Recently, I started experimenting with a very simple idea:

Parents use WeChat to start a video call
↓
A child’s tablet receives the incoming call
↓
Both sides enter a video conversation

At first glance, it sounds like a normal video chat application.

But once I actually started building it, I realized the project was less about “making a video chat page” and more about:

connecting the WeChat ecosystem,
PWA behavior,
real-time communication,
and Android tablet lifecycle management
into a coherent system.

And eventually, the whole MVP became an experiment in:

SDK orchestration
+
AI-assisted development
+
Architecture-first Vibe Coding

Why Build This?

The original motivation was surprisingly simple.

I noticed that many “family companionship” products suffer from the same problem:

too complicated for parents,
too heavy for children.

Most communication products assume:

both users install apps
both users register accounts
both users learn the interface
both users manage permissions

But in reality:

most parents already live inside WeChat.

So the question became:

Could parents use only WeChat,
while children use only a tablet,
and still achieve a smooth video calling experience?

That became the core MVP goal.

The Initial Architecture

The first system design looked like this:

WeChat Side
↓
Backend Service
↓
Tencent Cloud IM / TRTC
↓
Child Tablet

Each layer had a different responsibility:

WeChat:
entry point

IM:
messaging and signaling

TRTC:
real-time audio/video

Tablet:
incoming call UI and video display

And this led to one important realization very early on:

IM is not the same thing as video calling.

One Important Misunderstanding

Initially, I assumed:

Tencent Cloud IM SDK
≈ video communication SDK

But after reading the documentation more carefully, the architecture became much clearer.

Tencent Cloud actually separates the system into two layers:

Tencent Cloud IM

Responsible for:

messaging
online status
signaling
session management
push notifications

Meanwhile, the actual media layer comes from:

TUICallKit / TRTC

Tencent Cloud TUIKit Documentation

Responsible for:

WebRTC
audio/video streams
room management
microphone
camera

At that point, the mental model finally clicked:

IM is the “doorbell”
TRTC is the “conversation”

Once that distinction became clear, the overall architecture became dramatically simpler.

Why I Chose PWA Instead of Native Apps

Initially, I considered:

Flutter
React Native
Native Android

But later I realized:

the biggest complexity during MVP development
was not UI development.

The difficult parts were actually:

WeChat ecosystem integration
signaling flow
Android lifecycle behavior
push notifications
permissions
WebRTC compatibility

So eventually I chose:

PWA + APK wrapping

The reason was simple:

iteration speed.

The tablet side only needed:

fullscreen UI
microphone
camera
notifications
WebRTC support

Modern Android Chromium environments already support most of these capabilities well enough for an MVP.

The deployment path eventually became:

PWA
↓
TWA / WebView
↓
Android APK

This dramatically reduced iteration overhead.

How the System Actually Works

The final MVP flow became:

Parent presses “Call Child”
↓
WeChat sends request
↓
Backend creates room
↓
IM sends signaling event
↓
Child tablet receives incoming call
↓
Both clients join TRTC room
↓
Video call starts

The backend mainly handles:

UserSig generation
device binding
room management
permission validation
message routing

Meanwhile, the tablet side handles:

incoming call UI
camera
microphone
video rendering

The Hardest Part Wasn’t the UI

One thing became very obvious during development:

the hardest part was not drawing interfaces.

The difficult part was:

Android background behavior
PWA lifecycle limitations
WebRTC compatibility
notification reliability
permission handling

For example:

Can a PWA reliably wake up for incoming calls?

The answer turned out to be:

partially.

Different Android vendors behave differently:

different WebView versions
different Chromium implementations
different battery policies
different background restrictions

Eventually, it became clear that:

pure PWA architecture was not stable enough.

So the system gradually evolved toward:

PWA
+
APK wrapping
+
native notification bridge

At that point, the project stopped feeling like “web development” and started feeling more like:

cross-system orchestration.

Codex and Architecture-First Vibe Coding

Another interesting part of this project was the development workflow itself.

Instead of building everything manually from scratch, I experimented with what I’d call:

Architecture-first Vibe Coding

The workflow looked something like this:

Read Tencent Cloud documentation
↓
Describe the system goal to Codex
↓
Generate architecture ideas
↓
Analyze SDK integration paths
↓
Generate PWA structure
↓
Debug WebRTC behavior
↓
Adjust Android lifecycle handling
↓
Iterate continuously

During this process, I realized something important:

AI’s biggest value was not “writing code.”

Its biggest value was:

accelerating architectural exploration.

Most real problems were actually about:

whether the system design made sense
whether SDKs could cooperate
whether lifecycle assumptions were valid
whether ecosystem limitations existed

—not about syntax itself.

What the MVP Finally Achieved

At its current stage, the MVP can already:

launch calls from WeChat
notify the child tablet
establish real-time video communication
synchronize signaling through IM
run on Android tablets
operate through a simplified interaction flow

Most importantly:

it finally started feeling like a product,
instead of a collection of SDK demos.

Final Thoughts

This project gradually changed the way I think about modern software development.

What became interesting was no longer:

“Can AI write code?”

The more interesting question became:

“Can AI accelerate system-level reasoning?”

Because increasingly, development feels less like:

typing individual functions

and more like:

organizing ecosystems,
SDKs,
AI systems,
and human intent
into a coherent structure.

And perhaps that is becoming one of the most important skills in the AI era.