In-app screen capture as a mob plugin: the device captures its own screen,
hardware-encodes it to H264 on-device, and hands Annex-B NAL units to the BEAM.
This is the in-app replacement for sloppy_joe's host-side adb screenrecord
(SloppyJoe.Media.Capture.Adb) — so a NAT'd phone can publish its screen with no adb
and no host on the same network. Follows the mob_camera frame-streaming template.
Pairs with sloppy_joe's WebRTC device view: the H264 NALs drop straight into
SloppyJoe.Media.Capture.H264 (split + FU-A payload), the existing RTP path.
Stages
- [x] 1a — package (Elixir + manifest).
mix.exs(path-dep mob),lib/mob_screencast.ex(start_stream/stop_stream/request_keyframe, delivers{:screencast, :frame, %{bytes, format: :h264, keyframe, …}}),src/mob_screencast_nif.erl(3 stubs),priv/mob_plugin.exs. Compiles vs local mob; manifest validates + classifies tier 1. - [~] 1b — native capture + encode (the bulk).
- [x] Android code written (
priv/native/jni/mob_screencast_nif.zig+priv/native/android/MobScreencastBridge.kt): zig NIF mirrors the device-proven mob_camera pattern (nativeRegister + 3 NIFs + nativeDeliverScreencastFrame →{:screencast, :frame, %{bytes, …, keyframe}}); Kotlin bridge =MediaProjection(consent via a headlessScreencastConsentFragment) →MediaCodecAVC encoder (surface input) ←VirtualDisplay, a drain thread that prepends SPS/PPS to keyframes and pushes Annex-B access units.zig ast-checkclean; manifest tier 1. Targets API ≤ 33 (Moto G is API 30) so it runs without the foreground service. - [~] Android device build + partial verify (mob_plugin_demo host, Moto G ZY22DP6HFL):
- [x]
--nativebuild merges the zig NIF + Kotlin bridge, compiles + links + deploys (mix mob.pluginsshows it tier 1, vetting clean). Two real bugs found + fixed by the build/run: zig comptime atom (erts.atomneeds a comptime string) + Kotlin main-thread fragment launch (consent dialog must post to the main thread). - [x] On-device: dist RPC confirmed
MobScreencast+ the NIF load andstart_stream/2is callable (collector launched on the Moto G). - [x] Consent → projection → encoder pipeline EXECUTES on the Moto G. The
MediaProjection consent dialog appears ("Start recording or casting with
MobPluginDemo? … Start now"), and tapping Start now grants it + runs
onProjectionResult(proven by a crash trace there before the guard landed). FOUR bugs found+fixed by the device build/run: zig comptime atom; main-thread; the host is a Compose ComponentActivity not a FragmentActivity (use ActivityResultRegistry like mob_camera);onProjectionResultcrashed unguarded on the encoder setup (now wrapped- integer I-frame interval).
- [x] FRAMES FLOW END-TO-END ON THE MOTO G (2026-06-10).
capturing 358x720 @ 1000000bps; the collector saw{:screencast, :frame, …}count climb{55,true}→{75,true}(~10 fps, keyframes prefixed with SPS/PPS),ERR=:none,screencast_stop_stream → :ok. Two more real bugs found + fixed to get here:- Stale binary / driver_tab: the running app reported
:nif_not_loadedfor screencast while camera loaded — the installed.so's compileddriver_tabpredated the screencast row (the regen happened after that build). A clean Android-onlymix mob.deploy --native --device ZY22DP6HFLrebuilt it andscreencast_request_keyframereturned:ok. (The dual-platform build fails on the not-yet-written iOS.m; the--device <android>form builds Android only.) - Foreground service required even on API 30:
getMediaProjectionthrewSecurityException: Media projections require a foreground service of type … MEDIA_PROJECTION. The "API ≤ 33 runs directly" assumption was wrong on this OEM build. Fixed:ScreencastService(foreground service,FOREGROUND_SERVICE_TYPE_MEDIA_PROJECTION, folded into bridge_kt since the merge copies only one Kotlin file) + a<service android:foregroundServiceType="mediaProjection">in the host AndroidManifest;onProjectionResultnow stashes the consent result and starts the service, which foregrounds itself then callsbeginCaptureFromServiceto obtain the projection.
- Stale binary / driver_tab: the running app reported
- [x]
- [~] iOS (
priv/native/ios/mob_screencast_nif.m): WRITTEN.RPScreenRecorder(in-appstartCaptureWithHandler, per-session consent, mic off) →VideoToolboxVTCompressionSession(H264, constrained-baseline, no B-frames, realtime) → AVCC→Annex-B conversion in the VT output callback (SPS/PPS from the format description prepended to keyframes) →enif_sendof the SAME{:screencast, :frame, %{bytes, width, height, format: :h264, timestamp_ms, keyframe}}map the Android NIF emits. Consent outcome →{:screencast, :permission, :granted | :denied}from the completion handler. All VT lifecycle is serialized on a singleio.mob.screencast.sessionqueue (sample buffers CFRetain'd across the async hop);request_keyframesets a force-IDR flag for the next encode.ERL_NIF_INIT(mob_screencast_nif, …).- [x] Compiles + static-links (verified isolated, not yet in a full mob build):
-fsyntax-onlyclean against iPhoneOS26.4 SDK;-cwith mob's exact iOS NIF flags (-fobjc-arc -fmodules -DSTATIC_ERLANG_NIF -DSTATIC_ERLANG_NIF_LIBNAME=mob_screencast_nif) exits 0 and produces the_mob_screencast_nif_nif_initsymbol the iOS driver_tab references. - [ ] Full mob iOS build + on-device verify (iPhone — ReplayKit screen capture doesn't
work on the simulator). Needs re-enabling
:mob_screencastin the host mob.exs (now safe: the.mexists + compiles, so the dual-platform build no longer fails atmob_screencast_nif.o) and a device. Drivestart_stream→ ReplayKit consent → confirm{:screencast, :frame, …}H264 flows + a keyframe carries SPS/PPS. NOTE:max_sizeis NOT yet honored on iOS (encodes at native screen res; bitrate is still capped by the encoder's AverageBitRate, so output size is fine — quality/CPU only). A parallel iOS agent shares this device + mob.exs, so coordinate.
- [x] Compiles + static-links (verified isolated, not yet in a full mob build):
- [x] Android code written (
- [ ] 2 — sloppy_joe integration (architecture fork; downstream of the plugin).
The device BEAM has H264; getting it to the browser is the decision:
- A. Carrier relay — the device ships NALs over its existing
/devicedial-out (DeviceLink); a new host-sideSloppyJoe.Media.Capture.MobScreencastreceives them and feeds the host Publisher'ssend_rtp. Reuses the whole stack; reaches NAT'd phones; server relays media (not pure P2P). Smallest lift. - B. On-device WebRTC — the device runs the WebRTC publisher itself (ex_webrtc on mob), P2P to the browser via STUN/TURN, server signaling-only. Truest decoupling; large native lift (SRTP/ICE on device). Recommendation: A first (it's mostly wiring + reuses everything), B as a later option.
- A. Carrier relay — the device ships NALs over its existing
- [ ] 3 — device-verify Moto G ZY22DP6HFL + iPhone:
start_stream→ frames decode; foreground-service consent flow; parity withCapture.Adboutput. - [ ] 4 — tests, docs,
mix mob.plugin.sign, CHANGELOG, mob_new wizard opt-in.
Known gaps
- AndroidManifest fragment (foreground service): a
MediaProjectioncapture must run inside<service android:foregroundServiceType="mediaProjection">, which the plugin manifest can't yet contribute (apply_plugin_android_manifest!merges only<uses-permission>, not<service>— identical class tomob_camera's FileProvider/uses-feature gap). WORKED AROUND for the device verify by carrying the<service android:name="io.mob.screencast.ScreencastService" … foregroundServiceType="mediaProjection">in the host (mob_plugin_demo) AndroidManifest; the plugin contributes theFOREGROUND_SERVICE*<uses-permission>entries. Stage-2 decision still open: add a manifest-fragment capability to the plugin system so the<service>ships with the plugin, instead of every host having to declare it. - Per-session consent UX: both platforms prompt the user each capture session. For an unattended emulator that's a one-time tap; for a phone it's per session by OS policy.