SkillHub

virtual-remote-desktop

v1.0.3

KasmVNC-based virtual desktop for headless Linux with AI-first automation and human handoff. Use when most steps are automated but a user must manually intervene for captcha/risk-control/login approval, then return to automation. Includes requirement-driven setup for mobile/desktop takeover and brow...

Sourced from ClawHub, Authored by zhangxin15435

Installation

Please help me install the skill `virtual-remote-desktop` from SkillHub official store. npx skills add zhangxin15435/virtual-remote-desktop

Virtual Remote Desktop (KasmVNC edition)

What this skill is for

Use this when the workflow is: 1. AI runs browser automation most of the time 2. Captcha / risk-control / MFA appears 3. User takes over remotely for 1-3 minutes 4. AI continues automatically

This version replaces x11vnc+noVNC with KasmVNC and keeps computer-use style action scripts for AI control.

Core commands

0) Install KasmVNC (one-time)

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/install_kasmvnc.sh

This installer also prepares required runtime tools: - fluxbox (lightweight desktop) - xdotool + scrot (computer-use actions) - xauth

Before starting, always confirm these user requirements: 1. takeover device: 手机 or 电脑 2. website render mode: 手机网页 or 桌面网页 3. access mode: 本地隧道 (127.0.0.1) or 临时公网 (0.0.0.0) 4. network quality: 弱网 / 普通 / 良好

Use guided script (interactive Q&A):

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh

Preview config without starting:

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh --dry-run

1.1) Direct start (manual env)

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Important env vars: - AUTO_LAUNCH_URL (optional): open target page automatically - KASM_BIND (default 127.0.0.1, safer) - AUTO_STOP_IDLE_SECS (default 900) - BROWSER_MOBILE_MODE=1 (launch browser with mobile emulation) - BROWSER_DEVICE=iphone14pro|pixel7|ipad

Example:

AUTO_LAUNCH_URL="https://example.com/login" 
AUTO_STOP_IDLE_SECS=1200 
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Mobile-friendly VNC stream example (better phone takeover UX):

MOBILE_MODE=1 MOBILE_PRESET=phone 
AUTO_STOP_IDLE_SECS=900 
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Mobile stream options: - MOBILE_MODE=1 enables mobile defaults - MOBILE_PRESET=phone|tablet sets default resolution (960x540 / 1280x720) - KASM_MAX_FPS can be lowered further (e.g. 18) on weak networks

Browser mobile emulation (website renders as mobile page):

AUTO_LAUNCH_URL="https://example.com" 
BROWSER_MOBILE_MODE=1 BROWSER_DEVICE=iphone14pro 
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Notes: - Browser mobile emulation changes UA + viewport + touch behavior. - It is different from MOBILE_MODE (which optimizes VNC stream size).

2) Check status / health

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/status_vrd.sh
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/health_vrd.sh

3) Stop desktop

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/stop_vrd.sh

AI automation actions (computer-use style)

All actions run on the active virtual display from pids.env.

# screenshot (base64)
bash scripts/action_screenshot.sh

# click / type / key / scroll
bash scripts/action_click.sh 500 420 left
bash scripts/action_type.sh "hello"
bash scripts/action_key.sh "ctrl+l"
bash scripts/action_scroll.sh down 4

# helpers
bash scripts/action_mouse_move.sh 800 300
bash scripts/action_cursor_position.sh
bash scripts/action_wait.sh 2

Recommended loop: 1. action_screenshot.sh 2. analyze 3. action_click/type/key/... 4. screenshot verify 5. repeat


Best handoff pattern (AI ↔ User)

When captcha/risk-control appears: 1. AI sends user the KasmVNC URL + username/password 2. User manually solves challenge 3. User replies “done” 4. AI runs screenshot + validation step 5. AI resumes automation

This avoids full manual operation while keeping recovery fast.


UX presets

  • KASM_BIND=127.0.0.1
  • user connects via SSH tunnel
  • best for security

Preset B: temporary public takeover

  • KASM_BIND=0.0.0.0
  • short AUTO_STOP_IDLE_SECS (e.g. 300)
  • use only for urgent remote intervention

Notes

  • KasmVNC uses HTTPS + per-user auth (username/password).
  • This skill stores runtime files in ~/.openclaw/vrd-data by default.
  • Browser profile persistence is controlled by CHROME_PROFILE_DIR.
  • If captcha frequency is high, keep one long-lived profile to reduce repeated challenges.