SkillHub

whatsapp-ultimate

v4.0.1

You put 5 agents in a WhatsApp group. They all respond at once. Your API bill does a backflip. Protocol v2 fixes that — congestion control, conversation lifecycle, and budget-aware scheduling. Agents that know when to talk, when to shut up, and when to burn your unused tokens before reset.

Sourced from ClawHub, Authored by Oscar Serra

Installation

Please help me install the skill `whatsapp-ultimate` from SkillHub official store. npx skills add globalcaos/whatsapp-ultimate

WhatsApp Ultimate

Everything you can do in WhatsApp, your AI agent can do too.

This skill documents all WhatsApp capabilities available through OpenClaw's native channel integration. No external Docker services, no CLI wrappers — just direct WhatsApp Web protocol via Baileys.


Prerequisites

  • OpenClaw with WhatsApp channel configured
  • WhatsApp account linked via QR code (openclaw whatsapp login)

Capabilities Overview

Category Features
Messaging Text, media, polls, stickers, voice notes, GIFs
Interactions Reactions, replies/quotes, edit, unsend
Groups Create, rename, icon, description, participants, admin, invite links
History Full-text search, vCard contact extraction with phone numbers

Total: 22 distinct actions


Messaging

Send Text

message action=send channel=whatsapp to="+34612345678" message="Hello!"

Send Media (Image/Video/Document)

message action=send channel=whatsapp to="+34612345678" message="Check this out" filePath=/path/to/image.jpg

Supported: JPG, PNG, GIF, MP4, PDF, DOC, etc.

Send Poll

message action=poll channel=whatsapp to="+34612345678" pollQuestion="What time?" pollOption=["3pm", "4pm", "5pm"]

Send Sticker

message action=sticker channel=whatsapp to="+34612345678" filePath=/path/to/sticker.webp

Must be WebP format, ideally 512x512.

Send Voice Note

message action=send channel=whatsapp to="+34612345678" filePath=/path/to/audio.ogg asVoice=true

Critical: Use OGG/Opus format for WhatsApp voice notes. MP3 may not play correctly.

Send GIF

message action=send channel=whatsapp to="+34612345678" filePath=/path/to/animation.mp4 gifPlayback=true

Convert GIF to MP4 first (WhatsApp requires this):

ffmpeg -i input.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" output.mp4 -y

Interactions

Add Reaction

message action=react channel=whatsapp chatJid="[email protected]" messageId="ABC123" emoji="🚀"

Remove Reaction

message action=react channel=whatsapp chatJid="[email protected]" messageId="ABC123" remove=true

Reply/Quote Message

message action=reply channel=whatsapp to="[email protected]" replyTo="QUOTED_MSG_ID" message="Replying to this!"

Edit Message (Own Messages Only)

message action=edit channel=whatsapp chatJid="[email protected]" messageId="ABC123" message="Updated text"

Unsend/Delete Message

message action=unsend channel=whatsapp chatJid="[email protected]" messageId="ABC123"

Group Management

Create Group

message action=group-create channel=whatsapp name="Project Team" participants=["+34612345678", "+34687654321"]

Rename Group

message action=renameGroup channel=whatsapp groupId="[email protected]" name="New Name"

Set Group Icon

message action=setGroupIcon channel=whatsapp groupId="[email protected]" filePath=/path/to/icon.jpg

Set Group Description

message action=setGroupDescription channel=whatsapp groupJid="[email protected]" description="Team chat for Q1 project"

Add Participant

message action=addParticipant channel=whatsapp groupId="[email protected]" participant="+34612345678"

Remove Participant

message action=removeParticipant channel=whatsapp groupId="[email protected]" participant="+34612345678"

Promote to Admin

message action=promoteParticipant channel=whatsapp groupJid="[email protected]" participants=["+34612345678"]

Demote from Admin

message action=demoteParticipant channel=whatsapp groupJid="[email protected]" participants=["+34612345678"]

Leave Group

message action=leaveGroup channel=whatsapp groupId="[email protected]"
message action=getInviteCode channel=whatsapp groupJid="[email protected]"

Returns: https://chat.whatsapp.com/XXXXX

message action=revokeInviteCode channel=whatsapp groupJid="[email protected]"

Get Group Info

message action=getGroupInfo channel=whatsapp groupJid="[email protected]"

Returns: name, description, participants, admins, creation date.


JID Formats

WhatsApp uses JIDs (Jabber IDs) internally:

Type Format Example
Individual <number>@s.whatsapp.net [email protected]
Group <id>@g.us [email protected]

When using to= with phone numbers, OpenClaw auto-converts to JID format.


Tips

Voice Notes

Always use OGG/Opus format:

ffmpeg -i input.wav -c:a libopus -b:a 64k output.ogg

Stickers

Convert images to WebP stickers:

ffmpeg -i input.png -vf "scale=512:512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" output.webp

Rate Limits

WhatsApp has anti-spam measures. Avoid: - Bulk messaging to many contacts - Rapid-fire messages - Messages to contacts who haven't messaged you first

Message IDs

To react/edit/unsend, you need the message ID. Incoming messages include this in the event payload. For your own sent messages, the send response includes the ID.


Comparison with Other Skills

Feature whatsapp-ultimate wacli whatsapp-automation gif-whatsapp
Native integration ❌ (CLI) ❌ (Docker) N/A
Send text
Send media
Polls
Stickers
Voice notes
GIFs
Reactions
Reply/Quote
Edit
Unsend
Group create
Group management ✅ (full)
Receive messages
Two-way chat
External deps None Go binary Docker + WAHA ffmpeg


Protocol v2: Multi-Agent Discussions

Version 4.0 introduces a complete multi-agent framework for WhatsApp group chats. Multiple AI agents can discuss topics, debate ideas, and converge on conclusions — all with built-in safeguards against message explosion.

Agent Identity

Each agent gets its own personality, icon, and (optionally) model:

channels:
  whatsapp:
    agentIcon: "🤖"          # single-agent icon prefix
    turnEndMarker: "⚡"       # end-of-turn marker in 1:1 chats
    multiAgent:
      mainAgentId: "jarvis"
      agents:
        jarvis:
          id: "jarvis"
          name: "Jarvis"
          icon: "🤖"
        luna:
          id: "luna"
          name: "Luna"
          icon: "🌙"
          model: "sonnet"
        rex:
          id: "rex"
          name: "Rex"
          icon: "🦖"
          model: "haiku"

Agent personalities live in the workspace:

workspace/
├── SOUL.md                  # main agent
├── agents/
│   ├── luna/SOUL.md         # Luna's personality
│   └── rex/SOUL.md          # Rex's personality

Intra-Agent Chats

Register WhatsApp groups where agents discuss freely (no trigger prefix needed):

      intraAgentChats:
        brainstorm:
          chatId: "[email protected]"
          participants: ["jarvis", "luna", "rex"]
          owner: "oscar"
          mode: "broadcast"        # broadcast | addressed | round-robin

Routing modes: - broadcast — all agents respond (with congestion control) - addressed — only respond when mentioned by name ("Luna, what do you think?") - round-robin — structured turn-taking

Congestion Control (Exponential Courtesy Protocol)

Prevents N agents from all responding simultaneously:

      congestion:
        enabled: true
        baseDelayFactor: 150     # ms × agentCount² base delay
        maxDelay: 30000          # 30s cap
        backpressureThreshold: 1.5  # slow down over-talkers
        windowMs: 60000          # 60s sliding window

How it works: - Base delay scales quadratically with agent count (2 agents ≈ 600ms, 5 agents ≈ 3750ms) - Random jitter prevents synchronization - Agents talking more than their fair share get 2× delay penalty - If another agent posts during your wait, restart the timer (yield-on-collision)

Conversation Lifecycle

Agents detect when discussions go stale and know when to wrap up:

      lifecycle:
        stalenessWindow: 5        # compare last N messages
        stalenessThreshold: 0.85  # cosine similarity trigger
        maxTurnsPerObjective: 30  # hard cap
        autoClose: true

Features: - Staleness detection — cosine similarity of message embeddings detects circular discussions - Agreement loop detection — catches "I agree" / "Good point" / "Exactly" loops - Topic steering — one agent claims pivot role to redirect conversation - Objective tracking — set goals, track completion, auto-close with summary - Closure protocol — propose → ack → converge (all agents must agree)

Budget-Aware Scheduling

Adjusts conversation depth based on API usage and reset timing:

      budget:
        provider: "anthropic"
        windowDays: 7
        burnModeEnabled: true
        burnTriggerHours: 24     # hours before reset
        burnUsageThreshold: 0.20 # usage below 20%

Four modes:

Mode When Congestion Staleness Max Turns Tangents
Conservative >85% used 2× slower 0.80 ½ No
Moderate 60-85% Normal 0.85 Normal No
Aggressive <60% 0.7× faster 0.85 Normal Yes
Burn <20% used, <24h to reset 0.3× faster 0.95 Encouraged

Burn mode philosophy: unused tokens expire at reset. Better to have emergent agent-agent discussions than waste the budget.

DM Trigger Prefix

Protocol v2 extends triggerPrefix to DMs (previously groups only):

  • Owner — always bypasses triggerPrefix
  • Authorized contacts — must start message with prefix (e.g., "Jarvis, help me with...")
  • Intra-agent chats — bypass triggerPrefix entirely

Turn-End Marker

In 1:1 chats (selfChat or owner-only DM), append a visual marker to signal turn completion:

channels:
  whatsapp:
    turnEndMarker: "⚡"

4.0.0

  • Protocol v2: Multi-agent discussions with configurable routing (broadcast/addressed/round-robin)
  • Added: Congestion control — Exponential Courtesy Protocol prevents message explosion in multi-agent chats
  • Added: Conversation lifecycle — staleness detection, agreement loop detection, topic steering, objective tracking, closure protocol
  • Added: Budget-aware scheduling — four spending modes including burn mode for pre-reset token usage
  • Added: Agent identity system — per-agent SOUL.md, icons, names, model overrides
  • Added: DM triggerPrefix gating — non-owner contacts must use prefix in DMs
  • Added: Turn-end marker (⚡) for 1:1 chats
  • Added: agentIcon config for outbound message prefixing

3.7.0

  • Added: vCard phone number extraction — contact messages now return structured vcard field with names and phone numbers
  • Added: contactsArrayMessage support — multi-contact shares are now parsed
  • Improved: New contact messages store phone numbers in text_content for full-text search (e.g. search by phone number)
  • Improved: raw_json now included in search results for contact-type messages, enabling vCard extraction from historical data

3.4.0

  • Fixed: Chat search now resolves LID/JID aliases — searching by chat name finds messages across both @lid and @s.whatsapp.net JID formats
  • Added: resolveChatJids() cross-references chats, contacts, and messages tables to discover all JID aliases for a given chat filter
  • Improved: Search falls back to original LIKE behaviour if no JIDs resolve, so no regressions

3.0.0

Your Agent
    ↓
OpenClaw message tool
    ↓
WhatsApp Channel Plugin
    ↓
Baileys (WhatsApp Web Protocol)
    ↓
WhatsApp Servers

No external services. No Docker. No CLI tools. Direct protocol integration.


Pairs Well With

  • smart-model-router — auto-select the right model per agent role (creative → Sonnet, analyst → Haiku, devil's advocate → GPT)
  • agent-superpowers — verification iron law and three-agent review for when your multi-agent discussions produce code
  • subagent-overseer — monitor agent sessions without burning tokens on polling loops

👉 https://github.com/globalcaos/tinkerclaw

Clone it. Fork it. Break it. Make it yours.


License

MIT — Part of OpenClaw


  • OpenClaw: https://github.com/openclaw/openclaw
  • Baileys: https://github.com/WhiskeySockets/Baileys
  • ClawHub: https://clawhub.com