whatsapp-ultimate
v4.0.1You put 5 agents in a WhatsApp group. They all respond at once. Your API bill does a backflip. Protocol v2 fixes that — congestion control, conversation lifecycle, and budget-aware scheduling. Agents that know when to talk, when to shut up, and when to burn your unused tokens before reset.
Installation
WhatsApp Ultimate
Everything you can do in WhatsApp, your AI agent can do too.
This skill documents all WhatsApp capabilities available through OpenClaw's native channel integration. No external Docker services, no CLI wrappers — just direct WhatsApp Web protocol via Baileys.
Prerequisites
- OpenClaw with WhatsApp channel configured
- WhatsApp account linked via QR code (
openclaw whatsapp login)
Capabilities Overview
| Category | Features |
|---|---|
| Messaging | Text, media, polls, stickers, voice notes, GIFs |
| Interactions | Reactions, replies/quotes, edit, unsend |
| Groups | Create, rename, icon, description, participants, admin, invite links |
| History | Full-text search, vCard contact extraction with phone numbers |
Total: 22 distinct actions
Messaging
Send Text
message action=send channel=whatsapp to="+34612345678" message="Hello!"
Send Media (Image/Video/Document)
message action=send channel=whatsapp to="+34612345678" message="Check this out" filePath=/path/to/image.jpg
Supported: JPG, PNG, GIF, MP4, PDF, DOC, etc.
Send Poll
message action=poll channel=whatsapp to="+34612345678" pollQuestion="What time?" pollOption=["3pm", "4pm", "5pm"]
Send Sticker
message action=sticker channel=whatsapp to="+34612345678" filePath=/path/to/sticker.webp
Must be WebP format, ideally 512x512.
Send Voice Note
message action=send channel=whatsapp to="+34612345678" filePath=/path/to/audio.ogg asVoice=true
Critical: Use OGG/Opus format for WhatsApp voice notes. MP3 may not play correctly.
Send GIF
message action=send channel=whatsapp to="+34612345678" filePath=/path/to/animation.mp4 gifPlayback=true
Convert GIF to MP4 first (WhatsApp requires this):
ffmpeg -i input.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" output.mp4 -y
Interactions
Add Reaction
message action=react channel=whatsapp chatJid="[email protected]" messageId="ABC123" emoji="🚀"
Remove Reaction
message action=react channel=whatsapp chatJid="[email protected]" messageId="ABC123" remove=true
Reply/Quote Message
message action=reply channel=whatsapp to="[email protected]" replyTo="QUOTED_MSG_ID" message="Replying to this!"
Edit Message (Own Messages Only)
message action=edit channel=whatsapp chatJid="[email protected]" messageId="ABC123" message="Updated text"
Unsend/Delete Message
message action=unsend channel=whatsapp chatJid="[email protected]" messageId="ABC123"
Group Management
Create Group
message action=group-create channel=whatsapp name="Project Team" participants=["+34612345678", "+34687654321"]
Rename Group
message action=renameGroup channel=whatsapp groupId="[email protected]" name="New Name"
Set Group Icon
message action=setGroupIcon channel=whatsapp groupId="[email protected]" filePath=/path/to/icon.jpg
Set Group Description
message action=setGroupDescription channel=whatsapp groupJid="[email protected]" description="Team chat for Q1 project"
Add Participant
message action=addParticipant channel=whatsapp groupId="[email protected]" participant="+34612345678"
Remove Participant
message action=removeParticipant channel=whatsapp groupId="[email protected]" participant="+34612345678"
Promote to Admin
message action=promoteParticipant channel=whatsapp groupJid="[email protected]" participants=["+34612345678"]
Demote from Admin
message action=demoteParticipant channel=whatsapp groupJid="[email protected]" participants=["+34612345678"]
Leave Group
message action=leaveGroup channel=whatsapp groupId="[email protected]"
Get Invite Link
message action=getInviteCode channel=whatsapp groupJid="[email protected]"
Returns: https://chat.whatsapp.com/XXXXX
Revoke Invite Link
message action=revokeInviteCode channel=whatsapp groupJid="[email protected]"
Get Group Info
message action=getGroupInfo channel=whatsapp groupJid="[email protected]"
Returns: name, description, participants, admins, creation date.
JID Formats
WhatsApp uses JIDs (Jabber IDs) internally:
| Type | Format | Example |
|---|---|---|
| Individual | <number>@s.whatsapp.net |
[email protected] |
| Group | <id>@g.us |
[email protected] |
When using to= with phone numbers, OpenClaw auto-converts to JID format.
Tips
Voice Notes
Always use OGG/Opus format:
ffmpeg -i input.wav -c:a libopus -b:a 64k output.ogg
Stickers
Convert images to WebP stickers:
ffmpeg -i input.png -vf "scale=512:512:force_original_aspect_ratio=decrease,pad=512:512:(ow-iw)/2:(oh-ih)/2:color=0x00000000" output.webp
Rate Limits
WhatsApp has anti-spam measures. Avoid: - Bulk messaging to many contacts - Rapid-fire messages - Messages to contacts who haven't messaged you first
Message IDs
To react/edit/unsend, you need the message ID. Incoming messages include this in the event payload. For your own sent messages, the send response includes the ID.
Comparison with Other Skills
| Feature | whatsapp-ultimate | wacli | whatsapp-automation | gif-whatsapp |
|---|---|---|---|---|
| Native integration | ✅ | ❌ (CLI) | ❌ (Docker) | N/A |
| Send text | ✅ | ✅ | ❌ | ❌ |
| Send media | ✅ | ✅ | ❌ | ❌ |
| Polls | ✅ | ❌ | ❌ | ❌ |
| Stickers | ✅ | ❌ | ❌ | ❌ |
| Voice notes | ✅ | ❌ | ❌ | ❌ |
| GIFs | ✅ | ❌ | ❌ | ✅ |
| Reactions | ✅ | ❌ | ❌ | ❌ |
| Reply/Quote | ✅ | ❌ | ❌ | ❌ |
| Edit | ✅ | ❌ | ❌ | ❌ |
| Unsend | ✅ | ❌ | ❌ | ❌ |
| Group create | ✅ | ❌ | ❌ | ❌ |
| Group management | ✅ (full) | ❌ | ❌ | ❌ |
| Receive messages | ✅ | ✅ | ✅ | ❌ |
| Two-way chat | ✅ | ❌ | ❌ | ❌ |
| External deps | None | Go binary | Docker + WAHA | ffmpeg |
Protocol v2: Multi-Agent Discussions
Version 4.0 introduces a complete multi-agent framework for WhatsApp group chats. Multiple AI agents can discuss topics, debate ideas, and converge on conclusions — all with built-in safeguards against message explosion.
Agent Identity
Each agent gets its own personality, icon, and (optionally) model:
channels:
whatsapp:
agentIcon: "🤖" # single-agent icon prefix
turnEndMarker: "⚡" # end-of-turn marker in 1:1 chats
multiAgent:
mainAgentId: "jarvis"
agents:
jarvis:
id: "jarvis"
name: "Jarvis"
icon: "🤖"
luna:
id: "luna"
name: "Luna"
icon: "🌙"
model: "sonnet"
rex:
id: "rex"
name: "Rex"
icon: "🦖"
model: "haiku"
Agent personalities live in the workspace:
workspace/
├── SOUL.md # main agent
├── agents/
│ ├── luna/SOUL.md # Luna's personality
│ └── rex/SOUL.md # Rex's personality
Intra-Agent Chats
Register WhatsApp groups where agents discuss freely (no trigger prefix needed):
intraAgentChats:
brainstorm:
chatId: "[email protected]"
participants: ["jarvis", "luna", "rex"]
owner: "oscar"
mode: "broadcast" # broadcast | addressed | round-robin
Routing modes: - broadcast — all agents respond (with congestion control) - addressed — only respond when mentioned by name ("Luna, what do you think?") - round-robin — structured turn-taking
Congestion Control (Exponential Courtesy Protocol)
Prevents N agents from all responding simultaneously:
congestion:
enabled: true
baseDelayFactor: 150 # ms × agentCount² base delay
maxDelay: 30000 # 30s cap
backpressureThreshold: 1.5 # slow down over-talkers
windowMs: 60000 # 60s sliding window
How it works: - Base delay scales quadratically with agent count (2 agents ≈ 600ms, 5 agents ≈ 3750ms) - Random jitter prevents synchronization - Agents talking more than their fair share get 2× delay penalty - If another agent posts during your wait, restart the timer (yield-on-collision)
Conversation Lifecycle
Agents detect when discussions go stale and know when to wrap up:
lifecycle:
stalenessWindow: 5 # compare last N messages
stalenessThreshold: 0.85 # cosine similarity trigger
maxTurnsPerObjective: 30 # hard cap
autoClose: true
Features: - Staleness detection — cosine similarity of message embeddings detects circular discussions - Agreement loop detection — catches "I agree" / "Good point" / "Exactly" loops - Topic steering — one agent claims pivot role to redirect conversation - Objective tracking — set goals, track completion, auto-close with summary - Closure protocol — propose → ack → converge (all agents must agree)
Budget-Aware Scheduling
Adjusts conversation depth based on API usage and reset timing:
budget:
provider: "anthropic"
windowDays: 7
burnModeEnabled: true
burnTriggerHours: 24 # hours before reset
burnUsageThreshold: 0.20 # usage below 20%
Four modes:
| Mode | When | Congestion | Staleness | Max Turns | Tangents |
|---|---|---|---|---|---|
| Conservative | >85% used | 2× slower | 0.80 | ½ | No |
| Moderate | 60-85% | Normal | 0.85 | Normal | No |
| Aggressive | <60% | 0.7× faster | 0.85 | Normal | Yes |
| Burn | <20% used, <24h to reset | 0.3× faster | 0.95 | 2× | Encouraged |
Burn mode philosophy: unused tokens expire at reset. Better to have emergent agent-agent discussions than waste the budget.
DM Trigger Prefix
Protocol v2 extends triggerPrefix to DMs (previously groups only):
- Owner — always bypasses triggerPrefix
- Authorized contacts — must start message with prefix (e.g., "Jarvis, help me with...")
- Intra-agent chats — bypass triggerPrefix entirely
Turn-End Marker
In 1:1 chats (selfChat or owner-only DM), append a visual marker to signal turn completion:
channels:
whatsapp:
turnEndMarker: "⚡"
4.0.0
- Protocol v2: Multi-agent discussions with configurable routing (broadcast/addressed/round-robin)
- Added: Congestion control — Exponential Courtesy Protocol prevents message explosion in multi-agent chats
- Added: Conversation lifecycle — staleness detection, agreement loop detection, topic steering, objective tracking, closure protocol
- Added: Budget-aware scheduling — four spending modes including burn mode for pre-reset token usage
- Added: Agent identity system — per-agent SOUL.md, icons, names, model overrides
- Added: DM triggerPrefix gating — non-owner contacts must use prefix in DMs
- Added: Turn-end marker (⚡) for 1:1 chats
- Added:
agentIconconfig for outbound message prefixing
3.7.0
- Added: vCard phone number extraction — contact messages now return structured
vcardfield with names and phone numbers - Added:
contactsArrayMessagesupport — multi-contact shares are now parsed - Improved: New contact messages store phone numbers in
text_contentfor full-text search (e.g. search by phone number) - Improved:
raw_jsonnow included in search results for contact-type messages, enabling vCard extraction from historical data
3.4.0
- Fixed: Chat search now resolves LID/JID aliases — searching by chat name finds messages across both
@lidand@s.whatsapp.netJID formats - Added:
resolveChatJids()cross-references chats, contacts, and messages tables to discover all JID aliases for a given chat filter - Improved: Search falls back to original LIKE behaviour if no JIDs resolve, so no regressions
3.0.0
Your Agent
↓
OpenClaw message tool
↓
WhatsApp Channel Plugin
↓
Baileys (WhatsApp Web Protocol)
↓
WhatsApp Servers
No external services. No Docker. No CLI tools. Direct protocol integration.
Pairs Well With
- smart-model-router — auto-select the right model per agent role (creative → Sonnet, analyst → Haiku, devil's advocate → GPT)
- agent-superpowers — verification iron law and three-agent review for when your multi-agent discussions produce code
- subagent-overseer — monitor agent sessions without burning tokens on polling loops
👉 https://github.com/globalcaos/tinkerclaw
Clone it. Fork it. Break it. Make it yours.
License
MIT — Part of OpenClaw
Links
- OpenClaw: https://github.com/openclaw/openclaw
- Baileys: https://github.com/WhiskeySockets/Baileys
- ClawHub: https://clawhub.com