Mixture of Agents (MoA)

TL;DR: Make 3 AI models argue with each other. Get an answer better than any single model. Cost: ~$0.03.

Two Usage Modes

A. Standalone CLI (Node.js)

export OPENROUTER_API_KEY="your-key"
node scripts/moa.js "Your complex question"

B. OpenClaw Skill (Agent-orchestrated)

# Install
clawhub install moa

# Or copy to ~/clawd/skills/moa/

The agent can then invoke MoA for complex analysis tasks.

Origin Story

The concept of "Mixture of Agents" comes from research showing LLMs can improve each other's outputs through collaboration. I built this for VC deal analysis—when evaluating startups, you want multiple perspectives, not one model's opinion.

The journey: 1. Started with 5 free OpenRouter models (Llama, Gemini, Mistral, Qwen, Nemotron) 2. Rate limits killed me at 2am during peak hours 3. Switched to 3 paid frontier specialists 4. Result: ~$0.03/query, answers better than any single model

When to Use

Complex analysis — due diligence, market research, technical evaluation
Brainstorming — get diverse ideas, synthesize the best
Fact-checking — cross-reference across models with different training data
High-stakes decisions — when one model's blind spots could hurt you
Contrarian thinking — different models have different biases

When NOT to use: - Quick Q&A (too slow, 30-90s latency) - Real-time chat (not designed for streaming) - Simple lookups (overkill)

Model Configuration

Paid Tier (Default) — Recommended

Role	Model	~Latency	Strength
Proposer 1	`moonshotai/kimi-k2.5`	23s	Long context, strong reasoning
Proposer 2	`z-ai/glm-5`	36s	Technical depth, different training corpus
Proposer 3	`minimax/minimax-m2.5`	64s	Nuance catching, thorough analysis
Aggregator	`moonshotai/kimi-k2.5`	15s	Fast synthesis

Why these models? - Frontier-class but less congested than GPT-4/Claude - Different training data = genuinely different perspectives - Chinese models excel at certain reasoning tasks - Combined cost still cheaper than single Opus call

Cost breakdown:

3 proposers × ~$0.008 = $0.024
1 aggregator × ~$0.005 = $0.005
─────────────────────────────
Total: ~$0.029/query

Free Tier (Fallback)

5 models: Llama 3.3 70B, Gemini 2.0 Flash, Mistral Small, Nemotron 70B, Qwen 2.5 72B

⚠️ Warning: Free tier hits rate limits during peak hours. Use --free flag only for testing.

How It Works

        ┌─────────────┐
        │   PROMPT    │
        └──────┬──────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
┌────────┐ ┌────────┐ ┌────────┐
│Kimi 2.5│ │ GLM 5  │ │MiniMax │  ← Parallel (they "argue")
│(reason)│ │(depth) │ │(nuance)│
└───┬────┘ └───┬────┘ └───┬────┘
    │          │          │
    └──────────┼──────────┘
               ▼
       ┌──────────────┐
       │  AGGREGATOR  │
       │  (Kimi 2.5)  │
       │              │
       │ • Best of 3  │
       │ • Resolve    │
       │   conflicts  │
       │ • Synthesize │
       └──────┬───────┘
              ▼
       ┌──────────────┐
       │ FINAL ANSWER │
       │ (Synthesized)│
       └──────────────┘

API Reference

Function Signature

interface MoAOptions {
  prompt: string;           // Required: The question to analyze
  tier?: 'paid' | 'free';   // Default: 'paid'
}

interface MoAResult {
  synthesis: string;        // The final aggregated answer
}

// Throws on complete failure (all models down, invalid key)
// Returns partial synthesis if 1-2 models fail
async function handle(options: MoAOptions): Promise<string>

CLI Usage

# Paid tier (default)
node scripts/moa.js "Your complex question"

# Free tier
node scripts/moa.js "Your question" --free

Programmatic Usage

const { handle } = require('./scripts/moa.js');

const synthesis = await handle({ 
  prompt: "Analyze the competitive moats in AI code generation",
  tier: 'paid'
});

console.log(synthesis);

Failure Modes

Scenario	Behavior
1 proposer fails	Synthesis from remaining 2 models
2 proposers fail	Synthesis from 1 model (degraded)
All proposers fail	Returns error message
Invalid API key	Immediate error with setup instructions
Rate limit (free tier)	Returns rate limit error

The system is designed to degrade gracefully. A 2/3 response is still valuable.

Example Use Cases

VC Due Diligence

node scripts/moa.js "Analyze the competitive landscape for AI code generation. 
Who has defensible moats? Who's likely to be commoditized? Be specific."

Technical Evaluation

node scripts/moa.js "Compare RLHF vs DPO vs RLAIF for LLM alignment. 
Which scales better? What are the failure modes of each?"

Market Research

node scripts/moa.js "What are the emerging use cases for embodied AI in 2026? 
Focus on robotics, drones, and autonomous systems. Include specific companies."

Performance Expectations

Metric	Paid Tier	Free Tier
P50 Latency	~45s	~60s
P95 Latency	~90s	~120s+
Success Rate	>99%	~80% (rate limits)
Cost/Query	~$0.03	$0.00

Tips

Be specific — Vague prompts get vague synthesis
Ask for structure — "Give me pros/cons" or "List top 5" helps the aggregator
Use for analysis, not chat — MoA shines for complex reasoning
Batch your queries — 30-90s per query, so plan accordingly

Installation

Via ClawHub (Recommended)

clawhub install moa

Manual

Copy skills/moa/ to your ~/clawd/skills/ directory
Set OPENROUTER_API_KEY in your environment
The agent can now invoke MoA for complex queries

Environment Variables

Variable	Required	Description
`OPENROUTER_API_KEY`	Yes	Your OpenRouter API key

Get your key at: https://openrouter.ai/keys

Credits

MoA concept: Together AI Research
Implementation: @Scianna
Built for: OpenClaw