Music Generator Skill (Full SOP)

Capability Overview

This skill supports the following intents: 1) Generate a full song with lyrics 2) Generate pure background music (BGM) 3) Generate lyrics only (no audio) 4) Query music generation task status

Users can describe the music they want in plain language. The system auto-determines the mode and handles parameter inference and task tracking.

Triggers and Natural Language Examples

The following natural language requests will trigger this skill: - Generate a romantic love song - Write lyrics about the night sky - Create electronic music suitable for short‑video background - Check my music generation progress - I want a cheerful background music track

Execution (SOP Step‑by‑Step)

Preflight Check (Mandatory)

Read MUSICFUL_API_KEY from the skill folder’s .env (resolved at runtime via the running script path): /.env
If not configured (empty/missing), immediately inform the user:
- "MUSICFUL_API_KEY is not configured. Please visit https://www.musicful.ai/api/authentication/interface-key/ to obtain/purchase an interface key, then write the KEY into /.env under MUSICFUL_API_KEY."
Stop subsequent calls and wait for the user to complete configuration before continuing.

The execution flow is intent‑based and incorporates a two‑stage return and a "lyrics‑first" UX: - Single command entry: /music_generator, with mode branch control: - mode=normal (default): generate and show lyrics → submit generation → return preview (status=2) → return final (status=0) - mode=bgm: pure music (instrumental=1), no lyrics → preview first → then final - mode=lyrics: return lyrics text immediately - When using the "custom lyrics" path (built into the normal flow or future extensions): submit generation directly and poll (preview first, then final)

Scenario A: Generate a Full Song with Lyrics

Typical user inputs: - Generate a romantic electronic song - Write a sad rock song and generate audio - Here are some lyrics, please use them to create the song: [...user‑provided lyrics...]

Detailed Flow

Step 1: Intent Recognition - If the user provides complete lyrics, treat it as lyrics‑provided generation; - Otherwise, assume lyrics need to be generated automatically and then used to synthesize the song.

Step 2: Lyrics Handling - If lyrics are provided → use them directly; - If not provided → call the V1 Lyrics API to generate lyrics content;

Step 3: Submit Music Generation Task - POST {BASE_URL}/v1/music/generate - body: { action: "custom", lyrics: "<lyrics>", style: "<inferred from user>", mv: "<default to latest high‑quality model>" }

Step 4: Automatic Task Polling (Two Stages) - GET {BASE_URL}/v1/music/tasks?ids= - Status semantics (key): - status = 2 → preview stage complete (returns audio_url as preview link) - status = 0 → full audio complete (returns audio_url as downloadable final link) - others → processing or failed (use fail_code/fail_reason) - Polling strategy: 1) On first status=2: immediately announce "preview is ready" and return audio_url for listening; 2) Continue polling until status=0: then return "final audio is ready" with audio_url (download/publish).

Step 5: Return Results to the User (Two‑Stage × Two Songs) - The system by default generates two songs/two task_ids (e.g., ids=[id1,id2]). For each id, perform the two‑stage return independently: - Stage 1: when this id reaches status=2 → return the preview link (audio_url) - Stage 2: keep polling this id → when status=0 → return the full mp3 download link (audio_url) - Recommended output format (one block per song): - title: - prompt: <original user description> - lyrics: <full lyrics for lyric mode; empty for BGM> - preview: <preview link (status=2)> - full: <final mp3 (status=0)></p> <hr /> <h2 id="scenario-b-generate-pure-background-music">Scenario B: Generate Pure Background Music</h2> <p>:white_check_mark: Typical user inputs: - Generate a piece of pure background music - I want an electronic instrumental suitable for video background</p> <h3 id="detailed-flow_1">Detailed Flow</h3> <p><strong>Step 1: Intent Recognition</strong> - Detect semantics like "pure music/background music/accompaniment" → enter the pure BGM flow.</p> <p><strong>Step 2: Submit Music Generation Task</strong> - POST {BASE_URL}/v1/music/generate - body: <code>{ action: "auto", style: "<inferred from user>", mv: "<default latest model>", instrumental: 1 }</code></p> <p><strong>Step 3: Automatic Task Polling (Two Stages)</strong> - Same as Scenario A: status=2 → preview; status=0 → full</p> <p><strong>Step 4: Return Preview & Final Links (Two Steps)</strong> - Stage 1: preview link (status=2) - Stage 2: final link (status=0)</p> <hr /> <h2 id="scenario-c-generate-lyrics-only">Scenario C: Generate Lyrics Only</h2> <p>:white_check_mark: User inputs: - Write lyrics about a summer beach - I only need lyrics, about a rainy day</p> <h3 id="process">Process</h3> <ul> <li>POST {BASE_URL}/v1/lyrics body: <code>{ prompt: "<user description>" }</code></li> <li>Return: <code>{ lyrics: "<AI‑generated lyrics>" }</code></li> </ul> <hr /> <h2 id="scenario-d-query-music-generation-task-status">Scenario D: Query Music Generation Task Status</h2> <p>:white_check_mark: User inputs: - Check the progress of task_id=abc123 - See how far my song generation has progressed</p> <h3 id="detailed-flow_2">Detailed Flow</h3> <ol> <li>Extract task_id from the user input;</li> <li>GET {BASE_URL}/v1/music/tasks?ids=<task_id></li> <li>Return task status and audio information.</li> </ol> <hr /> <h2 id="parameter-inference-rules">Parameter Inference Rules</h2> <table> <thead> <tr> <th>Parameter</th> <th>Source</th> <th>Default/Notes</th> </tr> </thead> <tbody> <tr> <td><code>style</code></td> <td>Inferred from user input</td> <td>Default to Pop/general if none</td> </tr> <tr> <td><code>mv</code></td> <td>Default high‑quality</td> <td>Prefer latest high‑quality</td> </tr> <tr> <td><code>instrumental</code></td> <td>Set to 1 for BGM</td> <td>Otherwise 0</td> </tr> <tr> <td><code>lyrics</code></td> <td>User‑provided / auto</td> <td>—</td> </tr> <tr> <td><code>title</code></td> <td>Inferred or auto‑named</td> <td>—</td> </tr> </tbody> </table> <blockquote> <p>BASE_URL and API Key: - MUSICFUL_BASE_URL (default: https://api.musicful.ai) - MUSICFUL_API_KEY (read from the skill folder’s .env; environment variable MUSICFUL_API_KEY is also honored if set) - Entry points: scripts/musicful_api.py, CLI: scripts/run_musicful.py / scripts/dispatch_music_generator.py - Important: ensure MUSICFUL_API_KEY is configured before calling; if missing, the server may respond with HTTP 500 (helps pinpoint auth/config issues quickly).</p> </blockquote> <hr /> <h2 id="error-handling-and-fallback">Error Handling and Fallback</h2> <ol> <li>If the request is unclear (e.g., "generate music" without clarifying lyrics vs BGM) → ask a follow‑up;</li> <li>If the API call fails → return clear failure reason and suggestions;</li> <li>If polling times out → prompt the user to wait or retry.</li> </ol> <hr /> <h2 id="unified-return-format">Unified Return Format</h2> <p>Success:</p> <pre class="codehilite"><code class="language-json">{ "status": "success", "data": { ... } } </code></pre> <p>Error:</p> <pre class="codehilite"><code class="language-json">{ "status": "error", "message": "<reason>" } </code></pre> <h2 id="example-dialogues">Example Dialogues</h2> <ul> <li>User: Generate a sad rock song</li> <li> <p>Skill: Shows generated lyrics → submits job → returns preview link → returns full mp3 link</p> </li> <li> <p>User: An ambient BGM for a quiet night</p> </li> <li> <p>Skill: Submits job (instrumental=1) → returns preview → returns full</p> </li> <li> <p>User: Write lyrics about the night sky</p> </li> <li>Skill: Returns generated lyrics</li> </ul> </div> </div>  <aside class="space-y-6">  <div class="bg-gray-900 border border-gray-800 rounded-2xl p-6 shadow-sm"> <h3 class="text-[10px] font-black text-gray-500 uppercase tracking-widest mb-5">Popularity</h3> <div class="flex items-end gap-2 mb-6"> <span class="text-4xl font-black text-white leading-none">3</span> <span class="text-gray-500 text-[10px] font-bold uppercase mb-1">Stars</span> </div> <div class="grid grid-cols-2 gap-3 mb-6"> <div class="bg-gray-950 p-4 rounded-xl border border-gray-800 text-center"> <div class="text-gray-600 text-[10px] font-bold uppercase mb-1">DLs</div> <div class="text-white font-bold text-base">189</div> </div> <div class="bg-gray-950 p-4 rounded-xl border border-gray-800 text-center"> <div class="text-gray-600 text-[10px] font-bold uppercase mb-1">Installs</div> <div class="text-white font-bold text-base">1</div> </div> </div>  <a href="https://clawhub.ai/boner-bbb/musicful-music-generator" target="_blank" class="block text-center border border-gray-700 bg-gray-800 text-gray-200 py-3 rounded-xl font-bold hover:bg-gray-700 transition-all uppercase text-[10px] tracking-widest"> View Repository </a> </div>  <div class="bg-gray-900 border border-gray-800 rounded-2xl p-6 relative overflow-hidden"> <div class="absolute -top-10 -right-10 w-24 h-24 bg-green-500/10 rounded-full blur-2xl opacity-40"></div> <h3 class="text-[10px] font-black text-gray-500 uppercase tracking-widest mb-5 relative z-10">AI Security</h3> <div class="flex items-center gap-5 relative z-10"> <div class="w-14 h-14 rounded-full border-2 border-green-500 flex items-center justify-center text-green-500 text-xl font-black bg-gray-950 shadow-inner">98</div> <div class="flex-1"> <div class="text-white font-bold text-xs uppercase tracking-tight">None</div> <div class="text-[9px] text-gray-600 italic">Audited by AI Guard</div> </div> </div> </div>  <a href="https://wry-manatee-359.convex.site/api/v1/download?slug=musicful-music-generator" class="flex items-center justify-center gap-2 w-full border border-blue-500/40 text-blue-400 py-3.5 rounded-xl font-bold hover:bg-blue-500/10 transition-all uppercase text-[10px] tracking-widest"> <i class="fa-solid fa-cloud-arrow-down"></i> Download ZIP </a> </aside> </div> </main>  <footer class="border-t border-gray-800 bg-gray-950 mt-20 py-12 text-center"> <div class="container mx-auto px-4"> <div class="flex justify-center gap-6 mb-6 text-gray-500"> <a href="#" class="hover:text-white transition-colors text-lg"><i class="fa-brands fa-discord"></i></a> <a href="#" class="hover:text-white transition-colors text-lg"><i class="fa-brands fa-x-twitter"></i></a> </div> <p class="text-gray-500 text-[10px] font-bold uppercase tracking-widest mb-2">© 2026 AI Skills Hub</p> <p class="text-gray-700 text-[9px] uppercase tracking-[0.2em]">Verified Metadata Repository</p> </div> </footer> </body> </html>

musicful-music-generator