SkillHub

zvukogram

v1.1.4

Text-to-Speech via Zvukogram API with SSML support. Use when you need to generate speech from text, create podcasts, voice notifications, or work with audio. Supports speed control, stress marks, English word transcription, and audio fragment merging.

Sourced from ClawHub, Authored by erview

Installation

Please help me install the skill `zvukogram` from SkillHub official store. npx skills add erview/zvukogram

Zvukogram TTS

Speech generation via Zvukogram API with SSML markup support.

Requirements

To use this skill, you need: - Zvukogram API token — get it at https://zvukogram.com/ - Zvukogram account email

Setup

Create file ~/.config/zvukogram/config.json:

mkdir -p ~/.config/zvukogram
{
  "token": "your_api_token_here",
  "email": "[email protected]"
}

Or use environment variables:

export ZVUKOGRAM_TOKEN=your_api_token_here
export [email protected]

Quick Start

# Simple TTS
python3 scripts/tts.py --text "Hello, world!" --voice Алена --output hello.mp3

# With +20% speed
python3 scripts/tts.py --text "Fast text" --voice Алена --speed 1.2 --output fast.mp3

# Check balance
python3 scripts/balance.py

Features

  • TTS generation — text to speech
  • SSML support — stress marks, pauses, speed
  • Audio merging — combine fragments via ffmpeg
  • Transcription — proper pronunciation of English words

SSML Markup

Stress Marks

Use + before stressed vowel:

З+амок — stress on "a"
зам+ок — stress on "o"

Aliases (Transcription)

<sub alias="Оупен Эй Ай">OpenAI</sub>
<sub alias="Самсунг">Samsung</sub>
<sub alias="Ал+ьтман">Альтман</sub>

Speed

<prosody rate="1.2">20% faster</prosody>
<prosody rate="fast">Fast text</prosody>

Pauses

<break time="500ms"/>

Available Voices

  • Алена — female, neutral (recommended)
  • Андрей — male, neutral (recommended)
  • Александра — female, soft
  • Антон — male, business

Full list: see references/VOICES.md

Examples

See references/EXAMPLES.md for: - Dialogs and podcasts - News voiceover - Voice notifications - Long texts

Transcription

See references/TRANSCRIPTION.md for proper pronunciation: - OpenAI → Оупен Эй Ай - GPT → Джи Пи Ти - Samsung → Самсунг - Altman → Ал+ьтман

SSML Reference

  • Full, agent-readable reference (recommended): references/SSML.md
  • Quick lookup: references/SSML_CHEATSHEET.md
  • Official Zvukogram SSML docs: https://zvukogram.com/node/ssml/

Troubleshooting

See references/TROUBLESHOOTING.md for: - API errors - Audio issues - Diagnostics

API Limitations

  • Max 1000 characters per request (/text)
  • Up to 1M characters via /longtext
  • Do not rely on <voice> / <speak> wrappers for API usage. For multi-voice, generate and merge fragments (one request per voice).
  • API docs: https://zvukogram.com/node/api/
  • Voice rating: https://zvukogram.com/rating/
  • Support: https://t.me/zvukogram