telegram-voice-transcribe

Transcribe Telegram voice notes into text via OpenAI Whisper (whisper-1).

Quick Workflow

Detect a voice message: look for voice.file_id or audio.file_id in the inbound message metadata.
Run the transcription script: bash python3 ~/openclaw/skills/telegram-voice-transcribe/scripts/transcribe.py --file-id <file_id> --language es
Read the JSON output — transcript field contains the text.
Respond to the user based on the transcript content (treat it like typed text).

Mode	Flag	When to use
Telegram file_id	`--file-id <id>`	Standard case — voice message in Telegram
Local file	`--file <path>`	Testing, or file already downloaded
URL	`--url <https://...>`	Audio hosted externally

Always pass --language es for Spanish speakers to improve speed and accuracy.

{"transcript": "Hola, necesito que hagas un cambio en el juego", "language": "es", "duration_s": 4.2}

If error key is present, surface it to the user and check setup.

See references/setup.md for full setup, hooks integration, costs, and local Whisper alternative.

Error	Fix
`OPENAI_API_KEY not set`	Configure key via `openclaw configure --section env`
`TELEGRAM_BOT_TOKEN required`	Add bot token to env
`openai package not installed`	`pip install openai`
Telegram `400 Bad Request`	file_id expired — Telegram file_ids expire after ~48h
File too large	Whisper API limit is 25MB; split audio or use local Whisper