voice-recognition
v1.0.0Local speech-to-text with OpenAI Whisper CLI. Supports Chinese, English, 100+ languages with translation and summarization.
Installation
Please help me install the skill `voice-recognition` from SkillHub official store.
npx skills add gykdly/voice-recognition
Voice Recognition (Whisper)
Local speech-to-text with OpenAI Whisper CLI.
Features
- Local processing - No API key needed, free
- Multi-language - Chinese, English, 100+ languages
- Translation - Translate to English
- Summarization - Generate quick summary
Usage
Basic
# Chinese recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a
# Force Chinese
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --zh
# English recognition
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --en
# Translate to English
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --translate
# With summary
python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py audio.m4a --summarize
Quick Command (add to ~/.zshrc)
alias voice="python3 /Users/liyi/.openclaw/workspace/scripts/voice识别_升级版.py"
Then use:
voice ~/Downloads/audio.m4a --zh
Requirements
- OpenAI Whisper CLI:
brew install openai-whisper - Python 3.10+
Files
scripts/voice识别_升级版.py- Main scriptscripts/voice_tool_README.md- Documentation
Supported Formats
- MP3, M4A, WAV, OGG, FLAC, WebM
Language Support
100+ languages including: - Chinese (zh) - English (en) - Japanese (ja) - Korean (ko) - And more...
Notes
- Default model:
medium(balance of speed and accuracy) - First run downloads model to
~/.cache/whisper - Processing time varies by audio length and model size