bailian-studio
v0.0.2Call Aliyun Bailian via DashScope; OCR text extraction first + TTS speak.
Installation
Please help me install the skill `bailian-studio` from SkillHub official store.
npx skills add yab/bailian-studio
Bailian Studio
First feature: OCR text extraction via DashScope.
Requirements
- Python 3
dashscope(>=1.22.2)oss2ffmpeg(TTS 播放依赖,使用 ffplay)
Install:
pip install dashscope oss2
Config
API Key (priority order):
1. DASHSCOPE_API_KEY env
2. secrets/bailian.env
OSS (priority order):
1. OSS_ACCESS_KEY, OSS_SECRET_KEY, OSS_BUCKET, OSS_ENDPOINT, OSS_REGION env
2. secrets/bailian.env
Example secrets/bailian.env:
DASHSCOPE_API_KEY=sk-xxx
# TTS 可选配置(留空走默认)
BAILIAN_TTS_MODEL=qwen3-tts-flash
BAILIAN_TTS_VOICE=
BAILIAN_TTS_SAMPLE_RATE=16000
OSS_ACCESS_KEY=ak-xxx
OSS_SECRET_KEY=sk-xxx
OSS_BUCKET=your-bucket
OSS_ENDPOINT=oss-cn-beijing.aliyuncs.com
OSS_REGION=cn-beijing
Defaults:
- Region/base URL: Beijing (https://dashscope.aliyuncs.com/api/v1)
Optional overrides:
- DASHSCOPE_BASE_URL
Usage
TTS (speak)
python3 {baseDir}/scripts/tts_speak.py --text "你好"
OCR (text)
From local image (uploads to OSS):
python3 {baseDir}/scripts/ocr_text.py --image /path/to.png
From URL:
python3 {baseDir}/scripts/ocr_text.py --url https://example.com/image.png