senseaudio-asr
v1.0.2Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1/audio/transcriptions`), audio quality analysis (`/v1/audio/analysis`), and recognition record queries (`/v1/audio/records`). Use this whenever...
Installation
SenseAudio ASR
Use this skill for all SenseAudio speech recognition tasks.
Credential source: read the API key from SENSEAUDIO_API_KEY and send it only in the Authorization: Bearer ... header.
Do not place API keys in query parameters, logs, transcripts, or saved examples.
Read First
references/asr.md
Workflow
- Pick recognition mode:
- HTTP file transcription for offline audio.
- WebSocket for realtime streaming microphone/audio chunks.
- Audio analysis for noise and quality checks before recognition.
-
Records query for recent recognition history lookup.
-
Choose model by feature needs:
- Lite for low-cost basic transcription.
- ASR for streaming, translation, diarization, sentiment, and timestamps.
- Pro when diarization plus explicit
max_speakerscontrol is needed. -
DeepThink for streaming, translation, and intelligent editing; do not send
language, diarization, sentiment, timestamps, ITN, or punctuation controls. -
Build minimal request:
- Required auth, file/audio format, model.
- Add optional controls only when needed.
-
Keep uploaded files at or below 10MB; split longer audio before sending.
-
Validate compatibility:
- Check model-parameter support before sending.
- Enforce WS
pcm/16000Hz/ mono requirements. -
For HTTP
stream=true, expect SSE text deltas only, not structured verbose fields. -
Parse robustly:
- Handle JSON/text/verbose/SSE forms.
- Handle WS terminal events and failures.
- Treat returned
audioURLs,api_key,session_id, andtrace_idas sensitive operational data.