SkillHub

senseaudio-asr

v1.0.2

Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1/audio/transcriptions`), audio quality analysis (`/v1/audio/analysis`), and recognition record queries (`/v1/audio/records`). Use this whenever...

Sourced from ClawHub, Authored by scikkk

Installation

Please help me install the skill `senseaudio-asr` from SkillHub official store. npx skills add scikkk/senseaudio-asr

SenseAudio ASR

Use this skill for all SenseAudio speech recognition tasks.

Credential source: read the API key from SENSEAUDIO_API_KEY and send it only in the Authorization: Bearer ... header. Do not place API keys in query parameters, logs, transcripts, or saved examples.

Read First

  • references/asr.md

Workflow

  1. Pick recognition mode:
  2. HTTP file transcription for offline audio.
  3. WebSocket for realtime streaming microphone/audio chunks.
  4. Audio analysis for noise and quality checks before recognition.
  5. Records query for recent recognition history lookup.

  6. Choose model by feature needs:

  7. Lite for low-cost basic transcription.
  8. ASR for streaming, translation, diarization, sentiment, and timestamps.
  9. Pro when diarization plus explicit max_speakers control is needed.
  10. DeepThink for streaming, translation, and intelligent editing; do not send language, diarization, sentiment, timestamps, ITN, or punctuation controls.

  11. Build minimal request:

  12. Required auth, file/audio format, model.
  13. Add optional controls only when needed.
  14. Keep uploaded files at or below 10MB; split longer audio before sending.

  15. Validate compatibility:

  16. Check model-parameter support before sending.
  17. Enforce WS pcm / 16000Hz / mono requirements.
  18. For HTTP stream=true, expect SSE text deltas only, not structured verbose fields.

  19. Parse robustly:

  20. Handle JSON/text/verbose/SSE forms.
  21. Handle WS terminal events and failures.
  22. Treat returned audio URLs, api_key, session_id, and trace_id as sensitive operational data.