Crossword Diagnostics

STT Lab

Compare browser-native, OpenAI Realtime, and Google Cloud streaming transcription with timing diagnostics.

1) Browser-based STT

Uses the browser SpeechRecognition engine directly (no backend stream) for quick interim/final diagnostics.

Language codeInterim resultsContinuous mode

Not listening

Browser stream: disconnected

Listening after start: n/a

First final loaded: n/a

No final browser transcript yet.

No interim browser transcript yet.

No completed browser utterances yet.

Uses OpenAI Realtime session + WebRTC stream + transcription delta events.

Cloud access requires sign-in and explicit project permission (`stt_lab`) for your account.

Turn detection typeThreshold (0.0–1.0)Prefix padding (ms)Silence duration (ms)

Not listening

Live session: disconnected

Status: Idle.

Listening after start: n/a

First final loaded: n/a

No final transcript yet.

No live partial transcript.

No completed utterances yet.

Uses Google Speech-to-Text v2 streaming (recognizer location is configurable) via/api/stt-lab/google-v2-stream.

Cloud access requires sign-in and explicit project permission (`stt_lab`) for your account.

ModelLocationLanguage codeSingle utterance (auto-stop after first finalized result)Interim resultsInterim stability threshold (0-1)

Not listening

Google stream: disconnected

Status: Idle.

Listening after start: n/a

First final loaded: n/a

No final Google transcript yet.

No interim Google transcript yet.

No completed Google utterances yet.

No logs yet.