You say “Thanks.” The network says “Hold on.”

TL;DR

For short clips, DNS/TLS and upload setup can be most of the wait.
On-device avoids the handshake tax entirely.
The “feel” of dictation is dominated by stop-to-text latency.

Cloud flows typically include multiple handshakes (TLS/DNS) and at least one remote hop. For very short phrases, this setup time can dominate. On‑device dictation avoids the detours entirely: your audio stays local, text appears immediately, and there’s nothing to upload.

See the effect in the interactive demo (choose Short and try different networks): /blog/latency-demo

Blog

Short utterances and the hidden cost of handshakes

TL;DR

Related articles

Learn

Company