Skip to main content

Blog

Short utterances and the hidden cost of handshakes

For 5–15 second notes, network setup time can outweigh everything else. On‑device avoids the detours.

· Updated · 1 min read
Short utterances and the hidden cost of handshakes

You say “Thanks.” The network says “Hold on.”

TL;DR

  • For short clips, DNS/TLS and upload setup can be most of the wait.
  • On-device avoids the handshake tax entirely.
  • The “feel” of dictation is dominated by stop-to-text latency.

Cloud flows typically include multiple handshakes (TLS/DNS) and at least one remote hop. For very short phrases, this setup time can dominate. On‑device dictation avoids the detours entirely: your audio stays local, text appears immediately, and there’s nothing to upload.

See the effect in the interactive demo (choose Short and try different networks): /blog/latency-demo

Related: Offline stays fast · Long sessions

Related articles