Skip to main content

Long sessions: uploads vs 30‑second windowing

Why streaming on‑device and finalizing only the last ~30 seconds keeps long dictations responsive.

Key takeaways

answer-first
  • Long sessions punish cloud workflows because upload time scales with audio length.
  • On-device streaming stays responsive by working in fixed windows.
  • Finali

Half an hour of clean audio is not a “quick upload.”

TL;DR

  • Long sessions punish cloud workflows because upload time scales with audio length.
  • On-device streaming stays responsive by working in fixed windows.
  • Finalization is bounded: when you stop, only the last window needs finishing.

Uploading long, high‑quality audio takes time, especially on variable wifi. Many cloud tools avoid heavy compression to protect accuracy, which increases upload size.

Voice Type stays on‑device, streams continuously, and when you stop, finishes only the last ~30s window (≈2–3s on an M1). That’s why long sessions feel snappy in practice.

Explore the difference (choose Medium or Long in the demo): /blog/latency-demo

Related: Offline vs Cloud (practical guide)

FreshnessUpdated Dec 25, 2025

This article is reviewed against current product behavior, macOS guidance, and linked references. If a workflow changed after Dec 25, 2025, check the latest product docs and Apple guidance before relying on older steps or screenshots.

Try Voice Type

Dictate into any Mac text field without waiting on uploads.

Voice Type fits people who want local dictation, custom vocabulary, and a faster stop-to-text loop. The trial is the quickest way to see how it behaves on your own setup.

Freshly reviewed·7-day trial·one-time purchase