30 Sept 2025

Cleaner input, cleaner transcripts: audio conditioning for accuracy

Normalized loudness and gentle filtering help the recognizer hear what you meant, not the room.

If the input is messy, the output will be too.

Voice Type normalizes loudness to a consistent target and applies a light high pass filter to reduce low frequency rumble. Combined with noise aware voice activity detection, this gives the model input closer to the audio it was trained on which leads to fewer garbles and better punctuation.

We avoid heavy “prompt fixes” that can make transcripts look confident but less faithful. Instead, we improve the signal before recognition.