Speed claims in voice products are easy to make and hard to trust unless the timing boundary is explicit.

This post explains exactly how Almond measures dictation latency so anyone can reproduce the process with their own stack.

The metric we optimize

Our primary metric is end-of-dictation to visible final text. That means the timer starts when speech ends and stops when the final transcript is fully usable in the target app.

We use this boundary because it maps to human experience. Users feel the delay after they stop speaking, not only model token throughput while they are still talking.

What we keep constant

To make comparisons fair, we control the test environment:

Same spoken phrase length: 20 seconds.
Same microphone and speaking cadence.
Same hardware and macOS version.
Same target app and insertion context.
Repeated trials, then median reporting.

Any variable that changes between runs can distort conclusions, so we keep the setup simple and repeatable.

Trial protocol

Start recording in the target dictation tool.
Speak the standardized 20-second phrase.
Stop speaking and mark that timestamp.
Stop timing when final text is visible and editable.
Repeat across multiple runs and record median plus spread.

Why median and variance both matter

Median tells you typical speed. Variance tells you predictability.

A tool with a decent median but high variance still feels unreliable in daily writing. That is why we look at both central tendency and consistency.

Common benchmark mistakes

Stopping the timer at partial or streaming text instead of final usable output.
Running single-shot tests that ignore random spikes.
Comparing across different workflows or app contexts.
Ignoring correction overhead after insertion.

Each mistake can make a slow real workflow appear fast on paper.

How to replicate this yourself

You can follow the same method with your own phrase, app mix, and dictation tools. We recommend at least 10 trials per tool and reporting both median and rough spread.

If you want the complete reference, review our full methodology page and benchmark summary at dictation-speed-benchmark.

Bottom line

Measurement should make claims easier to verify, not harder. Clear boundaries create better product comparisons and better product decisions.

How We Measure Dictation Latency

Quick answer

Tags

Evidence links

The metric we optimize

What we keep constant

Trial protocol

Why median and variance both matter

Common benchmark mistakes

How to replicate this yourself

Bottom line

Related reading

Building Deterministic On-Device Dictation

Introducing Almond

Offline Dictation vs Cloud Latency

Start speaking.