Back to blog

Benchmark · February 16, 2026 · 10 min read

How We Measure Dictation Latency

A reproducible method for evaluating end-of-dictation completion speed across dictation tools.

Quick answer

We measure from end of speech to visible final text output using the same 20-second phrase across repeated trials.

Tags

benchmarklatencymethodologymeasurement

Speed claims in voice products are easy to make and hard to trust unless the timing boundary is explicit.

This post explains exactly how Almond measures dictation latency so anyone can reproduce the process with their own stack.

The metric we optimize

Our primary metric is end-of-dictation to visible final text. That means the timer starts when speech ends and stops when the final transcript is fully usable in the target app.

We use this boundary because it maps to human experience. Users feel the delay after they stop speaking, not only model token throughput while they are still talking.

What we keep constant

To make comparisons fair, we control the test environment:

  • Same spoken phrase length: 20 seconds.
  • Same microphone and speaking cadence.
  • Same hardware and macOS version.
  • Same target app and insertion context.
  • Repeated trials, then median reporting.

Any variable that changes between runs can distort conclusions, so we keep the setup simple and repeatable.

Trial protocol

  1. Start recording in the target dictation tool.
  2. Speak the standardized 20-second phrase.
  3. Stop speaking and mark that timestamp.
  4. Stop timing when final text is visible and editable.
  5. Repeat across multiple runs and record median plus spread.

Why median and variance both matter

Median tells you typical speed. Variance tells you predictability.

A tool with a decent median but high variance still feels unreliable in daily writing. That is why we look at both central tendency and consistency.

Common benchmark mistakes

  • Stopping the timer at partial or streaming text instead of final usable output.
  • Running single-shot tests that ignore random spikes.
  • Comparing across different workflows or app contexts.
  • Ignoring correction overhead after insertion.

Each mistake can make a slow real workflow appear fast on paper.

How to replicate this yourself

You can follow the same method with your own phrase, app mix, and dictation tools. We recommend at least 10 trials per tool and reporting both median and rough spread.

If you want the complete reference, review our full methodology page and benchmark summary at dictation-speed-benchmark.

Bottom line

Measurement should make claims easier to verify, not harder. Clear boundaries create better product comparisons and better product decisions.

Related reading

Published February 16, 2026 · Updated February 16, 2026

Almond Logo

Start speaking.

Download for Mac
Requires macOS 15.6+ and Apple Silicon.