Audio Transcription

result = evaluator.evaluate(
    eval_templates="audio_transcription",
    inputs={
        "audio": "https://datasets-server.huggingface.co/assets/MLCommons/peoples_speech/--/f10597c5d3d3a63f8b6827701297c3afdf178272/--/clean/train/0/audio/audio.wav",
        "transcription": "i wanted this to share a few things but i'm going to not share as much as i wanted to share because we are starting late i'd like to get this thing going so we all get home at a decent hour this this election is very important to"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)


Required Input	Type	Description
`audio`	`string`	The file path or URL to the audio file containing the speech
`transcription`	`string`	The text transcription to be evaluated for accuracy

Output
	Field	Description
	Result	Returns a numeric score, where higher score indicates a more accurate transcription
	Reason	Provides a detailed explanation of the transcription assessment

What to do If you get Undesired Results

If the transcription accuracy score is lower than expected:

Ensure the audio is clear with minimal background noise
Check for proper capitalization and punctuation in the transcription
Include all filler words (um, uh, etc.) for verbatim accuracy if required
Verify correct spelling of technical terms, names, or specialized vocabulary
Review for word substitution errors where similar-sounding words are confused
Consider using professional transcription services for important content
For non-native speakers, ensure the transcriber is familiar with the accent
Use timestamps for longer audio to help identify where errors might occur

Comparing Audio Transcription with Similar Evals

Audio Quality: While Audio Transcription evaluates the accuracy of converting speech to text, Audio Quality assesses the perceptual quality of the audio itself.
Context Adherence: Audio Transcription focuses on accurately capturing spoken words, while Context Adherence evaluates how well content aligns with given context or instructions.

Get Started

Guides

What to do If you get Undesired Results

Comparing Audio Transcription with Similar Evals

Get Started

Guides

​What to do If you get Undesired Results

​Comparing Audio Transcription with Similar Evals

What to do If you get Undesired Results

Comparing Audio Transcription with Similar Evals