Evaluation Using Interface
Input:- Required Inputs:
- input_audio: The audio file (URL or local path) containing the speech to be transcribed.
- input_transcription: The text transcription to be evaluated for accuracy against the audio.
- Score: A numeric score between 0 and 1, where 1 represents perfect transcription accuracy.
- Reason: A detailed explanation of the transcription assessment.
Evaluation Using SDK
Click here to learn how to setup evaluation using SDK.Input:
- Required Inputs:
- input_audio:
string
- The file path or URL to the audio file containing the speech. - input_transcription:
string
- The text transcription to be evaluated for accuracy.
- input_audio:
- Score: Returns a float value between 0 and 1, where higher values indicate a more accurate transcription.
- Reason: Provides a detailed explanation of the transcription assessment.
What to do If you get Undesired Results
If the transcription accuracy score is lower than expected:- Ensure the audio is clear with minimal background noise
- Check for proper capitalization and punctuation in the transcription
- Include all filler words (um, uh, etc.) for verbatim accuracy if required
- Verify correct spelling of technical terms, names, or specialized vocabulary
- Review for word substitution errors where similar-sounding words are confused
- Consider using professional transcription services for important content
- For non-native speakers, ensure the transcriber is familiar with the accent
- Use timestamps for longer audio to help identify where errors might occur
Comparing Audio Transcription with Similar Evals
- Audio Quality: While Audio Transcription evaluates the accuracy of converting speech to text, Audio Quality assesses the perceptual quality of the audio itself.
- Context Adherence: Audio Transcription focuses on accurately capturing spoken words, while Context Adherence evaluates how well content aligns with given context or instructions.