OCR Evaluation
Evaluates the quality of OCR output by verifying that the extracted JSON content faithfully represents the information in the source PDF document.
result = evaluator.evaluate(
eval_templates="ocr_evaluation",
inputs={
"input_pdf": "path/to/document.pdf",
"json_content": '{"name": "John Doe", "date": "2024-01-01", "amount": "$100.00"}'
},
model_name="turing_flash"
)
print(result.eval_results[0].output)
print(result.eval_results[0].reason)import { Evaluator, Templates } from "@future-agi/ai-evaluation";
const evaluator = new Evaluator();
const result = await evaluator.evaluate(
"ocr_evaluation",
{
input_pdf: "path/to/document.pdf",
json_content: '{"name": "John Doe", "date": "2024-01-01", "amount": "$100.00"}'
},
{
modelName: "turing_flash",
}
);
console.log(result); | Input | |||
|---|---|---|---|
| Required Input | Type | Description | |
input_pdf | string | The PDF document to verify against | |
json_content | string | The JSON content extracted from OCR to evaluate |
| Output | ||
|---|---|---|
| Field | Description | |
| Result | Returns a numeric score where higher values indicate more accurate OCR extraction | |
| Reason | Provides a detailed explanation of the OCR quality assessment |
What to Do When OCR Evaluation Score is Low
If the OCR evaluation score is lower than expected:
- Check for poor scan quality or low-resolution images in the PDF
- Verify that the OCR tool supports the fonts and languages present in the document
- Review the JSON structure to ensure it maps correctly to the document fields
- Look for misinterpreted characters (e.g.,
0vsO,1vsl) - Ensure tables and multi-column layouts are being parsed correctly
- Consider pre-processing the PDF to improve contrast and clarity before OCR
Comparing OCR Evaluation with Similar Evals
- Ground Truth Match: While OCR Evaluation checks the accuracy of structured extraction from a PDF, Ground Truth Match compares any generated output against a known expected value.
Was this page helpful?