JSON & Structured Output
14 metrics for validating JSON correctness, schema compliance, type checking, and structured output quality.
- 5 JSON metrics (binary): is_json, contains_json, json_schema, json_validation, json_syntax
- 9 structured output metrics (continuous 0.0-1.0): schema compliance, types, fields, hierarchy
- All run locally via
from fi.evals import evaluate
Validate whether your LLM produces correct JSON and well-structured output.
from fi.evals import evaluate
result = evaluate("is_json", output='{"name": "Alice", "age": 30}')
print(result.score) # 1.0
print(result.passed) # True
JSON Metrics
Binary checks: the output either passes or fails.
| Metric | What it checks |
|---|---|
is_json | Entire output is valid JSON |
contains_json | Output contains JSON somewhere within it |
json_schema | Output matches a provided JSON schema |
json_validation | Structural correctness and data integrity |
json_syntax | Correct JSON syntax — quoting, brackets, commas, escaping |
is_json
Entire output is valid JSON.
result = evaluate("is_json", output='{"status": "ok", "code": 200}')
# score → 1.0
result = evaluate("is_json", output="This is not JSON")
# score → 0.0
contains_json
Output contains JSON somewhere within it, even surrounded by text.
result = evaluate("contains_json", output='Here is the result: {"name": "Alice"} as expected.')
# score → 1.0
json_schema
Output matches a provided JSON schema. Pass the schema via config.
result = evaluate(
"json_schema",
output='{"name": "Alice", "age": 30}',
config={"schema": {"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}}, "required": ["name", "age"]}},
)
# score → 1.0
json_validation
Validates JSON for structural correctness and data integrity.
result = evaluate("json_validation", output='{"users": [{"id": 1, "name": "Alice"}]}')
# score → 1.0
json_syntax
Checks for correct JSON syntax — quoting, brackets, commas, escaping.
result = evaluate("json_syntax", output='{"temp": 22.5, "valid": true}')
# score → 1.0
result = evaluate("json_syntax", output='{"valid": True}')
# score → 0.0 (True should be true)
Structured Output Metrics
Continuous scores (0.0-1.0) measuring how closely the output structure matches an expected structure. Pass both as JSON strings.
| Metric | What it measures |
|---|---|
schema_compliance | Conformance to expected schema (field names, nesting, types) |
type_compliance | Whether field types match expected types |
field_completeness | Proportion of expected fields that are present |
required_fields | Whether all required fields are present |
field_coverage | Percentage of expected fields with non-null, non-empty values |
hierarchy_score | How well nested structure matches the expected hierarchy |
tree_edit_distance | Normalized edit distance between tree structures |
structured_output_score | Composite score combining schema, type, field, and hierarchy checks |
quick_structured_check | Fast basic structure check for quick pass/fail |
schema_compliance
How well the output conforms to the expected schema (field names, nesting, types).
result = evaluate(
"schema_compliance",
output='{"name": "Alice", "age": 30, "email": "alice@test.com"}',
expected_output='{"name": "string", "age": 0, "email": "string"}',
)
# score → 1.0
type_compliance
Whether field types match expected types.
result = evaluate(
"type_compliance",
output='{"name": "Alice", "age": 30, "active": true}',
expected_output='{"name": "Bob", "age": 25, "active": false}',
)
# score → 1.0 (all types match)
field_completeness
Proportion of expected fields that are present.
result = evaluate(
"field_completeness",
output='{"name": "Alice"}',
expected_output='{"name": "string", "age": 0, "email": "string"}',
)
# score → ~0.33 (1 of 3 fields)
required_fields
Whether all required fields are present.
result = evaluate(
"required_fields",
output='{"id": 1, "name": "Alice", "email": "alice@test.com"}',
expected_output='{"id": 0, "name": "string", "email": "string"}',
)
# score → 1.0
field_coverage
Percentage of expected fields with non-null, non-empty values.
result = evaluate(
"field_coverage",
output='{"name": "Alice", "age": null, "bio": ""}',
expected_output='{"name": "string", "age": 0, "bio": "string"}',
)
# score → ~0.33 (only name has a value)
hierarchy_score
How well nested structure matches the expected hierarchy.
result = evaluate(
"hierarchy_score",
output='{"user": {"name": "Alice", "address": {"city": "NYC"}}}',
expected_output='{"user": {"name": "string", "address": {"city": "string"}}}',
)
# score → 1.0
tree_edit_distance
Normalized edit distance between tree structures. Higher = more similar.
result = evaluate(
"tree_edit_distance",
output='{"a": 1, "b": {"c": 2}}',
expected_output='{"a": 1, "b": {"c": 2}}',
)
# score → 1.0 (identical)
structured_output_score
Composite score combining schema compliance, type compliance, field completeness, and hierarchy.
result = evaluate(
"structured_output_score",
output='{"id": 1, "name": "Alice", "tags": ["python"], "meta": {"role": "engineer"}}',
expected_output='{"id": 0, "name": "string", "tags": ["string"], "meta": {"role": "string"}}',
)
# score → ~0.95
quick_structured_check
Fast basic structure check. Use when you need a quick pass/fail.
result = evaluate(
"quick_structured_check",
output='{"name": "Alice", "age": 30}',
expected_output='{"name": "string", "age": 0}',
)
# score → 1.0