fuzzy-match

This evaluation template compares two texts for similarity using fuzzy matching techniques. It’s useful for detecting approximate matches between text strings when exact matching might be too strict, accounting for minor differences in wording, spelling, or formatting.

Interface Usage

result = evaluator.evaluate(
    eval_templates="fuzzy_match", 
    inputs={
        "input": "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair. It stands 324 meters tall.",
        "output": "The Eiffel Tower, located in Paris, was built in 1889 and is 324 meters high."
    },
    model_name="turing_flash"
)

print(result.eval_results[0].metrics[0].value)
print(result.eval_results[0].reason)

Python SDK Usage

from futureagi import Evaluator

# Initialize the evaluator
evaluator = Evaluator(api_key="your_api_key")

# Evaluate the fuzzy match between two text strings
result = evaluator.evaluate(
    eval_templates="fuzzy_match", 
    inputs={
        "input": "The Eiffel Tower is a famous landmark in Paris, built in 1889 for the World's Fair. It stands 324 meters tall.",
        "output": "The Eiffel Tower, located in Paris, was built in 1889 and is 324 meters high."
    },
    model_name="turing_flash"
)

# Access the result
match_score = result.eval_results[0].metrics[0].value
reason = result.eval_results[0].reason

print(f"Fuzzy match score: {match_score}")
print(f"Reason: {reason}")

Example Output

True
The two texts convey essentially the same information about the Eiffel Tower. Both mention that it's located in Paris, was built in 1889, and is 324 meters tall/high. The wording is slightly different, but the key facts are identical, making this a strong fuzzy match despite not being word-for-word identical.

Troubleshooting

If you encounter issues with this evaluation:

  • Ensure that both input texts are properly formatted and contain meaningful content
  • This evaluation works best with texts that convey similar information but might have different wording
  • For very short texts (1-2 words), results may be less reliable
  • If you need more precise matching, consider using levenshtein_similarity instead
  • levenshtein_similarity: Provides a more strict character-by-character comparison
  • embedding_similarity: Compares semantic meaning rather than surface-level text
  • semantic_list_contains: Checks if specific semantic concepts are present in both texts
  • rouge_score: Evaluates based on n-gram overlap, especially useful for summarization tasks