Summarization accuracy evaluation assesses how well a model’s summary captures the key information and meaning of the original document. This evaluation ensures that summaries are both concise and faithful to the source material.

Configuration

The evaluation requires the following configuration:

ParameterDescription
modelThe model to be used for evaluation
from fi.evals import SummarizationAccuracy

summary_eval = SummarizationAccuracy(config={"model": "gpt-4o-mini"})

Test Case Setup

The evaluation requires both the original document and the generated summary:

from fi.testcases import LLMTestCase

test_case = LLMTestCase(
    document="Climate change is a significant global challenge. Rising temperatures, melting ice caps, and extreme weather events are affecting ecosystems worldwide. Scientists warn that immediate action is needed to reduce greenhouse gas emissions and prevent catastrophic environmental damage.",
    response="Climate change poses a global threat with effects like rising temperatures and extreme weather, requiring urgent action to reduce emissions."
)

Client Setup

Initialize the evaluation client with your API credentials:

from fi.evals import EvalClient

evaluator = EvalClient(
    fi_api_key="your_api_key", 
    fi_secret_key="your_secret_key"
)

Complete Example

from fi.evals import SummarizationAccuracy, EvalClient
from fi.testcases import TestCase

# Initialize the summarization accuracy evaluator
summary_eval = SummarizationAccuracy(config={"model": "gpt-4o-mini"})

# Create a test case
test_case = TestCase(
    document="Climate change is a significant global challenge. Rising temperatures, melting ice caps, and extreme weather events are affecting ecosystems worldwide. Scientists warn that immediate action is needed to reduce greenhouse gas emissions and prevent catastrophic environmental damage.",
    response="Climate change poses a global threat with effects like rising temperatures and extreme weather, requiring urgent action to reduce emissions."
)

# Run the evaluation
evaluator = EvalClient(fi_api_key="your_api_key", fi_secret_key="your_secret_key")
result = evaluator.evaluate(summary_eval, test_case)
print(result)  # Will return Pass if summary accurately captures key information