Factual Accuracy

Evaluation Using Interface

Input:

Required Inputs:
- output: The output column generated by the model.
Optional Inputs:
- context: The context column provided to the model.
- input: The input column provided to the model.
Configuration Parameters:
- Check Internet: Boolean - Whether to verify information using external sources.

Output:

Score: Percentage score between 0 and 100

Interpretation:

Higher scores: Indicate that the output is factually accurate based on the provided context/input or general knowledge (if Check Internet is enabled).
Lower scores: Suggest the presence of factual inaccuracies in the output.

Evaluation Using SDK

Click here to learn how to setup evaluation using SDK.

Input Type	Parameter	Type	Description
Required Inputs	`output`	`string`	The output generated by the model.
Optional Inputs	`context`	`string`	The context provided to the model.
	`input`	`string`	The input provided to the model.

Output	Type	Description
`Score`	`float`	Returns a score between 0 and 1, where higher values indicate better factual accuracy.

from fi.testcases import TestCase
from fi.evals.templates import FactualAccuracy

test_case = TestCase(
    output="example output",
    context="example context",
    input="example input",
)

template = FactualAccuracy(config={
    "check_internet": False
})

response = evaluator.evaluate(eval_templates=[template], inputs=[test_case], model_name="turing_flash")

print(f"Score: {response.eval_results[0].metrics[0].value}")
print(f"Reason: {response.eval_results[0].reason}")

What to Do When Factual Accuracy Evaluation Gives a Low Score

When factual accuracy evaluation gives a low score, it is essential to reassess the evaluation criteria to ensure they are clearly defined and aligned with the evaluation’s goals. If necessary, adjustments should be made to enhance the criteria’s comprehensiveness and relevance. Additionally, the output should be thoroughly examined for factual inaccuracies, identifying any discrepancies and refining the content to improve factual correctness.

Differentiating Factual Accuracy with Groundedness

Factual accuracy focuses on verifying the correctness of the output based on the given input and context, ensuring that the information presented is factually sound. In contrast, groundedness ensures that the response strictly adheres to the provided context, preventing the inclusion of unsupported or external information. While factual accuracy requires input, output, and context for evaluation, groundedness only requires a response and its context.

Introduction

Evaluation

Simulations

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

Evaluation Using Interface

Evaluation Using SDK

What to Do When Factual Accuracy Evaluation Gives a Low Score

Differentiating Factual Accuracy with Groundedness

Introduction

Evaluation

Simulations

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

​Evaluation Using Interface

​Evaluation Using SDK

​What to Do When Factual Accuracy Evaluation Gives a Low Score

​Differentiating Factual Accuracy with Groundedness

Evaluation Using Interface

Evaluation Using SDK

What to Do When Factual Accuracy Evaluation Gives a Low Score

Differentiating Factual Accuracy with Groundedness