Using FutureAGI Evals

You can checkout the colab notebook to quickly get started with the FutureAGI Evals.

Installing FutureAGI SDK

pip install futureagi

Initializing FutureAGI Evals

from fi.evals import Evaluator

evaluator = Evaluator(fi_api_key="<your_api_key>", 
                  fi_secret_key="<your_api_secret>") # Optional, if you want to set the API key and secret key manually

Click here to learn how to access your API keys. It’s recommended to set the API key and secret key as environment variables.

Define the TestCase

from fi.testcases import TestCase

test_case = TestCase(
    input="Can you help me with my homework?",
    output="Sure, I can help you with that.",
    context="You are a helpful assistant that can help with homework.",
)

Define the Evaluation and run it

from fi.evals import ContextAdherence

eval_template = ContextAdherence()

result = evaluator.evaluate(
    inputs=[test_case], model_name="turing_flash",
    eval_templates=[eval_template],
)

print(result.eval_results[0].metrics[0].value)

Example Script using Anthropic Client and FutureAGI Evals

from anthropic import Anthropic
from fi.testcases import TestCase
from fi.evals import ContextAdherence, Evaluator

anthropic = Anthropic()

evaluator = Evaluator()

response = anthropic.messages.create(
    max_tokens=1000,
    model="claude-3-5-sonnet-20240620",
    messages=[
        {"role": "user", "content": "Can you help me with my homework?"}
    ]
)

test_case = TestCase(
    input="Can you help me with my homework?",
    output=response.content[0].text,
    context="You are a helpful assistant that can help with homework.",
)

eval_template = ContextAdherence()

result = evaluator.evaluate(
    inputs=[test_case], model_name="turing_flash",
    eval_templates=[eval_template],
)

print(result.eval_results[0].metrics[0].value)

Cookbooks

​Installing FutureAGI SDK

​Initializing FutureAGI Evals

​Define the TestCase

​Define the Evaluation and run it

​Example Script using Anthropic Client and FutureAGI Evals

Installing FutureAGI SDK

Initializing FutureAGI Evals

Define the TestCase

Define the Evaluation and run it

Example Script using Anthropic Client and FutureAGI Evals