Evaluation Using Interface

Input:

  • Required Inputs:
    • response: The column containing the API’s response content (e.g., JSON body, status code).
  • Optional Inputs:
    • None specified for this evaluation.
  • Configuration Parameters:
    • (Optional) expected_status_code: Integer - The expected HTTP status code for a successful call (e.g., 200).
    • (Optional) validate_json_body: Boolean - Whether to check if the response body is valid JSON.

Output:

  • Result: Passed / Failed

Interpretation:

  • Passed: Indicates that the API call response met the validation criteria (e.g., matched the expected_status_code, contained valid JSON if validate_json_body was true).
  • Failed: Suggests an issue with the API response based on the configured criteria (e.g., unexpected status code, malformed JSON body).

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.


Input TypeParameterTypeDescription
Required InputsresponsestringThe API response content (e.g., JSON body as a string, or status code).
Configuration Parametersexpected_status_codeint(Optional) The expected HTTP status code for success.
validate_json_bodybool(Optional) If true, checks if the response string is valid JSON. Default: False.
OutputTypeDescription
ResultboolReturns 1.0 if the validation passes, 0.0 otherwise (Fail).

from fi.evals import EvalClient
from fi.evals.templates import ApiCall
from fi.testcases import TestCase

test_case = TestCase(
    response='{"temperature": 75, "conditions": "sunny"}'
)

template = ApiCall(
    config={
        "url": "<https://api.weather.com/v1/current?apiKey=YOUR_WEATHER_API_KEY>",  # Add API key in URL
        "headers": {
            "apiKey": "YOUR_WEATHER_API_KEY",
            "Content-Type": "application/json"
        },
        "payload": {
            "city": "London",
            "units": "fahrenheit"
        }
    }
)

evaluator = EvalClient(
    fi_api_key="your_api_key",
    fi_secret_key="your_secret_key",
    fi_base_url="<https://api.futureagi.com>"
)

response = evaluator.evaluate(eval_templates=[template], inputs=[test_case])


What to do when API Call Evaluation Fails

Check the API endpoint and parameters to ensure they are correctly configured. Reviewing the response for error messages or status codes can help identify the cause of failure.


Differentiating API Call Eval with Function Calling Eval

The API Call evaluation focuses on making network requests to external services and validating the responses, while Evaluate LLM Function Calling examines whether LLMs correctly identify and execute function calls.

API calls are used for external interactions like retrieving data or triggering actions, while function call evaluation ensures that LLMs correctly interpret and execute function calls based on input prompts.

They differ in validation criteria, where API calls are assessed based on response content, status codes, and data integrity, the function call evaluation focuses on the accuracy of function call identification and parameter extraction.