Definition

Assesses the validity and correctness of responses from external APIs, ensuring that they align with expected criteria. This evaluation is essential for systems that depend on API integrations, helping to verify response structure, data accuracy, and reliability.

A Passed result indicates that the API response meets all predefined requirements, while a Failed result signifies that the response is either incorrect, incomplete, or does not conform to expected standards.


Calculation

The evaluation process begins with configuration setup, where the API URL, payload, and headers are defined. Next, the system makes the API call, ensuring that network errors are handled appropriately. The response evaluation then verifies the status code to confirm a successful request, checks the response body for expected data fields or values, and ensures the format is correct (e.g., JSON, XML). Finally, the validation outcome determines whether the response meets all criteria—if it does, the evaluation passes; if not, it fails.


What to do when API Call Evaluation Fails

Check the API endpoint and parameters to ensure they are correctly configured. Reviewing the response for error messages or status codes can help identify the cause of failure.


Differentiating API Call Eval with Function Calling Eval

The API Call evaluation focuses on making network requests to external services and validating the responses, while Evaluate LLM Function Calling examines whether LLMs correctly identify and execute function calls.

API calls are used for external interactions like retrieving data or triggering actions, while function call evaluation ensures that LLMs correctly interpret and execute function calls based on input prompts.

They differ in validation criteria, where API calls are assessed based on response content, status codes, and data integrity, the function call evaluation focuses on the accuracy of function call identification and parameter extraction.