Future AGI’s Eval tasks allows you to create and run automated tasks on your data. These tasks enable automated workflows to manage model evaluation at scale. They provide ways to operationalize evaluations and track ongoing results without requiring manual intervention. Users can create and run automated tasks on their data.
llm.output_messages.0.message.content
as the input, the Bias Detection evaluation will determine whether the content is biased. The evaluation will return Passed
if the content is neutral, or Failed
if any bias is detected.
For more information on the evaluations we support, please refer to the evals documentation.