Text to SQL

Evaluation Using Interface

Input:

Required Inputs:
- input: The natural language query or instruction.
- output: The generated SQL query to evaluate.

Output:

Result: Returns ‘Passed’ if the SQL query correctly represents the natural language request, ‘Failed’ if it doesn’t.
Reason: A detailed explanation of why the SQL query was classified as correct or incorrect.

Evaluation Using Python SDK

Click here to learn how to setup evaluation using the Python SDK.

Input:

Required Inputs:
- input: string - The natural language query or instruction.
- output: string - The generated SQL query to evaluate.

Output:

Result: Returns a list containing ‘Passed’ if the SQL query correctly represents the natural language request, or ‘Failed’ if it doesn’t.
Reason: Provides a detailed explanation of the evaluation.

result = evaluator.evaluate(
    eval_templates="text_to_sql",
    inputs={
        "input": "List the names of all employees who work in the sales department.",
        "output": "SELECT name FROM employees WHERE department = 'sales';"
    },
    model_name="turing_flash"
)

print(result.eval_results[0].output)
print(result.eval_results[0].reason)

Example Output:

['Passed']
The evaluation is 'Passed' because the SQL query correctly and efficiently implements the natural language request.

*   The query uses the **appropriate SELECT statement** to retrieve only the requested 'name' field from the employees table.
*   The **WHERE clause** correctly filters for employees in the 'sales' department using case-sensitive string comparison.
*   The query is **syntactically correct** with proper semicolon termination and correct SQL syntax.
*   The solution is **optimally efficient**, retrieving only the necessary data without unnecessary joins or sub-queries.

A different evaluation was not possible because the SQL query fully satisfies all requirements of the natural language prompt.

What to do If you get Undesired Results

If the SQL query is evaluated as incorrect (Failed) and you want to improve it:

Ensure the SQL syntax is correct and follows standard conventions
Verify that all tables and columns referenced match the database schema implied by the natural language query
Check that the query filters for exactly the data requested (no more, no less)
Make sure appropriate joins are used when multiple tables are involved
Confirm that the query handles potential edge cases like NULL values appropriately
Use the correct data types for values in comparisons (e.g., quotation marks for strings)
For complex queries, consider breaking them down into simpler parts for troubleshooting

Comparing Text to SQL with Similar Evals

Task Completion: While Text to SQL focuses specifically on converting natural language to SQL queries, Task Completion evaluates whether a response completes the requested task more generally.
Evaluate Function Calling: Text to SQL evaluates SQL generation specifically, whereas Evaluate Function Calling assesses the correctness of function calls and parameters more broadly.
Is Code: Text to SQL evaluates the correctness of SQL generation, while Is Code detects whether content contains code of any type.

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

Evaluation Using Interface

Evaluation Using Python SDK

What to do If you get Undesired Results

Comparing Text to SQL with Similar Evals

Introduction

Evaluation

Knowledge Base

Dataset

Prototype

Observe

Tracing

Optimization

Prompt Workbench

Protect

MCP

Admin & Settings

FAQs

​Evaluation Using Interface

​Evaluation Using Python SDK

​What to do If you get Undesired Results

​Comparing Text to SQL with Similar Evals

Evaluation Using Interface

Evaluation Using Python SDK

What to do If you get Undesired Results

Comparing Text to SQL with Similar Evals