Evaluation Tab

Evaluation Sets
Evaluation Runs

The Evaluation tab is used to test and validate AI behavior within a domain. Within this section, you can create specific sets of prompts to simulate user interactions. Provide expected SQL outputs alongside these prompts to directly measure the AI’s accuracy in translating natural language into correct database queries. After running evaluations, you can review the results to identify areas for improvement.

Evaluation Sets

Create and manage collections of prompts for evaluating AI behavior. Add a new evaluation set by providing a name and a JSON definition containing prompts, optionally including expected SQL. Existing evaluation sets are listed in this view, allowing you to review and manage them over time.

Evaluation Runs

Find a list of completed evaluation runs and their results. This view allows you to review past executions, compare outcomes, and track how AI performance changes over time.

To learn how to create an Evaluation Set, understand indicators, and review results, consult the related article Use Evaluation Sets and Runs.

Advanced Tab SQL Playground Tab

⌘I

Getting Started

Setting Up WisdomAI

Improve WisdomAI Responses

Admin Operations

Access Management

Advanced Features

Integrations

Reference Library

Evaluation Sets

Evaluation Runs

Getting Started

Setting Up WisdomAI

Improve WisdomAI Responses

Admin Operations

Access Management

Advanced Features

Integrations

Reference Library

​Evaluation Sets

​Evaluation Runs

Evaluation Sets

Evaluation Runs