Week of 2025-08-19

What’s New

Features
  • Comparison Summary: Compare evaluations and prompt summaries of two different datasets now with detailed graphs and scores.
  • Function Evals: Enable adding and editing function-type custom evals from the list of evals supported by Future AGI.
  • Edit Synthetic Dataset: Edit existing synthetic datasets directly or create a new version from changes.
  • Document Column Support in Dataset: New document column type to upload/store files in cells (TXT, DOC, DOCX, PDF).
  • User Tab in Dashboard and Observe: Searchable, filterable user list and detailed user view with metrics, interactive charts, synced time filters, and traces/sessions tabs.
  • Displaying the Timestamp Column in Trace/Spans: Added Start Time and End Time columns in Observe → LLM Tracing and Prototype → All Runs → Run Details.
  • Configure Labels: Configure system and custom labels per prompt version in Prompt Management.
  • Async Evals via SDK: Run evaluation asynchronously for long-running evaluations or larger datasets.
Bugs/Improvements
  • SDK Codes: Update the SDK codes for columns and rows on create dataset, add rows, and landing dataset page.
  • Fixed the editable issue in custom evals form: Incorrect config was displayed on evals page for function evals.
  • The bottom section for trace detail drawer disappeared: Dragging the bottom section caused the entire bottom area to disappear; behavior corrected.
  • UI screen optimization for different screen sizes.
  • Bug fixes for updates summary screen - color, text, and font alignment.
  • Cell loading state issues while creating synthetic data.
  • UI enhancement for simulation agent flow.
  • CSV upload bug in datasets and UI fixes for add feedback pop-up.
Week of 2025-08-11

What’s New

Features
  • Summary Screen Revamp (Evaluation and Prompt): Unified visual overview of model performance with pass rates and comparative spider/bar/pie charts; includes compare views, drill-downs, and consistent filters.
  • Alerts Revamp: Create alert rules in Observe (+New Alert) from Alerts tab or project; notifications via Slack/Email with guided Alert Type and Configuration steps.
  • Upgrades in Prompt SDK: Increased prompt availability after first run by virtue of prompt caching. Seamlessly deploy prompts in production, staging, or dev and perform A/B tests using prompt SDK.
Bugs/Improvements
  • Run prompt issues for longer prompts (>5K words).
  • Bug fixes for voice simulation naming convention in transcript deleting runs and selection of agent simulator.
Week of 2025-08-07

What’s New

Features
  • Voice Simulation: New testing infrastructure that deploys AI agents to conduct real conversations with your voice systems, analyzing actual audio, not just transcripts.
  • Edit Evals Config: Now edit the config (prompt/criteria) for your custom evals via evals playground, but with the restriction of no variable addition.
Bugs/Improvements
  • Bug fix for dynamic column creation via Weviate.
  • Reduced dependencies for TraceAI packages (HTTPS & GRPC).
  • Automated eval refinement: Retune your evals in evals playground by providing feedback.
  • Markdown now available as a default option for improved readability.
  • Support for video (traces and spans) in Observe project.
Week of 2025-07-29

What’s New

Features
  • Edit, Duplicate, and Delete Custom Evals: Now duplicate, edit, or delete evaluations if they are not in use anymore or logic is outdated.
  • Bulk Annotation/User Feedback: Bulk annotate your observe traces with user feedback directly using API or SDK.
  • JSON View for Evals Log: Access evals log data in JSON format in evals playground.
Bugs/Improvements
  • Span name visibility in traces for Observe and Prototype.
  • Bug fix for adding owner to workspace.
  • Error handling for evaluations in prompt workbench.
  • Add variables to system and assistant user roles in prompt workbench.
  • Speed enhancement for dataset loading.
  • Error state handling for evaluations in prompt workbench.
Week of 2025-07-21

What’s New

Features
  • Run button on single cell in evaluations workbench.
  • Now users can add notes to observe traces.
Bugs/Improvements
  • Improved search logic to render relevant search results in dataset.
  • Dataset bugs and API network call optimizations.
  • Fixed audio icon.
  • Error handling for network connection issues.
  • Bug fixes for prompt workbench versioning issues.
  • Changed the color mapping for deterministic type evals.
  • Updated loaders for evals playground.
  • Pagination fix in Observe.
  • Added clear functionality in add to dataset column mapping fields in Observe.
  • Clear graph property when Observe changes; fixed thumbs down icon not rendering.
  • Generate variable bug fix in prompt workbench.
  • Experiment page break on content tab switch.
  • Fixed the created_at 30-day filter on evals log section.
Week of 2025-07-14

What’s New

Bugs/Improvements
  • Prevented overscroll in X direction for entire platform.
  • Glitch after refreshing while generating sample data.
  • Error message update for doc uploads and save button status for doc upload.
  • Variable auto-population issue in compare prompt for multiple versions.
  • Restricted function tab to LLM spans only.
  • Error handling for mandatory system prompt for a few LLM models.
  • Added API null check in all places.
  • Streaming issues after run prompt when the current prompt version is updated.
  • Truncate model name in model details drawer.
  • No rows error on dataset homepage for selective users with low speed.
  • Easier removal of filters for Observe and Prototype.
  • Fixed validation in quick filter number-related fields.
  • Fixed inconsistent fonts in evaluation workbench.
  • Added loading state to evaluations tab.
  • Knowledge base name not visible in a few cases issue fixed.
  • Fixed spacing issue in run prompt.
  • Link updated for the workbench help section and width update as list.
Week of 2025-05-05

What’s New

Features
  • Diff view in experiment.
  • Updated sections for Prototype and Observe.
  • Error localization in Observe.
  • [Observe+Prototype] Adding annotations flow for trace view details.
  • Updated dataset layout and table design.
  • Higher rate limits to send more traces in Observe.
  • Sorting in alert.
  • Support for audio in Observe and datasets.
Bugs/Improvements
  • Improved error handling in prompt versioning.
  • Removed unnecessary keys from evaluation outputs.
  • Better handling of required keys to column names in add_evaluation in dataset.
  • Removed TraceAI code from FutureAGI SDK - experiment rerun fix.
  • SSO login issues.
  • Eval ranking fixes.
  • Fixed sizing and view issue in dataset when row size is adjusted.
  • Fixed sidebar item not showing active style when child page is active globally.
  • Edit integer type has red background in edit field.
  • Fixed crashing of page when adding JSON value in dataset.
  • Fixed knowledge base status update issue in case of network issues.
  • Experiment tab bugs for some browsers and loading state issues on experiment page.
  • Bug in run insight section of Prototype.
Week of 2025-04-28

What’s New

Features
  • Prototype / All Runs columns dropdown change.
  • Prototype / Configure project.
  • Trace details view for Observe/Prototype.
  • Allow search in dataset.
  • Run insights view - evals (deployed without the error modal part).
  • Improved user flow for synthetic data creation with “best practices” for each input.
  • Add to dataset flow from Prototype.
  • API for Gmail account signup.
  • Enabling search within data.
  • First-time user experience walkthrough for newly onboarded users.
  • Quick filters for annotations view in Prototype and Observe.
  • Compare runs in Prototype.
  • Diff view for compare dataset.
  • Enhancement of Observe and Prototype.
  • Addition of new evals for audio - conversational and completeness evals.
Bugs/Improvements
  • New choice for Tone Eval if none of the choices are suitable.
  • Bug on experiment view.
  • UI/UX bugs - knowledge base and audio support for evals.
  • Required input field column detail not coming on Audio Quality evals.
  • UX changes for loader of plan screen.
  • Changed the color and the percentage of the eval chips in experiment.
Week of 2025-04-21

What’s New

Features
  • Quick filters in Prototype & Observe.
  • Added support for knowledge base creation and updating.
  • Optimization of synthetic data generation.
  • Evaluate working in compare datasets.
Bugs/Improvements
  • Rate limit hit better UI.
  • Audio and knowledge base bug fixes.
  • Improved wrong evals view.
  • Fixes in compare dataset.
  • Changed the logo URL.
  • Filter issue fixed in Prototype.
  • Rate limit error message to upgrade the plan.
  • Experiment optimization under datasets to work faster.
  • Huggingface error handling for different datasets.