Week of 08-09-2025
What’s New
Features- AI-Powered Trace Intelligence: Unlock deeper insights into your AI applications with our comprehensive trace analytics agent. Get detailed performance scores across multiple evaluation metrics, pinpoint problematic spans in your execution flow, and receive AI-generated recommendations to optimize your AI agent’s performance automatically.
- Enterprise-Grade Multi-Workspace Security: Deploy with confidence using our complete RBAC framework. Create isolated workspaces, manage team members with full CRUD capabilities (edit, deactivate, resend invitations), and implement role-based access controls that scale with your organization’s security requirements.
- Intelligent Prompt Organization System: Transform your prompt management with our new folder-based architecture. Organize prompts and templates in a hierarchical structure, create reusable templates from existing prompts, and maintain consistency across your AI workflows. Templates function as fully-featured prompts while eliminating repetitive configuration tasks.
- Comprehensive Annotation Quality Dashboard: Monitor annotation quality at scale with our centralized analytics dashboard. Track key metrics including annotator agreement rates, completion times, and advanced quality scores (cosine similarity, Pearson correlation, Fleiss’ kappa) to ensure your training data meets the highest standards.
- Enhanced Test Execution Visibility: We’ve expanded the simulate feature with additional scenario columns in the test execution list, complete with grouping capabilities. Users can now filter results more effectively and customize column visibility to focus on the most relevant data for their testing workflows.
- Performance of Simulate Voice Agents: You can now view performance of your voice agent test run in a easy to read dashboard. You can checkout Top Performing Scenarios and other valuable metrics to optimize your voice AI implementations and track conversation quality across different test scenarios.
- Enhanced Data Visibility in dataset summary: Understand exactly how many data points contributed to your summary results and evaluation metrics, helping with complete transparency.
- Code snippet for running evals via SDK: Copy-paste ready terminal commands to run any evaluation without manual configuration by leveraging code snippet on evals playground.
- Improved Observability Reliability: Enhanced backend resilience for incomplete span creation scenarios and fixed issues when OpenTelemetry exports fail partially, ensuring complete trace visibility.
- UI/UX Enhancements: Streamlined simulation flow interfaces for better user experience and Standardized decimal precision across platform (displaying 2 decimal places for all numeric values).
Week of 29-08-2025
What’s New
Features- Add Rows in Evals Tab of Prompt Workbench: Instantly add new rows with variable values in the evaluations screen, allowing you to generate outputs and evaluate without returning to the Prompt Workbench homepage.
- Trace Linked to Prompt Workbench: View comprehensive performance metrics (latency, cost, tokens, evaluation metrics) for each prompt version linked to traces (and spans) across development, staging, and production environments via the Metrics section in Prompt Workbench.
- Critical Issue Detection & Mitigation Advice on Datasets: Get actionable, AI-powered insights with recommendations to improve your agent’s performance and accelerate your path to production.
- Access FAGI from AWS Marketplace: Sign up or sign in to the FAGI platform via AWS Marketplace and leverage AWS contracts and billing to work with FAGI.
- Support for LlamaIndex OTEL Instrumentation in TypeScript: Easily add observability to agents leveraging the LlamaIndex framework with our TypeScript SDK on the FAGI platform.
- Improved UX for Evaluate Pages: Enhanced the Evaluate Page interface for a consistent experience across devices.
- Faster Alert Graph Loading: Reduced load times of alert graphs in the Alerts feature for quicker and smoother performance.
- UI Improvements for Sidebar Navigation: Enhanced sidebar navigation for better usability.
- User Filtering on Navigation: When navigating from the Users List or User Details Page to the LLM Tracing or Sessions Page, the user’s ID is now automatically applied as a filter.
- User Details Filter Persistence: User filters (for traces and sessions) now persist across page refreshes.
- UI Enhancements for Simulator Agent Form: Improved the user interface for the simulator agent form.
- Support for Video in Trace Detail Screen: Added support for viewing videos in the Trace Details screen.
- Fixed Scroll Issue in Agent Description Box (Simulation): Enabled scroll functionality via mouse in the agent description box within the simulation module.
- Error Handling on Simulation Page: Improved error handling for low credit balances on the simulation homepage to enhance user experience.
- Credit Utilization for Error Localizer: Added visibility of credit utilization for the error localizer in the usage summary screen.
Week of 2025-08-19
What’s New
Features- Comparison Summary: Compare evaluations and prompt summaries of two different datasets now with detailed graphs and scores.
- Function Evals: Enable adding and editing function-type custom evals from the list of evals supported by Future AGI.
- Edit Synthetic Dataset: Edit existing synthetic datasets directly or create a new version from changes.
- Document Column Support in Dataset: New document column type to upload/store files in cells (TXT, DOC, DOCX, PDF).
- User Tab in Dashboard and Observe: Searchable, filterable user list and detailed user view with metrics, interactive charts, synced time filters, and traces/sessions tabs.
- Displaying the Timestamp Column in Trace/Spans: Added Start Time and End Time columns in Observe → LLM Tracing and Prototype → All Runs → Run Details.
- Configure Labels: Configure system and custom labels per prompt version in Prompt Management.
- Async Evals via SDK: Run evaluation asynchronously for long-running evaluations or larger datasets.
- SDK Codes: Update the SDK codes for columns and rows on create dataset, add rows, and landing dataset page.
- Fixed the editable issue in custom evals form: Incorrect config was displayed on evals page for function evals.
- The bottom section for trace detail drawer disappeared: Dragging the bottom section caused the entire bottom area to disappear; behavior corrected.
- UI screen optimization for different screen sizes.
- Bug fixes for updates summary screen - color, text, and font alignment.
- Cell loading state issues while creating synthetic data.
- UI enhancement for simulation agent flow.
- CSV upload bug in datasets and UI fixes for add feedback pop-up.
Week of 2025-08-11
What’s New
Features- Summary Screen Revamp (Evaluation and Prompt): Unified visual overview of model performance with pass rates and comparative spider/bar/pie charts; includes compare views, drill-downs, and consistent filters.
- Alerts Revamp: Create alert rules in Observe (+New Alert) from Alerts tab or project; notifications via Slack/Email with guided Alert Type and Configuration steps.
- Upgrades in Prompt SDK: Increased prompt availability after first run by virtue of prompt caching. Seamlessly deploy prompts in production, staging, or dev and perform A/B tests using prompt SDK.
- Run prompt issues for longer prompts (>5K words).
- Bug fixes for voice simulation naming convention in transcript deleting runs and selection of agent simulator.
Week of 2025-08-07
What’s New
Features- Voice Simulation: New testing infrastructure that deploys AI agents to conduct real conversations with your voice systems, analyzing actual audio, not just transcripts.
- Edit Evals Config: Now edit the config (prompt/criteria) for your custom evals via evals playground, but with the restriction of no variable addition.
- Bug fix for dynamic column creation via Weviate.
- Reduced dependencies for TraceAI packages (HTTPS & GRPC).
- Automated eval refinement: Retune your evals in evals playground by providing feedback.
- Markdown now available as a default option for improved readability.
- Support for video (traces and spans) in Observe project.
Week of 2025-07-29
What’s New
Features- Edit, Duplicate, and Delete Custom Evals: Now duplicate, edit, or delete evaluations if they are not in use anymore or logic is outdated.
- Bulk Annotation/User Feedback: Bulk annotate your observe traces with user feedback directly using API or SDK.
- JSON View for Evals Log: Access evals log data in JSON format in evals playground.
- Span name visibility in traces for Observe and Prototype.
- Bug fix for adding owner to workspace.
- Error handling for evaluations in prompt workbench.
- Add variables to system and assistant user roles in prompt workbench.
- Speed enhancement for dataset loading.
- Error state handling for evaluations in prompt workbench.
Week of 2025-07-21
What’s New
Features- Run button on single cell in evaluations workbench.
- Now users can add notes to observe traces.
- Improved search logic to render relevant search results in dataset.
- Dataset bugs and API network call optimizations.
- Fixed audio icon.
- Error handling for network connection issues.
- Bug fixes for prompt workbench versioning issues.
- Changed the color mapping for deterministic type evals.
- Updated loaders for evals playground.
- Pagination fix in Observe.
- Added clear functionality in add to dataset column mapping fields in Observe.
- Clear graph property when Observe changes; fixed thumbs down icon not rendering.
- Generate variable bug fix in prompt workbench.
- Experiment page break on content tab switch.
- Fixed the created_at 30-day filter on evals log section.
Week of 2025-07-14
What’s New
Bugs/Improvements- Prevented overscroll in X direction for entire platform.
- Glitch after refreshing while generating sample data.
- Error message update for doc uploads and save button status for doc upload.
- Variable auto-population issue in compare prompt for multiple versions.
- Restricted function tab to LLM spans only.
- Error handling for mandatory system prompt for a few LLM models.
- Added API null check in all places.
- Streaming issues after run prompt when the current prompt version is updated.
- Truncate model name in model details drawer.
- No rows error on dataset homepage for selective users with low speed.
- Easier removal of filters for Observe and Prototype.
- Fixed validation in quick filter number-related fields.
- Fixed inconsistent fonts in evaluation workbench.
- Added loading state to evaluations tab.
- Knowledge base name not visible in a few cases issue fixed.
- Fixed spacing issue in run prompt.
- Link updated for the workbench help section and width update as list.
Week of 2025-05-05
What’s New
Features- Diff view in experiment.
- Updated sections for Prototype and Observe.
- Error localization in Observe.
- [Observe+Prototype] Adding annotations flow for trace view details.
- Updated dataset layout and table design.
- Higher rate limits to send more traces in Observe.
- Sorting in alert.
- Support for audio in Observe and datasets.
- Improved error handling in prompt versioning.
- Removed unnecessary keys from evaluation outputs.
- Better handling of required keys to column names in add_evaluation in dataset.
- Removed TraceAI code from FutureAGI SDK - experiment rerun fix.
- SSO login issues.
- Eval ranking fixes.
- Fixed sizing and view issue in dataset when row size is adjusted.
- Fixed sidebar item not showing active style when child page is active globally.
- Edit integer type has red background in edit field.
- Fixed crashing of page when adding JSON value in dataset.
- Fixed knowledge base status update issue in case of network issues.
- Experiment tab bugs for some browsers and loading state issues on experiment page.
- Bug in run insight section of Prototype.
Week of 2025-04-28
What’s New
Features- Prototype / All Runs columns dropdown change.
- Prototype / Configure project.
- Trace details view for Observe/Prototype.
- Allow search in dataset.
- Run insights view - evals (deployed without the error modal part).
- Improved user flow for synthetic data creation with “best practices” for each input.
- Add to dataset flow from Prototype.
- API for Gmail account signup.
- Enabling search within data.
- First-time user experience walkthrough for newly onboarded users.
- Quick filters for annotations view in Prototype and Observe.
- Compare runs in Prototype.
- Diff view for compare dataset.
- Enhancement of Observe and Prototype.
- Addition of new evals for audio - conversational and completeness evals.
- New choice for Tone Eval if none of the choices are suitable.
- Bug on experiment view.
- UI/UX bugs - knowledge base and audio support for evals.
- Required input field column detail not coming on Audio Quality evals.
- UX changes for loader of plan screen.
- Changed the color and the percentage of the eval chips in experiment.
Week of 2025-04-21
What’s New
Features- Quick filters in Prototype & Observe.
- Added support for knowledge base creation and updating.
- Optimization of synthetic data generation.
- Evaluate working in compare datasets.
- Rate limit hit better UI.
- Audio and knowledge base bug fixes.
- Improved wrong evals view.
- Fixes in compare dataset.
- Changed the logo URL.
- Filter issue fixed in Prototype.
- Rate limit error message to upgrade the plan.
- Experiment optimization under datasets to work faster.
- Huggingface error handling for different datasets.