Loading a dataset in the Future AGI platform is easy. You can either directly upload it as JSON or CSV, or you could import it from Hugging Face. Follow detailed steps on how to add a dataset to Future AGI in the docs.
After successfully loading the dataset, you can see your dataset in the dashboard. Now, click on Run Prompt from top right corner and create prompt to generate summary.After creating summary of each row, download the dataset using download button from top-right corner.
Summary Quality: Evaluates if a summary effectively captures the main points, maintains factual accuracy, and achieves appropriate length while preserving the original meaning. Checks for both inclusion of key information and exclusion of unnecessary details.
Compares generated response and a reference text using contextual embeddings from pre-trained language models like bert-base-uncased.
It calculates precision, recall, and F1 score at the token level, based on cosine similarity between embeddings of each token in the generated response and the reference text.
combined_results = []summary_columns = ["summary-gpt-4o", "summary-gpt-4o-mini", "summary-claude3.5-sonnet"]for column in summary_columns: print(f"Evaluating Summary Quality for {column}...") evaluate_summary_quality(dataset, column) print(f"Evaluating BERTScore for {column}...") evaluate_bertscore(dataset, column) print()
Output:
Copy
Evaluating Summary Quality for summary-gpt-4o...Evaluating BERTScore for summary-gpt-4o...Evaluating Summary Quality for summary-gpt-4o-mini...Evaluating BERTScore for summary-gpt-4o-mini...Evaluating Summary Quality for summary-claude3.5-sonnet...Evaluating BERTScore for summary-claude3.5-sonnet...
Copy
from tabulate import tabulatecombined_results_df = pd.DataFrame(combined_results)for col in ["Avg. Summary Quality", "Avg. Precision", "Avg. Recall", "Avg. F1"]: if col in combined_results_df.columns: combined_results_df[col] = combined_results_df[col].apply(lambda x: f"{x:.2f}") else: print(f"Warning: Column {col} not found in the dataframe")print(tabulate( combined_results_df, headers='keys', tablefmt='fancy_grid', showindex=False, colalign=("left", "center", "center", "center", "center")))