Meta-Prompt Optimizer: Teacher LLM Prompt Refinement

The Meta-Prompt optimizer uses a teacher LLM for deep reasoning-based prompt refinement through systematic failure analysis and rewriting.

Meta-Prompt uses a powerful teacher LLM to analyze how your prompt performs, understand why it fails on specific examples, formulate hypotheses about improvements, and completely rewrite the prompt. This approach is inspired by the promptim library and excels at tasks requiring deep reasoning.

When to Use Meta-Prompt

✅ Best For

Complex reasoning tasks
Tasks where understanding failures helps
Refining well-scoped prompts
Deep iterative improvement

❌ Not Ideal For

Quick experiments (slower)
Simple classification tasks
Very large datasets (costly)
Tasks with unclear failure patterns

How It Works

Meta-Prompt follows a systematic analysis-and-rewrite cycle:

Evaluate Current Prompt

Run the current prompt on a subset of your dataset and collect scores

Identify Failures

Focus on examples with low scores to understand what went wrong

Formulate Hypothesis

Teacher model analyzes failures and proposes a specific improvement theory

Rewrite Prompt

Generate a complete new prompt implementing the hypothesis

Repeat

Continue for multiple rounds, building on previous insights

What the teacher sees (each round): Current prompt; previous failed attempts (to avoid repeating mistakes); performance data (which examples failed and why); your task description.

What the teacher returns: A hypothesis and an improved prompt, for example:

{
  "hypothesis": "The prompt fails on complex multi-sentence texts because it doesn't specify a structure. Adding explicit instruction to identify main points first should improve clarity.",
  "improved_prompt": "First identify the 2-3 main points in the following text. Then write a single concise sentence that captures these points:\n\n{text}"
}

Note

Unlike optimizers that tweak parts of a prompt, Meta-Prompt rewrites the entire prompt each iteration based on deep analysis.

Basic Usage

from fi.opt.optimizers import MetaPromptOptimizer
from fi.opt.generators import LiteLLMGenerator
from fi.opt.datamappers import BasicDataMapper
from fi.opt.base.evaluator import Evaluator

# Setup teacher model (use a powerful model for analysis)
teacher = LiteLLMGenerator(
    model="gpt-4o",
    prompt_template="{prompt}"
)

# Setup evaluator
evaluator = Evaluator(
    eval_template="summary_quality",
    eval_model_name="turing_flash",
    fi_api_key="your_key",
    fi_secret_key="your_secret"
)

# Setup data mapper
data_mapper = BasicDataMapper(
    key_map={"input": "text", "output": "generated_output"}
)

# Create optimizer
optimizer = MetaPromptOptimizer(
    teacher_generator=teacher
)

# Run optimization
result = optimizer.optimize(
    evaluator=evaluator,
    data_mapper=data_mapper,
    dataset=dataset,
    initial_prompts=["Summarize this text: {text}"],
    task_description="Create concise, informative summaries",
    num_rounds=5,
    eval_subset_size=40
)

print(f"Improvement: {result.final_score:.2%}")
print(f"Best prompt:\n{result.best_generator.get_prompt_template()}")

Parameters

Parameter	Type	Default	Description
`teacher_generator`	LiteLLMGenerator	required	Model for analysis and rewrites (e.g. gpt-4o, claude-3-opus)
`task_description`	str	”I want to improve my prompt.”	What the optimized prompt should achieve; more specific helps
`num_rounds`	int	5	Analysis-and-rewrite iterations (passed to `optimize()`)
`eval_subset_size`	int	40	Examples to evaluate each round (passed to `optimize()`)

Key concepts: Each round the teacher sees the current prompt, failed attempts, performance data, and your task description. It returns a hypothesis and an improved prompt. Unlike tweaking parts of a prompt, Meta-Prompt rewrites the entire prompt each iteration.

Underlying Research

The Meta-Prompt optimizer is inspired by meta-learning and reflective AI systems, where a model improves its own processes.

Meta-Learning: The core idea is formalized in research like “System Prompt Optimization with Meta-Learning”, which uses bilevel optimization. Another related work is “metaTextGrad”, which optimizes both prompts and their surrounding structures.
Industry Tools: This reflective approach is used in tools like Google’s Vertex AI Prompt Optimizer and is a key feature in advanced models for self-improvement.
Frameworks: The concept is explored in libraries like promptim and is classified in surveys as a leading LLM-driven optimization method.

Next steps

Try ProTeGi

For more systematic error analysis.

Compare Optimizers

See all optimization strategies.

Was this page helpful?

Questions & Discussion