dreadnode.optimization
API reference for the dreadnode.optimization module.
SearchSpace
Section titled “SearchSpace”SearchSpace = Mapping[str, Distribution | list[Primitive]]Type alias for search space definitions.
StudyStopCondition
Section titled “StudyStopCondition”StudyStopCondition = StopCondition[list[Trial[CandidateT]]]Type alias for study stop conditions.
BudgetUpdated
Section titled “BudgetUpdated”Signals that GEPA updated optimization budget usage.
CandidateAccepted
Section titled “CandidateAccepted”Signals that GEPA accepted a proposed candidate.
CandidateRejected
Section titled “CandidateRejected”Signals that GEPA rejected a proposed candidate.
CapabilityEnvAdapter
Section titled “CapabilityEnvAdapter”Capability adapter that scores candidates against a provisioned task environment.
Each dataset row is evaluated by provisioning a TaskEnvironment via
:func:dreadnode.task_env, rendering the task instruction, running the
rebuilt agent, and invoking the configured scorers against the agent’s
output. Scorers can read dreadnode.core.current_task_environment to
reach the live sandbox (e.g. to shell-probe for a flag) while it is still
provisioned.
Dataset row conventions
task_ref(optional): overrides the adapter’s default task ref on a per-row basis. Drives which task each trial provisions.inputs(optional): per-row template bindings substituted into the task’s instruction. The primary mechanism for per-row variation.- Scoring fields (
expected_output,needle,reward, etc.) for reward-recipe-based scoring.
The dataset’s goal field is explicitly NOT consulted: the task’s
rendered instruction is the agent’s user message, and the capability’s
mutable surfaces are the optimization target. “Injecting a different
prompt per row” isn’t a capability_env concept — it’s a capability_agent
concept, and that adapter should be used instead.
Attributes:
task_ref(str) –Default task reference passed to :func:dreadnode.task_envwhen a row does not override it.timeout_sec(int | None) –Optional per-env provisioning timeout.
parallel_rows
Section titled “parallel_rows”parallel_rows: int = Field(default=1, ge=1)Maximum dataset rows to evaluate concurrently within one candidate’s
evaluate() call. 1 preserves serial behaviour. Higher values
provision that many TaskEnvironment sandboxes in parallel, so watch
platform concurrency limits.
evaluate
Section titled “evaluate”evaluate( batch: list[dict[str, Any]], candidate: dict[str, str], *, capture_traces: bool = False,) -> OptimizationEvaluationBatchEvaluate a candidate by running the rebuilt agent against per-row task envs.
evaluate_candidate
Section titled “evaluate_candidate”evaluate_candidate( candidate: dict[str, str], example: dict[str, Any] | None = None,) -> OptimizationEvaluationEvaluate one candidate in GEPA-compatible (score, side_info) form.
Categorical
Section titled “Categorical”Categorical(choices: list[Primitive])Categorical distribution for discrete choices.
Parameters:
choices(list[Primitive]) –List of possible values.
Distribution
Section titled “Distribution”Distribution()Base class for all search space distributions.
DreadnodeAgentAdapter
Section titled “DreadnodeAgentAdapter”Adapter that evaluates agent instruction candidates with Evaluation.
apply_candidate
Section titled “apply_candidate”apply_candidate(candidate: dict[str, str]) -> AgentClone the agent and apply an instruction-only candidate.
evaluate
Section titled “evaluate”evaluate( batch: list[dict[str, Any]], candidate: dict[str, str], *, capture_traces: bool = False,) -> OptimizationEvaluationBatchEvaluate one batch of examples and return per-example scores.
evaluate_candidate
Section titled “evaluate_candidate”evaluate_candidate( candidate: dict[str, str], example: dict[str, Any] | None = None,) -> OptimizationEvaluationEvaluate one candidate in a GEPA-compatible (score, side_info) shape.
make_reflective_dataset
Section titled “make_reflective_dataset”make_reflective_dataset( candidate: dict[str, str], eval_batch: OptimizationEvaluationBatch, components_to_update: list[str],) -> dict[str, list[dict[str, t.Any]]]Build component-scoped reflective data for GEPA.
seed_candidate
Section titled “seed_candidate”seed_candidate() -> dict[str, str]Return the current instruction candidate for this agent.
EngineConfig
Section titled “EngineConfig”Execution settings for the optimization engine.
to_gepa_kwargs
Section titled “to_gepa_kwargs”to_gepa_kwargs() -> dict[str, t.Any]Return GEPA-compatible keyword arguments for the engine config.
Float( low: float, high: float, log: bool = False, step: float | None = None,)Floating-point distribution for continuous parameters.
Parameters:
low(float) –Lower bound (inclusive).high(float) –Upper bound (inclusive).log(bool, default:False) –If True, sample in log space.step(float | None, default:None) –Discretization step size.
GEPABackend
Section titled “GEPABackend”GEPA-backed implementation of Dreadnode optimize_anything.
Int(low: int, high: int, log: bool = False, step: int = 1)Integer distribution for discrete parameters.
Parameters:
low(int) –Lower bound (inclusive).high(int) –Upper bound (inclusive).log(bool, default:False) –If True, sample in log space.step(int, default:1) –Step size between values.
IterationStart
Section titled “IterationStart”Signals the start of an optimization iteration.
MergeConfig
Section titled “MergeConfig”Merge-policy settings for candidate combination.
to_gepa_kwargs
Section titled “to_gepa_kwargs”to_gepa_kwargs() -> dict[str, t.Any]Return GEPA-compatible keyword arguments for merge settings.
NewBestTrial
Section titled “NewBestTrial”Signals that a new best trial has been found.
Optimization
Section titled “Optimization”Dreadnode-native optimize_anything executor.
effective_dataset
Section titled “effective_dataset”effective_dataset: list[Any] | NoneReturn the trainset if provided, otherwise dataset.
optimization_id
Section titled “optimization_id”optimization_id: UUIDStable identifier for this optimization run.
console
Section titled “console”console() -> OptimizationResult[CandidateT]Run the optimization with a live console adapter.
OptimizationAdapter
Section titled “OptimizationAdapter”Adapter contract for systems that need batched evaluation and reflection.
OptimizationBackend
Section titled “OptimizationBackend”Base interface for optimization backends.
OptimizationBackendError
Section titled “OptimizationBackendError”Raised when an optimization backend cannot execute a request.
OptimizationConfig
Section titled “OptimizationConfig”Top-level configuration for Dreadnode optimize_anything runs.
OptimizationDependencyError
Section titled “OptimizationDependencyError”Raised when an optimization backend dependency is unavailable.
OptimizationEnd
Section titled “OptimizationEnd”Signals the end of an optimize_anything run.
OptimizationError
Section titled “OptimizationError”Signals that optimize_anything failed before producing a result.
OptimizationEvaluation
Section titled “OptimizationEvaluation”OptimizationEvaluation( score: float | None = None, scores: dict[str, float] = dict(), side_info: dict[str, Any] = dict(), evaluation_result: EvalResult[Any, Any] | None = None, traces: Any = None,)Normalized evaluator output for optimize_anything.
OptimizationEvaluationBatch
Section titled “OptimizationEvaluationBatch”OptimizationEvaluationBatch( outputs: list[Any] = list(), scores: list[float] = list(), trajectories: list[Any] | None = None, objective_scores: list[dict[str, float]] | None = None,)Batch evaluation data returned by Dreadnode-native adapters.
OptimizationEvaluator
Section titled “OptimizationEvaluator”Callable used to score a text candidate.
OptimizationEvent
Section titled “OptimizationEvent”Base event type for Dreadnode optimize_anything.
OptimizationResult
Section titled “OptimizationResult”OptimizationResult( backend: str, seed_candidate: CandidateT | None = None, best_candidate: CandidateT | None = None, best_score: float | None = None, best_scores: dict[str, float] = dict(), objective: str | None = None, train_size: int = 0, val_size: int = 0, pareto_frontier: list[CandidateT] = list(), history: list[Any] = list(), metadata: dict[str, Any] = dict(), raw_result: Any = None,)Result of a Dreadnode optimize_anything run.
frontier_size
Section titled “frontier_size”frontier_size: intReturn the number of candidates currently on the Pareto frontier.
to_dict
Section titled “to_dict”to_dict() -> dict[str, t.Any]Return a JSON-serializable result dictionary.
OptimizationStart
Section titled “OptimizationStart”Signals the beginning of an optimize_anything run.
ParetoFrontUpdated
Section titled “ParetoFrontUpdated”Signals that the Pareto frontier changed.
RefinerConfig
Section titled “RefinerConfig”Candidate-refinement settings for optimize_anything.
to_gepa_kwargs
Section titled “to_gepa_kwargs”to_gepa_kwargs() -> dict[str, t.Any]Return GEPA-compatible keyword arguments for refiner settings.
ReflectionConfig
Section titled “ReflectionConfig”Reflection-model settings passed through to GEPA.
to_gepa_kwargs
Section titled “to_gepa_kwargs”to_gepa_kwargs() -> dict[str, t.Any]Return GEPA-compatible keyword arguments for the reflection config.
Sample
Section titled “Sample”Sample( candidate: CandidateT, metadata: dict[str, Any] = dict())A candidate proposed by a sampler.
Attributes:
candidate(CandidateT) –The candidate value to evaluate.metadata(dict[str, Any]) –Optional metadata (e.g., parent_id for graph-based search).
parent_id
Section titled “parent_id”parent_id: UUID | NoneConvenience accessor for parent_id in metadata.
Sampler
Section titled “Sampler”Base class for optimization samplers.
Samplers propose candidates and learn from evaluation results. Study controls the execution loop - samplers are passive.
The sample/tell interface:
- sample(history) -> list[Sample]: Propose candidates to evaluate
- tell(trials): Receive evaluation results
Example
class GridSampler(Sampler[dict]): def init(self, grid: dict[str, list]): self.combinations = list(itertools.product(*grid.values())) self.keys = list(grid.keys()) self.index = 0
def sample(self, history: list[Trial]) -> list[Sample]: if self.exhausted: return [] candidate = dict(zip(self.keys, self.combinations[self.index])) self.index += 1 return [Sample(candidate)]
@propertydef exhausted(self) -> bool: return self.index >= len(self.combinations)exhausted
Section titled “exhausted”exhausted: boolCheck if sampler has no more candidates to propose.
Override for finite samplers (grid search, explicit candidate list). Default: never exhausted (infinite sampling).
Returns:
bool–True if sampler cannot propose more candidates.
reset() -> NoneReset sampler state for reuse.
Override if sampler maintains state that should be cleared between study runs.
sample
Section titled “sample”sample( history: list[Trial[CandidateT]],) -> ( list[Sample[CandidateT]] | t.Awaitable[list[Sample[CandidateT]]])Propose candidates to evaluate.
Can be sync or async. If async (returns awaitable), Study will await it. This allows samplers that use async operations (like LLM calls) to generate candidates.
Parameters:
history(list[Trial[CandidateT]]) –All trials evaluated so far (completed, failed, or pruned).
Returns:
list[Sample[CandidateT]] | Awaitable[list[Sample[CandidateT]]]–List of samples to evaluate together as a batch.list[Sample[CandidateT]] | Awaitable[list[Sample[CandidateT]]]–Return empty list to signal the sampler is exhausted.list[Sample[CandidateT]] | Awaitable[list[Sample[CandidateT]]]–Can also return an awaitable that resolves to the list.
tell(trials: list[Trial[CandidateT]]) -> NoneReceive evaluation results.
Called after each batch from sample() completes evaluation. Override to update internal state based on results.
Parameters:
trials(list[Trial[CandidateT]]) –Completed trials from the last sample() batch. Each trial has status, scores, and other result data.
SessionRuntimeAdapter
Section titled “SessionRuntimeAdapter”Capability optimization that runs each trial through a real
ManagedRuntimeClient session.
See OPTIMIZE_RUNTIME.MD §5 for the full design. Inherits seed,
materialize, propose_new_texts, make_reflective_dataset from
:class:StackAwareCapabilityAdapter and overrides evaluate +
materialize_candidate (to write under Storage instead of
tempfile) and _format_feedback (optional turn excerpt).
materialize_retention
Section titled “materialize_retention”materialize_retention: Literal["all", "frontier_only"] = ( "frontier_only")Which materialized capability trees to keep on disk after the optimization run terminates.
optimization_job_id
Section titled “optimization_job_id”optimization_job_id: str | None = NoneThreaded into Storage.optimization_job_path so materialized
trees land under <storage>/optimizations/<job>/iter-N/<hash>/.
The bridge that wraps the adapter (the same code that calls
api.create_optimization_job) is expected to set this before the
first evaluate call.
persist_sessions
Section titled “persist_sessions”persist_sessions: Literal["all", "accepted", "none"] = "all"Which trial sessions to persist. "accepted" is a future
enhancement (deferred sync until candidate accept signal); first cut
treats it the same as "all".
policy
Section titled “policy”policy: str | dict[str, Any] = 'headless'Policy name or dict passed to RuntimeClient.create_session.
The headless policy contributes a max_steps hook automatically;
pass a dict to override e.g. \{"name": "headless", "max_steps": 10\}.
system_prompt_append
Section titled “system_prompt_append”system_prompt_append: str | None = NoneMirrors the CLI --system-prompt overlay; threaded into
:class:ManagedRuntimeClient at boot.
task_ref
Section titled “task_ref”task_ref: str | None = NoneOptional task reference; if set, each row provisions dn.task_env.
Mirrors :class:CapabilityEnvAdapter.
trace_excerpt_chars
Section titled “trace_excerpt_chars”trace_excerpt_chars: int = 0When >0, inline a tool-call summary into the reflective dataset’s
Feedback field. Tunes how much trajectory context the GEPA
reflection LM sees per row. Default off for parity with parent.
aclose
Section titled “aclose”aclose() -> NoneShut down the in-process runtime. Safe to call multiple times.
evaluate
Section titled “evaluate”evaluate( batch: list[dict[str, Any]], candidate: dict[str, str], *, capture_traces: bool = False,) -> OptimizationEvaluationBatchMaterialize → register transient capability → drive trial sessions.
evaluate_candidate
Section titled “evaluate_candidate”evaluate_candidate( candidate: dict[str, str], example: dict[str, Any] | None = None,) -> OptimizationEvaluationSingle-row eval entry, GEPA-compatible (score, side_info) shape.
mark_frontier
Section titled “mark_frontier”mark_frontier(candidate_hash: str) -> NonePin a candidate’s materialized tree against frontier_only cleanup.
materialize_candidate
Section titled “materialize_candidate”materialize_candidate( candidate: dict[str, str], *, job_id: str | None = None, iteration: int | None = None, candidate_hash: str | None = None,) -> MaterializedCapabilityCandidateMaterialize the candidate under
Storage.optimization_candidate_path(job_id, iteration, hash).
Falls through to :meth:StackAwareCapabilityAdapter.materialize_candidate
(which uses :class:tempfile.TemporaryDirectory) when called
without optimization context — preserves the parent’s behavior
for callers that don’t go through the adapter’s evaluate.
StackAwareCapabilityAdapter
Section titled “StackAwareCapabilityAdapter”Capability-level adapter for stack-aware local optimization.
policy_factory
Section titled “policy_factory”policy_factory: Callable[[], Any] | None = NoneOptional factory returning a SessionPolicy whose extra_hooks() are
layered into the agent on each evaluation (e.g. HeadlessSessionPolicy
contributing a max_steps hook). Called per _build_agent.
proposal_enabled
Section titled “proposal_enabled”proposal_enabled: boolWhether this adapter exposes a custom candidate proposer.
registry
Section titled “registry”registry: Any = NoneOptional CapabilityRegistry for cross-capability tool/hook merging.
When provided, registry.all_tools() + registry.all_hooks() are
layered into the agent alongside the materialized capability’s own
tools/hooks.
system_prompt_append
Section titled “system_prompt_append”system_prompt_append: str | None = NoneMirrors the production CLI --system-prompt overlay; appended to the
final system prompt by create_agent so optimization sees the same
prompt-stack production does.
apply_candidate
Section titled “apply_candidate”apply_candidate(candidate: dict[str, str]) -> t.AnyBuild an agent from a materialized candidate workspace.
cleanup
Section titled “cleanup”cleanup() -> NoneDelete any materialized candidate workspaces retained by apply_candidate().
component_keys
Section titled “component_keys”component_keys() -> list[str]Return all editable component keys in stable order.
evaluate
Section titled “evaluate”evaluate( batch: list[dict[str, Any]], candidate: dict[str, str], *, capture_traces: bool = False,) -> OptimizationEvaluationBatchEvaluate a candidate by rebuilding the capability and running Evaluation.
evaluate_candidate
Section titled “evaluate_candidate”evaluate_candidate( candidate: dict[str, str], example: dict[str, Any] | None = None,) -> OptimizationEvaluationEvaluate one candidate in GEPA-compatible (score, side_info) form.
make_reflective_dataset
Section titled “make_reflective_dataset”make_reflective_dataset( candidate: dict[str, str], eval_batch: OptimizationEvaluationBatch, components_to_update: list[str],) -> dict[str, list[dict[str, t.Any]]]Build component-scoped reflective data for GEPA.
materialize_candidate
Section titled “materialize_candidate”materialize_candidate( candidate: dict[str, str],) -> MaterializedCapabilityCandidateCopy the capability to a temp workspace and apply candidate edits.
propose_new_texts
Section titled “propose_new_texts”propose_new_texts( candidate: dict[str, str], reflective_dataset: dict[str, list[dict[str, Any]]], components_to_update: list[str],) -> dict[str, str]Delegate candidate proposal to an optional proposer capability agent.
seed_candidate
Section titled “seed_candidate”seed_candidate() -> dict[str, str]Return the current flat candidate map for mutable capability surfaces.
Optimization study using a sampler and objective function.
Study controls the optimization loop:
- Ask sampler for candidates via sample()
- Evaluate candidates via objective function
- Inform sampler of results via tell()
- Repeat until stopping condition or sampler exhausted
Example
async def objective(candidate: dict) -> float: agent = Agent(model=candidate['model'], temperature=candidate['temp']) result = await agent.run("test prompt") return compute_score(result)
study = Study( name="optimize-agent", objective=objective, sampler=GridSampler({'model': ['gpt-4', 'claude'], 'temp': [0.5, 1.0]}), direction="maximize",)result = await study.run()Attributes:
objective(SkipValidation[ObjectiveFunc[CandidateT]]) –Function that takes a candidate and returns score(s).sampler(SkipValidation[Sampler[CandidateT]]) –Sampler that proposes candidates and learns from results.direction(Direction | list[Direction]) –“maximize” or “minimize” (or list for multi-objective).n_iterations(int) –Maximum number of iterations (sample/tell cycles).constraints(ScorersLike[CandidateT]) –Optional scorers to validate candidates before running.stop_conditions(list[StudyStopCondition]) –Conditions that will stop the study early.
airt_assessment_id
Section titled “airt_assessment_id”airt_assessment_id: str | None = NoneAIRT assessment ID for platform linking.
airt_attack_domain
Section titled “airt_attack_domain”airt_attack_domain: str | None = NoneAttack domain: ‘generative’ or ‘adversarial_ml’.
airt_attack_name
Section titled “airt_attack_name”airt_attack_name: str | None = NoneAIRT attack type (tap, pair, goat, crescendo).
airt_attacker_model
Section titled “airt_attacker_model”airt_attacker_model: str | None = NoneAttacker model identifier.
airt_category
Section titled “airt_category”airt_category: str | None = NoneAIRT category tier (safety/security).
airt_distance_norm
Section titled “airt_distance_norm”airt_distance_norm: str | None = NoneDistance norm for ML attacks: ‘l0’, ‘l1’, ‘l2’, ‘linf’.
airt_evaluator_model
Section titled “airt_evaluator_model”airt_evaluator_model: str | None = NoneEvaluator/judge model identifier.
airt_goal
Section titled “airt_goal”airt_goal: str | None = NoneAIRT attack goal text.
airt_goal_category
Section titled “airt_goal_category”airt_goal_category: str | None = NoneAIRT goal category slug (e.g. cybersecurity, weapons).
airt_input_modality
Section titled “airt_input_modality”airt_input_modality: str | None = NoneInput modality: ‘image’, ‘tabular’, ‘text’.
airt_jailbreak_threshold
Section titled “airt_jailbreak_threshold”airt_jailbreak_threshold: float = 0.5Score threshold for classifying a trial as a jailbreak (default 0.5).
airt_original_class
Section titled “airt_original_class”airt_original_class: str | None = NoneOriginal classification label for ML attacks.
airt_perturbation_budget
Section titled “airt_perturbation_budget”airt_perturbation_budget: float | None = NonePerturbation budget (epsilon) for ML attacks.
airt_sub_category
Section titled “airt_sub_category”airt_sub_category: str | None = NoneAIRT sub-category slug (e.g. cybersecurity, weapons).
airt_target_model
Section titled “airt_target_model”airt_target_model: str | None = NoneTarget model identifier.
airt_transforms
Section titled “airt_transforms”airt_transforms: list[str] | None = NoneAIRT transforms applied to prompts.
compliance_tags
Section titled “compliance_tags”compliance_tags: dict[str, Any] = Field( default_factory=dict)Compliance framework tags (OWASP, ATLAS, SAIF, NIST) for this study.
constraints
Section titled “constraints”constraints: ScorersLike[CandidateT] = Field( default_factory=list)Scorers that validate candidates before evaluation. Trial is pruned if any fails.
direction
Section titled “direction”direction: Direction | list[Direction] = 'maximize'Optimization direction(s). Use list for multi-objective.
directions
Section titled “directions”directions: list[Direction]Get directions as list.
max_trials
Section titled “max_trials”max_trials: int | None = NoneHard cap on total trial count. When set, the study stops after this many trials regardless of iteration count. This prevents batch expansion from generating excessive trials (e.g., beam_width * branching_factor per iteration).
n_iterations
Section titled “n_iterations”n_iterations: int = Config(default=100, ge=1)Maximum number of iterations (sample/tell cycles) to run.
objective
Section titled “objective”objective: SkipValidation[ObjectiveFunc[CandidateT]]Function that evaluates a candidate and returns score(s).
objective_names
Section titled “objective_names”objective_names: list[str]Get objective names (populated after first trial).
sampler
Section titled “sampler”sampler: SkipValidation[Sampler[CandidateT]]Sampler that proposes candidates to evaluate.
stop_conditions
Section titled “stop_conditions”stop_conditions: list[StudyStopCondition] = Field( default_factory=list)Conditions that stop the study early when met.
add_stop_condition
Section titled “add_stop_condition”add_stop_condition( condition: StudyStopCondition,) -> te.SelfAdd a stopping condition, returning a new Study.
console
Section titled “console”console() -> StudyResult[CandidateT]Run with live progress dashboard.
StudyEnd
Section titled “StudyEnd”Signals the end of the study.
StudyEvent
Section titled “StudyEvent”Base class for study-level events.
as_dict
Section titled “as_dict”as_dict() -> dict[str, t.Any]Serialize event for transport.
emit(span: TaskSpan) -> NoneEmit this event’s telemetry to the span.
StudyResult
Section titled “StudyResult”StudyResult( trials: list[Trial[CandidateT]] = list(), stop_reason: StudyStopReason = "unknown", stop_explanation: str | None = None,)The final result of an optimization study, containing all trials and summary statistics.
Attributes:
trials(list[Trial[CandidateT]]) –A complete list of all trials generated during the study.stop_reason(StudyStopReason) –The reason the study concluded.stop_explanation(str | None) –A human-readable explanation for why the study stopped.
best_score
Section titled “best_score”best_score: float | NoneThe highest score among all finished trials. Returns None if no trials succeeded.
best_trial
Section titled “best_trial”best_trial: Trial[CandidateT] | NoneThe trial with the highest score among all finished trials. Returns None if no trials succeeded.
failed_trials
Section titled “failed_trials”failed_trials: list[Trial[CandidateT]]A list of all trials that failed.
finished_trials
Section titled “finished_trials”finished_trials: intNumber of successfully finished trials.
pending_trials
Section titled “pending_trials”pending_trials: list[Trial[CandidateT]]A list of all trials that are still pending.
pruned_trials
Section titled “pruned_trials”pruned_trials: list[Trial[CandidateT]]A list of all trials that were pruned.
running_trials
Section titled “running_trials”running_trials: list[Trial[CandidateT]]A list of all trials that are currently running.
total_trials
Section titled “total_trials”total_trials: intTotal number of trials.
to_dataframe
Section titled “to_dataframe”to_dataframe() -> pd.DataFrameConverts the trials into a pandas DataFrame for analysis.
to_dicts
Section titled “to_dicts”to_dicts() -> list[dict[str, t.Any]]Flattens the results into a list of dictionaries, one for each trial.
to_jsonl
Section titled “to_jsonl”to_jsonl(path: str | Path) -> NoneSaves the trials to a JSON Lines (JSONL) file.
StudyStart
Section titled “StudyStart”Signals the beginning of a study.
TrackingConfig
Section titled “TrackingConfig”Tracing and reflection-data settings for optimization runs.
to_gepa_kwargs
Section titled “to_gepa_kwargs”to_gepa_kwargs() -> dict[str, t.Any]Return GEPA-compatible keyword arguments for tracking settings.
Represents a single, evaluated point in the search space.
Attributes:
id(UUID) –Unique identifier for the trial.candidate(CandidateT) –The candidate configuration being assessed.status(TrialStatus) –Current status of the trial.score(float) –The primary, single-value fitness score for this trial. This is an average of all objective scores for this trial adjusted based on their objective directions (higher is better).eval_result(float) –Complete evaluation result of the trial and associated dataset.pruning_reason(str | None) –Reason for pruning this trial, if applicable.error(str | None) –Any error which occurred while processing this trial.step(int) –The optimization step which produced this trial.dataset(int) –The specific dataset used for probing.created_at(datetime) –The creation timestamp of the trial.
all_scores
Section titled “all_scores”all_scores: dict[str, float]A dictionary of all named metric mean values from the evaluation result.
This includes scores not directly related to the objective.
score_breakdown
Section titled “score_breakdown”score_breakdown: dict[str, list[float]]Returns a breakdown of all objective scores across all samples in the evaluation result.
Returns:
dict[str, list[float]]–A dictionary where keys are objective names and values are lists of scores,dict[str, list[float]]–with each score corresponding to a sample from the evaluation dataset.
__await__
Section titled “__await__”__await__() -> t.Generator[t.Any, None, Trial[CandidateT]]Await the completion of the trial.
done() -> boolA non-blocking check to see if the trial’s evaluation is complete.
get_directional_score
Section titled “get_directional_score”get_directional_score( name: str | None = None, default: float = -float("inf")) -> floatGet a specific named objective score - adjusted for optimization direction (higher is better), or the overall score if no name is given.
Parameters:
name(str | None, default:None) –The name of the objective.default(float, default:-float('inf')) –The value to return if the named score is not found.
wait_for
Section titled “wait_for”wait_for( *trials: Trial[CandidateT],) -> list[Trial[CandidateT]]Await the completion of multiple trials.
Parameters:
*trials(Trial[CandidateT], default:()) –The trials to wait for.
Returns:
list[Trial[CandidateT]]–A future that resolves to a list of completed trials.
TrialComplete
Section titled “TrialComplete”Signals that a trial has completed successfully.
TrialEvent
Section titled “TrialEvent”Base class for trial-level events. Linked to study via span hierarchy.
as_dict
Section titled “as_dict”as_dict() -> dict[str, t.Any]Serialize event for transport.
emit(span: TaskSpan) -> NoneEmit this event’s telemetry to the span.
TrialFailed
Section titled “TrialFailed”Signals that a trial has failed.
TrialPruned
Section titled “TrialPruned”Signals that a trial was pruned (constraint not satisfied).
TrialStart
Section titled “TrialStart”Signals the start of a trial.
ValsetEvaluated
Section titled “ValsetEvaluated”Signals that GEPA finished a validation-set evaluation.
optimize_anything
Section titled “optimize_anything”optimize_anything( seed_candidate: CandidateT | None = None, evaluator: OptimizationEvaluator[CandidateT] | None = None, *, name: str | None = None, description: str = "", objective: str | None = None, background: str | None = None, dataset: list[Any] | None = None, trainset: list[Any] | None = None, valset: list[Any] | None = None, config: OptimizationConfig | None = None, backend: str | OptimizationBackend[CandidateT] = "gepa", adapter: OptimizationAdapter[CandidateT] | None = None, tags: list[str] | None = None, label: str | None = None, concurrency: int = 1,) -> Optimization[CandidateT]Construct a Dreadnode-native optimize_anything executor.