dreadnode.tracing

API reference for the dreadnode.tracing module.

Span

Span(
    name: str,
    tracer: Tracer,
    *,
    attributes: AnyDict | None = None,
    label: str | None = None,
    type: SpanType = "span",
    tags: Sequence[str] | None = None,
)

active

active: bool

Check if the span is currently active (recording).

duration

duration: float

Get the duration of the span in seconds.

exception

exception: BaseException | None

Get the exception recorded in the span, if any.

failed

failed: bool

Check if the span has failed.

is_recording

is_recording: bool

Check if the span is currently recording.

label

label: str

Get the label of the span.

TaskContext

Context for transferring and continuing tasks across processes.

TaskSpan

TaskSpan(
    name: str,
    tracer: Tracer,
    *,
    storage: Storage | None = None,
    project: str = "default",
    task_id: str | UUID | None = None,
    type: SpanType = "task",
    attributes: AnyDict | None = None,
    label: str | None = None,
    params: AnyDict | None = None,
    metrics: MetricsDict | None = None,
    tags: Sequence[str] | None = None,
    arguments: Arguments | None = None,
)

Self-sufficient task span with object storage, metrics, params, and artifacts.

TaskSpan is the primary span type for all operations. It manages its own:

Object storage (inputs, outputs, arbitrary objects)
Metrics tracking
Parameters
Artifacts
Child tasks

TaskSpans can be nested - a TaskSpan can contain child TaskSpans.

agent_id

agent_id: str | None

Get the ID of the nearest agent span in the parent chain.

all_tasks

all_tasks: list[TaskSpan[Any]]

Get all tasks, including nested subtasks.

arguments

arguments: Arguments | None

Get the arguments used for this task if created from a function.

eval_id

eval_id: str | None

Get the ID of the nearest evaluation span in the parent chain.

inputs

inputs: AnyDict

Get all logged inputs.

metrics

metrics: MetricsDict

Get all metrics.

output

output: R

Get the output of this task if created from a function.

outputs

outputs: AnyDict

Get all logged outputs.

params

params: AnyDict

Get all parameters.

parent_task

parent_task: TaskSpan[Any] | None

Get the parent task if it exists.

parent_task_id

parent_task_id: str

Get the parent task ID if it exists.

root_id

root_id: str

Get the root task’s ID (for span grouping/routing).

run_id

run_id: str

Alias for root_id (backwards compatibility).

study_id

study_id: str | None

Get the ID of the nearest study span in the parent chain.

task_id

task_id: str

Get this task’s unique ID.

tasks

tasks: list[TaskSpan[Any]]

Get the list of child tasks.

from_context

from_context(
    context: TaskContext,
    tracer: Tracer,
    storage: Storage | None = None,
) -> TaskSpan[t.Any]

Continue a task from captured context on a remote host.

get_average_metric_value

get_average_metric_value(key: str) -> float

Get the mean of a metric series.

get_object

get_object(hash_: str) -> Object

Get an object by its hash.

link_objects

link_objects(
    object_hash: str,
    link_hash: str,
    attributes: AnyDict | None = None,
) -> None

Link two objects together.

log_artifact

log_artifact(
    local_uri: str | Path, *, name: str | None = None
) -> dict[str, t.Any] | None

Log a file as an artifact.

log_input

log_input(
    name: str,
    value: Any,
    *,
    label: str | None = None,
    attributes: AnyDict | None = None,
) -> str

Log an input value.

log_metric

log_metric(
    name: str,
    value: float | bool,
    *,
    step: int = 0,
    origin: Any | None = None,
    timestamp: datetime | None = None,
    aggregation: MetricAggMode | None = None,
    prefix: str | None = None,
    attributes: JsonDict | None = None,
) -> Metric

log_metric(
    name: str,
    value: Metric,
    *,
    origin: Any | None = None,
    aggregation: MetricAggMode | None = None,
    prefix: str | None = None,
) -> Metric

log_metric(
    name: str,
    value: float | bool | Metric,
    *,
    step: int = 0,
    origin: Any | None = None,
    timestamp: datetime | None = None,
    aggregation: MetricAggMode | None = None,
    prefix: str | None = None,
    attributes: JsonDict | None = None,
) -> Metric

Log a metric value.

log_object

log_object(
    value: Any,
    *,
    label: str | None = None,
    event_name: str = EVENT_NAME_OBJECT,
    attributes: AnyDict | None = None,
) -> str

Store an object and return its hash. Objects are stored but not logged as span events.

log_output

log_output(
    name: str,
    value: Any,
    *,
    label: str | None = None,
    attributes: AnyDict | None = None,
) -> str

Log an output value.

log_param

log_param(key: str, value: Any) -> None

Log a single parameter.

log_params

log_params(**params: Any) -> None

Log multiple parameters.

bind_session_id

bind_session_id(session_id: str) -> t.Iterator[None]

Bind a session ID to all spans created in the current context.

find_span_by_type

find_span_by_type(span_type: str) -> TaskSpan[t.Any] | None

Find the nearest ancestor span with the given type.

Walks up the parent chain from the current task span to find a span matching the specified type (e.g., “agent”, “evaluation”, “study”).

Parameters:

span_type (str) –The span type to search for (e.g., “agent”, “evaluation”, “study”).

Returns:

TaskSpan[Any] | None –The matching TaskSpan or None if not found.

get_current_run_span

get_current_run_span() -> TaskSpan[t.Any] | None

Get the current task span (backwards compatibility).

get_current_task_span

get_current_task_span() -> TaskSpan[t.Any] | None

Get the current task span.

get_default_tracer

get_default_tracer() -> Tracer

Get the default tracer from the default Dreadnode instance. Span factories for type-safe tracing.

Only study_span and trial_span are actively used by Study. All other span creation should use dreadnode.task_span() directly.

study_span

study_span(
    name: str,
    *,
    label: str | None = None,
    tags: list[str] | None = None,
    airt_assessment_id: str | None = None,
    airt_attack_name: str | None = None,
    airt_goal: str | None = None,
    airt_goal_category: str | None = None,
    airt_category: str | None = None,
    airt_sub_category: str | None = None,
    airt_transforms: list[str] | None = None,
    airt_target_model: str | None = None,
    airt_attacker_model: str | None = None,
    airt_evaluator_model: str | None = None,
    airt_attack_domain: str | None = None,
    airt_distance_norm: str | None = None,
    airt_input_modality: str | None = None,
    airt_perturbation_budget: float | None = None,
    airt_original_class: str | None = None,
) -> TaskSpan[t.Any]

Create a bare span for optimization study execution.

Events populate all attributes via emit().

Parameters:

name (str) –The study name.
label (str | None, default: None ) –Human-readable label.
tags (list[str] | None, default: None ) –Additional tags.
airt_assessment_id (str | None, default: None ) –AIRT assessment ID (for platform linking).
airt_attack_name (str | None, default: None ) –AIRT attack name.
airt_goal (str | None, default: None ) –AIRT attack goal.
airt_goal_category (str | None, default: None ) –AIRT goal category.
airt_transforms (list[str] | None, default: None ) –AIRT transforms applied.
airt_target_model (str | None, default: None ) –Target model identifier.
airt_attacker_model (str | None, default: None ) –Attacker model identifier.
airt_evaluator_model (str | None, default: None ) –Evaluator model identifier.

Returns:

TaskSpan[Any] –A bare TaskSpan for study execution.

trial_span

trial_span(
    trial_id: str,
    *,
    step: int,
    task_name: str | None = None,
    label: str | None = None,
    tags: list[str] | None = None,
    airt_assessment_id: str | None = None,
    airt_trial_index: int | None = None,
    airt_attack_name: str | None = None,
    airt_goal: str | None = None,
    airt_goal_category: str | None = None,
    airt_category: str | None = None,
    airt_sub_category: str | None = None,
    airt_transforms: list[str] | None = None,
    airt_target_model: str | None = None,
    airt_attacker_model: str | None = None,
    airt_evaluator_model: str | None = None,
    airt_attack_domain: str | None = None,
    airt_distance_norm: str | None = None,
    airt_input_modality: str | None = None,
) -> TaskSpan[t.Any]

Create a bare span for optimization trial.

Events populate all attributes via emit().

Parameters:

trial_id (str) –Unique trial identifier.
step (int) –Trial number in the study.
task_name (str | None, default: None ) –Name of the task being evaluated (for label).
label (str | None, default: None ) –Human-readable label.
tags (list[str] | None, default: None ) –Additional tags.
airt_assessment_id (str | None, default: None ) –AIRT assessment ID (for linking trial to assessment).
airt_trial_index (int | None, default: None ) –AIRT trial index within the attack.
airt_attack_name (str | None, default: None ) –AIRT attack name.
airt_goal (str | None, default: None ) –AIRT attack goal.
airt_goal_category (str | None, default: None ) –AIRT goal category.
airt_transforms (list[str] | None, default: None ) –AIRT transforms applied.
airt_target_model (str | None, default: None ) –Target model identifier.
airt_attacker_model (str | None, default: None ) –Attacker model identifier.
airt_evaluator_model (str | None, default: None ) –Evaluator/judge model identifier.

Returns:

TaskSpan[Any] –A bare TaskSpan for trial execution. TraceBackend

TraceBackend = Literal['local', 'remote']

Controls remote OTLP streaming.

"local" — local JSONL only. No OTLP streaming.
"remote" — local JSONL and OTLP streaming.
None (default) — Auto-detect: stream if credentials exist.

Local JSONL is always populated regardless of this setting.

JsonlSpanExporter

JsonlSpanExporter(storage: Storage)

SpanExporter that writes spans to session or run-scoped JSONL files.

LocalStorageSpanExporter

LocalStorageSpanExporter(storage: Storage)

SpanExporter that writes spans to local JSONL files.

TraceExportConfig

TraceExportConfig(
    storage: Storage,
    run_id: str,
    _artifacts_file: IO[str] | None = None,
    _lock: Lock = threading.Lock(),
)

Configuration for trace exports to Storage.

Used by log_artifact() to write artifact metadata to JSONL.

get_path

get_path(signal: str, ext: str = 'jsonl') -> Path

Get the file path for a specific signal type.

shutdown

shutdown() -> None

Close any open file handles.

write_artifact

write_artifact(artifact: dict[str, Any]) -> None

Write artifact metadata to artifacts.jsonl.

WebSocketSpanExporter

WebSocketSpanExporter(
    run_id: str,
    host: str = "127.0.0.1",
    port: int = DEFAULT_MCP_PORT,
    *,
    auto_start: bool = True,
)

SpanExporter that sends spans to dreadnode serve via WebSocket.

Used by agents to stream spans in real-time to the serve endpoint for immediate visibility in Armada.

Create a WebSocket span exporter.

Parameters:

run_id (str) –The run identifier.
host (str, default: '127.0.0.1' ) –Server host address.
port (int, default: DEFAULT_MCP_PORT ) –Server port (default from MCP_SERVER_PORT env var or 8787).
auto_start (bool, default: True ) –Whether to auto-start the server if not running.

export

export(spans: Sequence[ReadableSpan]) -> SpanExportResult

Export spans to WebSocket server.

force_flush

force_flush(timeout_millis: int = 30000) -> bool

Force flush any pending spans.

shutdown

shutdown() -> None

Close the WebSocket connection.

span_to_flat_dict

span_to_flat_dict(span: ReadableSpan) -> dict

Convert an OTEL ReadableSpan to a flat dict for JSON serialization.

This is the canonical span serialization used by all local exporters (JSONL, WebSocket). task_span_to_graph

task_span_to_graph(task: TaskSpan[Any]) -> nx.DiGraph

Convert a TaskSpan hierarchy to a networkx directed graph.

Parameters:

task (TaskSpan[Any]) –The root TaskSpan to convert.

Returns:

DiGraph –A networkx DiGraph representing the task hierarchy.