SDK

The Dreadnode Python SDK — install, configure, and the module layout every reference page assumes.

The dreadnode package is the Python surface for everything the platform does: agents, datasets, evaluations, scorers, optimization, training, tracing, and capability authoring. Every reference page in this section is auto-generated from the SDK source, so signatures and docstrings track the code.

import dreadnode as dn

dn.configure(
    server="https://app.dreadnode.io",
    api_key="dn_...",
    organization="acme",
    workspace="research",
)

For account setup and installation, see Getting Started and Authentication. This page covers the shape of the SDK itself — modules, idioms, and the conventions each reference page assumes.

The module map

The SDK splits into one module per concern. Each row points at the reference page for that module.

Module	What it gives you
`dreadnode`	Top-level API: `configure`, `task`, `run`, `log_*`, types, meta annotations
`dreadnode.agents`	`Agent`, `Tool`, `Toolset`, reactions, hooks, stopping conditions, MCP
`dreadnode.airt`	Prebuilt attack studies for AI red teaming
`dreadnode.capabilities`	`Capability`, `Worker`, loader, sync client, manifest types
`dreadnode.datasets`	`Dataset`, `LocalDataset`, `load_dataset`
`dreadnode.evaluations`	`Evaluation`, sample events, the `@evaluation` decorator
`dreadnode.generators`	`Chat`, `Message`, `Generator` (LiteLLM, HTTP, vLLM, Transformers)
`dreadnode.models`	`Model`, `LocalModel`, `load_model`
`dreadnode.optimization`	`Optimization`, backends, agent adapter, events
`dreadnode.samplers`	Sampling strategies for studies (Random, Grid, MAP-Elites, ZOO, Optuna…)
`dreadnode.scorers`	100+ reusable scoring functions (safety, bias, format, security)
`dreadnode.storage`	S3 / GCS / Azure / MinIO credentials, session store
`dreadnode.tools`	Standard agent tools: `bash`, `python`, `read`, `write`, `fetch`, `grep`…
`dreadnode.tracing`	`Span`, `TaskSpan`, `study_span`, `trial_span`, OTLP exporters
`dreadnode.training`	Trainers (SFT, DPO, PPO) for Ray, Anyscale, Azure ML, Prime Intellect
`dreadnode.transforms`	35+ transform families for prompt rewriting and attack construction

Most real code starts on dreadnode.* directly — dn.task, dn.log_metric, dn.Agent — and only reaches into submodules when you need something specific like dn.scorers.exact_match or dn.transforms.cipher.

Idioms

`dn.*` is the default instance

Every top-level function on dreadnode is bound to a lazily-created Dreadnode instance. dn.configure(...), dn.run(...), and dn.log_metric(...) all operate on the same default. Construct your own Dreadnode(...) only when you need multiple isolated configurations in the same process.

Decorate functions to track them

Tasks, evaluations, and scorers are created by decorating a plain async function. The decorated object remembers the function and can be composed, executed, or logged without further setup.

import dreadnode as dn

@dn.task
async def triage(alert: str) -> str:
    # Your logic here.
    return classify(alert)

@dn.scorer
async def is_high_priority(output: str) -> float:
    return 1.0 if output == "urgent" else 0.0

result = await triage("Unusual login from new IP")

Runs group tasks; spans group anything

Wrap related work in dn.run(...) to give it a project, tags, and a top-level trace. Inside a run, every @dn.task call creates a nested TaskSpan. Use dn.span(...) when you want a labeled section of trace without the task decorator overhead.

with dn.run("triage-batch", project="soc", tags=["prod"]):
    for alert in alerts:
        await triage(alert)

Async where it counts

Task execution, agent runs, and evaluations are all async. dn.configure(...) and the dn.run(...) context manager are sync, so the common shape is a sync with block around await calls. Wrap scripts in asyncio.run(main()) at the top; notebooks and agent loops can await directly.

Load and publish artifacts

The SDK can pull published datasets, models, capabilities, and environments into local storage, and can publish new ones back to the registry.

Goal	API
Pull a published package locally	`dn.pull_package(["dataset://org/name:version"])`
Load a pulled package	`dn.load_package("dataset://org/name@version")`
Load a local capability directory	`dn.load_capability("./capabilities/recon-kit")`
Publish a capability	`dn.push_capability("./capabilities/recon-kit", publish=True)`
Publish a dataset, model, or environment	`dn.push_dataset(...)`, `dn.push_model(...)`, `dn.push_environment(...)`
List locally-cached or remote packages	`dn.list_registry("capabilities")` (or `"datasets"`, `"models"`, `"environments"`)

Reference formats differ slightly: pull_package takes OCI-style scheme://org/name:version, while load_package takes scheme://org/name@version. Pin versions in benchmarks and training jobs — a moving latest makes runs hard to reproduce.

For the full narrative on each artifact type — manifest shape, publishing lifecycle, catalog browsing, and loading patterns — see Datasets, Models, Capabilities, and Tasks (“environments” in the SDK).

SDK vs CLI

The SDK and CLI are complementary. Reach for the SDK when your workflow belongs in code — agent definitions, evaluations, custom scorers, training loops, CI jobs. Reach for the CLI for login, profile switching, registry operations, and quick platform inspection from a shell.

A typical loop is “build and test in Python, publish with the CLI, pin the published version in the next SDK run.”

Examples

Runnable scripts and notebooks ship in the SDK repo:

Scripts: packages/sdk/examples/scripts/ — run from packages/sdk with uv run python examples/scripts/<name>.py
Notebooks: packages/sdk/examples/notebooks/

Good entry points: agent_with_tools.py, evaluation_with_scorers.py, optimization_study.py, and airt_pair.py.

Common confusion points

Top-level re-exports duplicate domain pages. dn.Task, dn.Scorer, dn.Agent, etc. render on dreadnode and on the domain page. They’re the same class, just reached through different paths.
Capabilities are loaded, not load_package’d. Use dn.load_capability("./path") for local directories and dn pull + dn install from the CLI for published bundles.
“Environments” in the SDK are “tasks” everywhere else. dn.push_environment(...) publishes what the app and CLI call a task; the registry URI is environment://org/name:version.
ApiClient is the escape hatch. When an endpoint doesn’t have a first-class SDK wrapper — billing, device-code login, hosted job submission, raw world control — drop to from dreadnode.app.api.client import ApiClient.
Tracing needs a run or span. Calling dn.log_metric(...) outside of dn.run(...), a @dn.task, or dn.span(...) warns and no-ops.

SDK