SDK
The Dreadnode Python SDK — install, configure, and the module layout every reference page assumes.
The dreadnode package is the Python surface for everything the platform does: agents, datasets, evaluations, scorers, optimization, training, tracing, and capability authoring. Every reference page in this section is auto-generated from the SDK source, so signatures and docstrings track the code.
import dreadnode as dn
dn.configure( server="https://app.dreadnode.io", api_key="dn_...", organization="acme", workspace="research",)For account setup and installation, see Getting Started and Authentication. This page covers the shape of the SDK itself — modules, idioms, and the conventions each reference page assumes.
The module map
Section titled “The module map”The SDK splits into one module per concern. Each row points at the reference page for that module.
| Module | What it gives you |
|---|---|
dreadnode | Top-level API: configure, task, run, log_*, types, meta annotations |
dreadnode.agents | Agent, Tool, Toolset, reactions, hooks, stopping conditions, MCP |
dreadnode.airt | Prebuilt attack studies for AI red teaming |
dreadnode.capabilities | Capability, Worker, loader, sync client, manifest types |
dreadnode.datasets | Dataset, LocalDataset, load_dataset |
dreadnode.evaluations | Evaluation, sample events, the @evaluation decorator |
dreadnode.generators | Chat, Message, Generator (LiteLLM, HTTP, vLLM, Transformers) |
dreadnode.models | Model, LocalModel, load_model |
dreadnode.optimization | Optimization, backends, agent adapter, events |
dreadnode.samplers | Sampling strategies for studies (Random, Grid, MAP-Elites, ZOO, Optuna…) |
dreadnode.scorers | 100+ reusable scoring functions (safety, bias, format, security) |
dreadnode.storage | S3 / GCS / Azure / MinIO credentials, session store |
dreadnode.tools | Standard agent tools: bash, python, read, write, fetch, grep… |
dreadnode.tracing | Span, TaskSpan, study_span, trial_span, OTLP exporters |
dreadnode.training | Trainers (SFT, DPO, PPO) for Ray, Anyscale, Azure ML, Prime Intellect |
dreadnode.transforms | 35+ transform families for prompt rewriting and attack construction |
Most real code starts on dreadnode.* directly — dn.task, dn.log_metric, dn.Agent — and only reaches into submodules when you need something specific like dn.scorers.exact_match or dn.transforms.cipher.
Idioms
Section titled “Idioms”dn.* is the default instance
Section titled “dn.* is the default instance”Every top-level function on dreadnode is bound to a lazily-created Dreadnode instance. dn.configure(...), dn.run(...), and dn.log_metric(...) all operate on the same default. Construct your own Dreadnode(...) only when you need multiple isolated configurations in the same process.
Decorate functions to track them
Section titled “Decorate functions to track them”Tasks, evaluations, and scorers are created by decorating a plain async function. The decorated object remembers the function and can be composed, executed, or logged without further setup.
import dreadnode as dn
@dn.taskasync def triage(alert: str) -> str: # Your logic here. return classify(alert)
@dn.scorerasync def is_high_priority(output: str) -> float: return 1.0 if output == "urgent" else 0.0
result = await triage("Unusual login from new IP")Runs group tasks; spans group anything
Section titled “Runs group tasks; spans group anything”Wrap related work in dn.run(...) to give it a project, tags, and a top-level trace. Inside a run, every @dn.task call creates a nested TaskSpan. Use dn.span(...) when you want a labeled section of trace without the task decorator overhead.
with dn.run("triage-batch", project="soc", tags=["prod"]): for alert in alerts: await triage(alert)Async where it counts
Section titled “Async where it counts”Task execution, agent runs, and evaluations are all async. dn.configure(...) and the dn.run(...) context manager are sync, so the common shape is a sync with block around await calls. Wrap scripts in asyncio.run(main()) at the top; notebooks and agent loops can await directly.
Load and publish artifacts
Section titled “Load and publish artifacts”The SDK can pull published datasets, models, capabilities, and environments into local storage, and can publish new ones back to the registry.
| Goal | API |
|---|---|
| Pull a published package locally | dn.pull_package(["dataset://org/name:version"]) |
| Load a pulled package | dn.load_package("dataset://org/name@version") |
| Load a local capability directory | dn.load_capability("./capabilities/recon-kit") |
| Publish a capability | dn.push_capability("./capabilities/recon-kit", publish=True) |
| Publish a dataset, model, or environment | dn.push_dataset(...), dn.push_model(...), dn.push_environment(...) |
| List locally-cached or remote packages | dn.list_registry("capabilities") (or "datasets", "models", "environments") |
Reference formats differ slightly: pull_package takes OCI-style scheme://org/name:version, while load_package takes scheme://org/name@version. Pin versions in benchmarks and training jobs — a moving latest makes runs hard to reproduce.
For the full narrative on each artifact type — manifest shape, publishing lifecycle, catalog browsing, and loading patterns — see Datasets, Models, Capabilities, and Tasks (“environments” in the SDK).
SDK vs CLI
Section titled “SDK vs CLI”The SDK and CLI are complementary. Reach for the SDK when your workflow belongs in code — agent definitions, evaluations, custom scorers, training loops, CI jobs. Reach for the CLI for login, profile switching, registry operations, and quick platform inspection from a shell.
A typical loop is “build and test in Python, publish with the CLI, pin the published version in the next SDK run.”
Examples
Section titled “Examples”Runnable scripts and notebooks ship in the SDK repo:
- Scripts:
packages/sdk/examples/scripts/— run frompackages/sdkwithuv run python examples/scripts/<name>.py - Notebooks:
packages/sdk/examples/notebooks/
Good entry points: agent_with_tools.py, evaluation_with_scorers.py, optimization_study.py, and airt_pair.py.
Common confusion points
Section titled “Common confusion points”- Top-level re-exports duplicate domain pages.
dn.Task,dn.Scorer,dn.Agent, etc. render ondreadnodeand on the domain page. They’re the same class, just reached through different paths. - Capabilities are loaded, not
load_package’d. Usedn.load_capability("./path")for local directories anddn pull+dn installfrom the CLI for published bundles. - “Environments” in the SDK are “tasks” everywhere else.
dn.push_environment(...)publishes what the app and CLI call a task; the registry URI isenvironment://org/name:version. ApiClientis the escape hatch. When an endpoint doesn’t have a first-class SDK wrapper — billing, device-code login, hosted job submission, raw world control — drop tofrom dreadnode.app.api.client import ApiClient.- Tracing needs a run or span. Calling
dn.log_metric(...)outside ofdn.run(...), a@dn.task, ordn.span(...)warns and no-ops.