Skip to content

SDK

The Dreadnode Python SDK — install, configure, and the module layout every reference page assumes.

The dreadnode package is the Python surface for everything the platform does: agents, datasets, evaluations, scorers, optimization, training, tracing, and capability authoring. Every reference page in this section is auto-generated from the SDK source, so signatures and docstrings track the code.

import dreadnode as dn
dn.configure(
server="https://app.dreadnode.io",
api_key="dn_...",
organization="acme",
workspace="research",
)

For account setup and installation, see Getting Started and Authentication. This page covers the shape of the SDK itself — modules, idioms, and the conventions each reference page assumes.

The SDK splits into one module per concern. Each row points at the reference page for that module.

ModuleWhat it gives you
dreadnodeTop-level API: configure, task, run, log_*, types, meta annotations
dreadnode.agentsAgent, Tool, Toolset, reactions, hooks, stopping conditions, MCP
dreadnode.airtPrebuilt attack studies for AI red teaming
dreadnode.capabilitiesCapability, Worker, loader, sync client, manifest types
dreadnode.datasetsDataset, LocalDataset, load_dataset
dreadnode.evaluationsEvaluation, sample events, the @evaluation decorator
dreadnode.generatorsChat, Message, Generator (LiteLLM, HTTP, vLLM, Transformers)
dreadnode.modelsModel, LocalModel, load_model
dreadnode.optimizationOptimization, backends, agent adapter, events
dreadnode.samplersSampling strategies for studies (Random, Grid, MAP-Elites, ZOO, Optuna…)
dreadnode.scorers100+ reusable scoring functions (safety, bias, format, security)
dreadnode.storageS3 / GCS / Azure / MinIO credentials, session store
dreadnode.toolsStandard agent tools: bash, python, read, write, fetch, grep
dreadnode.tracingSpan, TaskSpan, study_span, trial_span, OTLP exporters
dreadnode.trainingTrainers (SFT, DPO, PPO) for Ray, Anyscale, Azure ML, Prime Intellect
dreadnode.transforms35+ transform families for prompt rewriting and attack construction

Most real code starts on dreadnode.* directly — dn.task, dn.log_metric, dn.Agent — and only reaches into submodules when you need something specific like dn.scorers.exact_match or dn.transforms.cipher.

Every top-level function on dreadnode is bound to a lazily-created Dreadnode instance. dn.configure(...), dn.run(...), and dn.log_metric(...) all operate on the same default. Construct your own Dreadnode(...) only when you need multiple isolated configurations in the same process.

Tasks, evaluations, and scorers are created by decorating a plain async function. The decorated object remembers the function and can be composed, executed, or logged without further setup.

import dreadnode as dn
@dn.task
async def triage(alert: str) -> str:
# Your logic here.
return classify(alert)
@dn.scorer
async def is_high_priority(output: str) -> float:
return 1.0 if output == "urgent" else 0.0
result = await triage("Unusual login from new IP")

Wrap related work in dn.run(...) to give it a project, tags, and a top-level trace. Inside a run, every @dn.task call creates a nested TaskSpan. Use dn.span(...) when you want a labeled section of trace without the task decorator overhead.

with dn.run("triage-batch", project="soc", tags=["prod"]):
for alert in alerts:
await triage(alert)

Task execution, agent runs, and evaluations are all async. dn.configure(...) and the dn.run(...) context manager are sync, so the common shape is a sync with block around await calls. Wrap scripts in asyncio.run(main()) at the top; notebooks and agent loops can await directly.

The SDK can pull published datasets, models, capabilities, and environments into local storage, and can publish new ones back to the registry.

GoalAPI
Pull a published package locallydn.pull_package(["dataset://org/name:version"])
Load a pulled packagedn.load_package("dataset://org/name@version")
Load a local capability directorydn.load_capability("./capabilities/recon-kit")
Publish a capabilitydn.push_capability("./capabilities/recon-kit", publish=True)
Publish a dataset, model, or environmentdn.push_dataset(...), dn.push_model(...), dn.push_environment(...)
List locally-cached or remote packagesdn.list_registry("capabilities") (or "datasets", "models", "environments")

Reference formats differ slightly: pull_package takes OCI-style scheme://org/name:version, while load_package takes scheme://org/name@version. Pin versions in benchmarks and training jobs — a moving latest makes runs hard to reproduce.

For the full narrative on each artifact type — manifest shape, publishing lifecycle, catalog browsing, and loading patterns — see Datasets, Models, Capabilities, and Tasks (“environments” in the SDK).

The SDK and CLI are complementary. Reach for the SDK when your workflow belongs in code — agent definitions, evaluations, custom scorers, training loops, CI jobs. Reach for the CLI for login, profile switching, registry operations, and quick platform inspection from a shell.

A typical loop is “build and test in Python, publish with the CLI, pin the published version in the next SDK run.”

Runnable scripts and notebooks ship in the SDK repo:

  • Scripts: packages/sdk/examples/scripts/ — run from packages/sdk with uv run python examples/scripts/<name>.py
  • Notebooks: packages/sdk/examples/notebooks/

Good entry points: agent_with_tools.py, evaluation_with_scorers.py, optimization_study.py, and airt_pair.py.

  • Top-level re-exports duplicate domain pages. dn.Task, dn.Scorer, dn.Agent, etc. render on dreadnode and on the domain page. They’re the same class, just reached through different paths.
  • Capabilities are loaded, not load_package’d. Use dn.load_capability("./path") for local directories and dn pull + dn install from the CLI for published bundles.
  • “Environments” in the SDK are “tasks” everywhere else. dn.push_environment(...) publishes what the app and CLI call a task; the registry URI is environment://org/name:version.
  • ApiClient is the escape hatch. When an endpoint doesn’t have a first-class SDK wrapper — billing, device-code login, hosted job submission, raw world control — drop to from dreadnode.app.api.client import ApiClient.
  • Tracing needs a run or span. Calling dn.log_metric(...) outside of dn.run(...), a @dn.task, or dn.span(...) warns and no-ops.