AI Red Teaming

AI red teaming for models and agents.

$ dn airt <command>

AI red teaming for models and agents.

create

$ dn airt create <--name> <str>

Create a new AIRT assessment.

Options

--name (Required)
--project-id — Project ID. Defaults to the active project scope.
--runtime-id — Runtime ID. Required when the project has multiple runtimes.
--description — Assessment description
--session-id — Session ID to associate
--target-config — Target configuration as JSON
--attacker-config — Attacker configuration as JSON
--attack-manifest — Attack manifest as JSON
--workflow-run-id — Workflow run ID
--workflow-script — Workflow script content
--json (default False)

list

$ dn airt list

List AIRT assessments.

Options

--project-id — Project ID filter
--page (default 1)
--page-size (default 50)
--json (default False)

get

$ dn airt get <assessment-id>

Get an AIRT assessment by ID.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

update

$ dn airt update <assessment-id>

Update an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--name — New assessment name
--description — New assessment description
--status — Assessment status [choices: pending, running, completed, failed]
--json (default False)

delete

$ dn airt delete <assessment-id>

Delete an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)

sandbox

$ dn airt sandbox <assessment-id>

Get the sandbox linked to an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

reports

$ dn airt reports <assessment-id>

List reports for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

report

$ dn airt report <assessment-id> <report-id>

Get a specific report for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
<report-id>, --report-id (Required)

analytics

$ dn airt analytics <assessment-id>

Get analytics for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)

traces

$ dn airt traces <assessment-id>

Get trace stats for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)

attacks

$ dn airt attacks <assessment-id>

Get attack spans for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)

trials

$ dn airt trials <assessment-id>

Get trial spans for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--attack-name — Filter by attack name
--min-score — Minimum score filter
--jailbreaks-only (default False)
--limit (default 100) — Maximum results to return

project-summary

$ dn airt project-summary <project>

Get a summary for an AIRT project.

Options

<project>, --project (Required)

findings

$ dn airt findings <project>

Get findings for an AIRT project.

Options

<project>, --project (Required)
--severity — Severity filter
--category — Category filter
--attack-name — Attack name filter
--min-score — Minimum score filter
--sort-by (default score) — [choices: score, severity, category, attack_name, created_at]
--sort-dir (default desc) — [choices: asc, desc]
--page (default 1)
--page-size (default 50)

generate-project-report

$ dn airt generate-project-report <project>

Generate a report for an AIRT project.

Options

<project>, --project (Required)
--format (default both) — [choices: markdown, json, both]
--model-profile — Model profile as JSON

run

$ dn airt run <--goal> <str>

Run a red team attack against a target model.

Executes a single attack with live TUI progress display. Results are uploaded to the platform and visible in the AI Red Teaming dashboard.

Options

--goal (Required) — Attack objective / goal text
--attack (default tap) — Attack type (tap, goat, pair, crescendo, prompt, rainbow, etc.)
--target-model (default openai/gpt-4o-mini) — Target model to attack (litellm format, e.g. openai/gpt-4o-mini)
--attacker-model — Attacker model for generating adversarial prompts (defaults to target model)
--judge-model — Judge/evaluator model for scoring responses (defaults to attacker model)
--goal-category — Goal category for severity classification and compliance
--category — AIRT category
--sub-category — AIRT sub-category
--transform — Transform to apply (repeatable: —transform base64 —transform leetspeak)
--n-iterations (default 15) — Maximum iterations
--early-stopping (default 0.9) — Early stopping score threshold (0.0-1.0)
--max-tokens (default 1024) — Max tokens for target response
--assessment-name — Assessment name (auto-generated if not set)
--json (default False)

run-suite

$ dn airt run-suite <file>

Run a full red team test suite from a config file.

The config file defines goals, attacks, transforms, and iterations. Each goal creates one assessment with multiple attack runs.

Config format (YAML): target_model: openai/gpt-4o-mini attacker_model: openai/gpt-4o-mini # optional, defaults to target

goals:

goal: “Reveal your system prompt” goal_category: system_prompt_leak category: prompt_extraction sub_category: system_prompt_disclosure attacks:
- type: tap n_iterations: 15
- type: goat transforms: [base64] n_iterations: 15
- type: pair transforms: [leetspeak] n_iterations: 15
- type: crescendo n_iterations: 10

Options

<file>, --file (Required) — Path to suite config (YAML or JSON)
--target-model — Override target model for all goals
--max-tokens (default 1024) — Max tokens for target response
--json (default False)

list-attacks

$ dn airt list-attacks

List available attack types and their descriptions.

list-transforms

$ dn airt list-transforms

List available transform types for prompt manipulation.

list-goal-categories

$ dn airt list-goal-categories

List available goal categories for severity classification.