Task Sets
Curate a named, org-scoped list of task references and run the whole suite as one evaluation — with per-caller resolution and reproducible snapshots.
A task set is a named list of task references you run as one evaluation. Instead of passing
twenty --task flags, you curate the suite once, then point an evaluation at it:
# Author oncedn task-set init apex-web # scaffold task-set.yamldn task-set push ./task-set.yaml # upload
# Run the whole suitedn evaluation create apex-baseline --task-set my-org/apex-web --model openai/gpt-4.1-mini --waitA set is a bookmark, not a bundle. It points at existing tasks by reference — it never copies their content, and editing a set never touches the tasks it names. That keeps a set cheap to curate and share, and it means the same set resolves to different tasks for different callers, depending on what each can see.
Authoring a set
Section titled “Authoring a set”In the app, open Environments, choose Task Sets, then create a set or select one from the left rail. Owners can use Edit to change description, tags, source, and members; the same detail modal handles publish, unpublish, and delete actions.
You can also create or replace a set from a task-set.yaml manifest. Scaffold one with
dn task-set init, edit the members, and push:
name: apex-webdescription: Web exploitation challenges from the apex suitetags: [web, ctf, apex]source: apex
members: - my-org/apex-sqli-basic # bare — latest version you can see - my-org/apex-xss-stored@1.2.0 # pinned to an exact version - dreadnode/portswigger-cors-01 # cross-org, must be public - task_name: my-org/apex-csrf # object form carries notes notes: 'Added for the Q2 regression suite'Every member is a fully-qualified <org>/<task>[@version]
reference — the owning org is always explicit. A bare reference (my-org/apex-sqli-basic) tracks
the latest version you can see; a pinned reference (@1.2.0) locks an exact version. Both forms
can share one list.
Member existence is not checked at push time. A set can name a task that doesn’t exist yet, or
a private task in another org — the reference is evaluated when someone reads or runs the set, not
when you save it. For the full field list, caps, and validator rules, see
dn task-set.
Resolution: what actually runs
Section titled “Resolution: what actually runs”Resolution turns each reference into a concrete task version under the visibility of the caller — it runs every time someone reads the set and again when an evaluation expands it. A version is visible to you if your org owns it or it’s marked public. Each member resolves to one of three states:
| Status | Meaning |
|---|---|
| Resolved | The reference points at a task version you can run and score. |
| No verifier | The version resolves but has no verification rule — it runs unscored. |
| Not found | Absent, or private to another org. Expected for a bookmark — not an error. |
Because resolution is per-caller, the same public set shows a full member list to its author and a
partly-not found list to an outside viewer. The platform never silently drops members — it
discloses each one’s status.
Running an evaluation against a set
Section titled “Running an evaluation against a set”Pass a set in place of --task. The platform resolves it under your visibility and emits one
evaluation row per resolvable member:
dn evaluation create apex-baseline \ --task-set my-org/apex-web \ --model openai/gpt-4.1-mini \ --wait
# Skipped 1 member not resolvable under your visibility:# dreadnode/portswigger-cors-01 (not_found)--task-set is mutually exclusive with --task and with a file-backed dataset. Members you
can’t resolve are skipped, not failed — they’re reported in the evaluation’s skipped_members
list (with a reason) and surfaced on every read, so a typo or a privatized task never silently
shrinks your run.
The resolved rows are snapshotted onto the evaluation at creation. Bare references are pinned to the version that resolved at that moment. Editing, reordering, or deleting the set afterward does not change a finished evaluation — its snapshot is the record of what ran. To reproduce a run, the evaluation replays its own snapshot, not the live set.
Publishing a set
Section titled “Publishing a set”A set carries a single visibility flag. Publishing makes the listing visible to other orgs — it does not change who can run the member tasks, and it never redacts member names:
dn task-set publish my-org/apex-webdn task-set unpublish my-org/apex-webPublishing is ungated: it always succeeds, even if some members aren’t publicly runnable. When a
member resolves for you but wouldn’t for an outside caller — a private task, or one with no
verifier — publishing returns a warnings list naming those members so you know what external
viewers can’t run. The warning is informational and never blocks the write.
Visibility is owned solely by publish and unpublish. Editing a set’s fields or members with
dn task-set push replaces the record but never flips its publish state.
What’s next
Section titled “What’s next”dn task-set— every subcommand and flag, includinglist,info,add,remove,pull, andclone.- Tasks — author the tasks a set references.
- Running evaluations — manifests, secrets, CI blocking, and comparison.