Task Sets

Curate a named, org-scoped list of task references and run the whole suite as one evaluation — with per-caller resolution and reproducible snapshots.

A task set is a named list of task references you run as one evaluation. Instead of passing twenty --task flags, you curate the suite once, then point an evaluation at it:

# Author once
dn task-set init apex-web          # scaffold task-set.yaml
dn task-set push ./task-set.yaml   # upload

# Run the whole suite
dn evaluation create apex-baseline --task-set my-org/apex-web --model openai/gpt-4.1-mini --wait

A set is a bookmark, not a bundle. It points at existing tasks by reference — it never copies their content, and editing a set never touches the tasks it names. That keeps a set cheap to curate and share, and it means the same set resolves to different tasks for different callers, depending on what each can see.

Authoring a set

In the app, open Environments, choose Task Sets, then create a set or select one from the left rail. Owners can use Edit to change description, tags, source, and members; the same detail modal handles publish, unpublish, and delete actions.

You can also create or replace a set from a task-set.yaml manifest. Scaffold one with dn task-set init, edit the members, and push:

name: apex-web
description: Web exploitation challenges from the apex suite
tags: [web, ctf, apex]
source: apex

members:
  - my-org/apex-sqli-basic # bare — latest version you can see
  - my-org/apex-xss-stored@1.2.0 # pinned to an exact version
  - dreadnode/portswigger-cors-01 # cross-org, must be public
  - task_name: my-org/apex-csrf # object form carries notes
    notes: 'Added for the Q2 regression suite'

Every member is a fully-qualified <org>/<task>[@version] reference — the owning org is always explicit. A bare reference (my-org/apex-sqli-basic) tracks the latest version you can see; a pinned reference (@1.2.0) locks an exact version. Both forms can share one list.

Member existence is not checked at push time. A set can name a task that doesn’t exist yet, or a private task in another org — the reference is evaluated when someone reads or runs the set, not when you save it. For the full field list, caps, and validator rules, see dn task-set.

Resolution: what actually runs

Resolution turns each reference into a concrete task version under the visibility of the caller — it runs every time someone reads the set and again when an evaluation expands it. A version is visible to you if your org owns it or it’s marked public. Each member resolves to one of three states:

Status	Meaning
Resolved	The reference points at a task version you can run and score.
No verifier	The version resolves but has no verification rule — it runs unscored.
Not found	Absent, or private to another org. Expected for a bookmark — not an error.

Because resolution is per-caller, the same public set shows a full member list to its author and a partly-not found list to an outside viewer. The platform never silently drops members — it discloses each one’s status.

Running an evaluation against a set

Pass a set in place of --task. The platform resolves it under your visibility and emits one evaluation row per resolvable member:

dn evaluation create apex-baseline \
  --task-set my-org/apex-web \
  --model openai/gpt-4.1-mini \
  --wait

# Skipped 1 member not resolvable under your visibility:
#   dreadnode/portswigger-cors-01  (not_found)

--task-set is mutually exclusive with --task and with a file-backed dataset. Members you can’t resolve are skipped, not failed — they’re reported in the evaluation’s skipped_members list (with a reason) and surfaced on every read, so a typo or a privatized task never silently shrinks your run.

The resolved rows are snapshotted onto the evaluation at creation. Bare references are pinned to the version that resolved at that moment. Editing, reordering, or deleting the set afterward does not change a finished evaluation — its snapshot is the record of what ran. To reproduce a run, the evaluation replays its own snapshot, not the live set.

Publishing a set

A set carries a single visibility flag. Publishing makes the listing visible to other orgs — it does not change who can run the member tasks, and it never redacts member names:

dn task-set publish my-org/apex-web
dn task-set unpublish my-org/apex-web

Publishing is ungated: it always succeeds, even if some members aren’t publicly runnable. When a member resolves for you but wouldn’t for an outside caller — a private task, or one with no verifier — publishing returns a warnings list naming those members so you know what external viewers can’t run. The warning is informational and never blocks the write.

Visibility is owned solely by publish and unpublish. Editing a set’s fields or members with dn task-set push replaces the record but never flips its publish state.

What’s next

dn task-set — every subcommand and flag, including list, info, add, remove, pull, and clone.
Tasks — author the tasks a set references.
Running evaluations — manifests, secrets, CI blocking, and comparison.