CLI
The datagen command — its shape, its defaults, and how to learn it from the terminal.
The datagen CLI is the primary surface. Every workflow a human or a coding agent can drive — creating a dataset, reviewing a preview, registering a Resource, managing auth — is a subcommand. The web UI exists for humans who prefer to click through a preview; everything else lives here.
Command tree
datagen
├── datasets # create, list, get, preview, approve, feedback, watch, download
├── resources # list, get, create, delete, templates, materialize
├── auth # login, logout, status, org
├── config # show
└── health # check API connectivityFive groups. Four of them are noun-first (datagen <noun> <verb>). The fifth — health — is a top-level sanity check.
Progressive --help
Every subcommand is self-documenting. --help at any depth prints the flags, their types, their defaults, and a one-line description of what the command does. Point a coding agent at the CLI and it can resolve the next command from the current one without ever reading these docs:
datagen --help
datagen datasets --help
datagen datasets create --helpThis is the recommended way for Claude Code, Cursor, or Windsurf to drive the CLI. See Claude Code workflow for the full pattern.
Output formats
Every subcommand accepts --format with two values:
| Value | When to use |
|---|---|
text (default) | Human-readable. Tables for list commands, rendered panels for previews, plain lines otherwise. |
json | Machine-readable. Stable shape, suitable for scripting or piping into jq. |
Data goes to stdout; diagnostics and prompts go to stderr. You can safely pipe --format json output without interleaved noise.
Global options
| Flag | Meaning |
|---|---|
-v, --verbose | Increase log verbosity. Repeatable (-vv). |
--help | Show help for the current command and exit. |
--install-completion | Install shell completion. |
--show-completion | Print the completion script for your shell. |
Connection settings (API URL, token, timeout) come from environment variables and a credentials file that datagen auth login manages — see Config.
The health command
A top-level ping against the API. Useful as a first check after install or auth.
$ datagen health
Connected to https://api.datagen.example (12 models available)Pass --format json for a structured payload.
Where to go next
Dataset formats
A delivered dataset is a small bundle of artifact files — JSONL and Parquet for the items, plus a quality report in Markdown and JSON. An optional HuggingFace mirror is available when you wire up a token.
datasets
The full lifecycle of a dataset from the command line — create, watch, preview, approve or iterate, download.