CLI Reference
HEDGEHOG provides a Typer-based command-line interface. The primary command is hedgehog, with hedge available as a short alias. Both are interchangeable.
uv run hedgehog --help
uv run hedge --helpCommands
| Command | Description |
|---|---|
run | Run the pipeline explicitly as a subcommand |
report | Regenerate the HTML report from an existing run |
info | Display pipeline information and available stages |
version | Display version information |
setup | Install optional external tools and assets |
tui | Launch the interactive TUI (Terminal User Interface) |
hedgehog (default pipeline command)
The root CLI command executes the evaluation pipeline by default. It runs all enabled stages using the configuration at src/hedgehog/configs/config.yml.
uv run hedgehog [OPTIONS]The same pipeline command is also available as an explicit subcommand:
uv run hedgehog run [OPTIONS]Options
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--config | -c | TEXT | src/hedgehog/configs/config.yml | Master YAML config path. |
--mols | -m | TEXT | None | Input molecule table path or glob. Overrides config input. Prefer CSV/TSV with a smiles header. |
--out | -o | TEXT | None | Output directory override (folder_to_save). |
--stage | -s | STAGE | None | Run one or more stages only. Repeat --stage to select multiple stages. Uses config molecules unless --mols is provided. See hedgehog info for stage descriptions. |
--reuse | FLAG | False | Reuse existing results folder. | |
--force-new | FLAG | False | Always create a new results folder. | |
--auto-install | FLAG | False | Auto-install missing optional tools without prompts. | |
--progress | FLAG | False | Show live progress bar. | |
--large-dataset | FLAG | False | Stream large molecule libraries in chunks, write row-level .parts shards, and skip plots/report-heavy outputs. Filters are calculated by default but do not remove molecules unless large_dataset_filter_data: true. |
The --reuse and --force-new flags are mutually exclusive. Using both will produce an error.
The --out option cannot be used together with --reuse or --force-new.
The progress bar is disabled by default in CLI runs. Use --progress when you want live stage progress rendering.
Use --large-dataset for PubChem/Enamine-scale statistics runs. It stores intermediate row-level tables as compressed shard directories such as filtered_molecules.parts/ and descriptors_all.parts/, uses large_dataset_chunk_rows from the master config, and skips plots plus HTML report generation. Without an explicit --stage selection, large-dataset mode runs the scalable statistics path: mol_prep, descriptors, struct_filters, and synthesis. Descriptor, structural, and synthesis filters are calculated by default, but downstream large-mode outputs keep all molecules unless large_dataset_filter_data: true. The synthesis stage skips AiZynthFinder retrosynthesis in large-dataset mode.
When you request any stage other than mol_prep, the pipeline may still run mol_prep first if the Mol Prep stage is enabled in config_mol_prep.yml. This is intentional: downstream stages operate on standardized molecules when that preprocessing stage is active.
Available Stages
Each --stage occurrence accepts one of the following values:
| Stage | Description |
|---|---|
mol_prep | Standardize and filter molecules (salts/fragments, metals, charges, tautomers, stereo) |
descriptors | Compute 22 physicochemical descriptors per molecule |
struct_filters | Apply structural filters (Lilly, NIBR, PAINS, etc.) |
synthesis | Evaluate synthetic accessibility using retrosynthesis (AiZynthFinder) and other metrics |
docking | Calculate docking scores with SMINA/GNINA/Matcha |
docking_filters | Filter docking poses by quality and interactions |
final_descriptors | Recompute descriptors on the final filtered set |
When running --stage struct_filters, input resolution is:
- Existing descriptors output (if present in the run folder)
- MolPrep output
- Sampled molecules input
Input File Contract
Recommended input is CSV/TSV with a smiles header:
smiles,model_name
CCO,demo
CCN,demoRequired:
smiles
Optional:
model_nameornamemol_idx
If mol_idx is missing, HEDGEHOG assigns it automatically. Headerless
.smi-style files are supported for simple one-SMILES-per-line inputs, but
CSV/TSV is the clearest format for production runs.
To run multiple stages in a single invocation, repeat --stage:
uv run hedgehog --stage descriptors --stage struct_filtersFolder Behavior
The output folder strategy depends on what you are running:
- Full pipeline — a new numbered folder (
results/run_N/) is created automatically. - Single stage rerun (
--stagewithout--mols) — the existing folder is reused so the stage output is updated in place. - Single stage with new molecules (
--stage+--mols) — a new folder is created since the input changed. --reuse— always reuses the configured folder, regardless of what exists.--force-new— always creates a new numbered folder, regardless of context.
Every run also writes a transient .RUN_INCOMPLETE marker in the results folder. The marker is removed on successful completion. If it remains present, the run was interrupted or failed and the run log should be checked.
Stage Failure and Skip Semantics
Stage completion is not binary “success or crash” across the whole pipeline. Some stages end the run early, while others can be skipped and let the pipeline continue.
- Early exit with a completed upstream pipeline state: if
mol_prepfinishes but leaves zero molecules, the pipeline stops immediately. - Hard failure / early stop: structural filters stop the run if they do not complete successfully.
- Conditional continuation: synthesis can continue the pipeline when the stage failed but still produced usable output, or when no output file was created and downstream stages can continue without synthesis results.
- Soft skip: docking, docking filters, and final descriptors can be skipped when their required upstream data does not exist or contains no molecules.
This matters when interpreting stage counts in the log and the final report: a missing stage output does not always mean the whole run failed in the same way.
Examples
# Run the full pipeline with default config
uv run hedgehog
# Run with a custom master config
uv run hedgehog --config src/hedgehog/configs/config.yml
# Run with a custom molecule set
uv run hedgehog --mols input/my_molecules.csv
# Run into a custom output directory
uv run hedgehog --out results/my_run
# Run with a glob pattern
uv run hedgehog --mols "input/generated/*.csv"
# Run only the docking stage
uv run hedge --stage docking
# Run descriptors and structural filters in one invocation
uv run hedge --stage descriptors --stage struct_filters
# Run a specific stage with custom molecules
uv run hedge --stage descriptors --mols input/candidates.csv
# Rerun into the same results folder
uv run hedge --reuse
# Force a fresh results folder for a stage rerun
uv run hedge --stage synthesis --force-new
# Enable live progress bar
uv run hedge --progress
# Stream a large library without monolithic intermediate CSVs or plots
uv run hedge --large-dataset --mols input/pubchem.csvhedgehog report
Regenerates the reporting artifacts from an existing pipeline run directory without re-running any pipeline stages. Useful when report templates, plotting logic, or the stage-audit notebook have been updated.
uv run hedgehog report <RESULTS_DIR>Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
RESULTS_DIR | TEXT | Yes | Path to an existing pipeline results directory (e.g., results/run_10). |
The command loads the saved configuration from configs/master_config_resolved.yml inside the results directory and regenerates:
report.htmlreport_data.jsonRUN_INFO.mdstage_filter_audit.ipynb
Examples
# Regenerate report for a specific run
uv run hedgehog report results/run_10
# Using the short alias
uv run hedge report results/run_5hedgehog info
Displays a table of all available pipeline stages with their descriptions. Useful for checking stage names before using --stage.
uv run hedgehog infohedgehog version
Prints the current HEDGEHOG version and project tagline.
uv run hedgehog versionhedgehog setup aizynthfinder
Installs the upstream aizynthfinder package into the project environment and downloads its public data into modules/aizynthfinder/.
Downloads are auto-accepted by default.
uv run hedgehog setup aizynthfinderIf you explicitly want interactive confirmation:
uv run hedgehog setup aizynthfinder --no-yeshedgehog setup sync
Downloads the SYNC 3D synthesizability classifier checkpoint to modules/sync/classifier_emb.ckpt.
uv run hedgehog setup syncUse --no-yes to prompt before download:
uv run hedgehog setup sync --no-yeshedgehog setup fsscore
Clones upstream FSScore checkout into modules/fsscore so you can point synthesis
configuration to the pretrained checkpoint without adding heavy torch dependencies
to the base HEDGEHOG environment.
uv run hedgehog setup fsscore --yesOptions:
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--yes | -y | FLAG | False | Auto-accept checkout prompt. |
After checkout, configure:
HEDGEHOG_FSSCORE_PYTHONto an isolated FSScore environmentHEDGEHOG_FSSCORE_MODEL_PATH(orHEDGEHOG_FSSCORE_REPO_PATH)
hedgehog setup nonpher-check
Validates optional Nonpher runtime for the synthesis nonpher scorer.
This command does not install dependencies. For portable setup, first use
uv-isolated environments under a writable per-host HEDGEHOG_OPTIONAL_ENV_ROOT
(for example ~/work/hedgehog_optional_envs), then probe with --python.
uv run hedgehog setup nonpher-checkOptions:
| Option | Type | Default | Description |
|---|---|---|---|
--python | TEXT | None | External interpreter to probe (for example ~/work/hedgehog_optional_envs/nonpher/bin/python). |
--probe-smiles | TEXT | CCO | Probe molecule used for runtime validation. |
Examples:
# Probe current uv environment
uv run hedgehog setup nonpher-check
# Probe uv-only isolated Nonpher env
uv run hedgehog setup nonpher-check --python ~/work/hedgehog_optional_envs/nonpher/bin/python
# If uv-only bootstrap is blocked by native deps, probe a validated external runtime
uv run hedgehog setup nonpher-check --python /mnt/ligandpro/shared_storage/data/nikolenko/hedgehog_optional_envs/nonpher-hybrid-py38-v2/bin/pythonhedgehog setup nvmolkit-worker
Creates an isolated virtual environment at .venv-nvmolkit-worker and installs the optional nvMolKit worker there.
uv run hedgehog setup nvmolkit-workerOptions:
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--yes/--no-yes | -y | FLAG | True | Auto-accept dependency download prompt. |
--python | TEXT | None | Python interpreter for worker venv. Resolution order: explicit value, then python3.12, python3.11, python3.10. |
Examples:
# Auto-install worker dependencies (default)
uv run hedgehog setup nvmolkit-worker
# Force worker venv to use Python 3.11
uv run hedgehog setup nvmolkit-worker --python python3.11hedgehog setup shepherd-worker
Creates an isolated virtual environment at .venv-shepherd-worker and installs the optional Shepherd-Score backend there.
uv run hedgehog setup shepherd-workerOptions:
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--yes | -y | FLAG | False | Auto-accept dependency download prompt. |
--python | TEXT | None | Python interpreter for worker venv. Resolution order: explicit value, then python3.12, python3.11, python3.10. |
Examples:
# Auto-install worker dependencies
uv run hedgehog setup shepherd-worker --yes
# Force worker venv to use Python 3.11
uv run hedgehog setup shepherd-worker --python python3.11 --yesEnvironment Variables
The CLI supports a few environment variables that are useful in automation and CI:
| Variable | Effect |
|---|---|
HEDGEHOG_AUTO_INSTALL=1 | Auto-accept optional dependency downloads and setup prompts. |
HEDGEHOG_NON_INTERACTIVE=1 | Auto-decline download/setup prompts instead of waiting for interactive input. |
HEDGEHOG_PLAIN_OUTPUT=1 | Disable Rich formatting and emit plain console output without the banner styling. |
HEDGEHOG_AUTO_INSTALL is what the --auto-install flag sets internally for a pipeline run. The setup commands also use it when their --yes flag is enabled.
hedgehog tui
Launches the interactive Terminal User Interface for visual pipeline configuration and management.
Requirements and behavior:
- Requires Node.js >= 18 and npm.
- Requires a source checkout that contains the repository
tui/directory. - If the TUI bundle does not exist yet, HEDGEHOG automatically runs
npm installandnpm run buildinsidetui/before launching.
uv run hedgehog tuiOptions:
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--session | -s | TEXT | None | Resume TUI directly into results for the specified job id. |
Examples:
# Start TUI normally
uv run hedgehog tui
# Resume directly into an existing job
uv run hedgehog tui --session 1a2b3c4dFor TUI-specific behavior such as editable config copies, job history storage, preflight checks, and keyboard shortcuts, see the TUI documentation.
The TUI can also be launched directly from the TUI package:
cd tui
npm run tuiAlias
hedge is a short alias for hedgehog. All commands work identically with either name:
uv run hedge --stage docking
uv run hedgehog --stage dockingExit Codes
| Code | Meaning |
|---|---|
0 | Pipeline completed successfully |
1 | Pipeline completed with failures, conflicting flags (--reuse + --force-new, or --out with --reuse/--force-new), TUI directory not found, or Node.js not available |