CLI Reference

HEDGEHOG provides a Typer-based command-line interface. The primary command is hedgehog, with hedge available as a short alias. Both are interchangeable.


uv run hedgehog --help
uv run hedge --help

Commands

Command	Description
`run`	Run the pipeline explicitly as a subcommand
`report`	Regenerate the HTML report from an existing run
`info`	Display pipeline information and available stages
`version`	Display version information
`setup`	Install optional external tools and assets
`tui`	Launch the interactive TUI (Terminal User Interface)

`hedgehog` (default pipeline command)

The root CLI command executes the evaluation pipeline by default. It runs all enabled stages using the configuration at src/hedgehog/configs/config.yml.


uv run hedgehog [OPTIONS]

The same pipeline command is also available as an explicit subcommand:


uv run hedgehog run [OPTIONS]

Options

Option	Short	Type	Default	Description
`--config`	`-c`	`TEXT`	`src/hedgehog/configs/config.yml`	Master YAML config path.
`--mols`	`-m`	`TEXT`	`None`	Input molecule table path or glob. Overrides config input. Prefer CSV/TSV with a `smiles` header.
`--out`	`-o`	`TEXT`	`None`	Output directory override (`folder_to_save`).
`--stage`	`-s`	`STAGE`	`None`	Run one or more stages only. Repeat `--stage` to select multiple stages. Uses config molecules unless `--mols` is provided. See `hedgehog info` for stage descriptions.
`--reuse`		`FLAG`	`False`	Reuse existing results folder.
`--force-new`		`FLAG`	`False`	Always create a new results folder.
`--auto-install`		`FLAG`	`False`	Auto-install missing optional tools without prompts.
`--progress`		`FLAG`	`False`	Show live progress bar.
`--large-dataset`		`FLAG`	`False`	Stream large molecule libraries in chunks, write row-level `.parts` shards, and skip plots/report-heavy outputs. Filters are calculated by default but do not remove molecules unless `large_dataset_filter_data: true`.

The --reuse and --force-new flags are mutually exclusive. Using both will produce an error.

The --out option cannot be used together with --reuse or --force-new.

The progress bar is disabled by default in CLI runs. Use --progress when you want live stage progress rendering.

Use --large-dataset for PubChem/Enamine-scale statistics runs. It stores intermediate row-level tables as compressed shard directories such as filtered_molecules.parts/ and descriptors_all.parts/, uses large_dataset_chunk_rows from the master config, and skips plots plus HTML report generation. Without an explicit --stage selection, large-dataset mode runs the scalable statistics path: mol_prep, descriptors, struct_filters, and synthesis. Descriptor, structural, and synthesis filters are calculated by default, but downstream large-mode outputs keep all molecules unless large_dataset_filter_data: true. The synthesis stage skips AiZynthFinder retrosynthesis in large-dataset mode.

When you request any stage other than mol_prep, the pipeline may still run mol_prep first if the Mol Prep stage is enabled in config_mol_prep.yml. This is intentional: downstream stages operate on standardized molecules when that preprocessing stage is active.

Available Stages

Each --stage occurrence accepts one of the following values:

Stage	Description
`mol_prep`	Standardize and filter molecules (salts/fragments, metals, charges, tautomers, stereo)
`descriptors`	Compute 22 physicochemical descriptors per molecule
`struct_filters`	Apply structural filters (Lilly, NIBR, PAINS, etc.)
`synthesis`	Evaluate synthetic accessibility using retrosynthesis (AiZynthFinder) and other metrics
`docking`	Calculate docking scores with SMINA/GNINA/Matcha
`docking_filters`	Filter docking poses by quality and interactions
`final_descriptors`	Recompute descriptors on the final filtered set

When running --stage struct_filters, input resolution is:

Existing descriptors output (if present in the run folder)
MolPrep output
Sampled molecules input

Input File Contract

Recommended input is CSV/TSV with a smiles header:


smiles,model_name
CCO,demo
CCN,demo

Required:

smiles

Optional:

model_name or name
mol_idx

If mol_idx is missing, HEDGEHOG assigns it automatically. Headerless .smi-style files are supported for simple one-SMILES-per-line inputs, but CSV/TSV is the clearest format for production runs.

To run multiple stages in a single invocation, repeat --stage:


uv run hedgehog --stage descriptors --stage struct_filters

Folder Behavior

The output folder strategy depends on what you are running:

Full pipeline — a new numbered folder (results/run_N/) is created automatically.
Single stage rerun (--stage without --mols) — the existing folder is reused so the stage output is updated in place.
Single stage with new molecules (--stage + --mols) — a new folder is created since the input changed.
--reuse — always reuses the configured folder, regardless of what exists.
--force-new — always creates a new numbered folder, regardless of context.

Every run also writes a transient .RUN_INCOMPLETE marker in the results folder. The marker is removed on successful completion. If it remains present, the run was interrupted or failed and the run log should be checked.

Stage Failure and Skip Semantics

Stage completion is not binary “success or crash” across the whole pipeline. Some stages end the run early, while others can be skipped and let the pipeline continue.

Early exit with a completed upstream pipeline state: if mol_prep finishes but leaves zero molecules, the pipeline stops immediately.
Hard failure / early stop: structural filters stop the run if they do not complete successfully.
Conditional continuation: synthesis can continue the pipeline when the stage failed but still produced usable output, or when no output file was created and downstream stages can continue without synthesis results.
Soft skip: docking, docking filters, and final descriptors can be skipped when their required upstream data does not exist or contains no molecules.

This matters when interpreting stage counts in the log and the final report: a missing stage output does not always mean the whole run failed in the same way.

Examples


# Run the full pipeline with default config
uv run hedgehog
 
# Run with a custom master config
uv run hedgehog --config src/hedgehog/configs/config.yml
 
# Run with a custom molecule set
uv run hedgehog --mols input/my_molecules.csv
 
# Run into a custom output directory
uv run hedgehog --out results/my_run
 
# Run with a glob pattern
uv run hedgehog --mols "input/generated/*.csv"
 
# Run only the docking stage
uv run hedge --stage docking
 
# Run descriptors and structural filters in one invocation
uv run hedge --stage descriptors --stage struct_filters
 
# Run a specific stage with custom molecules
uv run hedge --stage descriptors --mols input/candidates.csv
 
# Rerun into the same results folder
uv run hedge --reuse
 
# Force a fresh results folder for a stage rerun
uv run hedge --stage synthesis --force-new
 
# Enable live progress bar
uv run hedge --progress
 
# Stream a large library without monolithic intermediate CSVs or plots
uv run hedge --large-dataset --mols input/pubchem.csv

`hedgehog report`

Regenerates the reporting artifacts from an existing pipeline run directory without re-running any pipeline stages. Useful when report templates, plotting logic, or the stage-audit notebook have been updated.


uv run hedgehog report <RESULTS_DIR>

Arguments

Argument	Type	Required	Description
`RESULTS_DIR`	`TEXT`	Yes	Path to an existing pipeline results directory (e.g., `results/run_10`).

The command loads the saved configuration from configs/master_config_resolved.yml inside the results directory and regenerates:

report.html
report_data.json
RUN_INFO.md
stage_filter_audit.ipynb

Examples


# Regenerate report for a specific run
uv run hedgehog report results/run_10
 
# Using the short alias
uv run hedge report results/run_5

`hedgehog info`

Displays a table of all available pipeline stages with their descriptions. Useful for checking stage names before using --stage.


uv run hedgehog info

`hedgehog version`

Prints the current HEDGEHOG version and project tagline.


uv run hedgehog version

`hedgehog setup aizynthfinder`

Installs the upstream aizynthfinder package into the project environment and downloads its public data into modules/aizynthfinder/. Downloads are auto-accepted by default.


uv run hedgehog setup aizynthfinder

If you explicitly want interactive confirmation:


uv run hedgehog setup aizynthfinder --no-yes

`hedgehog setup sync`

Downloads the SYNC 3D synthesizability classifier checkpoint to modules/sync/classifier_emb.ckpt.


uv run hedgehog setup sync

Use --no-yes to prompt before download:


uv run hedgehog setup sync --no-yes

`hedgehog setup fsscore`

Clones upstream FSScore checkout into modules/fsscore so you can point synthesis configuration to the pretrained checkpoint without adding heavy torch dependencies to the base HEDGEHOG environment.


uv run hedgehog setup fsscore --yes

Options:

Option	Short	Type	Default	Description
`--yes`	`-y`	`FLAG`	`False`	Auto-accept checkout prompt.

After checkout, configure:

HEDGEHOG_FSSCORE_PYTHON to an isolated FSScore environment
HEDGEHOG_FSSCORE_MODEL_PATH (or HEDGEHOG_FSSCORE_REPO_PATH)

`hedgehog setup nonpher-check`

Validates optional Nonpher runtime for the synthesis nonpher scorer. This command does not install dependencies. For portable setup, first use uv-isolated environments under a writable per-host HEDGEHOG_OPTIONAL_ENV_ROOT (for example ~/work/hedgehog_optional_envs), then probe with --python.


uv run hedgehog setup nonpher-check

Options:

Option	Type	Default	Description
`--python`	`TEXT`	`None`	External interpreter to probe (for example `~/work/hedgehog_optional_envs/nonpher/bin/python`).
`--probe-smiles`	`TEXT`	`CCO`	Probe molecule used for runtime validation.

Examples:


# Probe current uv environment
uv run hedgehog setup nonpher-check
 
# Probe uv-only isolated Nonpher env
uv run hedgehog setup nonpher-check --python ~/work/hedgehog_optional_envs/nonpher/bin/python
 
# If uv-only bootstrap is blocked by native deps, probe a validated external runtime
uv run hedgehog setup nonpher-check --python /mnt/ligandpro/shared_storage/data/nikolenko/hedgehog_optional_envs/nonpher-hybrid-py38-v2/bin/python

`hedgehog setup nvmolkit-worker`

Creates an isolated virtual environment at .venv-nvmolkit-worker and installs the optional nvMolKit worker there.


uv run hedgehog setup nvmolkit-worker

Options:

Option	Short	Type	Default	Description
`--yes/--no-yes`	`-y`	`FLAG`	`True`	Auto-accept dependency download prompt.
`--python`		`TEXT`	`None`	Python interpreter for worker venv. Resolution order: explicit value, then `python3.12`, `python3.11`, `python3.10`.

Examples:


# Auto-install worker dependencies (default)
uv run hedgehog setup nvmolkit-worker
 
# Force worker venv to use Python 3.11
uv run hedgehog setup nvmolkit-worker --python python3.11

`hedgehog setup shepherd-worker`

Creates an isolated virtual environment at .venv-shepherd-worker and installs the optional Shepherd-Score backend there.


uv run hedgehog setup shepherd-worker

Options:

Option	Short	Type	Default	Description
`--yes`	`-y`	`FLAG`	`False`	Auto-accept dependency download prompt.
`--python`		`TEXT`	`None`	Python interpreter for worker venv. Resolution order: explicit value, then `python3.12`, `python3.11`, `python3.10`.

Examples:


# Auto-install worker dependencies
uv run hedgehog setup shepherd-worker --yes
 
# Force worker venv to use Python 3.11
uv run hedgehog setup shepherd-worker --python python3.11 --yes

Environment Variables

The CLI supports a few environment variables that are useful in automation and CI:

Variable	Effect
`HEDGEHOG_AUTO_INSTALL=1`	Auto-accept optional dependency downloads and setup prompts.
`HEDGEHOG_NON_INTERACTIVE=1`	Auto-decline download/setup prompts instead of waiting for interactive input.
`HEDGEHOG_PLAIN_OUTPUT=1`	Disable Rich formatting and emit plain console output without the banner styling.

HEDGEHOG_AUTO_INSTALL is what the --auto-install flag sets internally for a pipeline run. The setup commands also use it when their --yes flag is enabled.

`hedgehog tui`

Launches the interactive Terminal User Interface for visual pipeline configuration and management.

Requirements and behavior:

Requires Node.js >= 18 and npm.
Requires a source checkout that contains the repository tui/ directory.
If the TUI bundle does not exist yet, HEDGEHOG automatically runs npm install and npm run build inside tui/ before launching.


uv run hedgehog tui

Options:

Option	Short	Type	Default	Description
`--session`	`-s`	`TEXT`	`None`	Resume TUI directly into results for the specified job id.

Examples:


# Start TUI normally
uv run hedgehog tui
 
# Resume directly into an existing job
uv run hedgehog tui --session 1a2b3c4d

For TUI-specific behavior such as editable config copies, job history storage, preflight checks, and keyboard shortcuts, see the TUI documentation.

The TUI can also be launched directly from the TUI package:


cd tui
npm run tui

Alias

hedge is a short alias for hedgehog. All commands work identically with either name:


uv run hedge --stage docking
uv run hedgehog --stage docking

Exit Codes

Code	Meaning
`0`	Pipeline completed successfully
`1`	Pipeline completed with failures, conflicting flags (`--reuse` + `--force-new`, or `--out` with `--reuse/--force-new`), TUI directory not found, or Node.js not available

CLI Reference

Commands

hedgehog (default pipeline command)

Options

Available Stages

Input File Contract

Folder Behavior

Stage Failure and Skip Semantics

Examples

hedgehog report

Arguments

Examples

hedgehog info

hedgehog version

hedgehog setup aizynthfinder

hedgehog setup sync

hedgehog setup fsscore

hedgehog setup nonpher-check

hedgehog setup nvmolkit-worker

hedgehog setup shepherd-worker

Environment Variables

hedgehog tui

Alias

Exit Codes

`hedgehog` (default pipeline command)

`hedgehog report`

`hedgehog info`

`hedgehog version`

`hedgehog setup aizynthfinder`

`hedgehog setup sync`

`hedgehog setup fsscore`

`hedgehog setup nonpher-check`

`hedgehog setup nvmolkit-worker`

`hedgehog setup shepherd-worker`

`hedgehog tui`