Skip to Content
CLI Reference

CLI Reference

HEDGEHOG provides a Typer-based command-line interface. The primary command is hedgehog, with hedge available as a short alias. Both are interchangeable.

uv run hedgehog --help uv run hedge --help

Commands

CommandDescription
runRun the pipeline explicitly as a subcommand
reportRegenerate the HTML report from an existing run
infoDisplay pipeline information and available stages
versionDisplay version information
setupInstall optional external tools and assets
tuiLaunch the interactive TUI (Terminal User Interface)

hedgehog (default pipeline command)

The root CLI command executes the evaluation pipeline by default. It runs all enabled stages using the configuration at src/hedgehog/configs/config.yml.

uv run hedgehog [OPTIONS]

The same pipeline command is also available as an explicit subcommand:

uv run hedgehog run [OPTIONS]

Options

OptionShortTypeDefaultDescription
--config-cTEXTsrc/hedgehog/configs/config.ymlMaster YAML config path.
--mols-mTEXTNoneInput molecule table path or glob. Overrides config input. Prefer CSV/TSV with a smiles header.
--out-oTEXTNoneOutput directory override (folder_to_save).
--stage-sSTAGENoneRun one or more stages only. Repeat --stage to select multiple stages. Uses config molecules unless --mols is provided. See hedgehog info for stage descriptions.
--reuseFLAGFalseReuse existing results folder.
--force-newFLAGFalseAlways create a new results folder.
--auto-installFLAGFalseAuto-install missing optional tools without prompts.
--progressFLAGFalseShow live progress bar.
--large-datasetFLAGFalseStream large molecule libraries in chunks, write row-level .parts shards, and skip plots/report-heavy outputs. Filters are calculated by default but do not remove molecules unless large_dataset_filter_data: true.

The --reuse and --force-new flags are mutually exclusive. Using both will produce an error.

The --out option cannot be used together with --reuse or --force-new.

The progress bar is disabled by default in CLI runs. Use --progress when you want live stage progress rendering.

Use --large-dataset for PubChem/Enamine-scale statistics runs. It stores intermediate row-level tables as compressed shard directories such as filtered_molecules.parts/ and descriptors_all.parts/, uses large_dataset_chunk_rows from the master config, and skips plots plus HTML report generation. Without an explicit --stage selection, large-dataset mode runs the scalable statistics path: mol_prep, descriptors, struct_filters, and synthesis. Descriptor, structural, and synthesis filters are calculated by default, but downstream large-mode outputs keep all molecules unless large_dataset_filter_data: true. The synthesis stage skips AiZynthFinder retrosynthesis in large-dataset mode.

When you request any stage other than mol_prep, the pipeline may still run mol_prep first if the Mol Prep stage is enabled in config_mol_prep.yml. This is intentional: downstream stages operate on standardized molecules when that preprocessing stage is active.

Available Stages

Each --stage occurrence accepts one of the following values:

StageDescription
mol_prepStandardize and filter molecules (salts/fragments, metals, charges, tautomers, stereo)
descriptorsCompute 22 physicochemical descriptors per molecule
struct_filtersApply structural filters (Lilly, NIBR, PAINS, etc.)
synthesisEvaluate synthetic accessibility using retrosynthesis (AiZynthFinder) and other metrics
dockingCalculate docking scores with SMINA/GNINA/Matcha
docking_filtersFilter docking poses by quality and interactions
final_descriptorsRecompute descriptors on the final filtered set

When running --stage struct_filters, input resolution is:

  1. Existing descriptors output (if present in the run folder)
  2. MolPrep output
  3. Sampled molecules input

Input File Contract

Recommended input is CSV/TSV with a smiles header:

smiles,model_name CCO,demo CCN,demo

Required:

  • smiles

Optional:

  • model_name or name
  • mol_idx

If mol_idx is missing, HEDGEHOG assigns it automatically. Headerless .smi-style files are supported for simple one-SMILES-per-line inputs, but CSV/TSV is the clearest format for production runs.

To run multiple stages in a single invocation, repeat --stage:

uv run hedgehog --stage descriptors --stage struct_filters

Folder Behavior

The output folder strategy depends on what you are running:

  • Full pipeline — a new numbered folder (results/run_N/) is created automatically.
  • Single stage rerun (--stage without --mols) — the existing folder is reused so the stage output is updated in place.
  • Single stage with new molecules (--stage + --mols) — a new folder is created since the input changed.
  • --reuse — always reuses the configured folder, regardless of what exists.
  • --force-new — always creates a new numbered folder, regardless of context.

Every run also writes a transient .RUN_INCOMPLETE marker in the results folder. The marker is removed on successful completion. If it remains present, the run was interrupted or failed and the run log should be checked.

Stage Failure and Skip Semantics

Stage completion is not binary “success or crash” across the whole pipeline. Some stages end the run early, while others can be skipped and let the pipeline continue.

  • Early exit with a completed upstream pipeline state: if mol_prep finishes but leaves zero molecules, the pipeline stops immediately.
  • Hard failure / early stop: structural filters stop the run if they do not complete successfully.
  • Conditional continuation: synthesis can continue the pipeline when the stage failed but still produced usable output, or when no output file was created and downstream stages can continue without synthesis results.
  • Soft skip: docking, docking filters, and final descriptors can be skipped when their required upstream data does not exist or contains no molecules.

This matters when interpreting stage counts in the log and the final report: a missing stage output does not always mean the whole run failed in the same way.

Examples

# Run the full pipeline with default config uv run hedgehog # Run with a custom master config uv run hedgehog --config src/hedgehog/configs/config.yml # Run with a custom molecule set uv run hedgehog --mols input/my_molecules.csv # Run into a custom output directory uv run hedgehog --out results/my_run # Run with a glob pattern uv run hedgehog --mols "input/generated/*.csv" # Run only the docking stage uv run hedge --stage docking # Run descriptors and structural filters in one invocation uv run hedge --stage descriptors --stage struct_filters # Run a specific stage with custom molecules uv run hedge --stage descriptors --mols input/candidates.csv # Rerun into the same results folder uv run hedge --reuse # Force a fresh results folder for a stage rerun uv run hedge --stage synthesis --force-new # Enable live progress bar uv run hedge --progress # Stream a large library without monolithic intermediate CSVs or plots uv run hedge --large-dataset --mols input/pubchem.csv

hedgehog report

Regenerates the reporting artifacts from an existing pipeline run directory without re-running any pipeline stages. Useful when report templates, plotting logic, or the stage-audit notebook have been updated.

uv run hedgehog report <RESULTS_DIR>

Arguments

ArgumentTypeRequiredDescription
RESULTS_DIRTEXTYesPath to an existing pipeline results directory (e.g., results/run_10).

The command loads the saved configuration from configs/master_config_resolved.yml inside the results directory and regenerates:

  • report.html
  • report_data.json
  • RUN_INFO.md
  • stage_filter_audit.ipynb

Examples

# Regenerate report for a specific run uv run hedgehog report results/run_10 # Using the short alias uv run hedge report results/run_5

hedgehog info

Displays a table of all available pipeline stages with their descriptions. Useful for checking stage names before using --stage.

uv run hedgehog info

hedgehog version

Prints the current HEDGEHOG version and project tagline.

uv run hedgehog version

hedgehog setup aizynthfinder

Installs the upstream aizynthfinder package into the project environment and downloads its public data into modules/aizynthfinder/. Downloads are auto-accepted by default.

uv run hedgehog setup aizynthfinder

If you explicitly want interactive confirmation:

uv run hedgehog setup aizynthfinder --no-yes

hedgehog setup sync

Downloads the SYNC 3D synthesizability classifier checkpoint to modules/sync/classifier_emb.ckpt.

uv run hedgehog setup sync

Use --no-yes to prompt before download:

uv run hedgehog setup sync --no-yes

hedgehog setup fsscore

Clones upstream FSScore checkout into modules/fsscore so you can point synthesis configuration to the pretrained checkpoint without adding heavy torch dependencies to the base HEDGEHOG environment.

uv run hedgehog setup fsscore --yes

Options:

OptionShortTypeDefaultDescription
--yes-yFLAGFalseAuto-accept checkout prompt.

After checkout, configure:

  • HEDGEHOG_FSSCORE_PYTHON to an isolated FSScore environment
  • HEDGEHOG_FSSCORE_MODEL_PATH (or HEDGEHOG_FSSCORE_REPO_PATH)

hedgehog setup nonpher-check

Validates optional Nonpher runtime for the synthesis nonpher scorer. This command does not install dependencies. For portable setup, first use uv-isolated environments under a writable per-host HEDGEHOG_OPTIONAL_ENV_ROOT (for example ~/work/hedgehog_optional_envs), then probe with --python.

uv run hedgehog setup nonpher-check

Options:

OptionTypeDefaultDescription
--pythonTEXTNoneExternal interpreter to probe (for example ~/work/hedgehog_optional_envs/nonpher/bin/python).
--probe-smilesTEXTCCOProbe molecule used for runtime validation.

Examples:

# Probe current uv environment uv run hedgehog setup nonpher-check # Probe uv-only isolated Nonpher env uv run hedgehog setup nonpher-check --python ~/work/hedgehog_optional_envs/nonpher/bin/python # If uv-only bootstrap is blocked by native deps, probe a validated external runtime uv run hedgehog setup nonpher-check --python /mnt/ligandpro/shared_storage/data/nikolenko/hedgehog_optional_envs/nonpher-hybrid-py38-v2/bin/python

hedgehog setup nvmolkit-worker

Creates an isolated virtual environment at .venv-nvmolkit-worker and installs the optional nvMolKit worker there.

uv run hedgehog setup nvmolkit-worker

Options:

OptionShortTypeDefaultDescription
--yes/--no-yes-yFLAGTrueAuto-accept dependency download prompt.
--pythonTEXTNonePython interpreter for worker venv. Resolution order: explicit value, then python3.12, python3.11, python3.10.

Examples:

# Auto-install worker dependencies (default) uv run hedgehog setup nvmolkit-worker # Force worker venv to use Python 3.11 uv run hedgehog setup nvmolkit-worker --python python3.11

hedgehog setup shepherd-worker

Creates an isolated virtual environment at .venv-shepherd-worker and installs the optional Shepherd-Score backend there.

uv run hedgehog setup shepherd-worker

Options:

OptionShortTypeDefaultDescription
--yes-yFLAGFalseAuto-accept dependency download prompt.
--pythonTEXTNonePython interpreter for worker venv. Resolution order: explicit value, then python3.12, python3.11, python3.10.

Examples:

# Auto-install worker dependencies uv run hedgehog setup shepherd-worker --yes # Force worker venv to use Python 3.11 uv run hedgehog setup shepherd-worker --python python3.11 --yes

Environment Variables

The CLI supports a few environment variables that are useful in automation and CI:

VariableEffect
HEDGEHOG_AUTO_INSTALL=1Auto-accept optional dependency downloads and setup prompts.
HEDGEHOG_NON_INTERACTIVE=1Auto-decline download/setup prompts instead of waiting for interactive input.
HEDGEHOG_PLAIN_OUTPUT=1Disable Rich formatting and emit plain console output without the banner styling.

HEDGEHOG_AUTO_INSTALL is what the --auto-install flag sets internally for a pipeline run. The setup commands also use it when their --yes flag is enabled.

hedgehog tui

Launches the interactive Terminal User Interface for visual pipeline configuration and management.

Requirements and behavior:

  • Requires Node.js >= 18 and npm.
  • Requires a source checkout that contains the repository tui/ directory.
  • If the TUI bundle does not exist yet, HEDGEHOG automatically runs npm install and npm run build inside tui/ before launching.
uv run hedgehog tui

Options:

OptionShortTypeDefaultDescription
--session-sTEXTNoneResume TUI directly into results for the specified job id.

Examples:

# Start TUI normally uv run hedgehog tui # Resume directly into an existing job uv run hedgehog tui --session 1a2b3c4d

For TUI-specific behavior such as editable config copies, job history storage, preflight checks, and keyboard shortcuts, see the TUI documentation.

The TUI can also be launched directly from the TUI package:

cd tui npm run tui

Alias

hedge is a short alias for hedgehog. All commands work identically with either name:

uv run hedge --stage docking uv run hedgehog --stage docking

Exit Codes

CodeMeaning
0Pipeline completed successfully
1Pipeline completed with failures, conflicting flags (--reuse + --force-new, or --out with --reuse/--force-new), TUI directory not found, or Node.js not available
Last updated on