Troubleshooting
Common problems encountered when running HEDGEHOG, with explanations and solutions.
missing required 'smiles' column
Symptom: input loading fails with an error that the file is missing the
required smiles column.
Cause: the input was parsed as CSV/TSV but does not contain a smiles
header, or the file is not in the expected molecule-table format.
Fix: use CSV/TSV with a smiles column:
smiles,model_name
CCO,demo
CCN,demoFor headerless .smi files, keep one SMILES per line and use an optional second
token for model_name.
GNINA/SMINA Binary Not Found
Symptom: docking fails before producing gnina_out.sdf or smina_out.sdf.
Cause: docking is enabled but the selected binary cannot be resolved from
PATH or from the explicit bin value in config_docking.yml.
Fix:
- Install the binary and make it available on
PATH, or set an explicit path:
gnina_config:
bin: /path/to/gnina- For descriptor/filter-only runs, disable docking or run only the safe stages:
uv run hedgehog --stage descriptors --stage struct_filters --force-newNo Docking Output Found
Symptom: logs report that docking finished but no results were detected.
Check first:
- selected docking tool in
config_docking.yml - receptor and autobox/reference ligand paths
- generated scripts under
stages/05_docking/_workdir/ - binary path and executable permissions
auto_runsetting
The expected final artifacts are tool-specific SDF files such as
stages/05_docking/gnina/gnina_out.sdf or
stages/05_docking/smina/smina_out.sdf.
Docking Script Path Issues
Docking scripts generated by HEDGEHOG execute from a _workdir/ subdirectory inside the docking stage output, not from the stage directory itself. This causes path resolution issues if relative paths are used.
Three specific path bugs to watch for:
- Config reference — the docking config file lives in the pipeline’s root working directory. Use an absolute path or
config_file.relative_to(ligands_dir)instead of just the file name. - External prep output path argument — the output SDF path must be absolute, otherwise it resolves relative to
_workdir/, creating nested_workdir/_workdir/paths. - External prep input path argument — same issue as above; the input CSV path must be absolute.
Symptoms: docking jobs fail with FileNotFoundError, or output files appear in unexpected nested directories.
Fix: ensure all paths passed to docking scripts are absolute:
# Wrong: relative path breaks when cwd is _workdir/
script_args = [str(output_sdf)]
# Correct: always resolve to absolute
script_args = [str(output_sdf.resolve())]Conda / Mamba Detection (GNINA)
If you configure GNINA to run inside an existing conda environment, HEDGEHOG may need to locate
conda.sh to activate that environment before launching GNINA. When conda_sh is not provided in
your docking configuration, HEDGEHOG auto-detects common conda installs by searching for these
directory names under the user’s home:
miniforgeminiconda3mambaforgeanaconda3
Symptom: docking (GNINA) stage fails with an error like conda not found or CondaError, or
GNINA cannot find required shared libraries after activation.
Fix: ensure one of the supported conda distributions is installed and its bin/ directory is on
your PATH, or set conda_sh explicitly to your .../etc/profile.d/conda.sh path.
# Verify conda is accessible
conda --version
# If using miniforge, conda.sh is typically here:
ls ~/miniforge/etc/profile.d/conda.shAutobox Coordinate Frame Mismatch
When using autobox docking (where the search box is derived from a reference ligand SDF), the reference ligand must be in the same coordinate frame as the receptor PDB file.
Symptom: docking scores are unreasonably poor, or all poses are placed far from the binding site.
Common cause: the reference ligand was extracted from an apo (ligand-free) crystal structure, but the receptor PDB comes from a holo (ligand-bound) structure with a different coordinate frame. Even small translations or rotations between crystal structures cause the autobox to miss the binding pocket entirely.
Fix: superimpose the apo and holo structures before extracting the reference ligand, or use a reference ligand from the same PDB file as the receptor:
from Bio.PDB import PDBParser, Superimposer
parser = PDBParser(QUIET=True)
ref_struct = parser.get_structure("holo", "receptor_holo.pdb")
mobile_struct = parser.get_structure("apo", "receptor_apo.pdb")
# Superimpose using CA atoms, then transform the reference ligand
sup = Superimposer()
sup.set_atoms(ref_atoms, mobile_atoms)
sup.apply(mobile_struct.get_atoms())External Tool Licensing
The optional ligand preparation and protein preparation steps may use proprietary third-party tools. These often require a valid vendor license.
Symptom: pipeline fails at input preprocessing with license checkout failures or vendor-tool initialization errors.
Fix:
- Set the environment variable(s) required by your tool vendor to point to the installation directory:
export TOOL_HOME=/opt/proprietary_tools/suite2024-1- Verify the license server is accessible:
$TOOL_HOME/licadmin STAT- If you do not have a valid license, remove or clear the
ligand_preparation_toolandprotein_preparation_toolfields inconfig.yml. HEDGEHOG will skip external preprocessing and rely on the Mol Prep stage (Datamol standardization), which does not require a license:
# config.yml — disable external preparation tools
ligand_preparation_tool:
protein_preparation_tool:Memory Issues with Large SDF Files
Docking stages load and process SDF files that can grow very large when docking thousands of molecules. This may cause out-of-memory errors, especially on systems with limited RAM.
Symptoms: the process is killed by the OS (Killed or OOM), or you see MemoryError in the log.
Mitigations:
- Reduce input size — use the
sample_sizeparameter inconfig.ymlto cap the number of molecules entering the pipeline:
sample_size: 500 # Process at most 500 molecules-
Split input files — divide your input CSV into smaller batches and run the pipeline on each batch separately.
-
Reduce parallelism — lower
n_jobsto reduce peak memory from concurrent docking processes:
n_jobs: 8 # Default is often set to all cores- Monitor memory — watch system memory during the docking stage:
# In a separate terminal
watch -n 5 free -hAiZynthFinder Installation
The retrosynthesis stage requires AiZynthFinder. Install it with the built-in setup command.
Symptom: synthesis stage fails with ModuleNotFoundError: aizynthfinder or the retrosynthesis subprocess exits immediately.
Fix:
- Run setup from the project root:
uv run hedgehog setup aizynthfinderThis command installs the optional retrosynthesis extra into the project environment and downloads the required public data (model files and templates) into modules/aizynthfinder/.
- Verify the installation:
# Check that public data was downloaded
ls modules/aizynthfinder/public/
# Check that logging config exists
ls modules/aizynthfinder/aizynthfinder/data/- If setup fails during dependency sync, verify your local
uvinstallation, Python version compatibility (AiZynthFinder currently supports Python 3.10-3.12), and outbound package-index access.
Shepherd-Score on Python 3.13
Symptom: installing Shepherd dependencies fails on Python 3.13 with wheel or
open3d errors.
open3d (required by shepherd-score) does not publish wheels for some Python ABIs (notably cp313), so installing Shepherd dependencies in the main environment can fail.
HEDGEHOG base install is now decoupled from Shepherd/legacy PoseCheck dependencies. For Shepherd, use an isolated worker environment:
uv run hedgehog setup shepherd-worker --yesIf Shepherd backend is unavailable at runtime, docking filters soft-skip Shepherd and log a warning with setup instructions.
TUI Port Conflicts
The TUI (Text User Interface) uses a JSON-RPC protocol over stdio to communicate between the Node.js frontend and the Python backend. The backend is launched as a child process — it does not bind to a network port by default.
However, the Node.js development server (used during TUI development) binds to a local port. If that port is already in use, the TUI will fail to start.
Symptom: Error: listen EADDRINUSE when launching the TUI in development mode.
Fix:
- Find and kill the process using the port:
# Find what is using port 3000
lsof -i :3000
# Kill the process
kill <PID>- For production use, launch the TUI via the CLI command, which uses the built bundle and stdio communication without binding any port:
uv run hedgehog tui- If the TUI has not been built yet, the CLI will automatically run
npm install && npm run buildinside thetui/directory before launching.
TUI Cancellation Is Not Immediate
Symptom: pressing c in the TUI requests cancellation, but an external
docking or synthesis command continues for a while.
Cause: cancellation is cooperative. The pipeline stops at safe checkpoints, and long-running external subprocesses may not terminate immediately.
Fix: wait for the current stage checkpoint and inspect the run log. If a separate external process must be stopped manually, identify it from the stage work directory or system process list before killing it.
Environment-Specific Docking Config
The default docking configuration file (config_docking.yml) ships with
repository-relative example receptor and reference-ligand paths. Production runs
should replace them with target-specific files.
Symptom: docking stage fails immediately with FileNotFoundError for receptor PDB or reference ligand files.
Fix: copy the bundled configs into your project directory, pass the copied
master config with --config, and update receptor/reference ligand paths:
receptor_pdb: /path/to/your/receptor.pdb
gnina_config:
autobox_ligand: /path/to/your/reference_ligand.sdfCommon Log Messages
| Log message | Meaning | Action |
|---|---|---|
No data available for structural filters | The previous stage produced no output | Check that descriptors stage completed and produced molecules |
Synthesis finished but no output file detected | AiZynthFinder did not write results | Check AiZynthFinder installation and that model/public data files exist |
Docking finished but no results detected in output directories | Neither SMINA nor GNINA produced output | Check binary paths and receptor PDB |
No molecules left after synthesis | All molecules were filtered out | Relax synthesis thresholds in config_synthesis.yml |
Pipeline completed with failures | At least one enabled stage did not complete | Review the per-stage status in the log output |