If you’ve ever worked with protein simulation pipelines, you know that preparing a clean structure is foundational—but incredibly repetitive when dealing with large datasets. Whether it’s removing water molecules, adding hydrogens, or fixing alternate locations, these tasks consume time that could be better spent interpreting results or refining hypotheses.
This is where SAMSON’s Batch Protein Prepare extension can help. If you need to process dozens—or hundreds—of PDB files consistently, this tool automates tedious cleanup tasks so you don’t have to.
Why batch preparation matters 👩🔬
Large-scale studies increasingly require consistent preprocessing of multiple structures—whether for docking, molecular dynamics, or computational screening. Doing it manually invites inconsistency and introduces potential errors. The Batch Protein Prepare extension helps eliminate variability by applying the same preparation pipeline to every structure.
So, what can Batch Protein Prepare do?
With this extension, you can:
- Run preparation on a whole folder containing structures in
PDB
,PDBx/mmCIF
,MMTF
, or evenMOL2
format. - Preserve internal folder structure—the outputs are organized to mirror your input directory.
- Automatically download structures using a list of PDB codes you paste or load from a text file.
- Apply the same preparation steps you use in
Home > Prepare
: strip ligands, water, and ions; add hydrogens; fix alternate locations, and more.
How it works
The extension is straightforward to set up. After installing it from the SAMSON Extension store, you launch it via the interface and specify either a folder or a list of PDB identifiers. Then, you choose your preparation options—e.g., whether to remove certain molecules or add hydrogen atoms. Once done, hit “Start” and let the tool do the heavy lifting.
This batch mode is particularly useful when:
- You’re preparing curated datasets for virtual screening.
- You want to test a pipeline across hundreds of protein targets.
- You’re working in education or training and need consistent examples for a class.
A small but important tip 📝
If consistency matters to your workflow, batch mode is safer than manual preparation. One missed step in an individual file can cause downstream simulation crashes or skew analysis results.
It’s also easy to track what’s been done: SAMSON outputs clean versions while maintaining your original files untouched. So it is both safe and reversible.
From high-throughput workflows to ensuring reproducibility in computational studies, SAMSON’s Batch Protein Prepare is a helpful tool to ensure preparation no longer becomes a bottleneck.
To dive deeper into how this extension works and what else you can do during batch processing, visit the full documentation: https://documentation.samson-connect.net/tutorials/prepare-protein/prepare-protein/
SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.