Cleaning Hundreds of Protein Files Just Got Easier

If you’ve ever tried preparing more than a handful of PDB files for simulations or docking, you know the pain: cleaning structures one by one, checking for missing atoms, removing waters, renaming residues… It’s a tedious, time-consuming task—and one that’s hard to get consistent when done manually.

Luckily, SAMSON offers a less painful route for molecular modelers looking to prepare protein structures in batches. Whether you’re screening a set of targets or running simulations on multiple models, the Batch Protein Prepare extension simplifies this entire process.

Why Batch Preparation Matters

In any high-throughput computational pipeline, consistency and time are critical. Manual errors—like forgetting to strip water molecules or missing a low-occupancy alternate location—can derail a workflow or invalidate results. On top of that, preparing large datasets means repeating the same operations again and again. This is where batch processing saves the day.

Instead of going through the Home > Prepare interface for each file, the Batch Protein Prepare extension lets you:

  • Automatically download structures from PDB using identifiers
  • Process entire directories of PDB, mmCIF, MOL2, or MMTF files
  • Apply filtering and protonation steps uniformly across all inputs

It’s particularly handy when your inputs come from a hybrid of sources: local files, downloaded codes, or data from collaborators.

How It Works

To start batch preparation, install the Batch Protein Prepare extension from SAMSON Connect.

Once installed, you have two key options:

  1. Prepare existing structures: Select a folder with PDB (or other supported formats) and the extension will process each file. Your output maintains the original folder hierarchy for traceability.
  2. Download and prepare by PDB code: Enter PDB identifiers as a string or from a text file. The tool will fetch each structure, clean it, and save the output.

Cleaning steps include:

  • Removing alternate locations
  • Deleting unnecessary ligands, cofactors, and solvents
  • Clearing ions
  • Adding hydrogens based on residue type

This makes it significantly easier to prepare data for batch docking, simulations, or modeling workflows.

Batch Protein Prepare

Who Should Use This

If you work with structure datasets—whether for virtual screening, comparative modeling, or database curation—Batch Protein Prepare is for you. It’s particularly helpful for research groups that deal with multi-model systems, variant analysis, or pipeline automation.

Final Thoughts

This extension doesn’t just save time. It reduces manual errors and helps maintain data consistency across your simulation or analysis pipeline. So the next time you groan at the sight of a folder with 300 PDB files, remember there’s a button for that.

To learn more about the full range of tools for protein preparation and validation in SAMSON, visit the documentation page: https://documentation.samson-connect.net/tutorials/prepare-protein/prepare-protein/

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.

Comments are closed.