When Downloaded Structures Aren’t Ready: Dealing with Missing Atoms and Residues

Working with publicly available protein structures can feel like opening a surprise package. Sometimes, it’s all there — chain complete, atoms in place, ready for simulation. More often, though, it’s not. Missing heavy atoms, absent side chains, incomplete backbones, protonation uncertainty… these gaps aren’t just inconvenient, they can silently disrupt energy minimizations, docking runs, or molecular dynamics simulations. 🧪

So, what’s the best way to fill these holes without switching tools or writing extensive scripts? If you’re using SAMSON, there’s a practical solution built right in: the PDBFixer extension.

Fixing deeper issues with PDBFixer

Integrated directly with SAMSON, the PDBFixer extension uses the PDBFixer Python package to identify and resolve common structure problems before they cascade into hard-to-diagnose failures.

Common pains it addresses:

  • Missing atoms or side chains: Common in X-ray structures, particularly in flexible regions. PDBFixer will rebuild these automatically.
  • Missing residues: When SEQRES indicates residues that aren’t in the ATOM records. The extension adds them, including building missing loops where needed.
  • Non-standard residues: Converts modified residues to their closest standard equivalents, useful when simulations require standard residue types.
  • No hydrogens: Protonates structures based on your specified pH — especially useful when working with pH-dependent interactions.
  • Implicit/explicit solvent setup: Can add water boxes around your protein, choose ion types, and even embed membrane proteins into lipid environments.

All this is accessible visually and interactively from within SAMSON. No scripting, no command-line tools, and no switching between programs.

Batch-mode support

The PDBFixer extension also works in batch mode — which is invaluable when preparing libraries of proteins from large datasets. Whether imported PDBs contain inconsistencies or just need consistent preparation settings (e.g., identical pH and solvent conditions), you can apply the same process across dozens or hundreds of files. Perfect for systematic studies or training datasets.

This extension avoids over-manipulation too: alternate conformations are resolved by keeping the higher occupancy atoms, and optional solvent/ligand/ion removal is fully controllable.

Example: PDBFixer in SAMSON

PDBFixer dialog

When to use it?

Use the PDBFixer extension any time you:

  • Load a protein and see missing sections or gaps
  • Want to simulate a protein that lacks hydrogens at physiological pH
  • Are prepping a batch of structures for docking, dynamics, or ML workflows

It’s not just about completing structures — it’s about reducing uncertainty before you simulate. Fewer early errors, less debugging, more time spent modeling.

To learn more about protein preparation and validation in SAMSON, visit the original documentation page: https://documentation.samson-connect.net/tutorials/prepare-protein/prepare-protein/.

SAMSON and all SAMSON Extensions are free for non-commercial use. You can download SAMSON at https://www.samson-connect.net.

Comments are closed.