Anyone who has prepared protein systems for molecular dynamics (MD) simulations knows how tricky it can be to clean up your structure without deleting something important. A common pain point: selectively removing crystal waters that are outside of the active site, while keeping those that might be functionally relevant. This blog post walks you through a targeted method available in the GROMACS Wizard within the SAMSON platform to help you do just that.
Why bother?
Crystal structures from the Protein Data Bank (PDB) often contain dozens or even hundreds of water molecules. While many of these are just bulk waters included during crystallization, some can play functional roles—especially those located in the active site. Deleting all waters indiscriminately could lead to a non-functional model, but leaving all of them may affect the solvated system and simulation speed. A practical solution is selective deletion.
The feature: Expand > Advanced selection
SAMSON offers a visual and customizable way to isolate only the crystal water molecules that lie outside of a chosen zone—usually defined around your active site. Here’s how:
- First, select atoms, residues, or structures that define your active site. This could include ligands, catalytic residues, or active-site waters you know you want to keep.
- Right-click on your selection in either the Document view or the Viewport. Choose Expand selection > Advanced.
- In the pop-up, set the Node Type to Water.
- Choose “beyond” a certain distance—e.g., 5Å. This means only waters outside this distance from your selected active site structures will be selected.
- Enable Auto-update to preview the resulting selection. Adjust the distance if needed.
- Click OK once satisfied, then verify the highlighted waters.
- Finally, right-click the selection and choose Erase selection.
This method ensures you’re not removing tightly bound (i.e., functional) waters within your protein’s active site, based on spatial proximity. It’s a simple yet powerful way to clean your system intelligently.

A flexible workflow
The real benefit of this approach is flexibility. Rather than relying on fixed residue ranges or PDB chain information, you’re using spatial logic—something that better reflects the biochemical context of your structure. And because the workflow is visual and interactively updated, it’s easy to experiment and refine.
For researchers preparing simulations of enzymes, membrane proteins, or any system where specific waters matter (e.g., structural waters or those involved in ligand binding), this feature can save hours of manual editing and guesswork.
Want to learn more about system pre-processing in GROMACS Wizard? You can check out the full tutorial here: https://documentation.samson-connect.net/tutorials/gromacs-wizard/preprocess/
SAMSON and all SAMSON Extensions are free for non-commercial use. You can download SAMSON at https://www.samson-connect.net.
