Filtering Molecular Models by Atom Counts: A Quick Guide for Structural Modelers

When working with large molecular datasets, one of the most common challenges modelers face is filtering out molecules based on their structural complexity. Whether you’re curating a dataset, setting up a simulation, or organizing your models, figuring out how to quickly filter structures by the number of atoms, chains, residues or molecular groups can save precious time.

In SAMSON, the Node Specification Language (NSL) provides a powerful and concise way to do just that. More specifically, within the structuralModel attribute space (short name: sm), you can define filters based on quantitative attributes like atom counts, chain numbers, and more. This blog explains how.

Common Use Cases

  • You want to visualize only models with fewer than 200 atoms to ensure rendering performance.
  • You’re analyzing proteins and need to extract all structures with more than 3 chains.
  • You’re running simulations where only molecules with 5-10 residues are valid inputs.

Filtering by Number of Atoms

Start with atom count, one of the most often used criteria:

This will select all structural models with fewer than 200 atoms. For a specific range:

This matches models with 100 to 200 atoms, inclusive.

Filtering by Molecules, Chains, and Residues

You can also focus on the structural organization at the molecular and biomolecular level:

  • sm.nm < 3: Fewer than 3 molecules
  • sm.nc 2:4: Between 2 and 4 chains
  • sm.nr > 130: More than 130 residues

By combining these expressions, you can rapidly home in on exactly the structures you need for your task.

Element-Specific Atom Counts (Carbons, Hydrogens, etc.)

Filtering models by the number of atoms of a specific element is easy too:

Advanced Combinations

You can chain multiple conditions as needed. For example:

This matches models with fewer than 200 atoms, more than 5 carbon atoms, and exactly one chain.

Why This Matters

Manually checking each model for complexity, atom types, or structural segments isn’t scalable. NSL filtering in SAMSON makes this task efficient and scalable, especially when dealing with hundreds or thousands of structures.

Every time you define dataset inputs, filter out anomalies, or focus on a certain type of connectivity, NSL becomes an invaluable tool. Once you get used to the syntax, it becomes second nature to query models precisely and consistently.

Learn more in the full documentation here: https://documentation.samson-connect.net/users/latest/nsl/structuralModel/

SAMSON and all SAMSON Extensions are free for non-commercial use. You can download SAMSON at https://www.samson-connect.net.

Comments are closed.