Selective Insights: Finding Molecules by Atom Count in SAMSON

When working with complex molecular systems, narrowing down the parts of your model that matter can be a huge time saver. Whether you’re screening large molecular databases, refining simulation systems, or preparing focused data sets, the ability to filter molecules by specific attributes helps streamline your workflow.

One common need? Isolating molecules based on the number of atoms they contain.

Why Filter by Atom Count?

Large molecular models often include a wide range of molecule sizes, from small ligands to large biomolecular assemblies. Let’s say you want to:

  • Analyze only small molecules (e.g., with less than 100 atoms)
  • Exclude larger biomolecules like proteins or polymers
  • Focus on a specific size range for docking or visualization

Without an efficient filtering tool, doing this manually would require going molecule by molecule—and this quickly becomes tedious.

Using mol.nat in SAMSON’s Node Specification Language (NSL)

SAMSON provides a flexible way to filter molecules using its Node Specification Language (NSL). To target molecules based on their atom counts, the attribute to use is numberOfAtoms, or just nat in shorthand.

In NSL syntax, this is accessed through the molecule attribute space, so you’d write:

mol.nat

Here are some examples of how to use it:

  • mol.nat < 100 – selects molecules with fewer than 100 atoms
  • mol.nat 200:500 – selects molecules with atom counts between 200 and 500
  • not mol.nat > 1000 – excludes large molecules with more than 1000 atoms

Combining with Other Filters

What makes NSL filtering especially powerful is its compositional syntax. You can combine mol.nat with other attributes such as:

  • mol.nH – number of hydrogens
  • mol.nC – number of carbon atoms
  • mol.name – molecule names (strings, with optional wildcards)

For instance:

mol.nat < 100 and mol.name "Ligand*"

This would return all molecules with fewer than 100 atoms whose names start with “Ligand”.

Use Case: Preprocessing Molecular Libraries

If you’re preparing compound libraries for screening or simulation, you might not want extremely small or extremely large molecules. Using mol.nat, you could specify size bounds to ensure you’re including molecules within a meaningful range, such as:

mol.nat 50:500

This can help you generate cleaner, more relevant datasets—faster, and with fewer manual steps.

Where to Find More Attributes

mol.nat is just one of many attributes in NSL. You might also explore forcing filters on charge (mol.fc), partial charge (mol.pc), material ownership, chain count, and more.

To learn more, visit the full documentation page on molecule attributes in SAMSON.

SAMSON and all SAMSON Extensions are free for non-commercial use. You can download SAMSON at https://www.samson-connect.net.

Comments are closed.