Filtering Molecules by Structure: Speed Up Your Modeling Workflow

Molecular modelers often deal with large systems containing hundreds—or even thousands—of molecules. Whether you’re working on a protein complex, a nanomaterial, or a drug candidate database, being able to rapidly sort and filter your molecular structures based on specific criteria can save time and reduce frustration.

That’s where the Node Specification Language (NSL) in SAMSON becomes incredibly helpful. In this post, we’ll focus on a powerful and self-contained feature: how to filter molecules by their structural properties, such as the number of atoms, chains, residues, and elements like carbon or nitrogen.

Why does structure-based filtering matter?

Imagine you’re preparing a dataset of biomolecular structures for simulation or visualization, and you only want proteins with more than 100 residues or molecules with fewer than 10 carbon atoms. Instead of manually browsing your dataset, you can use NSL queries to automatically select what you need. This makes your process scalable, reproducible, and less error-prone.

Getting started with NSL molecule attributes

NSL provides a dedicated attribute space for molecules—abbreviated as mol—that allows you to match only molecule nodes. Within this space, you can use intuitive filters. Here are a few examples that demonstrate how you can query based on specific criteria:

Examples of useful queries:

mol.nat < 1000 – Selects molecules with fewer than 1,000 atoms.
mol.nC 10:20 – Matches molecules with 10 to 20 carbon atoms.
mol.nr > 130 – Finds molecules with more than 130 residues.
mol.nc < 3 – Selects molecules with fewer than 3 chains.
mol.pc > 1.5 – Filters molecules with partial charge greater than 1.5.

How it works in practice

All you need to do is enter these specifications in the selection box or script interface that supports NSL inside SAMSON. If, for example, you are only interested in highly charged molecules with many nitrogens and structural segments, you could write something like:

mol.nN &gt; 5 and mol.pc &gt; 1.0 and mol.ns &gt; 2

1	mol.nN > 5 and mol.pc > 1.0 and mol.ns > 2

Beyond atoms: residues, chains, and groups

One of the more underrated features is the ability to filter by the number of structural segments and groups:

mol.ns 1:3 – Matches molecules with 1 to 3 segments.
mol.nsg > 10 – Targets molecules with more than 10 structural groups.

This can be especially useful when analyzing large biomolecules, polymers, or composite materials that come pre-organized in structural hierarchies. With NSL, you can navigate them quickly and group similar structures for batch processing or comparative visualization.

Example use cases

Filter molecules for coarse-grained vs. atomistic simulations using mol.ncga (number of coarse-grained atoms).
Exclude hidden molecules from view using not mol.h.
Identify all selected molecules using mol.selected.

Each attribute query is transparent and can be combined with others to create increasingly specific filters. This kind of flexibility makes NSL a handy tool not just for large datasets, but even in smaller modeling tasks when precision matters.

Want to see the full list of molecule attributes available for filtering? Learn more in the official documentation.

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.