When working with large molecular datasets, one of the most common challenges modelers face is filtering out molecules based on their structural complexity. Whether you’re curating a dataset, setting up a simulation, or organizing your models, figuring out how to quickly filter structures by the number of atoms, chains, residues or molecular groups can save precious time.
In SAMSON, the Node Specification Language (NSL) provides a powerful and concise way to do just that. More specifically, within the structuralModel attribute space (short name: sm), you can define filters based on quantitative attributes like atom counts, chain numbers, and more. This blog explains how.
Common Use Cases
- You want to visualize only models with fewer than 200 atoms to ensure rendering performance.
- You’re analyzing proteins and need to extract all structures with more than 3 chains.
- You’re running simulations where only molecules with 5-10 residues are valid inputs.
Filtering by Number of Atoms
Start with atom count, one of the most often used criteria:
|
1 |
sm.nat < 200 |
This will select all structural models with fewer than 200 atoms. For a specific range:
|
1 |
sm.nat 100:200 |
This matches models with 100 to 200 atoms, inclusive.
Filtering by Molecules, Chains, and Residues
You can also focus on the structural organization at the molecular and biomolecular level:
sm.nm < 3: Fewer than 3 moleculessm.nc 2:4: Between 2 and 4 chainssm.nr > 130: More than 130 residues
By combining these expressions, you can rapidly home in on exactly the structures you need for your task.
Element-Specific Atom Counts (Carbons, Hydrogens, etc.)
Filtering models by the number of atoms of a specific element is easy too:
|
1 2 3 |
sm.nC < 10</code> — Models with fewer than 10 carbon atoms<br> sm.nH 10:20</code> — Models with 10 to 20 hydrogen atoms<br> sm.nS > 5</code> — Models with more than 5 sulfur atoms |
Advanced Combinations
You can chain multiple conditions as needed. For example:
|
1 |
sm.nat < 200 and sm.nC > 5 and sm.nc == 1 |
This matches models with fewer than 200 atoms, more than 5 carbon atoms, and exactly one chain.
Why This Matters
Manually checking each model for complexity, atom types, or structural segments isn’t scalable. NSL filtering in SAMSON makes this task efficient and scalable, especially when dealing with hundreds or thousands of structures.
Every time you define dataset inputs, filter out anomalies, or focus on a certain type of connectivity, NSL becomes an invaluable tool. Once you get used to the syntax, it becomes second nature to query models precisely and consistently.
Learn more in the full documentation here: https://documentation.samson-connect.net/users/latest/nsl/structuralModel/
SAMSON and all SAMSON Extensions are free for non-commercial use. You can download SAMSON at https://www.samson-connect.net.
