Working with large datasets in molecular modeling often means dealing with hundreds or even thousands of structures. Filtering through these models to identify only those relevant to a specific task—such as those with a manageable number of atoms—can be tedious.
That’s where the Node Specification Language (NSL) in SAMSON becomes incredibly helpful. One hidden gem in NSL is the numberOfAtoms structural model attribute. It lets you filter models based on the number of atoms they contain, allowing you to load only what’s needed and avoid overwhelming your system resources.
Why filter by number of atoms?
If you’re setting up molecular dynamics simulations ⚛️, building training datasets for machine learning, or simply exploring a database, you might want to quickly identify structures whose size doesn’t exceed a certain threshold. For instance:
- You want to exclude structures larger than 10,000 atoms to render them in real-time on your laptop.
- You need only small molecule drug candidates with less than 500 atoms.
- You want to visualize only macromolecules with at least 2,000 atoms.
How to filter with NSL
SAMSON’s Node Specification Language provides an intuitive syntax for this. The relevant attribute is numberOfAtoms, with the short form nat. To access it in the structural model attribute space, you would use sm.nat, where sm stands for structuralModel.
Here are some practical examples:
|
1 2 3 |
sm.nat < 1000</code> → Match models with fewer than 1,000 atoms sm.nat > 2000</code> → Match models with more than 2,000 atoms sm.nat 500:1500</code> → Match models with between 500 and 1,500 atoms |
This expression can be directly used in the SAMSON selection bar, allowing for real-time filtering of structures in your workspace.
Combine with other attributes
You can combine sm.nat with other NSL filters for more powerful queries. For example, if you’re only interested in models with at least 2 chains and fewer than 5,000 atoms:
|
1 |
sm.nc >= 2 and sm.nat < 5000 |
The NSL engine interprets these filters and matches the corresponding nodes immediately.
Bonus: use autocomplete
SAMSON’s user interface provides auto-completion for NSL queries, making it easy to discover available attributes and avoid typos 🚀
Overall, numberOfAtoms is one of the simplest yet most powerful ways to filter molecular data quickly and effectively.
To learn more about the full range of structural model attributes, visit the official SAMSON NSL documentation.
SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.
