Quickly Filter Molecular Folders by Atom Counts in SAMSON

Filtering and analyzing large molecular datasets can quickly become overwhelming, especially when you’re looking for structures with specific characteristics like atom counts, element compositions, or number of chains. For researchers and molecular modelers dealing with large projects inside SAMSON, locating the right molecular folders efficiently is essential to streamline your workflow and reduce cognitive load.

The Node Specification Language (NSL) in SAMSON introduces a convenient way to query molecular models based on a variety of attributes. One highly useful yet often underused area is the set of folder-specific attributes dedicated to the structures grouped as folders. These attributes can be used to create powerful filters and target exactly the data you need.

Why Use Folder Attributes for Filtering?

Imagine you’re analyzing simulation results where each folder represents a different molecule or molecular system. Some may contain thousands of atoms, while others are much smaller. Instead of visually browsing through the UI to inspect each one, you can use NSL queries to programmatically select just the folders you’re interested in.

For instance, if you only want folders that contain fewer than 1000 atoms, this simple NSL expression will do:

Need folders with between 100 and 200 atoms?

The f prefix corresponds to the folder attribute space in NSL, and nat is short for numberOfAtoms.

Available Atom-Level Filters

Here are some additional folder attributes particularly useful for filtering by specific atom types:

  • f.nC – Number of Carbon atoms
  • f.nH – Number of Hydrogens
  • f.nN – Number of Nitrogens
  • f.nO – Number of Oxygens
  • f.nS – Number of Sulfurs

Examples of useful queries include:

Beyond Atom Counts

Folder attributes also allow you to filter based on molecular structure and hierarchy:

  • f.nc – Number of chains
  • f.nr – Number of residues
  • f.nm – Number of molecules
  • f.nsg – Number of structural groups
  • f.nsm – Number of structural models

By combining these with logical operations (like and, or, not), you can structure complex queries. For example:

…selects folders with fewer than 4 molecules and more than 10 Carbon atoms.

A Simple Use Case

You’re preparing a data set for training a machine learning model and want to include only molecular systems with 100 to 200 atoms, that have at least 2 chains, and include Oxygen atoms. Here’s your query:

Clean, efficient, and instantly filters your data so you can focus on analysis instead of browsing.

To explore all available folder attributes, check out the official NSL folder attribute documentation.

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.

Comments are closed.