Finding Segments with Just the Right Composition

When working with molecular models that include thousands of nodes, efficiently filtering and selecting parts of your system is essential. One recurring challenge for modelers is identifying molecular segments with specific atomic compositions—say, segments with high oxygen content, a fixed number of residues, or certain charge characteristics. If you’re working on simulations that require grouping or analyzing such specific subsets, SAMSON’s Node Specification Language (NSL) comes in handy.

This post focuses on how to use NSL to filter segments based on quantitative attributes like the number of atoms, residues, and elemental compositions (e.g., carbons, hydrogens, oxygens). These attributes not only help you isolate regions of interest but also facilitate building specific simulation or analysis pipelines.

Why segment attributes matter

Segments in SAMSON can represent chains, molecules, or structural groupings within your model. Understanding and utilizing segment attributes enables you to:

Classify segments by their complexity (e.g., total number of atoms).
Identify outliers in structural databases or simulation results (e.g., segments with an unusually high number of sulfurs or charged residues).
Preprocess models for machine learning by filtering based on numeric properties.

Examples of useful filters

Using the segment attribute space (short name s), you can craft simple yet powerful NSL queries. Here are a few practical patterns:

Find segments with many atoms: s.nat > 1000
Filter segments with partial positive charge: s.pc > 0.5
Get segments with 10 to 20 oxygen atoms: s.nO 10:20
Locate highly charged segments: s.fc > 5
Select segments with more than 130 residues: s.nr > 130

Pro tip: Combine attributes

You can combine multiple attributes in a single query. For instance, to get segments with 100–200 atoms and more than 10 carbons, you could use:

s.nat 100:200 and s.nC &gt; 10

1	s.nat 100:200 and s.nC > 10

This kind of query is especially useful for identifying segments relevant to specific reaction mechanisms or binding site analyses.

A closer look at residue and group counts

The segment attribute space includes filters specific to biological polymers and structural groupings:

s.nr – number of residues (e.g., s.nr 100:130)
s.nsg – number of structural groups (e.g., s.nsg > 10)

These are particularly useful when working with proteins or nucleic acids, where modular structure is important to capture.

How it’s different from visual filters

While tools like visibility toggles and selection brushes are useful for visual interaction, NSL filters offer granular and reproducible control over your model’s content—especially helpful when scripting workflows or working with large datasets.

To learn more about the full set of segment attributes you can leverage in SAMSON, visit the official documentation page: NSL Segment Attributes.

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.