Filter Molecular Chains by Charge, Size, and Visibility in SAMSON

When working on large molecular systems, molecular modelers often face the challenge of quickly locating and analyzing specific chains that meet a set of criteria: hidden chains, chains with a particular charge, chains exceeding a certain number of atoms, or even those with a specific name pattern. Sifting manually through hundreds of chains slows down workflows and increases the likelihood of missing important outliers.

This is where the Node Specification Language (NSL) in SAMSON becomes particularly useful. SAMSON exposes a collection of powerful yet simple-to-use chain attributes that let you define precisely the characteristics of chains you want to work with. This blog post introduces you to the chain attributes in NSL and shows how to write filters for common scenarios modelers face.

Accessing Chain-Specific Attributes

In NSL, the chain attribute space (abbreviated c) allows you to query chains based on a large variety of properties. These include properties inherited from node and structuralGroup objects, as well as attributes specific to chains themselves. Here are some commonly used ones:

c.id: chain ID (e.g., c.id 2:5)
c.nr: number of residues (e.g., c.nr > 100)
c.h: hidden flag (e.g., c.h or not c.h)
c.n: name of the chain (e.g., c.n "L*")
c.fc: formal charge (e.g., c.fc 6:8)
c.nat: number of atoms (e.g., c.nat < 1000)

Solving Practical Problems

The real advantage of chain attributes is in solving practical modeling issues. For example:

🧪 Finding Chains Based on Visibility

If you’re working on presentation-ready images and need to isolate visible chains, use:

c.v

c.v

This matches all currently visible chains. To find hidden ones, use:

not c.v

not c.v

⚛️ Inspecting Chains With High Positive Charge

Electrostatic properties are critical when modeling enzyme active sites or binding interactions. To filter highly charged chains:

c.fc &gt; 8

c.fc > 8

🔍 Locating Chains With Name Patterns

To find chains that start with the letter “L” (e.g., ligands):

c.n "L*"

c.n "L*"

🧵 Selecting Chains Based on Number of Residues

If you are interested only in longer chains, like full proteins (e.g., 120 or more residues):

c.nr &gt;= 120

1	c.nr >= 120

🚮 Cleaning Unused or Very Small Chains

Isolate out chains with less than 50 atoms, possibly to delete or simplify for coarse-grained simulations:

c.nat &lt; 50

1	c.nat < 50

Working with Combined Filters

You can also combine multiple attributes to create more precise queries. Want all visible chains that are not hydrogen-only and have a chainID between 1 and 3?

c.v and c.id 1:3 and c.nH &lt; 10

1	c.v and c.id 1:3 and c.nH < 10

With these kinds of filters, you can interact with your molecular data much more effectively and streamline your modeling workflow.

To explore the full list of chain attributes and examples, visit the official documentation here:
https://documentation.samson-connect.net/users/latest/nsl/chain/

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at https://www.samson-connect.net.