Quickly Select Protein Chains by Size with NSL

When working with large biomolecular structures, one common challenge for molecular modelers is quickly identifying and selecting specific protein chains based on their attributes — for example, their number of residues. Whether you’re preparing a system for simulation, simplifying a view for analysis, or isolating a region of interest, being able to filter chains by size can save significant time and reduce manual errors.

The Node Specification Language (NSL) in SAMSON gives users a powerful and compact way to select molecular elements based on a wide range of attributes. Let’s explore how to solve a common task: selecting chains based on their number of residues.

Chain Selection Based on Size

Every chain in SAMSON has a numberOfResidues attribute, accessible in NSL via the short code c.nr. This allows you to quickly identify chains that are above or below certain size thresholds, or within specific ranges. Here are some common queries and when to use them:

c.nr > 100: Selects chains with more than 100 residues. Great for focusing on large chains in multi-chain biomolecules.
c.nr <= 50: Identifies smaller chains, possibly ligands or short peptides.
c.nr 60:90: Selects chains whose residue count lies between 60 and 90, inclusive. Useful for narrowing down to medium-sized chains.

This functionality becomes particularly helpful when you’re working with large structures such as ribosomes, viral capsids, or multi-domain proteins, where hundreds of chains may be involved. Manually checking the size of each chain through visualization would be time-consuming and error-prone.

Combining Filters for Fine Control

You can combine NSL filters to refine your selection. For example:

c.v and c.nr &gt; 150

1	c.v and c.nr > 150

This selects chains that are both visible and have more than 150 residues. Or try:

not c.h and c.nr &lt; 50

1	not c.h and c.nr < 50

This selects chains that are not hidden and contain fewer than 50 residues. You can even combine ranges and multiple conditions:

c.nr 80:120, 150:180

1	c.nr 80:120, 150:180

Meaning: select chains with either 80–120 or 150–180 residues.

Why This Matters

Efficient chain selection makes it easier to annotate, analyze, color, or isolate specific regions of a molecular system. Instead of visually inspecting each chain and counting residues manually, using NSL expressions reduces this to a one-liner.

The filtering logic is readable and repeatable, which is essential for collaborating with others or documenting your workflows. It’s also very easy to modify and reuse, especially when you switch between projects with different systems.

Explore More

You can learn about more chain attributes and filters — including how to filter by number of segments, partial charge, visibility, and atomic composition — in the full NSL documentation page:

NSL: Chain Attributes Documentation

SAMSON and all SAMSON Extensions are free for non-commercial use. You can get SAMSON at www.samson-connect.net.