Ligandmpnn
Design protein sequences with LigandMPNN for ligand-aware structure-based design
Category: development Source: adaptyvbio/protein-design-skillsLigandMPNN is a development skill for designing protein sequences with ligand awareness, covering structure-based protein design, binding site optimization, and conditional sequence generation
What Is This?
Overview
LigandMPNN is a machine learning model that generates novel protein sequences while maintaining awareness of bound ligands or small molecules. It leverages graph neural networks to simultaneously interpret protein structure and ligand interactions, enabling the design of proteins that preserve binding functionality while exploring sequence diversity. The model conditions sequence generation on both the protein backbone geometry and the presence of ligands, making it ideal for applications where maintaining ligand binding is critical.
This skill integrates LigandMPNN into your protein design workflow, providing tools to design binding pockets, optimize existing sequences around ligands, and generate variants that maintain or improve ligand interactions. It handles the complexity of structure-based design while accounting for the chemical environment created by bound molecules. The model can be used to redesign entire proteins or focus on specific regions, such as active sites or allosteric pockets, where ligand interactions are essential for function.
Who Should Use This
Protein engineers, computational biologists, and drug discovery researchers who need to design proteins that bind specific ligands or maintain ligand binding while exploring sequence space should use this skill. It is also valuable for academic researchers studying protein-ligand interactions, as well as biotechnology professionals developing biosensors or therapeutic proteins.
Why Use It?
Problems It Solves
Traditional protein design often ignores bound ligands or treats them as static obstacles. LigandMPNN solves this by explicitly modeling ligand interactions during sequence generation, ensuring designed proteins maintain binding affinity and specificity. This prevents the common problem of generating sequences that fold correctly but lose ligand binding capability. By considering the ligand as part of the design process, LigandMPNN helps avoid mutations that would disrupt critical contacts or alter the binding pocket geometry.
Core Highlights
LigandMPNN generates protein sequences conditioned on both structure and ligand presence simultaneously. The model preserves binding pocket geometry while allowing sequence variation in non-critical regions. It enables rapid exploration of sequence space for ligand-binding proteins without expensive experimental screening. The skill integrates seamlessly with structure preparation and validation workflows for complete design pipelines. LigandMPNN also supports batch processing of multiple protein-ligand complexes, making it suitable for high-throughput design projects.
How to Use It?
Basic Usage
from ligandmpnn import LigandMPNN
model = LigandMPNN()
designed_sequences = model.design(
pdb_file="protein_ligand.pdb",
num_sequences=10
)
Real-World Examples
Design a binding pocket for an enzyme while maintaining ligand interactions:
sequences = model.design(
pdb_file="enzyme_substrate.pdb",
fixed_positions=[1, 2, 3],
num_sequences=5
)
Generate variants of an antibody that preserve antigen binding:
variants = model.design(
pdb_file="antibody_antigen.pdb",
chain_ids=["H", "L"],
temperature=0.1
)
Advanced Tips
Use lower temperature values when you need sequences closer to the training distribution, and higher values for more diverse exploration. Specify fixed positions for residues critical to ligand binding to prevent unwanted mutations in the binding pocket. You can also adjust the number of design iterations and set custom constraints for specific regions, such as disulfide bonds or post-translational modification sites. For large proteins, consider designing in segments and merging results.
When to Use It?
Use Cases
Design novel enzymes that bind specific substrates while improving catalytic efficiency. Optimize antibody sequences to maintain antigen binding while reducing immunogenicity. Create protein binders for drug discovery by designing binding pockets around small molecule ligands. Engineer biosensors that respond to specific ligand binding with conformational changes. LigandMPNN is also useful for stabilizing protein-ligand complexes in synthetic biology applications or for designing proteins that modulate signaling pathways in response to ligand binding.
Related Topics
This skill complements protein folding prediction with AlphaFold, molecular dynamics simulation, and structure-based drug design workflows. It can be integrated with downstream analysis tools for binding energy estimation or in silico mutagenesis studies.
Important Notes
Requirements
You need a PDB file with protein structure and ligand coordinates. The model requires GPU acceleration for reasonable inference speed. Python 3.8 or higher is required with PyTorch installed. Additional dependencies may include RDKit for ligand processing and Biopython for structure manipulation.
Usage Recommendations
Always validate designed sequences with structure prediction before experimental testing. Use multiple design runs with different random seeds to explore sequence diversity. Consider the biological context when selecting fixed positions and design constraints. For best results, combine LigandMPNN outputs with molecular docking or molecular dynamics simulations to assess binding stability.
Limitations
LigandMPNN works best with ligands present in the training data distribution. Very large proteins may require chunking or special handling. The model cannot predict ligand binding affinity, only generate sequences compatible with the input structure. Experimental validation is essential to confirm binding and function.