PPLM is a protein–protein language model that learns directly from paired sequences through a novel attention architecture, explicitly capturing inter-protein context.
Building on PPLM, we developed PPLM-PPI, PPLM-Affinity, and PPLM-Contact for predicting binary protein–protein interactions, estimating binding affinity,
and identifying interface residue contacts, respectively.
In PPLM-PPI, the embeddings and attention matrices generated by PPLM are first pooled using max and mean pooling strategies,
and then passed through a multilayer perceptron to predict the interaction probability between the input sequences.
In PPLM-Affinity, the final layer of PPLM is fine-tuned on binding affinity data, with its embeddings utilized to predict the binding affinity
of the receptor and ligand sequences through max pooling and two translation layers.
In PPLM-Contact, the inter-protein attention matrices generated by PPLM are integrated with MSA-derived features and monomer distance maps
to capture both evolutionary and structural information, which are then used to model interface residue contacts
through a novel inter-protein transformer network.
These downstream methods consistently outperform existing approaches across their respective tasks,
highlighting the potential of language models as a powerful framework for computational PPI studies.
Protein-Protein Interaction Prediction (PPLM-PPI)
Protein-Protein Binding Affinity Prediction (PPLM-Affinity)
References
Jun Liu, Hungyu Chen, Yang Zhang. A Corporative Language Model for Protein-Protein Interaction, Binding Affinity, and Interface Contact Prediction. In preparation.