DeepFoldRNA is a deep learning method for de novo RNA tertiary structure prediction.
Starting from a query sequence, it first collects an alignment of homologous sequences from multiple sequence databases.
Spatial restraints (distance maps and inter-residue orientations) are then predicted by deep self-attention-based
neural networks and converted into negative log-likelihood potentials. Finally, full-length structure models are
generated using L-BFGS folding simulations based on minimization of the potential with respect to the backbone
Figure 1. Flowchart of DeepFoldRNA which consists of two modules:(A) Module-I: Restraint Construction. Starting from a query nucleic acid sequence,
mutliple RNA sequence databases are searched
in order to create a multiple sequence alignment (MSA) for the target RNA.
Then the MSA is used to derive the predicted secondary structure which in turn is used to
initialize the pair embedding that captures the spatial relationship between each nucleic acid.
The raw MSA is also embedded into the network to initialize the MSA representation, which
captures the information contained in the alignment of homologous sequences.
The MSA and pair embeddings are then processed by the MSA Transformer layers,
which use multiple self-attention mechanisms to extract information from the MSA and
pair embeddings, where communication is encouraged between the two to ensure consistency.
Finally, the sequence embedding is extracted from the row in the final MSA embedding
that corresponds to the query sequence, which is further processed using self-attention
mechanisms by the Sequence Transformer layers. Finally, the predicted distance and orientation
maps are generated from a linear projection of the final pair embedding,
while the pseudo-torsion angles are predicted by a linear projection of the sequence embedding.
(B) Module-II: 3D structure assembly. These restraints are converted into a negative-log liklihood potential and L-BFGS
folding simulations are used to minimize the conformation in torsion angle space to produce a final model.
Figure 2. Definition of the geometric restraints predicted by DeepFoldRNA.
(A) inter-residue distances; (B) inter-residue torsion angles; and (C) backbone pseudo-torsion angles.
The DeepFoldRNA server takes as input a nucleic acid sequence in FASTA format, as well as
an email address to inform a user when their job has finished running. Additionally,
after submitting a job, an email will be sent to notify users that their job is running.
The output of the DeepFoldRNA server consists of the following sections:
The predicted structure model, where a download link is provided underneath the graphic.
One model will be generated for each sequence. To generate additional models, users
may download the stand-alone package, which is freely-available to academic/non-profit users.
The input sequence in FASTA format.
The sequence profile generated from the alignment of homologous sequences detected
for the given target. Under the sequence logo, users may download the raw
multiple sequence alignment (MSA). Note, U will be converted to T in the MSA.
The predicted distance and orientation maps, which are used to guide the DeepFoldRNA simulations.
Specifically, the distance map depicts the pairwise distances (from 0-40 Angstroms) between the
N1/N9 base atoms, while the orientation map illustrates the pairwise torsion angles formed by
the C4' and N1/N9 atoms from nucleic acid i and the N1/N9 and C4' atoms from nucleic acid j