About geo.npy file

This forum is shown on the index page along with all topics.

Moderator: robpearc

sawcheet
Posts: 1
Joined: Mon Dec 04, 2023 3:03 pm

About geo.npy file

Post by sawcheet »

After obtaining the output of the sequence using DRfold, in geo.npy file there are array with name: pp,cc, nn, pcc, pnn, cnn, pccp,pnnp,cnnc.
I don't know what these terms mean, and I could not find any online resource stating what these are. Can anyone help me on this and help me understand what are the long list of the scores that we get in these array while plotting between the nucleotides.
dr.jawairia
Posts: 2
Joined: Mon Dec 04, 2023 6:44 pm

Re: About geo.npy file

Post by dr.jawairia »

ATACAAGAGATGTGAGAAGCACCATAAAAGGCGTTGTGAGGAGTTGTGGGGGAGTGAGGGAGAGAAGAGG
TTGAAAAGCTTATTAGCTGCTGTACGGTAAAACTCCTTCTTTCTGCAACATGGGGAAGAACAAACTCCTT
CATCCAAGTCTGGTTCTTCTCCTCTTGGTCCTCCTGCCCACAGACGCCTCAGTCTCTGGAAAACCGCAGT
ATATGGTTCTGGTCCCCTCCCTGCTCCACACTGAGACCACTGAGAAGGGCTGTGTCCTTCTGAGCTACCT
GAATGAGACAGTGACTGTAAGTGCTTCCTTGGAGTCTGTCAGGGGAAACAGGAGCCTCTTCACTGACCTG
GAGGCGGAGAATGACGTACTCCACTGTGTCGCCTTCGCTGTCCCAAAGTCTTCATCCAATGAGGAGGTAA
TGTTCCTCACTGTCCAAGTGAAAGGACCAACCCAAGAATTTAAGAAGCGGACCACAGTGATGGTTAAGAA
CGAGGACAGTCTGGTCTTTGTCCAGACAGACAAATCAATCTACAAACCAGGGCAGACAGTGAAATTTCGT
GTTGTCTCCATGGATGAAAACTTTCACCCCCTGAATGAGTTGATTCCACTAGTATACATTCAGGATCCCA
AAGGAAATCGCATCGCACAATGGCAGAGTTTCCAGTTAGAGGGTGGCCTCAAGCAATTTTCTTTTCCCCT
CTCATCAGAGCCCTTCCAGGGCTCCTACAAGGTGGTGGTACAGAAGAAATCAGGTGGAAGGACAGAGCAC
CCTTTCACCGTGGAGGAATTTGTTCTTCCCAAGTTTGAAGTACAAGTAACAGTGCCAAAGATAATCACCA
TCTTGGAAGAAGAGATGAATGTATCAGTGTGTGGCCTATACACATATGGGAAGCCTGTCCCTGGACATGT
GACTGTGAGCATTTGCAGAAAGTATAGTGACGCTTCCGACTGCCACGGTGAAGATTCACAGGCTTTCTGT
GAGAAATTCAGTGGACAGCTAAACAGCCATGGCTGCTTCTATCAGCAAGTAAAAACCAAGGTCTTCCAGC
TGAAGAGGAAGGAGTATGAAATGAAACTTCACACTGAGGCCCAGATCCAAGAAGAAGGAACAGTGGTGGA
ATTGACTGGAAGGCAGTCCAGTGAAATCACAAGAACCATAACCAAACTCTCATTTGTGAAAGTGGACTCA
CACTTTCGACAGGGAATTCCCTTCTTTGGGCAGGTGCGCCTAGTAGATGGGAAAGGCGTCCCTATACCAA
ATAAAGTCATATTCATCAGAGGAAATGAAGCAAACTATTACTCCAATGCTACCACGGATGAGCATGGCCT
TGTACAGTTCTCTATCAACACCACCAATGTTATGGGTACCTCTCTTACTGTTAGGGTCAATTACAAGGAT
CGTAGTCCCTGTTACGGCTACCAGTGGGTGTCAGAAGAACACGAAGAGGCACATCACACTGCTTATCTTG
TGTTCTCCCCAAGCAAGAGCTTTGTCCACCTTGAGCCCATGTCTCATGAACTACCCTGTGGCCATACTCA
GACAGTCCAGGCACATTATATTCTGAATGGAGGCACCCTGCTGGGGCTGAAGAAGCTCTCCTTCTATTAT
CTGATAATGGCAAAGGGAGGCATTGTCCGAACTGGGACTCATGGACTGCTTGTGAAGCAGGAAGACATGA
AGGGCCATTTTTCCATCTCAATCCCTGTGAAGTCAGACATTGCTCCTGTCGCTCGGTTGCTCATCTATGC
TGTTTTACCTACCGGGGACGTGATTGGGGATTCTGCAAAATATGATGTTGAAAATTGTCTGGCCAACAAG
GTGGATTTGAGCTTCAGCCCATCACAAAGTCTCCCAGCCTCACACGCCCACCTGCGAGTCACAGCGGCTC
CTCAGTCCGTCTGCGCCCTCCGTGCTGTGGACCAAAGCGTGCTGCTCATGAAGCCTGATGCTGAGCTCTC
GGCGTCCTCGGTTTACAACCTGCTACCAGAAAAGGACCTCACTGGCTTCCCTGGGCCTTTGAATGACCAG
GACAATGAAGACTGCATCAATCGTCATAATGTCTATATTAATGGAATCACATATACTCCAGTATCAAGTA
CAAATGAAAAGGATATGTACAGCTTCCTAGAGGACATGGGCTTAAAGGCATTCACCAACTCAAAGATTCG
TAAACCCAAAATGTGTCCACAGCTTCAACAGTATGAAATGCATGGACCTGAAGGTCTACGTGTAGGTTTT
TATGAGTCAGATGTAATGGGAAGAGGCCATGCACGCCTGGTGCATGTTGAAGAGCCTCACACGGAGACCG
TACGAAAGTACTTCCCTGAGACATGGATCTGGGATTTGGTGGTGGTAAACTCAGCAGGTGTGGCTGAGGT
AGGAGTAACAGTCCCTGACACCATCACCGAGTGGAAGGCAGGGGCCTTCTGCCTGTCTGAAGATGCTGGA
CTTGGTATCTCTTCCACTGCCTCTCTCCGAGCCTTCCAGCCCTTCTTTGTGGAGCTCACAATGCCTTACT
CTGTGATTCGTGGAGAGGCCTTCACACTCAAGGCCACGGTCCTAAACTACCTTCCCAAATGCATCCGGGT
CAGTGTGCAGCTGGAAGCCTCTCCCGCCTTCCTAGCTGTCCCAGTGGAGAAGGAACAAGCGCCTCACTGC
ATCTGTGCAAACGGGCGGCAAACTGTGTCCTGGGCAGTAACCCCAAAGTCATTAGGAAATGTGAATTTCA
CTGTGAGCGCAGAGGCACTAGAGTCTCAAGAGCTGTGTGGGACTGAGGTGCCTTCAGTTCCTGAACACGG
AAGGAAAGACACAGTCATCAAGCCTCTGTTGGTTGAACCTGAAGGACTAGAGAAGGAAACAACATTCAAC
TCCCTACTTTGTCCATCAGGTGGTGAGGTTTCTGAAGAATTATCCCTGAAACTGCCACCAAATGTGGTAG
AAGAATCTGCCCGAGCTTCTGTCTCAGTTTTGGGAGACATATTAGGCTCTGCCATGCAAAACACACAAAA
TCTTCTCCAGATGCCCTATGGCTGTGGAGAGCAGAATATGGTCCTCTTTGCTCCTAACATCTATGTACTG
GATTATCTAAATGAAACACAGCAGCTTACTCCAGAGATCAAGTCCAAGGCCATTGGCTATCTCAACACTG
GTTACCAGAGACAGTTGAACTACAAACACTATGATGGCTCCTACAGCACCTTTGGGGAGCGATATGGCAG
GAACCAGGGCAACACCTGGCTCACAGCCTTTGTTCTGAAGACTTTTGCCCAAGCTCGAGCCTACATCTTC
ATCGATGAAGCACACATTACCCAAGCCCTCATATGGCTCTCCCAGAGGCAGAAGGACAATGGCTGTTTCA
GGAGCTCTGGGTCACTGCTCAACAATGCCATAAAGGGAGGAGTAGAAGATGAAGTGACCCTCTCCGCCTA
TATCACCATCGCCCTTCTGGAGATTCCTCTCACAGTCACTCACCCTGTTGTCCGCAATGCCCTGTTTTGC
CTGGAGTCAGCCTGGAAGACAGCACAAGAAGGGGACCATGGCAGCCATGTATATACCAAAGCACTGCTGG
CCTATGCTTTTGCCCTGGCAGGTAACCAGGACAAGAGGAAGGAAGTACTCAAGTCACTTAATGAGGAAGC
TGTGAAGAAAGACAACTCTGTCCATTGGGAGCGCCCTCAGAAACCCAAGGCACCAGTGGGGCATTTTTAC
GAACCCCAGGCTCCCTCTGCTGAGGTGGAGATGACATCCTATGTGCTCCTCGCTTATCTCACGGCCCAGC
CAGCCCCAACCTCGGAGGACCTGACCTCTGCAACCAACATCGTGAAGTGGATCACGAAGCAGCAGAATGC
CCAGGGCGGTTTCTCCTCCACCCAGGACACAGTGGTGGCTCTCCATGCTCTGTCCAAATATGGAGCAGCC
ACATTTACCAGGACTGGGAAGGCTGCACAGGTGACTATCCAGTCTTCAGGGACATTTTCCAGCAAATTCC
AAGTGGACAACAACAACCGCCTGTTACTGCAGCAGGTCTCATTGCCAGAGCTGCCTGGGGAATACAGCAT
GAAAGTGACAGGAGAAGGATGTGTCTACCTCCAGACATCCTTGAAATACAATATTCTCCCAGAAAAGGAA
GAGTTCCCCTTTGCTTTAGGAGTGCAGACTCTGCCTCAAACTTGTGATGAACCCAAAGCCCACACCAGCT
TCCAAATCTCCCTAAGTGTCAGTTACACAGGGAGCCGCTCTGCCTCCAACATGGCGATCGTTGATGTGAA
GATGGTCTCTGGCTTCATTCCCCTGAAGCCAACAGTGAAAATGCTTGAAAGATCTAACCATGTGAGCCGG
ACAGAAGTCAGCAGCAACCATGTCTTGATTTACCTTGATAAGGTGTCAAATCAGACACTGAGCTTGTTCT
TCACGGTTCTGCAAGATGTCCCAGTAAGAGATCTGAAACCAGCCATAGTGAAAGTCTATGATTACTACGA
GACGGATGAGTTTGCAATTGCTGAGTACAATGCTCCTTGCAGCAAAGATCTTGGAAATGCTTGAAGACCA
CAAGGCTGAAAAGTGCTTTGCTGGAGTCCTGTTCTCAGAGCTCCACAGAAGACACGTGTTTTTGTATCTT
TAAAGACTTGATGAATAAACACTTTTTCTGGTCAATGTC
dr.jawairia
Posts: 2
Joined: Mon Dec 04, 2023 6:44 pm

Re: About geo.npy file

Post by dr.jawairia »

ATACAAGAGATGTGAGAAGCACCATAAAAGGCGTTGTGAGGAGTTGTGGGGGAGTGAGGGAGAGAAGAGG
TTGAAAAGCTTATTAGCTGCTGTACGGTAAAACTCCTTCTTTCTGCAACATGGGGAAGAACAAACTCCTT
CATCCAAGTCTGGTTCTTCTCCTCTTGGTCCTCCTGCCCACAGACGCCTCAGTCTCTGGAAAACCGCAGT
ATATGGTTCTGGTCCCCTCCCTGCTCCACACTGAGACCACTGAGAAGGGCTGTGTCCTTCTGAGCTACCT
GAATGAGACAGTGACTGTAAGTGCTTCCTTGGAGTCTGTCAGGGGAAACAGGAGCCTCTTCACTGACCTG
GAGGCGGAGAATGACGTACTCCACTGTGTCGCCTTCGCTGTCCCAAAGTCTTCATCCAATGAGGAGGTAA
TGTTCCTCACTGTCCAAGTGAAAGGACCAACCCAAGAATTTAAGAAGCGGACCACAGTGATGGTTAAGAA
CGAGGACAGTCTGGTCTTTGTCCAGACAGACAAATCAATCTACAAACCAGGGCAGACAGTGAAATTTCGT
GTTGTCTCCATGGATGAAAACTTTCACCCCCTGAATGAGTTGATTCCACTAGTATACATTCAGGATCCCA
AAGGAAATCGCATCGCACAATGGCAGAGTTTCCAGTTAGAGGGTGGCCTCAAGCAATTTTCTTTTCCCCT
CTCATCAGAGCCCTTCCAGGGCTCCTACAAGGTGGTGGTACAGAAGAAATCAGGTGGAAGGACAGAGCAC
CCTTTCACCGTGGAGGAATTTGTTCTTCCCAAGTTTGAAGTACAAGTAACAGTGCCAAAGATAATCACCA
TCTTGGAAGAAGAGATGAATGTATCAGTGTGTGGCCTATACACATATGGGAAGCCTGTCCCTGGACATGT
GACTGTGAGCATTTGCAGAAAGTATAGTGACGCTTCCGACTGCCACGGTGAAGATTCACAGGCTTTCTGT
GAGAAATTCAGTGGACAGCTAAACAGCCATGGCTGCTTCTATCAGCAAGTAAAAACCAAGGTCTTCCAGC
TGAAGAGGAAGGAGTATGAAATGAAACTTCACACTGAGGCCCAGATCCAAGAAGAAGGAACAGTGGTGGA
ATTGACTGGAAGGCAGTCCAGTGAAATCACAAGAACCATAACCAAACTCTCATTTGTGAAAGTGGACTCA
CACTTTCGACAGGGAATTCCCTTCTTTGGGCAGGTGCGCCTAGTAGATGGGAAAGGCGTCCCTATACCAA
ATAAAGTCATATTCATCAGAGGAAATGAAGCAAACTATTACTCCAATGCTACCACGGATGAGCATGGCCT
TGTACAGTTCTCTATCAACACCACCAATGTTATGGGTACCTCTCTTACTGTTAGGGTCAATTACAAGGAT
CGTAGTCCCTGTTACGGCTACCAGTGGGTGTCAGAAGAACACGAAGAGGCACATCACACTGCTTATCTTG
TGTTCTCCCCAAGCAAGAGCTTTGTCCACCTTGAGCCCATGTCTCATGAACTACCCTGTGGCCATACTCA
GACAGTCCAGGCACATTATATTCTGAATGGAGGCACCCTGCTGGGGCTGAAGAAGCTCTCCTTCTATTAT
CTGATAATGGCAAAGGGAGGCATTGTCCGAACTGGGACTCATGGACTGCTTGTGAAGCAGGAAGACATGA
AGGGCCATTTTTCCATCTCAATCCCTGTGAAGTCAGACATTGCTCCTGTCGCTCGGTTGCTCATCTATGC
TGTTTTACCTACCGGGGACGTGATTGGGGATTCTGCAAAATATGATGTTGAAAATTGTCTGGCCAACAAG
GTGGATTTGAGCTTCAGCCCATCACAAAGTCTCCCAGCCTCACACGCCCACCTGCGAGTCACAGCGGCTC
CTCAGTCCGTCTGCGCCCTCCGTGCTGTGGACCAAAGCGTGCTGCTCATGAAGCCTGATGCTGAGCTCTC
GGCGTCCTCGGTTTACAACCTGCTACCAGAAAAGGACCTCACTGGCTTCCCTGGGCCTTTGAATGACCAG
GACAATGAAGACTGCATCAATCGTCATAATGTCTATATTAATGGAATCACATATACTCCAGTATCAAGTA
CAAATGAAAAGGATATGTACAGCTTCCTAGAGGACATGGGCTTAAAGGCATTCACCAACTCAAAGATTCG
TAAACCCAAAATGTGTCCACAGCTTCAACAGTATGAAATGCATGGACCTGAAGGTCTACGTGTAGGTTTT
TATGAGTCAGATGTAATGGGAAGAGGCCATGCACGCCTGGTGCATGTTGAAGAGCCTCACACGGAGACCG
TACGAAAGTACTTCCCTGAGACATGGATCTGGGATTTGGTGGTGGTAAACTCAGCAGGTGTGGCTGAGGT
AGGAGTAACAGTCCCTGACACCATCACCGAGTGGAAGGCAGGGGCCTTCTGCCTGTCTGAAGATGCTGGA
CTTGGTATCTCTTCCACTGCCTCTCTCCGAGCCTTCCAGCCCTTCTTTGTGGAGCTCACAATGCCTTACT
CTGTGATTCGTGGAGAGGCCTTCACACTCAAGGCCACGGTCCTAAACTACCTTCCCAAATGCATCCGGGT
CAGTGTGCAGCTGGAAGCCTCTCCCGCCTTCCTAGCTGTCCCAGTGGAGAAGGAACAAGCGCCTCACTGC
ATCTGTGCAAACGGGCGGCAAACTGTGTCCTGGGCAGTAACCCCAAAGTCATTAGGAAATGTGAATTTCA
CTGTGAGCGCAGAGGCACTAGAGTCTCAAGAGCTGTGTGGGACTGAGGTGCCTTCAGTTCCTGAACACGG
AAGGAAAGACACAGTCATCAAGCCTCTGTTGGTTGAACCTGAAGGACTAGAGAAGGAAACAACATTCAAC
TCCCTACTTTGTCCATCAGGTGGTGAGGTTTCTGAAGAATTATCCCTGAAACTGCCACCAAATGTGGTAG
AAGAATCTGCCCGAGCTTCTGTCTCAGTTTTGGGAGACATATTAGGCTCTGCCATGCAAAACACACAAAA
TCTTCTCCAGATGCCCTATGGCTGTGGAGAGCAGAATATGGTCCTCTTTGCTCCTAACATCTATGTACTG
GATTATCTAAATGAAACACAGCAGCTTACTCCAGAGATCAAGTCCAAGGCCATTGGCTATCTCAACACTG
GTTACCAGAGACAGTTGAACTACAAACACTATGATGGCTCCTACAGCACCTTTGGGGAGCGATATGGCAG
GAACCAGGGCAACACCTGGCTCACAGCCTTTGTTCTGAAGACTTTTGCCCAAGCTCGAGCCTACATCTTC
ATCGATGAAGCACACATTACCCAAGCCCTCATATGGCTCTCCCAGAGGCAGAAGGACAATGGCTGTTTCA
GGAGCTCTGGGTCACTGCTCAACAATGCCATAAAGGGAGGAGTAGAAGATGAAGTGACCCTCTCCGCCTA
TATCACCATCGCCCTTCTGGAGATTCCTCTCACAGTCACTCACCCTGTTGTCCGCAATGCCCTGTTTTGC
CTGGAGTCAGCCTGGAAGACAGCACAAGAAGGGGACCATGGCAGCCATGTATATACCAAAGCACTGCTGG
CCTATGCTTTTGCCCTGGCAGGTAACCAGGACAAGAGGAAGGAAGTACTCAAGTCACTTAATGAGGAAGC
TGTGAAGAAAGACAACTCTGTCCATTGGGAGCGCCCTCAGAAACCCAAGGCACCAGTGGGGCATTTTTAC
GAACCCCAGGCTCCCTCTGCTGAGGTGGAGATGACATCCTATGTGCTCCTCGCTTATCTCACGGCCCAGC
CAGCCCCAACCTCGGAGGACCTGACCTCTGCAACCAACATCGTGAAGTGGATCACGAAGCAGCAGAATGC
CCAGGGCGGTTTCTCCTCCACCCAGGACACAGTGGTGGCTCTCCATGCTCTGTCCAAATATGGAGCAGCC
ACATTTACCAGGACTGGGAAGGCTGCACAGGTGACTATCCAGTCTTCAGGGACATTTTCCAGCAAATTCC
AAGTGGACAACAACAACCGCCTGTTACTGCAGCAGGTCTCATTGCCAGAGCTGCCTGGGGAATACAGCAT
GAAAGTGACAGGAGAAGGATGTGTCTACCTCCAGACATCCTTGAAATACAATATTCTCCCAGAAAAGGAA
GAGTTCCCCTTTGCTTTAGGAGTGCAGACTCTGCCTCAAACTTGTGATGAACCCAAAGCCCACACCAGCT
TCCAAATCTCCCTAAGTGTCAGTTACACAGGGAGCCGCTCTGCCTCCAACATGGCGATCGTTGATGTGAA
GATGGTCTCTGGCTTCATTCCCCTGAAGCCAACAGTGAAAATGCTTGAAAGATCTAACCATGTGAGCCGG
ACAGAAGTCAGCAGCAACCATGTCTTGATTTACCTTGATAAGGTGTCAAATCAGACACTGAGCTTGTTCT
TCACGGTTCTGCAAGATGTCCCAGTAAGAGATCTGAAACCAGCCATAGTGAAAGTCTATGATTACTACGA
GACGGATGAGTTTGCAATTGCTGAGTACAATGCTCCTTGCAGCAAAGATCTTGGAAATGCTTGAAGACCA
CAAGGCTGAAAAGTGCTTTGCTGGAGTCCTGTTCTCAGAGCTCCACAGAAGACACGTGTTTTTGTATCTT
TAAAGACTTGATGAATAAACACTTTTTCTGGTCAATGTC
liyangum
Posts: 8
Joined: Tue May 11, 2021 1:42 am

Re: About geo.npy file

Post by liyangum »

sawcheet wrote: Mon Dec 04, 2023 3:06 pm After obtaining the output of the sequence using DRfold, in geo.npy file there are array with name: pp,cc, nn, pcc, pnn, cnn, pccp,pnnp,cnnc.
I don't know what these terms mean, and I could not find any online resource stating what these are. Can anyone help me on this and help me understand what are the long list of the scores that we get in these array while plotting between the nucleotides.
Hi,

The detailed descriptions of terms in geo.npy can be found in Section Prediction terms of geometry models in the DRfold paper (https://www.nature.com/articles/s41467-023-41303-9).
Normally for RNA structure prediction, you can simply ignore the gep.npy since our progam has already added them into the final prediction. However, if you want to analysis the distance or the orientation, you can refer the code from LINE 180 at https://github.com/leeyang/DRfold/blob/ ... ld/Fold.py to check how to utilize those geometry predictions.

Thanks,
Yang LI
Post Reply