We developed a protein design algorithm that selects the final designed sequences
from clusters of low-free-energy decoy sequences.
Test set
Click on the test proteins and browse the designed sequences and their I-TASSER
predicted 3D models.
1GUTA 2CMPA 3G36A 3FILA 1OAIA 2VPBA 2V1QA 1KQ1A 2P5KA 1TUKA 2O9SA 1UTGA 1V5IB 2B97A 2QCPX 2CVIA 3G21A 2J8BA 2D3DA 3FEAA 2ZXYA 2GPIA 2FTRA 1IUJA 1MG4A 2PV2A 1VQSA 3IV4A 3CTGA 1NZ0A 3E9TA 1O7IA 3H7HA 1WN2A 2F01A 1DBWA 2ERBA 1EAQA 1OH0A 1VZIA 2VZCA 1ZHVA 1JF8A 3EBTA 2PR7A 1QHQA 2O1QA 2WLVA 2ANXA 3FH2A 2V0UA 3EF8A
INSTALLATION (Linux) 1. cd to the directory where you want to install the program. 2. Download and unpack "cluseq.tar.gz" into the installation directory. 3. Change the path to input file "blosum62.txt" in source file "cseqlongset/Amino1CharSeq.cpp" (line 6). The path must be changed to the location you choose for "blosum62.txt" in your file system. 4. Run the "build" script from the installation directory. 5. You are ready to run CLUSEQ. USAGE cluseq <path_to_the_input_file> - The input file (example) contains the set of amino acid sequences to cluster. The sequences are all of the same length and occupy consecutive lines in the file. Each sequence lies on a single line and is immediately followed by a line terminator. - The output (example) is printed to screen. First it lists the cluster centers (tags), then it lists the entire clusters. The clusters are expressed in terms of sequence indices in the input file. Both the sequence indices and the cluster indices start at 0. Cluster 0 is the largest cluster. The first index of each cluster represents the tag of the cluster.