Locally Installed C-I-Tasser and I-Tasser fails for certain proteins

This forum is shown on the index page along with all topics.

Moderator: robpearc

rpearson_7
Posts: 17
Joined: Wed Nov 10, 2021 8:05 pm

Re: Locally Installed C-I-Tasser and I-Tasser fails for certain proteins

Post by rpearson_7 »

I am still experiencing problems.

It appears now that only 6 threading programs yield output.

Code: Select all

3.1 do threading
start parallel threading CEthreader
start parallel threading mCEthreader
start parallel threading eCEthreader
start parallel threading PPAS
start parallel threading dPPAS
start parallel threading dPPAS2
start parallel threading Env-PPAS
start parallel threading MUSTER
start parallel threading wPPAS
start parallel threading wdPPAS
start parallel threading wMUSTER
only 6 threading programs have output, please check threading programs
These are the files I have in my directory:
files_in_dir.PNG
files_in_dir.PNG (45.66 KiB) Viewed 10163 times
err_CITCEthreader_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead

WARNING: Ignoring unknown option -mapt ...

WARNING: Ignoring unknown option 0 ...
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "cal_large_matrix1" failed type inference due to: No implementation of function Function(<built-in function zeros>) found for signature:

 >>> zeros(list(int64)<iv=None>)

There are 2 candidate implementations:
   - Of which 2 did not match due to:
   Overload of function 'zeros': File: numba/core/typing/npydecl.py: Line 511.
     With argument(s): '(list(int64)<iv=None>)':
    No match.

During: resolving callee type: Function(<built-in function zeros>)
During: typing of call at /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py (198)


File "aaweights.py", line 198:
def cal_large_matrix1(msa,weight):
    <source elided>
    pa=np.zeros((N,ALPHA))
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    ^

  @jit
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "cal_large_matrix1" failed type inference due to: Cannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>

File "aaweights.py", line 199:
def cal_large_matrix1(msa,weight):
    <source elided>
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    for i in range(N):
    ^

  @jit
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:152: NumbaWarning: Function "cal_large_matrix1" was compiled in object mode without forceobj=True, but has lifted loops.

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:162: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/respre.py:52: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  x=Variable(x,volatile=True)
err_CITdPPAS2_P10636.fasta:

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
At line 440 of file ppa1.f
Fortran runtime error: No such file or directory
Illegal division by zero at /var/spool/slurmd/job571942/slurm_script line 358.
err_CITdPPAS_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
At line 440 of file ppa1.f
Fortran runtime error: No such file or directory
Illegal division by zero at /var/spool/slurmd/job571941/slurm_script line 359.
err_CITeCEthreader_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead

WARNING: Ignoring unknown option -mapt ...

WARNING: Ignoring unknown option 0 ...
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "cal_large_matrix1" failed type inference due to: No implementation of function Function(<built-in function zeros>) found for signature:

 >>> zeros(list(int64)<iv=None>)

There are 2 candidate implementations:
   - Of which 2 did not match due to:
   Overload of function 'zeros': File: numba/core/typing/npydecl.py: Line 511.
     With argument(s): '(list(int64)<iv=None>)':
    No match.

During: resolving callee type: Function(<built-in function zeros>)
During: typing of call at /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py (198)


File "aaweights.py", line 198:
def cal_large_matrix1(msa,weight):
    <source elided>
    pa=np.zeros((N,ALPHA))
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    ^

  @jit
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "cal_large_matrix1" failed type inference due to: Cannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>

File "aaweights.py", line 199:
def cal_large_matrix1(msa,weight):
    <source elided>
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    for i in range(N):
    ^

  @jit
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:152: NumbaWarning: Function "cal_large_matrix1" was compiled in object mode without forceobj=True, but has lifted loops.

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:162: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/respre.py:52: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  x=Variable(x,volatile=True)
err_CITEnv-PPAS_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
At line 697 of file zal3.f
Fortran runtime error: No such file or directory
err_CITmCEthreader_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead

WARNING: Ignoring unknown option -mapt ...

WARNING: Ignoring unknown option 0 ...
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "cal_large_matrix1" failed type inference due to: No implementation of function Function(<built-in function zeros>) found for signature:

 >>> zeros(list(int64)<iv=None>)

There are 2 candidate implementations:
   - Of which 2 did not match due to:
   Overload of function 'zeros': File: numba/core/typing/npydecl.py: Line 511.
     With argument(s): '(list(int64)<iv=None>)':
    No match.

During: resolving callee type: Function(<built-in function zeros>)
During: typing of call at /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py (198)


File "aaweights.py", line 198:
def cal_large_matrix1(msa,weight):
    <source elided>
    pa=np.zeros((N,ALPHA))
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    ^

  @jit
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/aaweights.py:189: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "cal_large_matrix1" failed type inference due to: Cannot determine Numba type of <class 'numba.core.dispatcher.LiftedLoop'>

File "aaweights.py", line 199:
def cal_large_matrix1(msa,weight):
    <source elided>
    cov=np.zeros([N*ALPHA,N*ALPHA ])
    for i in range(N):
    ^

  @jit
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:152: NumbaWarning: Function "cal_large_matrix1" was compiled in object mode without forceobj=True, but has lifted loops.

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/.conda/envs/Quark_and_Itasser_Python3/lib/python3.6/site-packages/numba/core/object_mode_passes.py:162: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit

File "aaweights.py", line 192:
def cal_large_matrix1(msa,weight):
    <source elided>
    #output:21*l*21*l
    ALPHA=21
    ^

  state.func_ir.loc))
/home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/contact/ResPre/respre.py:52: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  x=Variable(x,volatile=True)
err_CITMUSTER_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
At line 1067 of file zal33.f
Fortran runtime error: No such file or directory
Illegal division by zero at /var/spool/slurmd/job571944/slurm_script line 604.
err_CITPPAS_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
err_CITwdPPAS_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
Exception in thread "main" java.io.FileNotFoundException: /home/rpearson/Structure_Prediction_Tools/CIT_Lib/DEP/1a4zA1.dep (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at java.io.FileReader.<init>(FileReader.java:58)
        at c.a(c.java)
        at c.main(c.java)
Illegal division by zero at /var/spool/slurmd/job571946/slurm_script line 351.
err_CITwMUSTER_P10636.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITP10636.fasta': No such file or directory: going to /tmp instead
Exception in thread "main" java.io.FileNotFoundException: /home/rpearson/Structure_Prediction_Tools/CIT_Lib/DEP/1a4zA1.dep (No such file or directory)
        at java.io.FileInputStream.open0(Native Method)
        at java.io.FileInputStream.open(FileInputStream.java:195)
        at java.io.FileInputStream.<init>(FileInputStream.java:138)
        at java.io.FileInputStream.<init>(FileInputStream.java:93)
        at java.io.FileReader.<init>(FileReader.java:58)
        at e.a(e.java)
        at e.main(e.java)
Illegal division by zero at /var/spool/slurmd/job571947/slurm_script line 575.
Personally, it looks like I have a problem with CITwMUSTER and CITwPPAS but I am fairly lost. I just downloaded the DEP from your website and replaced my old DEP dir.

To be clear, I am running C-I-Tasser and I-Tasser using the same libraries. The only thing that is different when running Citasser vs Itasser is the -cit flag will either be set to true or false depending on what kind of run I want to do. Can CIT and IT share the same libraries or should I point them in different places?

I am going to copy all files for both DEP dir versions to a new dir that I will name DEP. I hope this clears the constant error messages. I wish instructions for a working conda environment with versions etc was available. I also wish the default settings were provided so that results could be as close to possible to your server. Maybe in the future this is possible.

Please let me know if you have any ideas or suggestions!

Thanks so much,

Rich
rpearson_7
Posts: 17
Joined: Wed Nov 10, 2021 8:05 pm

Re: Locally Installed C-I-Tasser and I-Tasser fails for certain proteins

Post by rpearson_7 »

Well, this is not looking good so far...

I am now getting the following error in err_CITwMUSTER_Q9KL03.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITQ9KL03.fasta': No such file or directory: going to /tmp instead
Exception in thread "main" java.lang.NumberFormatException: For input string: "-NAN."
        at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
        at java.lang.Float.parseFloat(Float.java:451)
        at java.lang.Float.valueOf(Float.java:416)
        at e.a(e.java)
        at e.main(e.java)
The same error appears in err_CITwdPPAS_Q9KL03.fasta

Code: Select all

slurmstepd: error: couldn't chdir to `/tmp/rpearson/CITQ9KL03.fasta': No such file or directory: going to /tmp instead
Exception in thread "main" java.lang.NumberFormatException: For input string: "-NAN."
        at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
        at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
        at java.lang.Float.parseFloat(Float.java:451)
        at java.lang.Float.valueOf(Float.java:416)
        at c.a(c.java)
        at c.main(c.java)
Here is another note. Since I am re-running some of these proteins, I did not want to delete the entire directory and wast time re-creating the MSA etc. So, I ended up deleting all err_* files all init.* files and all out_* files. Then I re-ran from there. I am at a complete loss now.. Not sure what to do from here.
rpearson_7
Posts: 17
Joined: Wed Nov 10, 2021 8:05 pm

Re: Locally Installed C-I-Tasser and I-Tasser fails for certain proteins

Post by rpearson_7 »

Okay, I made some more progress.

First, let's recap what I am doing and some quick problems and solutions I found so far.

I am using the standalone C-I-Tasser program on an HPC (linux with CentOS and SLURM job manager). I downloaded the libraries a long time ago and have just recently started using the program on a more regular basis. I started to notice that some of the longer sequences > ~550 residues would not produce model outputs. I figured this could be a time or memory issue, so I went into the runITasser.pl and changed the memory from 10000mb to 20000mb and the time from 72:00:00 to 144:00:00. When I ran this, I got errors about the time so I reverted the time back to the default 72:00:00 and kept the memory to the adjusted 20000mb. This seemed to work for a couple of the larger proteins I was having issues with. However, some of the proteins gave me some errors regarding missing $/DEP/*.dep files. I confirmed that these files were actually missing from that DEP directory in the library and they were. I downloaded a fresh DEP directory and checked if the missing file was in the new DEP dir and it was. I re-ran the proteins that errored out and I was still getting missing dep file errors on some (others had successful runs). I checked if the missing .dep files were in the new DEP dir and indeed they were not. I then checked my old DEP dir and the missing files were in that DEP dir. With that, I made a new DEP dir by copying all of the files from the new DEP to the old DEP dir. I re-ran the proteins and each on worked perfectly.

The only thing I am faced with now is the following:

I need 5 models. In the approximately 55 proteins I ran 8 yielded 1 model output for both C-I-Tasser and I-Tasser. 1 other protein yielded only 3 models for both C-I-Tasser and I-Tasser. I find it a little strange that both C-I-Tasser and I-Tasser yielded the same number of protein model outputs in every single circumstance.

It is important for my research to obtain 5 model outputs for both C-I-Tasser and I-Tasser. I am wondering if anyone has a suggestion on how I can try to force the programs to yield the correct number of outputs I specified in the command. Here is how I am running the programs:

For C-I-Tasser I am running the following:

Code: Select all

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=28
#SBATCH -J A0AQQ7_CIT
#SBATCH --mem=0
#SBATCH -D /home/rpearson/research/NegValSet/CIT

cd /home/rpearson/research/NegValSet/CIT

srun --ntasks=1 --nodes=1 --cpus-per-task=28 bash -c "echo Making A0AQQ7 directory.; mkdir A0AQQ7 ; scp ./fastas/A0AQQ7.fasta ./A0AQQ7 ; echo Running A0AQQ7 ; cd /home/rpearson/research/NegValSet/CIT/A0AQQ7 ; mv A0AQQ7.fasta seq.fasta ; perl /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/I-TASSERmod/runI-TASSER.pl -pkgdir /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0 -libdir /home/rpearson/Structure_Prediction_Tools/CIT_Lib -seqname A0AQQ7.fasta -datadir /home/rpearson/research/NegValSet/CIT/A0AQQ7 -outdir /home/rpearson/research/NegValSet/CIT/A0AQQ7 -runstyle parallel -homoflag benchmark -idcut 0.3 -light true -nmodel 5 -hours 5 -LBS false -EC false -GO false -java_home /usr -cit true; echo A0AQQ7 complete." &

wait
conda deactivate
For I-Tasser I am running the following:

Code: Select all

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=28
#SBATCH -J A0AQQ7_IT
#SBATCH --mem=0
#SBATCH -D /home/rpearson/research/NegValSet/IT

cd /home/rpearson/research/NegValSet/IT

srun --ntasks=1 --nodes=1 --cpus-per-task=28 bash -c "echo Making A0AQQ7 directory.; mkdir A0AQQ7 ; scp ./fastas/A0AQQ7.fasta ./A0AQQ7 ; echo Running A0AQQ7 ; cd /home/rpearson/research/NegValSet/IT/A0AQQ7 ; mv A0AQQ7.fasta seq.fasta ; perl /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0/I-TASSERmod/runI-TASSER.pl -pkgdir /home/rpearson/Structure_Prediction_Tools/C-I-TASSER-1.0 -libdir /home/rpearson/Structure_Prediction_Tools/IT_Lib -seqname A0AQQ7.fasta -datadir /home/rpearson/research/NegValSet/CIT/A0AQQ7 -outdir /home/rpearson/research/NegValSet/IT/A0AQQ7 -runstyle parallel -homoflag benchmark -idcut 0.3 -light true -nmodel 5 -hours 5 -LBS false -EC false -GO false -java_home /usr -cit false; echo A0AQQ7 complete." &

wait
conda deactivate
Could 1 model output be the result of this protein being in the training set? I doubt it because it is supposed to remove homologues but who knows. Any suggestions on how to force the correct number of models would be greatly appreciated!

As always, thanks so much!

Rich
Post Reply