Bioinformatics Playground

Live, learn & play safe.

The complete guide to installing DISOPRED 3 (Mac + Linux)

DISOPRED, initially published by Jones and Ward in 2003, is a tool that predicts intrinsically disordered regions (IDS) in proteins. The third major release of this software, aka DISOPRED 3, was made public in 2015 (Jones & Cozzetto). This great piece of software does not only operate as a standalone predictor, it is also incorporated as essential part of other useful bioinformatics tools, e.g. ActiveDriver.

According to the author, DISOPRED 3, compared with previous versions, offered a significant increase in the prediction accuracy and efficiency. Although an README.txt is provided, adopting DISOPRED 3 is not a piece of cake because several critical steps are missed out in the original installation guide.

After numerous attempts and failures, with the aid of these two helpful articles (1, 2), I eventually streamlined the process of installing DISOPRED 3 on a typical Mac or Linux workstation. Here I present the complete guide to proper installing and configuring DISOPRED 3.

Please feel free to let me know via response if there is any question.


BLAST and Database

BLAST toolkit

Note: Please make sure you are not installing BLAST+ as it is NOT yet supported by DISOPRED 3.

  1. Download the BLAST 2.2.26 from NCBI. Make sure you download the correct binary for your operating system.
  2. Unzip the compressed file to a new folder called blast-2.2.26.
Uniref90 database

Note: please be patient as formatting will take a while to complete.

  1. Create a new folder called blastdb to store all resources
  2. Download the newest Uniref90 (fasta format) from Uniprot
  3. Unzip the compressed file to a new folder named uniref90
  4. Format the database using formatdb tool in the BLAST toolkit: ```bash # Change directory to the folder with the unzipped database file cd uniref90 # Change Uniref90.fasta to file name of which you just downloaded formatdb -i uniref90.fasta ```
Matrices
  1. Download matrices for blastpgp from NCBI to a new folder named data.
  2. Create a configuration file called ~/.ncbirc for BLAST.
  3. Put in the required parameters. Make sure to change your_name to the name of your home directory.
; Start the section for BLAST configuration
[BLAST]
; Specifies the path where BLAST databases are installed
BLASTDB=/home/your_name/blastdb/uniref90
BLASTMAT=/home/your_name/blastdb/data
; Specifies the data sources to use for automatic resolution
; for sequence identifiers
DATA_LOADERS=blastdb
[NCBI]
data=/home/your_name/blastdb/data

DISOPRED 3
Download
  1. Download the latest DISOPRED 3 and dso_lib from UCL.
  2. Unzip these compressed files.
  3. Replace the DISOPRED/dso_lib with the newest ones.
  4. If you are ==running macOS==, please modify the source code:
    • In src/disord_pred.c, find these lines
    • #ifdef unix
      #define CLKRATE 1000000
      #endif
      

      and replace with

      #ifdef unix
      #define CLKRATE 1000000
      #endif
      
      #ifdef __MACH__
      #define CLKRATE 1000000
      #endif
      
    • In src/Makefile, find these lines
    • install:
          mkdir ../bin/
          mv -t ../bin disopred2 diso_neu_net diso_neighb combine svm-predict
      

      and replace with

      install:
          mkdir ../bin/
          mv disopred2 diso_neu_net diso_neighb combine svm-predict ../bin/
      
Install and Test
  1. Compile the source code.
  2. cd DISOPRED/src
    make clean
    make
    make install
    
  3. Configure the run_disopred.pl file with locations of BLAST and database and number of cores to use.

    Make sure to change ==your_name== to the name of your home directory.

  4. ## IMPORTANT: Set the paths to folder with the NCBI executables and to the
    ## sequence database
    my $NCBI_DIR = "/Users/your_name/Desktop/blast-2.2.26/bin/";
    my $SEQ_DB = "/Users/your_name/Desktop/blastdb/uniref50.fasta";
    
    ## IMPORTANT: Changing these flags will alter the behaviour of blastpgp
    ## You may want to use -a n to speed-up the search using n processors, if available
    my $PSIBLAST_PAR = "-a 1 -b 0 -j 3 -h 0.001";
    
  5. Run the DISOPRED 3 with example data.

Reference
  1. Jones, D. T., & Ward, J. J. (2003). Prediction of disordered regions in proteins from position specific score matrices. Proteins: Structure, Function, and Bioinformatics, 53(S6), 573–578.
  2. Jones, D. T., & Cozzetto, D. (2015). DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics,31(6), 857–863.

Comments is loading...

Comments is loading...