Quickstart

Before We Start

This section provides several usage examples for Blast2GO CLI. Please read the Installation chapter carefully and configure the GO Mapping database in your properties file before you continue.

Important things to consider:

  • If you are using MS-Windows, all commands must be changed accordingly. Please replace ./blast2go_cli.run with blast2go_cli.exe

  • A properties file is always necessary, create it with:

    ./blast2go_cli.run -createproperties cli.prop
  • GO Annotation, Enzyme Code Mapping, Statistics, GO Slim and various import and export functions make use of the obo file (-useobo). Blast2GO CLI comes with a built-in obo file, however, we recommend downloading the latest version.

    The obo file version should match the GO Mapping database version. http://resources.biobam.com/b2g_res/obo_files/index.html

    The latest version can always be found here: http://resources.biobam.com/b2g_res/obo_files/go_latest.obo.gz
    Just provide this file additionally when executing a command:
    ./blast2go_cli.run -useobo go_latest.obo.gz -properties cli.prop -annotation ...

  • Make sure to have a working MongoDB server, with the GO Mapping database dump imported (see Installation chapter).

Examples

  1. Load a DNA fasta file, add the corresponding BLAST results and perform GO Mapping and Annotation. Furthermore, we want to save the .box file and the PDF report into the current directory.

    ./blast2go_cli.run -properties cli.prop -loadfasta \
    example_data/plant_nucleotide.fasta -loadblast31 example_data/plant_blast_31.zip \
    -mapping -annotation -savebox example_1.box -savereport example_1.pdf

  2. This example requires a local Swissprot Database installation. Therefore, simply download and extract the file from: ftp.ncbi.nlm.nih.gov/blast/db/swissprot.tar.gz. Load nucleotide sequences, run local BLAST against the Swissprot database, GO Mapping and Annotation. We also create various statistics. Finally, the whole project will be saved in .box format with the chosen name prefix together with the log file. The LocalBlastAlgoParameters have to be configured:

    // ** LocalBlastAlgoParameters **
    LocalBlastAlgoParameters.blastProgram=blastx-fast
    LocalBlastAlgoParameters.blastDbFile=/path/to/swissprot.pal
    LocalBlastAlgoParameters.blastXML2ResultEnable=true
    LocalBlastAlgoParameters.blastXML2Result=blast_xmls

    Command-line:

    ./blast2go_cli.run -properties cli.prop -loadfasta example_data/plant_nucleotide.fasta \
    -workspace example_data -nameprefix localblastSwissprot \
    -localblast </path/to/blast binaries> -mapping -annotation -statistics \
    bspecdis,mdbresmap,aannotscore -savebox -savelog blast2go.log

  3. Load example_1.box from the first example and run Cloud-InterProScan (online). We will save the project, as well as the InterProScan results (default configuration):

    ./blast2go_cli.run -properties cli.prop -loadbox example_1.box \
    -cloudips <cloud-key> -savebox example_data/example_ips.box

  4. Convert sequences to proteins and save them into a fasta file.

    ./blast2go_cli.run -properties cli.prop -useobo go_latest.obo.gz -loadfasta \
    example_data/plant_nucleotide.fasta -savelorf plant_protein

  5. Load a Blast2GO project file, apply plants GO Slim, and save the results as .box, which will be saved with the default name prefix “b2g_project” into the current directory.

    ./blast2go_cli.run -properties cli.prop -useobo go_latest.obo.gz -loadbox \
    example_data/plant_annotated.box -goslim example_data/goslim_plant.obo -savebox plant_annotated_goslim.box

  6. To run this example, we need the results from example 4. Run CloudBlast against Viridiplantae taxonomy (33090), GO Mapping, Annotation on the protein sequences and save the results as .box and customized annotation format. Please configure the following properties sections:

    // ** CloudBlastAlgoParameters **
    ServiceCloudBlastAlgoParameters.blastProgram=blastp-fast
    ServiceCloudBlastAlgoParameters.blastDB=nr
    ServiceCloudBlastAlgoParameters.species=33090
    ServiceCloudBlastAlgoParameters.blastXML2ResultEnable=true
    ServiceCloudBlastAlgoParameters.blastXML2Result=blast_xmls
    
    // ** ExportAnnotParameters **
    ExportAnnotParameters.format=custom
    ExportAnnotParameters.desc=true
    ExportAnnotParameters.go=category_and_id_and_term
    ExportAnnotParameters.goseparator=tabulator
    ExportAnnotParameters.column=tabulator
    ExportAnnotParameters.row=sequence

    Command-line:

    ./blast2go_cli.run -properties cli.prop -loadfasta \
    example_data/plant_protein.fasta -cloudblast <cloud-key> \
    -mapping -annotation -savebox plant_protein.box \
    -saveannot plant_annotation.annot

  7. Load a protein fasta file, add the corresponding BLAST results and execute GO Mapping and Annotation. All files (.box, .pdf, .annot and .txt) will be saved with the name prefix "example_7" in the "working dir" in the current directory. Additionally, the data distribution pie chart and enzyme statistics will also be saved in the “work dir” folder.

    Make sure that the export annotation format is annot.

    // ** ExportAnnotParameters **
    ExportAnnotParameters.format=annot

    Command-line:

    ./blast2go_cli.run -properties cli.prop \
    -loadfasta example_data/plant_protein.fasta \
    -loadblast31 example_data/plant_blast_31.zip \
    -mapping -annotation -workspace work_dir \
    -nameprefix example_7 -savebox -saveannot \
    -savereport -saveseqtable -statistics gdatadispie,aecdis

  8. Load a fasta file, a BLAST result file, and InterProScan 5.0 files, perform GO mapping, annotation. Then create all the statistical charts. As a result, we want to obtain the .box and the PDF report, which will be saved with the default name prefix “b2g_project” into the current directory.

    Command-line:

    ./blast2go_cli.run -properties cli.prop -loadfasta example_data/plant_nucleotide.fasta \
    -loadblast31 example_data/plant_blast_31.zip -loadips50 \
    example_data/plant_ips_50.xml -mapping -annotation \
    -statistics all -savebox -savereport

  9. Load an example data-set and export user-defined columns for each sequence. Please configure the following properties section:

    // ** GenericExportParameters **
    GenericExportParameters.columnSeparator=tabulator
    GenericExportParameters.itemSeparator=semicolon
    GenericExportParameters.itemsToExport=seq_name,blast_hit_count,mapping_genename,mapping_xref,mapping_goid,annot_goid,enzyme_code

    Command-line:

    ./blast2go_cli.run -properties cli.prop -loadbox example_data/plant_annotated.box \
    -exportgeneric blast_top_hit.txt