Use Case Examples

Before We Start

This section provides several example use cases for the Blast2GO Command Line. Please read the Setup chapter carefully and configure the GO Mapping database in your properties file.

Important things to consider:

  • If you are using MS Windows all commands must be changed accordingly. Please replace
    ./blast2go_cli.run with blast2go_cli.exe
  • A properties file is always necessary, create it with:
    ./blast2go_cli.run -createproperties cli.prop
  • GO Annotation, Enzyme Code Mapping, Statistics, GO Slim and various import and export functions make use of the obo file (-useobo). The CLI contains a default obo file. However, we recommend that to download the up-to-date version of the obo file.

    The obo file version should match the GO Mapping database version. http://resources.biobam.com/b2g_res/obo_files/index.html

    The latest version can always be found here: http://resources.biobam.com/b2g_res/obo_files/go_latest.obo.gz
    Just provide this file additionally when executing a command:
    ./blast2go_cli.run -useobo go_latest.obo.gz -properties cli.prop -annotation ...

  • Make sure having a working MongoDB server installed, with the GO Mapping database dump imported (see setup chapter).

Examples

  1. Load a DNA fasta file, add the corresponding BLAST results and perform GO Mapping
    and Annotation. Furthermore, we want to save the .b2g file and the PDF report at the
    current directory with the name "example".

    ./blast2go_cli.run -properties cli.prop -loadfasta \
    example_data/1000_plant.fasta -loadblast example_data/1000_plant_blastResult.xml \
    -mapping -annotation -saveb2g example.b2g -savereport example.pdf

  2. This example requires a local Swissprot Database installation. Simply download and extract the file from: ftp.ncbi.nlm.nih.gov/blast/db/swissprot.tar.gz Load nucleotide sequences, run local BLAST against the Swissprot database, GO Mapping and Annotation. We also create various statistics. Finally the whole project will be saved to the example data folder in .b2g format with the chosen name prefix together with the log file. The LocalBlastAlgoParameters have to be configured:

    // ** LocalBlastAlgoParameters **
    LocalBlastAlgoParameters.blastProgram=blastx-fast
    LocalBlastAlgoParameters.blastDbFile=/path/to/swissprot.pal
    LocalBlastAlgoParameters.blastXML2ResultEnable=true
    LocalBlastAlgoParameters.blastXML2Result=example_data/blast_xmls

    Command Line:
    ./blast2go_cli.run -properties cli.prop -loadfasta example_data/15_plant.fasta \
    -workspace example_data -nameprefix localblastSwissprot \
    -localblast -mapping -annotation -statistics \
    bspecdis,mdbresmap,aannotscore -saveb2g -savelog example_data/blast2go.log

  3. Load nucleotide sequences, import BLAST results (.json or .xml2) from a zip file, run GO Mapping and Annotation. Save the whole project and its report as example json.b2g.

    ./blast2go_cli.run -properties cli.prop -loadfasta example_data/15_plant.fasta \
    -loadblast31 example_data/json/02X9PD4T01R-Alignment.json.zip -mapping \
    -annotation -savereport example_data/example_json_report.pdf -saveb2g \
    example_data/example_json.b2g

  4. Load example.b2g from the second example and run InterProScan (online). We will save the project, as well as the InterProScan results. The following InterProScanAlgoParameters have to be configured:

    // ** InterProScanAlgoParameters **
    InterProScanAlgoParameters.ipsXML2Result=example_data/ips_xmls
    InterProScanAlgoParameters.ipsXML2ResultEnabled=true

    Command Line:

    ./blast2go_cli.run -properties cli.prop -loadb2g example_data/example.b2g \
    -ips <valid_email_address> -saveb2g example_data/example_withIPS.b2g

  5. Convert sequences to proteins and save them as fasta file.

    ./blast2go_cli.run -properties cli.prop -useobo go_latest.obo -loadfasta \
    example_data/15_plant.fasta -savelorf example_data/15_plant_protein

  6. Load a .b2g file, apply plants GO Slim and save the results as .b2g, which will be saved with the default nameprefix “b2g project” into the current directory.

    ./blast2go_cli.run -properties cli.prop -useobo go_latest.obo -loadb2g \
    example.b2g -goslim example_data/goslim_plant.obo -saveb2g

  7. To run this example, we need the results from the previous example. Run CloudBlast, GO Mapping, Annotation on the protein sequences and save the results as .b2g and customized annotation format. Please configure the following properties sections:

    // ** CloudBlastAlgoParameters **
    CloudBlastAlgoParameters.blastProgram=blastp-fast
    CloudBlastAlgoParameters.blastDB=nr_alias_viridiplantae
    CloudBlastAlgoParameters.blastXML2ResultEnable=true
    CloudBlastAlgoParameters.blastXML2Result=example_data/blast_xmls

    // ** ExportAnnotParameters **
    ExportAnnotParameters.format=custom
    ExportAnnotParameters.desc=true
    ExportAnnotParameters.go=category_and_id_and_term
    ExportAnnotParameters.goseparator=tabulator
    ExportAnnotParameters.column=tabulator
    ExportAnnotParameters.row=sequence

    Command Line:

    ./blast2go_cli.run -properties cli.prop -loadfasta \
    example_data/15_plant_protein.fasta -protein -cloudblast B2G-CloudBlastKey \
    -mapping -annotation -saveb2g example_data/15_plant_protein.b2g \
    -saveannot example_data/15_plant_annotation.txt

  8. Load a protein fasta file, add the corresponding BLAST results and execute GO Mapping and Annotation. All files (.b2g, .pdf, .annot and .txt) will be saved with the nameprefix "p53" in the "working dir" in the current directory. Additionally, the data distribution pie chart and enzyme statistics will also be saved in the “work dir” folder.

    Command Line:

    ./blast2go_cli.run -properties cli.prop -loadfasta \
    example_data/1000_seq_protein.fasta -protein -loadblast \
    example_data/1000_plant_protein_blastResult.xml -mapping -annotation \
    -workspace work_dir -nameprefix p53 -saveb2g -saveannot -savereport \
    -saveseqtable -statistics gdatadispie,aecdis

  9. Load a fasta file, a BLAST result file and InterProScan 5.0 files, perform GO mapping, annotation. Then create all statistical charts. As a result we want to obtain the .b2g and the PDF report, which will be saved with the default nameprefix “b2g project” into the current directory.

    Command Line:

    ./blast2go_cli.run -properties cli.prop -loadfasta example_data/1000_plant.fasta \
    -loadblast example_data/1000_plant_blastResult.xml -loadips50 \
    example_data/1000_seq_protein_ips50.xml -mapping -annotation \
    -statistics all -saveb2g -savereport

  10. Load an example data-set and export user defined columns for each sequence. Please configure the following properties section:

    // ** GenericExportParameters **
    GenericExportParameters.columnSeparator=tabulator
    GenericExportParameters.itemSeparator=semicolon
    GenericExportParameters.itemsToExport=seq_name,blast_hit_count, \
    mapping_genename,mapping_xref,mapping_goid,annot_goid,enzyme_code

    Command Line:

    ./blast2go_cli.run -properties cli.prop -loadb2g example_data/example.b2g \
    -exportgeneric example_data/blast_top_hit.txt