Metagenomics

Modules included in this section

HUMAnN2

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running HUMAnN2:

Requires

  • fastq files, either forward or single:

    • sample_data[<sample>]["fastq.F"]

    • sample_data[<sample>]["fastq.S"]

Output

  • Puts the HUMAnN2 output files in:

    • self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.RPK)

    • self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.RPK)

    • self.sample_data[sample]["HUMAnN2.pathcoverage"]

  • If humann2_renorm_table block is set in params, puts the normalized tables in:

    • self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.<units>, where <units> is the value passed to --units)

    • self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.<units>, where <units> is the value passed to --units)

  • If humann2_join_tables block is set in params, puts the joined tables in:

    • self.sample_data["project_data"]["HUMAnN2.genefamilies"]

    • self.sample_data["project_data"]["HUMAnN2.pathabundance"]

    • self.sample_data["project_data"]["HUMAnN2.pathcoverage"]

Note

If both humann2_renorm_table and humann2_join_tables blocks exist in params, humann2_join_tables will work on the normalized tables produced by humann2_renorm_table! To join the non-normalized tables, do not normalize the tables by not including a humann2_renorm_table block.

Parameters that can be set

Parameter

Values

Comments

humann2_join_tables

Block containing path to humann2_join_tables, and a redirects block if necessary.

humann2_renorm_table

Block containing path to humann2_renorm_table, and a redirects block if necessary.

protein-database

uniref50|uniref90

Protein database used for analysis.

Warning

The protein-database parameter records the protein database being used: uniref50 or uniref90. It is not used by this module but is required by the downstream module, HUMAnN2_further_processing. If you do not include it, you will not be able to add a HUMAnN2_further_processing instance for downstream analysis.

Lines for parameter file

HUMAnN2_uniref50_hardtrimmed_reads:
    module: HUMAnN2
    base: Trim_Galore
    script_path: '{Vars.Programs_path.humann2}'
    setenv: PERL5LIB="" mpa_dir=$CONDA_PREFIX/bin
    qsub_params:
        -pe: shared 30
    protein-database:   uniref50
    redirects:
        --gap-fill: 'on'
        --input-format: fastq
        --minpath: 'on'
        --nucleotide-database: '{Vars.databases.humann2.chocophlan}'
        --protein-database: '{Vars.databases.humann2.uniref50}'
        --threads: '30'
    humann2_join_tables:
        path: humann2_join_tables
    humann2_renorm_table:
        path: humann2_renorm_table
        redirects:
            --units: cpm

References

HUMAnN2 home page

kraken

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken:

Note that kraken executable must be in a folder together with kraken-translate and kraken-report. This is the default for kraken installation.

Pass the full path to the kraken executable in script_path.

Merging of sample kraken reports in done with krona. See the section on Parameters that can be set.

You can follow this module with the kraken-biom module to create a biom table from the reports.

Requires

  • fastq files, either paired end or single:

    • sample_data[<sample>]["fastq.F"]

    • sample_data[<sample>]["fastq.R"]

    • sample_data[<sample>]["fastq.S"]

Output

  • Puts the kraken output files in:

    • self.sample_data[<sample>]["raw_classification"]

    • self.sample_data[<sample>]["classification"]

    • self.sample_data[<sample>]["kraken.report"]

    • If ktImportTaxonomy_path parameter was passed, puts the krona reports in

    • self.sample_data["project_data"]["krona"]

Parameters that can be set

Parameter

Values

Comments

ktImportTaxonomy_path

Path to ktImportTaxonomy. You can additional ktImportTaxonomy parameters at the end of the path. If not passed, the krona report will not be built.

Lines for parameter file

kraken1:
    module: kraken
    base: trim1
    script_path: {Vars.paths.kraken}
    qsub_params:
        -pe: shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db: /path/to/kraken_std_db
        --preload: 
        --quick: 
        --threads: 20

References

Wood, D.E. and Salzberg, S.L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome biology, 15(3), p.R46.

kraken_biom

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken-biom (https://github.com/smdabdoub/kraken-biom)

Requires

  • Kraken reports:

    • sample_data[<sample>]["kraken.report"]

Output

  • Puts the resulting biom output files in:

    • self.sample_data["project_data"]["kraken.biom"]

    • self.sample_data["project_data"]["biom_table"]

    • self.sample_data["project_data"]["biom_table_tsv"] (if skip_tsv is not set)

Parameters that can be set

Parameter

Values

Comments

skip_tsv

Set if you do not want to convert the report into tsv format.

skip_summary

Set if you do not want to create a summary of the report.

biom_path

/path/to/biom

The path to biom. This is required for conversion to tsv and for producing the summary

Lines for parameter file

kraken_biom1:
    module:             kraken_biom
    base:               kraken1
    script_path:        '{Vars.paths.kraken_biom}'
    # skip_tsv:
    biom_path:          '{Vars.paths.biom}'
    redirects:
        --max:          D 
        --min:          S 
        --gzip:

References

https://github.com/smdabdoub/kraken-biom

metaphlan2

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running metaphlan2:

Requires

  • fastq files, either paired end or single:

    • sample_data[<sample>]["fastq.F"]

    • sample_data[<sample>]["fastq.R"]

    • sample_data[<sample>]["fastq.S"]

Output

  • Puts the metaphlan2 output files in:

    • self.sample_data[<sample>]["raw_classification"]

  • If

  • If ktImportText_path parameter was passed, puts the krona reports in

    • self.sample_data["project_data"]["krona"]

  • If merge_metaphlan_tables was passed, puts the merged reports in

    • self.sample_data["project_data"]["merged_metaphlan2"]

  • If ‘–biom’ is set in redirects, the biom table is put in:

    • self.sample_data[<sample>]["biom_table"]

  • If ‘–bowtie2out’ is set in redirects, the SAM file is put in:

    • self.sample_data[<sample>]["sam"]

  • If ‘metaphlan2krona_path’ is set:

    • self.sample_data[<sample>]["classification"]

Parameters that can be set

Parameter

Values

Comments

ktImportText_path

Path to ktImportText.

merge_metaphlan_tables

Path to merge_metaphlan_tables.py. If not specified, will derive it from the location of metaphlan2

metaphlan2krona_path

Path to metaphlan2krona.py

Lines for parameter file

metph1:
    module: metaphlan2
    base: trim1
    script_path: {Vars.paths.metaphlan2}
    ktImportText_path: /path/to/ktImportText
    merge_metaphlan_tables: 
    metaphlan2krona_path:   /path/to/metaphlan2krona.py
    redirects:
        --biom: 
        --bowtie2_exe: /path/to/bowtie2
        --bowtie2db: /path/to/database
        --bowtie2out:
        --input_type: fastq
        --mdelim: ';'
        --mpa_pkl: /path/to/mpa_v20_m200.pkl

References

Truong, D.T., Franzosa, E.A., Tickle, T.L., Scholz, M., Weingart, G., Pasolli, E., Tett, A., Huttenhower, C. and Segata, N., 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nature methods, 12(10), pp.902-903.

centrifuge

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running centrifuge:

Pass the full path to the centrifuge executable in script_path.

Merging of sample centrifuge reports in done with krona. See the section on Parameters that can be set.

Requires

  • fastq files, either paired end or single:

    • sample_data[<sample>]["fastq.F"]

    • sample_data[<sample>]["fastq.R"]

    • sample_data[<sample>]["fastq.S"]

Output

  • Puts the centrifuge output files in:

    • self.sample_data[<sample>]["raw_classification"]

    • self.sample_data[<sample>]["classification"]

    • self.sample_data[<sample>]["classification_report"]

  • If ktImportTaxonomy_path parameter was passed, puts the krona reports in

    • self.sample_data["project_data"]["krona"]

Parameters that can be set

Parameter

Values

Comments

ktImportTaxonomy_path

Path to ktImportTaxonomy. You can additional ktImportTaxonomy parameters at the end of the path. If not passed, the krona report will not be built.

Lines for parameter file

Centrifuge:
    module:         centrifuge
    base:           trim1
    script_path:    {Vars.paths.centrifuge}
    qsub_params:
        -pe:        shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db:       /path/to/centrifuge_db
        --preload: 
        --quick: 
        --threads:  20

References

Kim, D., Song, L., Breitwieser, F. P., & Salzberg, S. L. (2016). Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome research, 26(12), 1721-1729.