Metagenomics¶

Modules included in this section

HUMAnN2
kraken
kraken_biom
metaphlan2
centrifuge

`HUMAnN2`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running HUMAnN2:

Requires¶

fastq files, either forward or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.S"]

Output¶

Puts the HUMAnN2 output files in:
- self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.RPK)
- self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.RPK)
- self.sample_data[sample]["HUMAnN2.pathcoverage"]
If humann2_renorm_table block is set in params, puts the normalized tables in:
- self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.<units>, where <units> is the value passed to --units)
- self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.<units>, where <units> is the value passed to --units)
If humann2_join_tables block is set in params, puts the joined tables in:
- self.sample_data["project_data"]["HUMAnN2.genefamilies"]
- self.sample_data["project_data"]["HUMAnN2.pathabundance"]
- self.sample_data["project_data"]["HUMAnN2.pathcoverage"]

Note

If both humann2_renorm_table and humann2_join_tables blocks exist in params, humann2_join_tables will work on the normalized tables produced by humann2_renorm_table! To join the non-normalized tables, do not normalize the tables by not including a humann2_renorm_table block.

Parameters that can be set¶

Parameter	Values	Comments
humann2_join_tables		Block containing `path` to `humann2_join_tables`, and a `redirects` block if necessary.
humann2_renorm_table		Block containing `path` to `humann2_renorm_table`, and a `redirects` block if necessary.
protein-database	uniref50\|uniref90	Protein database used for analysis.

Warning

The protein-database parameter records the protein database being used: uniref50 or uniref90. It is not used by this module but is required by the downstream module, HUMAnN2_further_processing. If you do not include it, you will not be able to add a HUMAnN2_further_processing instance for downstream analysis.

Lines for parameter file¶

HUMAnN2_uniref50_hardtrimmed_reads:
    module: HUMAnN2
    base: Trim_Galore
    script_path: '{Vars.Programs_path.humann2}'
    setenv: PERL5LIB="" mpa_dir=$CONDA_PREFIX/bin
    qsub_params:
        -pe: shared 30
    protein-database:   uniref50
    redirects:
        --gap-fill: 'on'
        --input-format: fastq
        --minpath: 'on'
        --nucleotide-database: '{Vars.databases.humann2.chocophlan}'
        --protein-database: '{Vars.databases.humann2.uniref50}'
        --threads: '30'
    humann2_join_tables:
        path: humann2_join_tables
    humann2_renorm_table:
        path: humann2_renorm_table
        redirects:
            --units: cpm

References¶

HUMAnN2 home page

`kraken`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken:

Note that kraken executable must be in a folder together with kraken-translate and kraken-report. This is the default for kraken installation.

Pass the full path to the kraken executable in script_path.

Merging of sample kraken reports in done with krona. See the section on Parameters that can be set.

You can follow this module with the kraken-biom module to create a biom table from the reports.

Requires¶

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output¶

Puts the kraken output files in:
- self.sample_data[<sample>]["raw_classification"]
- self.sample_data[<sample>]["classification"]
- self.sample_data[<sample>]["kraken.report"]
- If ktImportTaxonomy_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]

Parameters that can be set¶

Parameter	Values	Comments
ktImportTaxonomy_path		Path to ktImportTaxonomy. You can additional `ktImportTaxonomy` parameters at the end of the path. If not passed, the `krona` report will not be built.

Lines for parameter file¶

kraken1:
    module: kraken
    base: trim1
    script_path: {Vars.paths.kraken}
    qsub_params:
        -pe: shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db: /path/to/kraken_std_db
        --preload: 
        --quick: 
        --threads: 20

References¶

Wood, D.E. and Salzberg, S.L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome biology, 15(3), p.R46.

`kraken_biom`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken-biom (https://github.com/smdabdoub/kraken-biom)

Requires¶

Kraken reports:
- sample_data[<sample>]["kraken.report"]

Output¶

Puts the resulting biom output files in:
- self.sample_data["project_data"]["kraken.biom"]
- self.sample_data["project_data"]["biom_table"]
- self.sample_data["project_data"]["biom_table_tsv"] (if skip_tsv is not set)

Parameters that can be set¶

Parameter	Values	Comments
skip_tsv		Set if you do not want to convert the report into tsv format.
skip_summary		Set if you do not want to create a summary of the report.
biom_path	/path/to/biom	The path to biom. This is required for conversion to tsv and for producing the summary

Lines for parameter file¶

kraken_biom1:
    module:             kraken_biom
    base:               kraken1
    script_path:        '{Vars.paths.kraken_biom}'
    # skip_tsv:
    biom_path:          '{Vars.paths.biom}'
    redirects:
        --max:          D 
        --min:          S 
        --gzip:

References¶

https://github.com/smdabdoub/kraken-biom

`metaphlan2`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running metaphlan2:

Requires¶

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output¶

Puts the metaphlan2 output files in:
- self.sample_data[<sample>]["raw_classification"]
If
If ktImportText_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]
If merge_metaphlan_tables was passed, puts the merged reports in
- self.sample_data["project_data"]["merged_metaphlan2"]
If ‘–biom’ is set in redirects, the biom table is put in:
- self.sample_data[<sample>]["biom_table"]
If ‘–bowtie2out’ is set in redirects, the SAM file is put in:
- self.sample_data[<sample>]["sam"]
If ‘metaphlan2krona_path’ is set:
- self.sample_data[<sample>]["classification"]

Parameters that can be set¶

Parameter	Values	Comments
ktImportText_path		Path to ktImportText.
merge_metaphlan_tables		Path to merge_metaphlan_tables.py. If not specified, will derive it from the location of `metaphlan2`
metaphlan2krona_path		Path to metaphlan2krona.py

Lines for parameter file¶

metph1:
    module: metaphlan2
    base: trim1
    script_path: {Vars.paths.metaphlan2}
    ktImportText_path: /path/to/ktImportText
    merge_metaphlan_tables: 
    metaphlan2krona_path:   /path/to/metaphlan2krona.py
    redirects:
        --biom: 
        --bowtie2_exe: /path/to/bowtie2
        --bowtie2db: /path/to/database
        --bowtie2out:
        --input_type: fastq
        --mdelim: ';'
        --mpa_pkl: /path/to/mpa_v20_m200.pkl

References¶

Truong, D.T., Franzosa, E.A., Tickle, T.L., Scholz, M., Weingart, G., Pasolli, E., Tett, A., Huttenhower, C. and Segata, N., 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nature methods, 12(10), pp.902-903.

`centrifuge`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running centrifuge:

Pass the full path to the centrifuge executable in script_path.

Merging of sample centrifuge reports in done with krona. See the section on Parameters that can be set.

Requires¶

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output¶

Puts the centrifuge output files in:
- self.sample_data[<sample>]["raw_classification"]
- self.sample_data[<sample>]["classification"]
- self.sample_data[<sample>]["classification_report"]
If ktImportTaxonomy_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]

Parameters that can be set¶

Parameter	Values	Comments
ktImportTaxonomy_path		Path to ktImportTaxonomy. You can additional `ktImportTaxonomy` parameters at the end of the path. If not passed, the `krona` report will not be built.

Lines for parameter file¶

Centrifuge:
    module:         centrifuge
    base:           trim1
    script_path:    {Vars.paths.centrifuge}
    qsub_params:
        -pe:        shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db:       /path/to/centrifuge_db
        --preload: 
        --quick: 
        --threads:  20

References¶

Kim, D., Song, L., Breitwieser, F. P., & Salzberg, S. L. (2016). Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome research, 26(12), 1721-1729.

Metagenomics¶

HUMAnN2¶

Requires¶

Output¶

Parameters that can be set¶

Lines for parameter file¶

References¶

kraken¶

Requires¶

Output¶

Parameters that can be set¶

Lines for parameter file¶

References¶

kraken_biom¶

Requires¶

Output¶

Parameters that can be set¶

Lines for parameter file¶

References¶

metaphlan2¶

Requires¶

Output¶

Parameters that can be set¶

Lines for parameter file¶

References¶

centrifuge¶

Requires¶

Output¶

Parameters that can be set¶

Lines for parameter file¶

References¶

`HUMAnN2`¶

`kraken`¶

`kraken_biom`¶

`metaphlan2`¶

`centrifuge`¶