Metagenomics

Modules included in this section

HUMAnN2
kraken
kraken_biom
metaphlan2
centrifuge

`HUMAnN2`

Authors: Menachem Sklarz
Affiliation: Bioinformatics core facility
Organization: National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running HUMAnN2:

Requires

fastq files, either forward or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.S"]

Output

Puts the HUMAnN2 output files in:
- self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.RPK)
- self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.RPK)
- self.sample_data[sample]["HUMAnN2.pathcoverage"]
If humann2_renorm_table block is set in params, puts the normalized tables in:
- self.sample_data[sample]["HUMAnN2.genefamilies"] (Also in HUMAnN2.genefamilies.<units>, where <units> is the value passed to --units)
- self.sample_data[sample]["HUMAnN2.pathabundance"] (Also in HUMAnN2.pathabundance.<units>, where <units> is the value passed to --units)
If humann2_join_tables block is set in params, puts the joined tables in:
- self.sample_data["project_data"]["HUMAnN2.genefamilies"]
- self.sample_data["project_data"]["HUMAnN2.pathabundance"]
- self.sample_data["project_data"]["HUMAnN2.pathcoverage"]

Note

If both humann2_renorm_table and humann2_join_tables blocks exist in params, humann2_join_tables will work on the normalized tables produced by humann2_renorm_table! To join the non-normalized tables, do not normalize the tables by not including a humann2_renorm_table block.

Parameters that can be set

Parameter	Values	Comments
humann2_join_tables		Block containing `path` to `humann2_join_tables`, and a `redirects` block if necessary.
humann2_renorm_table		Block containing `path` to `humann2_renorm_table`, and a `redirects` block if necessary.
protein-database	uniref50\|uniref90	Protein database used for analysis.

Warning

The protein-database parameter records the protein database being used: uniref50 or uniref90. It is not used by this module but is required by the downstream module, HUMAnN2_further_processing. If you do not include it, you will not be able to add a HUMAnN2_further_processing instance for downstream analysis.

Lines for parameter file

HUMAnN2_uniref50_hardtrimmed_reads:
    module: HUMAnN2
    base: Trim_Galore
    script_path: '{Vars.Programs_path.humann2}'
    setenv: PERL5LIB="" mpa_dir=$CONDA_PREFIX/bin
    qsub_params:
        -pe: shared 30
    protein-database:   uniref50
    redirects:
        --gap-fill: 'on'
        --input-format: fastq
        --minpath: 'on'
        --nucleotide-database: '{Vars.databases.humann2.chocophlan}'
        --protein-database: '{Vars.databases.humann2.uniref50}'
        --threads: '30'
    humann2_join_tables:
        path: humann2_join_tables
    humann2_renorm_table:
        path: humann2_renorm_table
        redirects:
            --units: cpm

References

HUMAnN2 home page

`kraken`

Authors: Menachem Sklarz
Affiliation: Bioinformatics core facility
Organization: National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken:

Note that kraken executable must be in a folder together with kraken-translate and kraken-report. This is the default for kraken installation.

Pass the full path to the kraken executable in script_path.

Merging of sample kraken reports in done with krona. See the section on Parameters that can be set.

You can follow this module with the kraken-biom module to create a biom table from the reports.

Requires

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output

Puts the kraken output files in:
- self.sample_data[<sample>]["raw_classification"]
- self.sample_data[<sample>]["classification"]
- self.sample_data[<sample>]["kraken.report"]
- If ktImportTaxonomy_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]

Parameters that can be set

Parameter	Values	Comments
ktImportTaxonomy_path		Path to ktImportTaxonomy. You can additional `ktImportTaxonomy` parameters at the end of the path. If not passed, the `krona` report will not be built.

Lines for parameter file

kraken1:
    module: kraken
    base: trim1
    script_path: {Vars.paths.kraken}
    qsub_params:
        -pe: shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db: /path/to/kraken_std_db
        --preload: 
        --quick: 
        --threads: 20

References

Wood, D.E. and Salzberg, S.L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome biology, 15(3), p.R46.

`kraken_biom`

Authors: Menachem Sklarz
Affiliation: Bioinformatics core facility
Organization: National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running kraken-biom (https://github.com/smdabdoub/kraken-biom)

Requires

Kraken reports:
- sample_data[<sample>]["kraken.report"]

Output

Puts the resulting biom output files in:
- self.sample_data["project_data"]["kraken.biom"]
- self.sample_data["project_data"]["biom_table"]
- self.sample_data["project_data"]["biom_table_tsv"] (if skip_tsv is not set)

Parameters that can be set

Parameter	Values	Comments
skip_tsv		Set if you do not want to convert the report into tsv format.
skip_summary		Set if you do not want to create a summary of the report.
biom_path	/path/to/biom	The path to biom. This is required for conversion to tsv and for producing the summary

Lines for parameter file

kraken_biom1:
    module:             kraken_biom
    base:               kraken1
    script_path:        '{Vars.paths.kraken_biom}'
    # skip_tsv:
    biom_path:          '{Vars.paths.biom}'
    redirects:
        --max:          D 
        --min:          S 
        --gzip:

References

https://github.com/smdabdoub/kraken-biom

`metaphlan2`

Authors: Menachem Sklarz
Affiliation: Bioinformatics core facility
Organization: National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running metaphlan2:

Requires

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output

Puts the metaphlan2 output files in:
- self.sample_data[<sample>]["raw_classification"]
If
If ktImportText_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]
If merge_metaphlan_tables was passed, puts the merged reports in
- self.sample_data["project_data"]["merged_metaphlan2"]
If ‘–biom’ is set in redirects, the biom table is put in:
- self.sample_data[<sample>]["biom_table"]
If ‘–bowtie2out’ is set in redirects, the SAM file is put in:
- self.sample_data[<sample>]["sam"]
If ‘metaphlan2krona_path’ is set:
- self.sample_data[<sample>]["classification"]

Parameters that can be set

Parameter	Values	Comments
ktImportText_path		Path to ktImportText.
merge_metaphlan_tables		Path to merge_metaphlan_tables.py. If not specified, will derive it from the location of `metaphlan2`
metaphlan2krona_path		Path to metaphlan2krona.py

Lines for parameter file

metph1:
    module: metaphlan2
    base: trim1
    script_path: {Vars.paths.metaphlan2}
    ktImportText_path: /path/to/ktImportText
    merge_metaphlan_tables: 
    metaphlan2krona_path:   /path/to/metaphlan2krona.py
    redirects:
        --biom: 
        --bowtie2_exe: /path/to/bowtie2
        --bowtie2db: /path/to/database
        --bowtie2out:
        --input_type: fastq
        --mdelim: ';'
        --mpa_pkl: /path/to/mpa_v20_m200.pkl

References

Truong, D.T., Franzosa, E.A., Tickle, T.L., Scholz, M., Weingart, G., Pasolli, E., Tett, A., Huttenhower, C. and Segata, N., 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nature methods, 12(10), pp.902-903.

`centrifuge`

Authors: Menachem Sklarz
Affiliation: Bioinformatics core facility
Organization: National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running centrifuge:

Pass the full path to the centrifuge executable in script_path.

Merging of sample centrifuge reports in done with krona. See the section on Parameters that can be set.

Requires

fastq files, either paired end or single:
- sample_data[<sample>]["fastq.F"]
- sample_data[<sample>]["fastq.R"]
- sample_data[<sample>]["fastq.S"]

Output

Puts the centrifuge output files in:
- self.sample_data[<sample>]["raw_classification"]
- self.sample_data[<sample>]["classification"]
- self.sample_data[<sample>]["classification_report"]
If ktImportTaxonomy_path parameter was passed, puts the krona reports in
- self.sample_data["project_data"]["krona"]

Parameters that can be set

Parameter	Values	Comments
ktImportTaxonomy_path		Path to ktImportTaxonomy. You can additional `ktImportTaxonomy` parameters at the end of the path. If not passed, the `krona` report will not be built.

Lines for parameter file

Centrifuge:
    module:         centrifuge
    base:           trim1
    script_path:    {Vars.paths.centrifuge}
    qsub_params:
        -pe:        shared 20
    ktImportTaxonomy_path: /path/to/ktImportTaxonomy  -u  http://krona.sourceforge.net
    redirects:
        --db:       /path/to/centrifuge_db
        --preload: 
        --quick: 
        --threads:  20

References

Kim, D., Song, L., Breitwieser, F. P., & Salzberg, S. L. (2016). Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome research, 26(12), 1721-1729.

Metagenomics

HUMAnN2

Requires

Output

Parameters that can be set

Lines for parameter file

References

kraken

Requires

Output

Parameters that can be set

Lines for parameter file

References

kraken_biom

Requires

Output

Parameters that can be set

Lines for parameter file

References

metaphlan2

Requires

Output

Parameters that can be set

Lines for parameter file

References

centrifuge

Requires

Output

Parameters that can be set

Lines for parameter file

References

`HUMAnN2`

`kraken`

`kraken_biom`

`metaphlan2`

`centrifuge`