Metagenomics¶
Modules included in this section
HUMAnN2
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
Note
This module was developed as part of a study led by Dr. Jacob Moran Gilad
A module for running HUMAnN2
:
Requires¶
fastq files, either forward or single:
sample_data[<sample>]["fastq.F"]
sample_data[<sample>]["fastq.S"]
Output¶
Puts the
HUMAnN2
output files in:self.sample_data[sample]["HUMAnN2.genefamilies"]
(Also inHUMAnN2.genefamilies.RPK
)self.sample_data[sample]["HUMAnN2.pathabundance"]
(Also inHUMAnN2.pathabundance.RPK
)self.sample_data[sample]["HUMAnN2.pathcoverage"]
If
humann2_renorm_table
block is set in params, puts the normalized tables in:self.sample_data[sample]["HUMAnN2.genefamilies"]
(Also inHUMAnN2.genefamilies.<units>
, where<units>
is the value passed to--units
)self.sample_data[sample]["HUMAnN2.pathabundance"]
(Also inHUMAnN2.pathabundance.<units>
, where<units>
is the value passed to--units
)
If
humann2_join_tables
block is set in params, puts the joined tables in:self.sample_data["project_data"]["HUMAnN2.genefamilies"]
self.sample_data["project_data"]["HUMAnN2.pathabundance"]
self.sample_data["project_data"]["HUMAnN2.pathcoverage"]
Note
If both humann2_renorm_table
and humann2_join_tables
blocks exist in params, humann2_join_tables
will work on the normalized tables produced by humann2_renorm_table
! To join the non-normalized tables, do not normalize the tables by not including a humann2_renorm_table
block.
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
humann2_join_tables | Block containing path to humann2_join_tables , and a redirects block if necessary. |
|
humann2_renorm_table | Block containing path to humann2_renorm_table , and a redirects block if necessary. |
|
protein-database | uniref50|uniref90 | Protein database used for analysis. |
Warning
The protein-database
parameter records the protein database being used: uniref50 or uniref90. It is not used by this module but is required by the downstream module, HUMAnN2_further_processing
. If you do not include it, you will not be able to add a HUMAnN2_further_processing
instance for downstream analysis.
Lines for parameter file¶
HUMAnN2_uniref50_hardtrimmed_reads:
module: HUMAnN2
base: Trim_Galore
script_path: '{Vars.Programs_path.humann2}'
setenv: PERL5LIB="" mpa_dir=$CONDA_PREFIX/bin
qsub_params:
-pe: shared 30
protein-database: uniref50
redirects:
--gap-fill: 'on'
--input-format: fastq
--minpath: 'on'
--nucleotide-database: '{Vars.databases.humann2.chocophlan}'
--protein-database: '{Vars.databases.humann2.uniref50}'
--threads: '30'
humann2_join_tables:
path: humann2_join_tables
humann2_renorm_table:
path: humann2_renorm_table
redirects:
--units: cpm
References¶
kraken
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
Note
This module was developed as part of a study led by Dr. Jacob Moran Gilad
A module for running kraken
:
Note that kraken
executable must be in a folder together with kraken-translate
and kraken-report
. This is the default for kraken
installation.
Pass the full path to the kraken
executable in script_path
.
Merging of sample kraken reports in done with krona. See the section on Parameters that can be set.
You can follow this module with the kraken-biom
module to create a biom table from the reports.
Requires¶
fastq files, either paired end or single:
sample_data[<sample>]["fastq.F"]
sample_data[<sample>]["fastq.R"]
sample_data[<sample>]["fastq.S"]
Output¶
Puts the
kraken
output files in:self.sample_data[<sample>]["raw_classification"]
self.sample_data[<sample>]["classification"]
self.sample_data[<sample>]["kraken.report"]
- If
ktImportTaxonomy_path
parameter was passed, puts the krona reports in self.sample_data["project_data"]["krona"]
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
ktImportTaxonomy_path | Path to ktImportTaxonomy. You can additional ktImportTaxonomy parameters at the end of the path. If not passed, the krona report will not be built. |
Lines for parameter file¶
kraken1:
module: kraken
base: trim1
script_path: {Vars.paths.kraken}
qsub_params:
-pe: shared 20
ktImportTaxonomy_path: /path/to/ktImportTaxonomy -u http://krona.sourceforge.net
redirects:
--db: /path/to/kraken_std_db
--preload:
--quick:
--threads: 20
References¶
Wood, D.E. and Salzberg, S.L., 2014. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome biology, 15(3), p.R46.
kraken_biom
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
Note
This module was developed as part of a study led by Dr. Jacob Moran Gilad
A module for running kraken-biom
(https://github.com/smdabdoub/kraken-biom)
Requires¶
Kraken reports:
sample_data[<sample>]["kraken.report"]
Output¶
Puts the resulting biom output files in:
self.sample_data["project_data"]["kraken.biom"]
self.sample_data["project_data"]["biom_table"]
self.sample_data["project_data"]["biom_table_tsv"]
(ifskip_tsv
is not set)
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
skip_tsv | Set if you do not want to convert the report into tsv format. | |
skip_summary | Set if you do not want to create a summary of the report. | |
biom_path | /path/to/biom | The path to biom. This is required for conversion to tsv and for producing the summary |
Lines for parameter file¶
kraken_biom1:
module: kraken_biom
base: kraken1
script_path: '{Vars.paths.kraken_biom}'
# skip_tsv:
biom_path: '{Vars.paths.biom}'
redirects:
--max: D
--min: S
--gzip:
References¶
metaphlan2
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
Note
This module was developed as part of a study led by Dr. Jacob Moran Gilad
A module for running metaphlan2
:
Requires¶
fastq files, either paired end or single:
sample_data[<sample>]["fastq.F"]
sample_data[<sample>]["fastq.R"]
sample_data[<sample>]["fastq.S"]
Output¶
Puts the
metaphlan2
output files in:self.sample_data[<sample>]["raw_classification"]
If
If
ktImportText_path
parameter was passed, puts the krona reports inself.sample_data["project_data"]["krona"]
If
merge_metaphlan_tables
was passed, puts the merged reports inself.sample_data["project_data"]["merged_metaphlan2"]
If ‘–biom’ is set in
redirects
, the biom table is put in:self.sample_data[<sample>]["biom_table"]
If ‘–bowtie2out’ is set in
redirects
, the SAM file is put in:self.sample_data[<sample>]["sam"]
If ‘metaphlan2krona_path’ is set:
self.sample_data[<sample>]["classification"]
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
ktImportText_path | Path to ktImportText. | |
merge_metaphlan_tables | Path to merge_metaphlan_tables.py. If not specified, will derive it from the location of metaphlan2 |
|
metaphlan2krona_path | Path to metaphlan2krona.py |
Lines for parameter file¶
metph1:
module: metaphlan2
base: trim1
script_path: {Vars.paths.metaphlan2}
ktImportText_path: /path/to/ktImportText
merge_metaphlan_tables:
metaphlan2krona_path: /path/to/metaphlan2krona.py
redirects:
--biom:
--bowtie2_exe: /path/to/bowtie2
--bowtie2db: /path/to/database
--bowtie2out:
--input_type: fastq
--mdelim: ';'
--mpa_pkl: /path/to/mpa_v20_m200.pkl
References¶
Truong, D.T., Franzosa, E.A., Tickle, T.L., Scholz, M., Weingart, G., Pasolli, E., Tett, A., Huttenhower, C. and Segata, N., 2015. MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Nature methods, 12(10), pp.902-903.
centrifuge
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
Note
This module was developed as part of a study led by Dr. Jacob Moran Gilad
A module for running centrifuge
:
Pass the full path to the centrifuge
executable in script_path
.
Merging of sample centrifuge reports in done with krona. See the section on Parameters that can be set.
Requires¶
fastq files, either paired end or single:
sample_data[<sample>]["fastq.F"]
sample_data[<sample>]["fastq.R"]
sample_data[<sample>]["fastq.S"]
Output¶
Puts the
centrifuge
output files in:self.sample_data[<sample>]["raw_classification"]
self.sample_data[<sample>]["classification"]
self.sample_data[<sample>]["classification_report"]
If
ktImportTaxonomy_path
parameter was passed, puts the krona reports inself.sample_data["project_data"]["krona"]
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
ktImportTaxonomy_path | Path to ktImportTaxonomy. You can additional ktImportTaxonomy parameters at the end of the path. If not passed, the krona report will not be built. |
Lines for parameter file¶
Centrifuge:
module: centrifuge
base: trim1
script_path: {Vars.paths.centrifuge}
qsub_params:
-pe: shared 20
ktImportTaxonomy_path: /path/to/ktImportTaxonomy -u http://krona.sourceforge.net
redirects:
--db: /path/to/centrifuge_db
--preload:
--quick:
--threads: 20
References¶
Kim, D., Song, L., Breitwieser, F. P., & Salzberg, S. L. (2016). Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome research, 26(12), 1721-1729.