QIIME (version 1.9)

qiime_prep

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for preparing fastq reads for analysis with QIIME (1.9):

The reads stored in each sample are optinally joined and then set it a directory in such a way the downstream, QIIME’s demult can concatenate the sequences while saving the sample of origin.

The directory will contain symbolic links to the files to be used by demult in the following step.

Requires

  • fastq files in one of the following slots:

    • sample_data[<sample>]["fastq.F"]

    • sample_data[<sample>]["fastq.R"]

    • sample_data[<sample>]["fastq.S"]

Output

  • Puts directory of links to files to use with QIIME:

    • self.sample_data["project_data"]["qiime.prep_links_dir"]

  • If join is performed:

    • puts the new joined reads in:

      • self.sample_data[<sample>]["fastq.J"]

    • puts the unjoined forward reads in:

      • self.sample_data[<sample>]["fastq.F"]

    • puts the unjoined reverse reads in:

      • self.sample_data[<sample>]["fastq.R"]

Parameters that can be set

Parameter

Values

Comments

join

none, join (or join_cat - not implemented)

Wheather to join paired reads.

unjoined

forward, reverse, both or none

What to do with unjoined sequences? Use only forward, only reverse, both or none. If join is none, use this parameter to indicate which reads to take for analysis.

join_algo

forward, reverse, both or none

What to do with unjoined sequences?

parameters

Path to QIIME parameter file to be used downstream

Lines for parameter file

q_prep_1:
    module: qiime_prep
    base: merge1
    script_path: /path/to/join_paired_ends.py
    join: join
    unjoined: forward
    parameters: /path/to/qiime_params.txt
    redirects:
        --pe_join_method: fastq-join

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_demult

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s multiple_split_libraries_fastq.py:

The reads from step qiime_prep are combined into one seqs.fna file.

Note

The module has not been tested on other types of data, such as undemultiplexed reads. It should work but there will probably be unexpected problems.

Requires

  • A directory of read files with smaple names coded in the file names, such as the directory produced by qiime_prep:

    • sample_data["qiime.prep_links_dir"]

Output

  • Puts the resulting seqs.fna file in the following slots:

    • self.sample_data["project_data"]["qiime.demult_seqs"]

    • self.sample_data["project_data"]["qiime.fasta"]

    • self.sample_data["project_data"]["fasta.nucl"]

Lines for parameter file

q_demult_1:
    module: qiime_demult
    base: q_prep_1
    script_path: '/path/to/multiple_split_libraries_fastq.py'
    redirects:
        --demultiplexing_method: sampleid_by_file
        --include_input_dir_path: null
        --parameter_fp: /path/to/qiime_params
        --remove_filepath_in_name: null

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_chimera

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s identify_chimeric_seqs.py:

The module can operate on the raw seqs.fna or on an aligned version. The latter is used for ChimeraSlayer and the former for usearch61

Requires

  • A fasta file in:

    • sample_data["qiime.fasta"]

  • Alternatively, an aligned fasta file in:

    • sample_data["fasta.aligned"]

Output

  • Puts the resulting list of chimeras in

    • self.sample_data["project_data"]["chimeras"]

  • Puts the filtered fasta file in:

    • self.sample_data["project_data"]["fasta.chimera_removed"]

    • self.sample_data["project_data"]["fasta.nucl"]

Note

When using parallel_identify_chimeric_seqs.py, the module tries to build the scripts appropriately. It is wise to check the parallel scripts before running them…

Parameters that can be set

Parameter

Values

Comments

method

usearch61 or ChimeraSlayer

Method to use for the analysis (passed to the –chimera_detection_method of identify_chimeric_seqs.py

Lines for parameter file

q_chimera_usrch:
    module: qiime_chimera
    base: q_demult_1
    # script_path: '{Vars.qiime_path}/parallel_identify_chimeric_seqs.py'
    script_path: '{Vars.qiime_path}/identify_chimeric_seqs.py'
    method:         usearch61 # Or ChimeraSlayer. Will guess depending on existing files.
    redirects:
        # --jobs_to_start:              20
        --aligned_reference_seqs_fp:  /path/to/reference_files.otus_aligned
        --reference_seqs_fp:  /path/to/reference_files.otus

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_pick_otus

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s pick_otus.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

Output

  • Puts the resulting OTU table in:

    • self.sample_data["project_data"]["otu_table"]

Lines for parameter file

q_pick_otu_1:
    module: qiime_pick_otus
    base: q_chimera_usrch
    script_path: '{Vars.qiime_path}/pick_otus.py'
    setenv: {Vars.qiime_env}

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_pick_rep_set

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s pick_rep_set.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

  • An OTU table in:

    • sample_data["otu_table"]

Output

  • Puts the resulting fasta file in:

    • self.sample_data["project_data"]["fasta.nucl"]

  • Saves the original fasta file in:

    • self.sample_data["project_data"]["qiime.full_fasta"]

Lines for parameter file

q_rep_set_1:
    module: qiime_pick_rep_set
    base: q_pick_otu_1
    script_path: '{Vars.qiime_path}/pick_rep_set.py'
    setenv: {Vars.qiime_env}

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_align_seqs

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME's align_seqs.py:

Can be used for the parallel versions thereof: parallel_align_seqs_pynast.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

Output

  • Puts the resulting aligned fasta file in:

    • self.sample_data["project_data"]["fasta.nucl"]

    • self.sample_data["project_data"]["fasta.aligned"]

  • Stores the old, unaligned version in:

    • self.sample_data["project_data"]["fasta.unaligned"]

Note

When using parallel_align_seqs_pynast.py, the module tries to build the scripts appropriately. It is wise to check the parallel scripts before running them…

Lines for parameter file

q_align_para:
    module: qiime_align_seqs
    base: q_rep_set_1
    script_path: '{Vars.qiime_path}/parallel_align_seqs_pynast.py'
    setenv: {Vars.qiime_env}
    redirects:
        --jobs_to_start: 5
        --retain_temp_files: 

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_filter_alignment

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s filter_alignment.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

Output

  • Puts the resulting aligned fasta file in:

    • self.sample_data["project_data"]["fasta.nucl"]

  • Saves the original unaligned fasta file in:

    • self.sample_data["project_data"]["fasta.aligned_unfiltered"]

Lines for parameter file

q_filt_align_1:
    module: qiime_filter_alignment
    base: q_align_1
    script_path: '{Vars.qiime_path}/filter_alignment.py'
    setenv: {Vars.qiime_env}

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_assign_taxonomy

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s assign_taxonomy.py

Can also be used to run the parallel versions of the program:

  • parallel_assign_taxonomy_blast.py

  • parallel_assign_taxonomy_rdp.py

  • parallel_assign_taxonomy_uclust.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

Output

  • Puts the resulting list of chimeras in

    • self.sample_data["project_data"]["taxonomy"]

Note

When using the parallel version, the module tries to build the scripts appropriately. It is wise to check the parallel scripts before running them…

Lines for parameter file

q_tax_asn_1:
    module: qiime_assign_taxonomy
    base: q_rep_set_1
    script_path: '{Vars.qiime_path}/parallel_assign_taxonomy_rdp.py'
    setenv: {Vars.qiime_env}
    redirects:
        --confidence: 0.5
        --id_to_taxonomy_fp: {Vars.reference_files.id_to_taxonomy}
        --jobs_to_start: 20
        --rdp_max_memory: 50000
        --reference_seqs_fp: {Vars.reference_files.otus}

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_make_phylogeny

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s make_phylogeny.py

Requires

  • A fasta file in:

    • sample_data["fasta.nucl"]

Output

  • Puts the resulting OTU table in:

    • self.sample_data["project_data"]["phylotree"]

Lines for parameter file

q_phylo_1:
    module: qiime_make_phylogeny
    base: q_filt_align_1
    script_path: '{Vars.qiime_path}/make_phylogeny.py'
    setenv: {Vars.qiime_env}

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_make_otu_table

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s make_otu_table.py:

The module creates a BIOM table based on the OTU table and a taxonomy assignment if avaliable (will be available if the qiime_assign_taxonomy is in the branch).

If chimera checking has been performed, the suspected chimeric sequences will be removed from the BIOM table.

The module also adds code for creating a summary of the BIOM table and a tab-delimited version thereof.

Requires

  • An OTU table:

    • sample_data["otu_table"]

Optional

  • A taxonomy assignment of the sequences:

    • sample_data["taxonomy"]

Output

  • Puts the BIOM table in

    • self.sample_data["project_data"]["biom_table"]

  • Puts the BIOM table summary in:

    • self.sample_data["project_data"]["biom_table_summary"]

  • Puts the BIOM table in tab-delimited format in:

    • self.sample_data["project_data"]["biom_table_tsv"]

  • If a fasta.chimera_removed file exists, will put the unfiltered BIOM table in:

    • self.sample_data["project_data"]["unfiltered_biom_table"]

Parameters that can be set

Parameter

Values

Comments

skip_summary

If passed, will not create the BIOM table summary.

skip_tsv

If passed, will not create the tsv version of the BIOM table.

Lines for parameter file

q_mk_otu_1:
    module: qiime_make_otu_table
    base: q_phylo_1
    script_path: '{Vars.qiime_path}/make_otu_table.py'
    setenv: {Vars.qiime_env}
    # skip_summary:
    # skip_tsv:
    redirects:
        --mapping_fp: /path/to/qiime1_mapping.txt

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_filter_samples_from_otu_table

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s filter_samples_from_otu_table.py

Requires

  • A BIOM table in:

    • sample_data["biom_table"]

Output

  • Puts the resulting BIOM table in:

    • self.sample_data["project_data"]["biom_table"]

  • Puts the BIOM table summary in:

    • self.sample_data["project_data"]["biom_table_summary"]

  • Puts the BIOM table in tab-delimited format in:

    • self.sample_data["project_data"]["biom_table_tsv"]

  • Puts the unfiltered BIOM table in:

    • self.sample_data["project_data"]["prefilter_biom_table"]

Parameters that can be set

Parameter

Values

Comments

skip_summary

If passed, will not create the BIOM table summary.

skip_tsv

If passed, will not create the tsv version of the BIOM table.

Lines for parameter file

filt_samp_1:
    module: qiime_filter_samples_from_otu_table
    base: q_mk_otu_1
    script_path: '{Vars.qiime_path}/filter_samples_from_otu_table.py'
    setenv: {Vars.qiime_env}
    redirects:
        --mapping_fp: /path/to/mapping.txt
        --min_count: 100000

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_filter_otus

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s filter_otus_from_otu_table.py

Requires

  • A BIOM table in:

    • sample_data["biom_table"]

Output

  • Puts the resulting BIOM table in:

    • self.sample_data["project_data"]["biom_table"]

  • Puts the BIOM table summary in:

    • self.sample_data["project_data"]["biom_table_summary"]

  • Puts the BIOM table in tab-delimited format in:

    • self.sample_data["project_data"]["biom_table_tsv"]

  • Puts the unfiltered BIOM table in:

    • self.sample_data["project_data"]["prefilter_biom_table"]

Parameters that can be set

Parameter

Values

Comments

skip_summary

If passed, will not create the BIOM table summary.

skip_tsv

If passed, will not create the tsv version of the BIOM table.

Lines for parameter file

q_filt_otus_1:
    module: qiime_filter_otus
    base: filt_samp_1
    script_path: '{Vars.qiime_path}/filter_otus_from_otu_table.py'
    setenv: {Vars.qiime_env}
    redirects:
        --min_count_fraction: 0.00005
        --min_samples: 10

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_sort_otu_table

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s sort_otu_table.py

Requires

  • A BIOM table in:

    • sample_data["biom_table"]

Output

  • Puts the resulting BIOM table in:

    • self.sample_data["project_data"]["biom_table"]

  • Puts the BIOM table summary in:

    • self.sample_data["project_data"]["biom_table_summary"]

  • Puts the BIOM table in tab-delimited format in:

    • self.sample_data["project_data"]["biom_table_tsv"]

Parameters that can be set

Parameter

Values

Comments

skip_summary

If passed, will not create the BIOM table summary.

skip_tsv

If passed, will not create the tsv version of the BIOM table.

Lines for parameter file

q_sort_otus_1:
    module: qiime_sort_otu_table
    base: filt_samp_1
    script_path: '{Vars.qiime_path}/sort_otu_table.py'
    setenv: {Vars.qiime_env}
    redirects:
        --sort_field:   XXX

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.

qiime_divers

Authors

Menachem Sklarz

Affiliation

Bioinformatics core facility

Organization

National Institute of Biotechnology in the Negev, Ben Gurion University.

Note

This module was developed as part of a study led by Dr. Jacob Moran Gilad

A module for running QIIME’s core_diversity_analyses.py:

The module creates a BIOM table based on the OTU table and a taxonomy assignment if avaliable (will be available if the qiime_assign_taxonomy is in the branch).

If chimera checking has been performed, the suspected chimeric sequences will be removed from the BIOM table.

The module also adds code for creating a summary of the BIOM table and a tab-delimited version thereof.

Requires

  • A BIOM table:

    • sample_data["biom_table"]

Optional

  • A phylogenetic tree:

    • sample_data["phylotree"]

Output

  • Puts the core diversity directory name in

    • self.sample_data["project_data"]["diversity"]

Parameters that can be set

Parameter

Values

Comments

–mapping_fp

A path to the qiime mapping file (if not set, will use the mapping file passed in qiime_prep.

–parameter_fp

A path to a qiime parameter file.

Lines for parameter file

q_divers_1:
    module: qiime_divers
    base: q_filt_otus_1
    script_path: /path/to/QIIME/bin/core_diversity_analyses.py
    qsub_params:
        -pe: shared 20
    sampling_depth: 109897
    redirects:
        --categories: Disease,sex
        --parameter_fp: /path/to/parameter_file

References

Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D., Costello, E.K., Fierer, N., Peña, A.G., Goodrich, J.K., Gordon, J.I. and Huttley, G.A., 2010. “QIIME allows analysis of high-throughput community sequencing data”. Nature methods, 7(5), pp.335-336.