Transcriptome Annotation¶
Modules included in this section
Trinotate
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
A class that defines a module for RNA_seq assembly annotation using Trinotate.
Note
This module will be updated in the future to support uploading of other sources of information such as RNAMMER output. See Trinotate documentation.
Requires¶
- A transcripts file in
- self.sample_data[“project_data”][“transcripts.fasta.nucl”],
- A gene to transcript mapping file in: (produced by
Trinity_gene_to_trans_map
module) - self.sample_data[“project_data”][“gene_trans_map”],
- A gene to transcript mapping file in: (produced by
- A protein fasta file (produced by
TransDecoder
) - self.sample_data[“project_data”][“fasta.prot”])
- A protein fasta file (produced by
- Results of
blastp
of protein file against swissprot database: - self.sample_data[“project_data”][“blast.prot”],
- Results of
- Results of
blastx
of transcripts file against swissprot database: - self.sample_data[“project_data”][“blast.nucl”],
- Results of
- Results of
hmmscan
of protein file against pfam database: - self.sample_data[“project_data”][“hmmscan.prot”])
- Results of
Attention
If scope
is set to sample
, all of the above files should be in the sample scope!
Output:¶
puts Trinotate report file in:
sample_data[<sample>]["trino.rep"]
(scope = sample
)sample_data["trino.rep"]
(scope = project
)
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
scope | sample|project | |
sqlitedb | Path to Trinotate sqlitedb | |
cp_sqlitedb | Create local copy of the sqlitedb, before loading teh data (recommended) |
Lines for parameter file¶
trino_Trinotate:
module: Trinotate
base:
- trino_blastp_sprot
- trino_blastx_sprot
- trino_hmmscan1
script_path: {Vars.paths.Trinotate}
scope: project
sqlitedb: {Vars.databases.trinotate.sqlitedb}
cp_sqlitedb:
References¶
Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q. and Chen, Z., 2011. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nature biotechnology, 29(7), p.644.
TransDecoder
¶
Authors: | Menachem Sklarz |
---|---|
Affiliation: | Bioinformatics core facility |
Organization: | National Institute of Biotechnology in the Negev, Ben Gurion University. |
A module for running TransDecoder
on a transcripts file.
Note
Tested on TransDecoder version 5.5.0.. The main difference being that in this version an output directory can be specified in the command line.
Requires¶
fasta
files in at least one of the following slots:
sample_data[<sample>]["fasta.nucl"]
(ifscope = sample
)sample_data["fasta.nucl"]
(ifscope = project
)
Output:¶
If
scope = project
:- Protein fasta in
self.sample_data["project_data"]["fasta.prot"]
- Gene fasta in
self.sample_data["project_data"]["fasta.nucl"]
- Original transcripts in
self.sample_data["project_data"]["transcripts.fasta.nucl"]
- GFF file in
self.sample_data["project_data"]["gff3"]
- Protein fasta in
If
scope = sample
:- Protein fasta in
self.sample_data[<sample>]["fasta.prot"]
- Gene fasta in
self.sample_data[<sample>]["fasta.nucl"]
- Original transcripts in
self.sample_data[<sample>]["transcripts.fasta.nucl"]
- GFF file in
self.sample_data[<sample>]["gff3"]
- Protein fasta in
Parameters that can be set¶
Parameter | Values | Comments |
---|---|---|
scope | sample|project | Determine weather to use sample or project transcripts file. |
Lines for parameter file¶
trino_Transdecode_highExpr:
module: TransDecoder
base: Split_Fasta
script_path: {Vars.paths.TransDecoder}
scope: sample