Transcriptome Annotation
Modules included in this section
Trinotate
- Authors
Menachem Sklarz
- Affiliation
Bioinformatics core facility
- Organization
National Institute of Biotechnology in the Negev, Ben Gurion University.
A class that defines a module for RNA_seq assembly annotation using Trinotate.
Note
This module will be updated in the future to support uploading of other sources of information such as RNAMMER output. See Trinotate documentation.
Requires
- A transcripts file in
self.sample_data[“project_data”][“transcripts.fasta.nucl”],
- A gene to transcript mapping file in: (produced by
Trinity_gene_to_trans_map
module) self.sample_data[“project_data”][“gene_trans_map”],
- A gene to transcript mapping file in: (produced by
- A protein fasta file (produced by
TransDecoder
) self.sample_data[“project_data”][“fasta.prot”])
- A protein fasta file (produced by
- Results of
blastp
of protein file against swissprot database: self.sample_data[“project_data”][“blast.prot”],
- Results of
- Results of
blastx
of transcripts file against swissprot database: self.sample_data[“project_data”][“blast.nucl”],
- Results of
- Results of
hmmscan
of protein file against pfam database: self.sample_data[“project_data”][“hmmscan.prot”])
- Results of
- Results of
signalp
of protein file using signalp program: [ optional ] self.sample_data[“project_data”][“signalp”])
- Results of
- Results of
rnammer
/infernal
transcripts of file: [ optional, use Infernal with Trinotate-V4 ] self.sample_data[“project_data”][“rnammer”])
- Results of
- Results of
tmhmm
of protein file using TmHMM program: [ optional ] self.sample_data[“project_data”][“tmhmm”])
- Results of
- Results of
eggnog
of protein file using EggnogMapper program: [ optional only Trinotate-V4] self.sample_data[“project_data”][“eggnog”])
- Results of
Attention
If scope
is set to sample
, all of the above files should be in the sample scope!
Output:
puts Trinotate report file in:
sample_data[<sample>]["trino.rep"]
(scope = sample
)sample_data["trino.rep"]
(scope = project
)
Parameters that can be set
Parameter |
Values |
Comments |
---|---|---|
scope |
sample|project |
|
sqlitedb |
Path to Trinotate sqlitedb |
|
cp_sqlitedb |
Create local copy of the sqlitedb, before loading teh data (recommended) |
|
ver4 |
Indicate you are using Trinotate V4 |
Lines for parameter file
trino_Trinotate:
module: Trinotate
base:
- trino_blastp_sprot
- trino_blastx_sprot
- trino_hmmscan1
script_path: {Vars.paths.Trinotate}
scope: project
sqlitedb: {Vars.databases.trinotate.sqlitedb}
cp_sqlitedb:
ver4:
References
Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q. and Chen, Z., 2011. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nature biotechnology, 29(7), p.644.
TransDecoder
- Authors
Menachem Sklarz
- Affiliation
Bioinformatics core facility
- Organization
National Institute of Biotechnology in the Negev, Ben Gurion University.
A module for running TransDecoder
on a transcripts file.
Note
Tested on TransDecoder version 5.5.0.. The main difference being that in this version an output directory can be specified in the command line.
Requires
fasta
files in at least one of the following slots:
sample_data[<sample>]["fasta.nucl"]
(ifscope = sample
)
sample_data["fasta.nucl"]
(ifscope = project
)
Output:
If
scope = project
:Protein fasta in
self.sample_data["project_data"]["fasta.prot"]
Gene fasta in
self.sample_data["project_data"]["fasta.nucl"]
Original transcripts in
self.sample_data["project_data"]["transcripts.fasta.nucl"]
GFF file in
self.sample_data["project_data"]["gff3"]
If
scope = sample
:Protein fasta in
self.sample_data[<sample>]["fasta.prot"]
Gene fasta in
self.sample_data[<sample>]["fasta.nucl"]
Original transcripts in
self.sample_data[<sample>]["transcripts.fasta.nucl"]
GFF file in
self.sample_data[<sample>]["gff3"]
Parameters that can be set
Parameter |
Values |
Comments |
---|---|---|
scope |
sample|project |
Determine weather to use sample or project transcripts file. |
Lines for parameter file
trino_Transdecode_highExpr:
module: TransDecoder
base: Split_Fasta
script_path: {Vars.paths.TransDecoder}
scope: sample