Transcriptome Annotation¶

Modules included in this section

Trinotate
TransDecoder

`Trinotate`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

A class that defines a module for RNA_seq assembly annotation using Trinotate.

Note

This module will be updated in the future to support uploading of other sources of information such as RNAMMER output. See Trinotate documentation.

Requires¶

A transcripts file in
- self.sample_data[“project_data”][“transcripts.fasta.nucl”],
A gene to transcript mapping file in: (produced by Trinity_gene_to_trans_map module)
- self.sample_data[“project_data”][“gene_trans_map”],
A protein fasta file (produced by TransDecoder)
- self.sample_data[“project_data”][“fasta.prot”])
Results of blastp of protein file against swissprot database:
- self.sample_data[“project_data”][“blast.prot”],
Results of blastx of transcripts file against swissprot database:
- self.sample_data[“project_data”][“blast.nucl”],
Results of hmmscan of protein file against pfam database:
- self.sample_data[“project_data”][“hmmscan.prot”])

Attention

If scope is set to sample, all of the above files should be in the sample scope!

Output:¶

puts Trinotate report file in:
- sample_data[<sample>]["trino.rep"] (scope = sample)
- sample_data["trino.rep"] (scope = project)

Parameters that can be set¶

Parameter	Values	Comments
scope	sample\|project
sqlitedb		Path to Trinotate sqlitedb
cp_sqlitedb		Create local copy of the sqlitedb, before loading teh data (recommended)

Lines for parameter file¶

trino_Trinotate:
    module:             Trinotate
    base:               
                        - trino_blastp_sprot
                        - trino_blastx_sprot
                        - trino_hmmscan1
    script_path:        {Vars.paths.Trinotate}
    scope:              project
    sqlitedb:           {Vars.databases.trinotate.sqlitedb}
    cp_sqlitedb:    

References¶

Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q. and Chen, Z., 2011. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nature biotechnology, 29(7), p.644.

`TransDecoder`¶

Authors:	Menachem Sklarz
Affiliation:	Bioinformatics core facility
Organization:	National Institute of Biotechnology in the Negev, Ben Gurion University.

A module for running TransDecoder on a transcripts file.

Note

Tested on TransDecoder version 5.5.0.. The main difference being that in this version an output directory can be specified in the command line.

Requires¶

fasta files in at least one of the following slots:

sample_data[<sample>]["fasta.nucl"] (if scope = sample)

sample_data["fasta.nucl"] (if scope = project)

Output:¶

If scope = project:
- Protein fasta in self.sample_data["project_data"]["fasta.prot"]
- Gene fasta in self.sample_data["project_data"]["fasta.nucl"]
- Original transcripts in self.sample_data["project_data"]["transcripts.fasta.nucl"]
- GFF file in self.sample_data["project_data"]["gff3"]
If scope = sample:
- Protein fasta in self.sample_data[<sample>]["fasta.prot"]
- Gene fasta in self.sample_data[<sample>]["fasta.nucl"]
- Original transcripts in self.sample_data[<sample>]["transcripts.fasta.nucl"]
- GFF file in self.sample_data[<sample>]["gff3"]

Parameters that can be set¶

Parameter	Values	Comments
scope	sample\|project	Determine weather to use sample or project transcripts file.

Lines for parameter file¶

trino_Transdecode_highExpr:
    module:             TransDecoder
    base:               Split_Fasta
    script_path:        {Vars.paths.TransDecoder}
    scope:              sample

Transcriptome Annotation¶

Trinotate¶

Requires¶

Output:¶

Parameters that can be set¶

Lines for parameter file¶

References¶

TransDecoder¶

Requires¶

Output:¶

Parameters that can be set¶

Lines for parameter file¶

References¶

`Trinotate`¶

`TransDecoder`¶