You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi All,
I am new to working with nextflow's rnaseq pipe, and am currently using previously produced outputs from this pipeline. The outputs produced from STAR_salmon include quant matrixes on gene and transcript levels. However, when looking at my transcript outputs, there aren't any quantifications on an isoform level (ex. ENST00000420443, and not ENST00000420443.1,ENST00000420443.2, etc.).
The original references look correct for what this pipeline requires, but unfortunately it looks like reference transcriptome produced in this pipeline wasn't saved: ./Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz and ./homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz. I also dont have any of the quant.sf files, but I do have access to the downstream BAM's (*.markdup.sorted.bam) from STAR. These also appear to lack this isoform-level data.
Could anyone help me figure out why this may be? Or if it is possible to re-process these BAM files in an alternative way to extract this isoform info.
The text was updated successfully, but these errors were encountered:
Different transcript isoforms from the same gene have different Ensembl IDs. For example, Ensembl gene ENSG00000139618 has transcript isoforms ENST00000380152.8, ENST00000530893.7, etc. (see here for the full list).
The ".8" and ".7" in the transcript IDs are the version numbers, referring to how many times the transcript annotation has been revised.
So unless I'm misunderstanding your question (always possible!), you do have transcript isoform quantification data in your results. Also, you should have quant.sf files for each of your samples. They will be in individual subdirectories corresponding to your sample names within the star_salmon/ output directory.
Hi All,
I am new to working with nextflow's rnaseq pipe, and am currently using previously produced outputs from this pipeline. The outputs produced from STAR_salmon include quant matrixes on gene and transcript levels. However, when looking at my transcript outputs, there aren't any quantifications on an isoform level (ex. ENST00000420443, and not ENST00000420443.1,ENST00000420443.2, etc.).
The original references look correct for what this pipeline requires, but unfortunately it looks like reference transcriptome produced in this pipeline wasn't saved: ./Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa.gz and ./homo_sapiens/Homo_sapiens.GRCh38.111.gtf.gz. I also dont have any of the quant.sf files, but I do have access to the downstream BAM's (*.markdup.sorted.bam) from STAR. These also appear to lack this isoform-level data.
Could anyone help me figure out why this may be? Or if it is possible to re-process these BAM files in an alternative way to extract this isoform info.
The text was updated successfully, but these errors were encountered: