UTR Sequencing

Map both 5’ and 3’ transcript ends genome-wide at single nucleotide resolution and identify isoforms

Precise mapping of transcription start sites (TSS) and polyadenylation sites (PAS) is critical to understand gene regulation and transcript diversity. Alternate TSS and PAS usage can either include or exclude key regulatory elements found in the untranslated regions (UTRs) at the ends of transcripts, ultimately having a profound impact on the molecular and cellular physiology of cells. Regulatory elements in 5′ and 3′ UTRs have been implicated in human disease and can function as therapeutic targets.

End-Seq applications for RNA drug discovery and clinical biomarkers. This new technology was developed with the RNA drug discovery and biomarker field in mind. Alternative TSS and poly-A-site usage resulting in transcript isomers are known hallmarks of certain human diseases like cancer. Being able to identify such usage genome-wide across multiple samples, at single nucleotide resolution, is a powerful tool in biomarker discovery. Similarly, TSS and poly-A-sites are excellent targets for both anti-sense oligos and small molecule drugs, requiring absolute nucleotide locations of transcript ends.

The End-Seq technology enables mapping of the 5′ ends and 3′ ends of transcripts by sequencing from the TSS and PAS, respectively, revealing the complete UTR landscape of expressed transcripts. 5′ and 3′ End-Seq enrich for transcript ends to detect known and novel TSS and PAS at single nucleotide resolution.

5´End-Seq

5´ End Sequencing 5´ UTR Sequencing

5´ End Sequencing 5´ UTR Sequencing

3´End-Seq

3´ End-Sequencing 3´UTR Sequencing

3´ End-Sequencing 3´UTR Sequencing

Enrich for TSS and PAS to define 5′ and 3′ UTR Landscape

  • End-Seq enriches the sample for reads at transcript ends requiring lower sequencing depths
  • Greater depth of coverage around the PAS and TSS allows for more precise detection of known and novel polyA and transcription start sites

Genome-Wide End Calling at Single Nucleotide Resolution

High Confidence TSS and PAS

  • Defined read cliffs in 5’ End-Seq and 3’ End-Seq data allow application of the PureCLIP peak calling algorithm to call single nucleotide end positions near annotated TSS and PAS
  • 3′ End-Seq increases the confidence of detected PAS by removing any PAS reads caused by internal A-rich genomic regions

5’ and 3’ End-Seq protocols are easy to perform and are compatible with just 3 ug total RNA starting material. They take about three days to complete (both protocols in parallel) and provide genome-wide transcript end data at single nucleotide resolution.

How does this new approach compare to current transcript-end mapping methods? Take as an example CAGE-seq, a commonly used method to map 5’ ends. CAGE-Seq is a rather labor- and cost-intensive protocol, requiring 5ug total RNA, taking up to 8 days to complete and costing around 4000 Euro for an 8-sample kit, excluding the cost for sequencing and data analysis. In contrast, the Eclipse Bioinnovations method needs only 3ug of total RNA, takes 3 days to complete and costs less than half for an 8-sample kit including data analysis.

On the contrary, 3’ end sequencing technologies are known for their low cost, fast protocols and low input requirements. However, many of the 3’ end protocols are based on oligodT-primed reverse transcription (RT), allowing for the occurrence of low confidence poly-A-sites by priming internal A-rich sequences. Eclipse’s 3’ End-seq improves this by relying on efficient adapter ligations instead of oligodT-primed RT. In addition, te Eclipse data analysis package filters out any false positive 3’ ends. The most revolutionizing aspect of the new approach is the fact that the sequencing reads start at the poly-A-sites and read toward the 3’ UTR, as opposed to starting somewhere in the 3’ UTR and sequencing towards the poly-A-sites like other 3’ end sequencing technologies such as MACE-Seq do. This enables poly-A-site definition with single nucleotide resolution.

Eclipse BioInnovations´ End-Seq Data Analysis Packages enable the end user to detect all active transcription start sites (TSS) and poly-A-sites, detect novel poly-A-sites with high confidence based on false positive poly-A-site filtering, and further use this data to understand differential gene expression and potentially detect biomarkers in diseased samples. The analysis package uses Fastq files of the raw sequencing reads provided by you. You  will receive an HTML report detailing the following data: Single Nucleotide 5´/ 3´End Read Coverage Bedgraph, Single Nucleotide 5´/3´End Calls BED, Filtered Single Nucleotide 5´/3´End Calls (false positive 5´/3´ends are filtered out bioinformatically).