It is generally recognized that pre-mRNA processing is a key step that influences subsequent mature mRNA expression levels, with all the functional implications that this entails. One of these pre-mRNA-processing mechanisms is polyadenylation. Polyadenylation is an alternative process as it can occur at different sites in the transcript, either in the terminal exon or in introns. This leads to mature transcripts of different lengths, that show therefore different stability and are differently regulated. This study aims at testing the feasibility of analyzing mammalian evolution by studying the evolution of the alternative polyadenylation (APA) mechanism from the standard gene expression data provided by Kaessmann’s lab. By using the DaPars algorithm, we estimated a replicability of APA events of ~90% and of APA sites of ~60-70%, which gives us important information on the proportion of APA change due to intra-specific individual variability. As regards the comparison among different species, we calculated a conservation of the APA site that is, as expected, inversely proportional to the evolutionary distance between species. DaPars also proves to be reliable in calculating the level of usage of the distal versus proximal APA site (quantified through the PDUI value). We know that the mechanism of APA is driven by specialized protein factors and by the presence of specific sequences to which proteins are recruited (more or less strongly), such as the PAS motif. As our results show, the PDUI value provided by the algorithm is indeed capable of reflecting the possible presence and the relative 'strength' of a PAS motif in the vicinity of the predicted proximal APA site. However, this algorithm also shows limitations. Indeed, according to our results, it systematically shifts the estimated APA site upstream from its actual location. Then, by applying the treeSeg algorithm on the DaPars output data from a cross-section of brain transcriptomes across different mammalian species, we obtained information on all genes for which APA was estimated to have diverged at a certain evolutionary node. Performing then a GO enrichment analysis we managed to find some overrepresented terms (being response to amino acid / acid chemical and regulation of (cellular) protein metabolic process) that are very important for the evolution of a complex organ such as the brain.

Using transcriptome data to understand the evolution of alternative polyadenylation in mammals

D'ANGELO, RACHELE MARIA
2021/2022

Abstract

It is generally recognized that pre-mRNA processing is a key step that influences subsequent mature mRNA expression levels, with all the functional implications that this entails. One of these pre-mRNA-processing mechanisms is polyadenylation. Polyadenylation is an alternative process as it can occur at different sites in the transcript, either in the terminal exon or in introns. This leads to mature transcripts of different lengths, that show therefore different stability and are differently regulated. This study aims at testing the feasibility of analyzing mammalian evolution by studying the evolution of the alternative polyadenylation (APA) mechanism from the standard gene expression data provided by Kaessmann’s lab. By using the DaPars algorithm, we estimated a replicability of APA events of ~90% and of APA sites of ~60-70%, which gives us important information on the proportion of APA change due to intra-specific individual variability. As regards the comparison among different species, we calculated a conservation of the APA site that is, as expected, inversely proportional to the evolutionary distance between species. DaPars also proves to be reliable in calculating the level of usage of the distal versus proximal APA site (quantified through the PDUI value). We know that the mechanism of APA is driven by specialized protein factors and by the presence of specific sequences to which proteins are recruited (more or less strongly), such as the PAS motif. As our results show, the PDUI value provided by the algorithm is indeed capable of reflecting the possible presence and the relative 'strength' of a PAS motif in the vicinity of the predicted proximal APA site. However, this algorithm also shows limitations. Indeed, according to our results, it systematically shifts the estimated APA site upstream from its actual location. Then, by applying the treeSeg algorithm on the DaPars output data from a cross-section of brain transcriptomes across different mammalian species, we obtained information on all genes for which APA was estimated to have diverged at a certain evolutionary node. Performing then a GO enrichment analysis we managed to find some overrepresented terms (being response to amino acid / acid chemical and regulation of (cellular) protein metabolic process) that are very important for the evolution of a complex organ such as the brain.
ENG
IMPORT DA TESIONLINE
File in questo prodotto:
File Dimensione Formato  
869759_tesi_versione_segreteria_11_11_2022.pdf

non disponibili

Tipologia: Altro materiale allegato
Dimensione 1.02 MB
Formato Adobe PDF
1.02 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14240/84932