BioC2024

Statistical modelling of microRNA-seq data
07-24, 13:50–13:58 (US/Eastern), Tomatis Auditorium

MicroRNAs (miRNAs) are pivotal in regulating gene expression and influencing disease progression. Despite their critical role as disease modulators, statistical analysis methods for miRNAs have not been as thoroughly developed as those for messenger RNAs (mRNAs). Commonly, techniques designed for mRNAs are repurposed for miRNA data without considering the unique characteristics of miRNAs. This study critically examines the assumptions of mRNA-based methods when applied to miRNAs. We challenge these assumptions, highlighting the competitive nature of miRNA expression. Our research introduces novel statistical methods and modelling strategies tailored for miRNA sequencing data. These approaches account for the distinctive sources of variability in miRNA data, including competition for expression, library size variations, and data sparsity. We demonstrate the efficacy of our models through validation on both microRNAome datasets and simulated data. Our parameter estimation relies on autograd, while inference employs Laplace's approximation to Bayesian posterior distribution. Our investigation not only questions prevailing practices in miRNA data analysis but also provides a foundation for more accurate and specific miRNA study by examination of sequence level data.