BioC2024

Ahmad Al Ajami

Unraveling Immunogenomic Diversity in Single-Cell Data

Alexey Sergushichev

Easy access, interactive exploration and analysis of public gene expression datasets with Phantasus

Andrea Baran

miRglmm: modeling isomiR-level counts improves estimation of miRNA-level differential expression and uncovers variable differential expression between isomiRs

Anthony Christidis

Dr. Anthony Christidis is a Computational Scientist at the Center for Computational Biomedicine at Harvard Medical School where he is a member of multiple research teams. Originally from Canada, he earned his PhD in Statistical Machine Learning from the University of British Columbia (UBC) and a MSc in the same field from the University of Toronto. During his doctoral studies, he developed a new ensemble learning framework to model high-dimensional data which resulted in multiple publications in computational statistics journals. Following his PhD, Dr. Christidis was a Postdoctoral Research Fellow in the Department of Statistics at UBC where he developed new robust computational methods for the analysis of multi-omics data. He has also taught undergraduate and graduate courses in probability, statistics, data science and signal processing at UBC. Dr. Christidis regularly publishes software libraries implementing the statistical and computational methods he develops, and he has held various software development jobs in research institutes and in collaboration with the private sector. His research interests include machine learning, optimization, scientific computing, and the application of computational methods to single-cell and RNA-seq data.

scDiagnostics: diagnostic functions to assess the quality of cell type annotations in single-cell RNA-seq data

Arnab Mukherjee

An ardent and dedicated doctoral researcher with a burning desire to bridge the realms of cancer genomics and systems biology, embarking on a thrilling journey to reveal promising and precise therapeutic avenues.

Unraveling the Intricate Molecular Landscape and Potential Biomarkers in Lung Adenocarcinoma through Integrative Epigenomic and Transcriptomic Data Analysis

Asier Ortega Legarreta

Passionate about Data Science, Bioinformatics and Genomics. PhD student at the Translational Bioinformatic Unit at Navarrabiomed.

GeneSetCluster 2.0: an upgraded comprehensive toolset for summarizing and integrating gene-sets analysis

Astrid Deschênes

I have an academic background in both computer science and engineering. I have had several professional experiences in various environments such as research centers and private companies. All those assignments have it common that they are related to research and development. I have accumulated more than 10 years of experience as a computational analyst, and I have had the opportunity to work in various fields such as plant genomics, epigenetic inheritance, and cancer. Since joining the Tuveson Laboratory at Cold Spring Harbor Laboratory, I have been collaborating with biologists to conduct rigorous computational analyses in order to achieve a better understanding of pancreatic cancer mechanisms. I am particularly interested in the development of novel bioinformatics methods and software.

Visualization of functional enrichment results into biological networks with Bioconductor enrichViewNet package

Boyi Guo

Boyi Guo is an applied statistician and biomedical data scientist working at the intersection of machine learning, computational omics, and population health. His research concentrates on developing statistically rigorous and computationally scalable machine-learning methods, as well as open-source software, that integrates population-scale multi-omics data to uncover functional mechanisms that explain disease heterogeneity.

Scalable count-based models for unsupervised detection of spatially variable genes

Changqing Wang

Igniting full-length isoform and mutation analysis of single-cell RNA-seq data with FLAMES

Charles Deng

Charles is an associate bioinformatician in the Beckmann Lab at Mount Sinai. After graduating from Brown in 2019 with a major in Applied Math-Economics, he worked for three years building trading algorithms in quantitative finance before leaving his job to pursue a career in medicine. He will be starting medical school this fall of 2024.

A scalable and flexible network-based approach to identify, diagnose, and resolve mislabeled samples in molecular data

Charlotte Hutchings

I am a third-year PhD student working between AstraZeneca and Prof. Kathryn Lilley's laboratory at the University of Cambridge. My project uses novel proteomics methods to map changes in protein abundance and location within HEK293 cells when these cells are being used as factories to produce recombinant adeno-associated viruses (rAAVs). Such viruses are used as DNA delivery vehicles in both research and gene therapies. By increasing our understanding of the viral production process, I aim to identify routes to enhance the manufacturing process.

Throughout my PhD I have generated large and complex proteomics datasets and, as such, have come to enjoy the challenge of large-scale data exploration and analysis. My excitement for data has led me to publish multiple data processing workflows and write/teach workshops on using the use of R for these workflows. Moving forwards, I intend to continue into a career that involves active research (both wet lab and bioinformatic), teaching and promoting open, reproducible science.

Using high-throughput spatial proteomics as a platform to elucidate protein relocalisation events during viral production

Conor Ryan

Assigning treatment regimens to Irish patients in head and neck squamous cell carcinoma with large language models

Deleted User

Assigning treatment regimens to Irish patients in head and neck squamous cell carcinoma with large language models

Erdal Cosgun

Cloud Methods Working Group

Erica Feick

Closing Remarks

Ernesto Aparicio-Puerta

Assessing differential expression strategies for small RNA sequencing using real and simulated data

Farhan Ameen

Context is important! Identifying context aware spatial relationships with Kontextual.

Frederic Bertrand

As a full professor at the University of Technology of Troyes, I am mainly interested in applying statistics and machine learning to high-dimensional, process, and network data.

Sobol4RV: sensitivity in random settings

Gauri Vaidya

Assigning treatment regimens to Irish patients in head and neck squamous cell carcinoma with large language models

Hannah Swan

A hierarchical Bayesian model for the identification of technical length variants in miRNA sequencing data

Hiba Ben Aribi

Exvar: An R Package for Gene Expression And Genetic Variation Data Analysis And Visualization

Jacques Serizay

Applying tidy principles to investigating chromatin composition and architecture

James Eapen

iscream: Fast and memory efficient (sc)WGBS data handler

Janani Ravi

I am an Assistant Professor at the University of Colorado Anschutz Medical Campus, Dept. of Biomedical Informatics, Center for Health Artificial Intelligence (with ties to Dept. of Immunology and Microbiology). I completed my PhD in Computational Biology at Virginia Tech and postdoctoral research at the Public Health Research Institute, Rutgers Biomedical, and Health Sciences. I recently moved from the Depts. of Pathobiology & Diagnostic Investigation, Microbiology & Molecular Genetics, Michigan State University.

We develop general-purpose computational approaches that integrate large-scale heterogeneous public datasets that lead to the mechanistic understanding of microbial genotypes, phenotypes, and diseases.
Specifically, we focus on two key questions:
- How do we link microbial genotypes to phenotypic traits?
We use a combination of protein sequence-structure-function relationships, comparative genomics, and machine learning to bridge the genotype-phenotype gap (e.g., phenotypes, antimicrobial resistance, host-specificity, microbial pathogenesis).
- How do we delineate molecular mechanisms underlying host response to infection and discover host-directed therapeutics?
We use comparative transcriptomics, disease-drug signatures, and machine learning to learn about host response and drug repurposing.
Our methods are generally pathogen- and disease-agnostic. We also release open data/software and easy-to-use web applications for wide use by the biomedical community.

I am also actively engaged in training, education, and outreach, and committed to creating and sustaining a diverse and inclusive ecosystem in data science and R for learners and professionals alike, focusing on increasing the participation of underrepresented minorities in data science and R programming. Towards this effort, I founded R-Ladies East Lansing and R-Ladies Aurora, and co-founded Women+ Data Science and AsiaR. I also co-chair the R/Bioconductor Community Advisory Board.

amR: an R package to predict and explore the top antimicrobial resistance features

Jayaram Kancherla

Interoperability between R and Python using BiocPy Workshop

Joseph Lee

BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis

Joseph Rich

The impact of package selection and versioning on single-cell RNA-seq analysis

Kaushal Bhavsar

Assigning treatment regimens to Irish patients in head and neck squamous cell carcinoma with large language models

Kelly Street

Assistant Professor of Population and Public Health Sciences in the Division of Biostatistics at the University of Southern California.
Previously: Dana-Farber Cancer Institute (postdoc), UC Berkeley (PhD), and UChicago (undergrad).

Multi-omic analysis methods for identifying phenotypic plasticity

Lambda Moses

Exploratory spatial data analysis from single molecules to multiple samples

Lukas Weber

Assistant Professor, Department of Biostatistics, Boston University

Identification of spatial domains by smoothing for compositional analyses in spatial transcriptomics data

Meghana Kshirsagar

Meghana Kshirsagar is an Associate Professor in AI/ML , in the department of computer science and information systems at University of Limerick, Ireland. She possesses rich experience of working as Associate Professor in the Computer science department at Government Engineering College Aurangabad, India. During 2019 -2022 she worked as postdoctoral researcher on the Science Foundation Ireland Project “Automatic Design of Digital Circuits” along with industrial partner Intel. From 2022-2023 she worked as a research fellow in “Limerick Digital Cancer Research Centre”. Her research interests include large language models, machine learning , digital twins , bioinformatics and blockchains. She is Principal Investigator for the National Challenge Fund , digital for resilience for the project “ALTER: Unleashing the power of Artificial Intelligence and Digital Twins for Emergency Care: A vision for the future”. She is Deputy Director of Biocomputing Developmental Systems Research Group at University of Limerick.She serves on the “Equity, Diversity and inclusion” committee. She has authored 80 papers in prestigious machine learning ,artificial intelligence and bioinformatics conferences and journals.She serves as both reviewer and guest editor in “Frontiers of Blockchain” journal. She has delivered workshops and webinars in high profile international events and workshops.

Assigning treatment regimens to Irish patients in head and neck squamous cell carcinoma with large language models

Mengbo Li

Cell type co-localization and cell type-specific microenvironment analysis on spatial transcriptomics data

Mercedeh Movassagh

PathSeeker: A Statistical Package for Enhanced Pathogen Identification and Characterization in RNA Sequencing Data

Michael Totty

Currently an F32 NRSA Postdoctoral Fellow in the Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health working with Dr. Stephanie Hicks and Dr. Keri Martinowich. Interested in developing novel therapeutics for psychiatric disorders via reverse translation by combining next-generation sequencing across species with sophisticated behavioral design in preclinical models. PhD training from the Texas A&M Institute for Neuroscience.

SpotSweeper: spatially-aware quality control for the removal of technical artifacts and local outliers in spatial transcriptomics

My Nguyen

scHiCcompare - differential analysis of single-cell Hi-C data

Myriam Maumy

As an associate professor at the University of Technology of Troyes, I am interested in statistics and machine learning.

Patterns: Deciphering Biological Networks with Patterned Heterogeneous (multiOmics) Measurements

Ning Shen

vmrseq: Probabilistic Modeling of Single-cell Methylation Heterogeneity

Pascal Belleau

Computational framework for inference of genetic ancestry from challenging human molecular data

PeterHuang

bamSliceR: Estimation of Transcripts origin of Variants by a Bayesian Approach Using RNA-seuencing data.

Pratheepa Jeganathan

Statistical Methods for the Tissue Microenvironment of Multiplex Images in a Clinical-relevant Manner

Raymond Lesiyon

I am a Informatic Research Professional, in the University of Colorado Anschutz Medical Campus. I am under the mentorship of Dr. Janani Ravi, and Dr. Nina Wale. I graduated with a Bachelors in Biosystem Engineering, and a Masters in Computational Mathematics Science and Engineering, both from Michigan State University. My current research is on understand microbial phenotypic traits, using interpretable machine learning algorithms.

microgenomeR: an R workflow for integrating genomic metadata and bacterial phenotypes

Ryan Thompson

Ryan C. Thompson, PhD is an Assistant Professor in the Division of Data Driven and Digital Medicine (D3M) in the Department of Medicine in the Icahn School of Medicine at Mount Sinai. He completed his PhD at Scripps Research in La Jolla, CA. He has a long history of applying and adapting bioinformatics methods to address both anticipated and unanticipated research challenges, even developing new methods when the need arises. His broad methodological background includes normalization, differential expression, machine learning, gene set testing, and data visualization with both sequencing and array-based data. In addition, he has a broad background in general biology and in immunology in particular, along with a strong foundation in statistics. In his current work at Mount Sinai, he has developed a colored graph visualization tool to aid in quick, accurate identification and correction of mislabeled samples in a data set of 2385 RNA-seq and whole-genome sequencing samples. His broad understanding of commonly used statistical and computational methods enables him to continue adapting to and overcoming similar unexpected analysis challenges that may arise in the course of the proposed research. Outside the lab, Ryan considers it a success if his D&D group has to pull out a physics textbook to figure out what happens next.

A scalable and flexible network-based approach to identify, diagnose, and resolve mislabeled samples in molecular data

Sehyun Oh

I am an Assistant Professor at CUNY SPH, with expertise in both experimental biology and bioinformatics. As a molecular biologist by training, I had studied DNA repair and telomere maintenance mechanisms during my doctoral and postdoctoral research. As a bench scientist, I started to notice the limitations of arguing the extent to which my findings in cell lines were actually happening in living organisms and relevant to public health, and this made me interested in the potential of large public datasets. I made a career transition from a bench scientist to a bioinformatics scientist and joined Dr. Waldron’s lab at CUNY SPH as a postdoctoral researcher in 2017. Since then, I had worked on many research projects, published papers, and have developed a wide collaborative network and profound experience and understanding of large public omics data analysis, statistical method development for high-dimensional data, Cloud-based computing, AnVIL workspace and workflow developments, user-friendly software development. Currently, I am working on a NIH-funded project to construct an omics data repository designed for the easy application of Artificial Intelligence and Machine Learning tools. My over-arching career goal is to facilitate interdisciplinary research through the development of intuitive bioinformatics infrastructure and user-friendly tools that lower barriers across different disciplines and resources. In my free time, I enjoys ballroom dancing and exploring different neighborhoods in New York.

OmicsMLRepo: Ontology-leveraged metadata harmonization to improve AI/ML-readiness of omics data in Bioconductor

Seong-Hwan Jun

Statistical modelling of microRNA-seq data

Shian Su

Long-read methylation data analysis with NanoMethViz and Bioconductor

Svetlana Ugarcina Perovic

Svetlana Ugarcina Perovic is a microbiome scientist in the Segata lab, who supports open science and open source initiatives: Microbiome digest, Microbiome Virtual International Forum, NSURP, Outreachy, BugSigDB and curatedMetagenomicData.

Inclusive internships in genomic data science: Outreachy and Bioconductor

Tim Triche

Live-fire reproducible research: htmlwidgets, observable, webR, and Bioconductor
Introduction to Package Development

Vincent Carey

Ontologies for Genomics: new approaches with Bioconductor's ontoProc

Vipul Singhal

I am a computational biologist at the Genome Institute of Singapore. I am interested in ML/DL, with applications to neuroscience and genomics.

BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis

Xinyue Cui

Statistical Methods for the Tissue Microenvironment of Multiplex Images in a Clinical-relevant Manner

Zachary DeBruine

Zach DeBruine is Assistant Professor of Computing at Grand Valley State University, and Research Scientist in the GVSU Applied Computing Institute. He completed his Ph.D. and postdoc studies at Van Andel Institute at first in structural biology, then bioinformatics, and finally high-performance machine learning. His lab currently is building large multimodal foundation models on genomics and biobank data.

Engineering Foundation Models of Single-cell Transcriptomics Data
Singlet: Fast and Interpretable Dimension Reduction of Single-cell Data