Computational Challenges in a Densely Sequenced Tree of Life

Published on ● Video Link: https://www.youtube.com/watch?v=6AasNpwIRmc



Duration: 43:30
89 views
2


Katie Pollard (Gladstone Institute of Data Science & Biotechnology)
Computational Challenges in a Densely Sequenced Tree of Life
Computational Challenges in Very Large-Scale 'Omics'

Genome sequencing and assembly have exploded since 2015. Today, many linages contain closely related species, as well as species with multiple diverse genome sequences. Having more genomes seems like a good thing for studying ecology and evolution across the tree of life. However, the workhorse algorithm for genomic studies, sequence alignment, is breaking down in terms of both computational efficiency and accuracy. We explore these issues using metagenomic applications in which microbial communities are sequenced as a pool and alignment is used to map reads to the correct species and genomic site before downstream bioinformatics applications such as abundance estimation and genotyping. We quantify alignment errors and computational barriers across a broad range of scenarios, including lineages in which a commonly used, operational definition of the species boundary (greater than 95% average nucleotide identity) is blurred. Then, we propose several actionable and aspirational solutions to problems such as genome redundancy, reference bias, and cross-mapping. This work demonstrates that efficient algorithms and data structures are essential to maintain access to genomic and metagenomic data science for researchers without massive high-performance computing resources and to ensure read mapping is accurate on a densely sequenced tree of life.




Other Videos By Simons Institute for the Theory of Computing


2022-07-22Profiling of Antibody Repertoires and Immunoglobulin Loci Enables Large-Scale Analysis of...
2022-07-22Leveraging Molecular Data for Drug Discovery
2022-07-22The Rewards and Challenges of Constructing Patient Registries in Mexico
2022-07-22Whole Genome Methylation Patterns as a Biomarkers for EHR Imputation
2022-07-22Biological Discovery and Consumer Genomics Databases Activate Latent Privacy Risk in...
2022-07-22How Do We Deliver Precision Health at Scale for All?
2022-07-22Longitudinal Phenotypes and Disease Trajectories at Population Scale
2022-07-22Towards Making Identification of Noncoding Causes of Human Disease Routine
2022-07-22Nanopore Basecalling for Directed Evolution
2022-07-22Breaking the Winner's Curse in Mendelian Randomization: Rerandomized Inverse Variance...
2022-07-22Computational Challenges in a Densely Sequenced Tree of Life
2022-07-16On the Concept of History (in Foundation Models)
2022-07-16Race Beyond Perception: Analysing Race in Post-visual Regimes
2022-07-15Designing Human-Aware Learning Agents: Understanding the Relationship between Interactions...
2022-07-15Reimagining the machine learning life cycle in education
2022-07-15Aligning Robot Representations with Humans
2022-07-15The Future of Good Decisions’: a research paradigm for quality in automated decision-making
2022-07-15The Flaws of Policies Requiring Human Oversight of Government Algorithms
2022-07-15Assistive Teaching of Motor Control Tasks to Humans
2022-07-15Where Does the Understanding Come From When Explaining Automated Decision-making Systems?
2022-07-14From Optimizing Engagement to Measuring Value



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Computational Challenges in Very Large-Scale 'Omics'
Katie Pollard