Sequence Bioinformatics at Large Scale: Petabase-Scale Sequence Alignment Catalyses Viral Discovery

Published on ● Video Link: https://www.youtube.com/watch?v=OeagWAjpXBo



Duration: 28:45
93 views
3


Rayan Chikhi (Institut Pasteur)
https://simons.berkeley.edu/talks/sequence-bioinformatics-large-scale-petabase-scale-sequence-alignment-catalyses-viral
Computational Challenges in Very Large-Scale 'Omics'

Petabytes of valuable sequencing data reside in public repositories, doubling in size every two years. They contain a wealth of genetic information about viruses that would help us monitor spillovers and anticipate future pandemics. We recently developed a bioinformatics cloud infrastructure, named Serratus, to perform petabase-scale sequence alignment. With it we analyzed all available RNA-seq samples (5.7 million samples, 10 petabytes) and discovered 10x more RNA viruses than previously known, including a new family of coronaviruses (Edgar et al, Nature, 2022). In this talk, I will present the computational infrastructure and some of the biological analyses.




Other Videos By Simons Institute for the Theory of Computing


2022-07-22Benchmarking, Inference, and in Silico Controls in Single-Cell and Spatial Omics Data Science
2022-07-22Learning Gene Association Networks Using Single-Cell RNA-Seq Data: A Graphical Model Approach
2022-07-22Mapping Gene Regulatory Dependencies with Single-Cell Resolution
2022-07-22Harnessing Multimodal Single-Cell Sequencing Data for Integrative Analysis
2022-07-22Learning From Large-Scale (Single-Cell) ‘Omics’
2022-07-22Panel Discussion
2022-07-22Exploratory and Model-Based Analysis of ScHi-C Data
2022-07-22The Earth Biogenome Project: Progress and the Challenges Ahead
2022-07-22Multiple Sequence Alignment for Predicting Antigen-Antibody Interactions
2022-07-22Evolution of Germline Mutation Spectrum in Humans
2022-07-22Sequence Bioinformatics at Large Scale: Petabase-Scale Sequence Alignment Catalyses Viral Discovery
2022-07-22Long-Read Transcriptome Complexity and Cell-Type Regulatory Signatures in ENCODE4
2022-07-22Leveraging Long Reads Sequencing for Developing a Functional Iso-Transcriptomics Analysis Framework
2022-07-22Multi-Omic Integration for Understanding Disease
2022-07-22The Epigenetic Logic of Gene Activation
2022-07-22Profiling of Antibody Repertoires and Immunoglobulin Loci Enables Large-Scale Analysis of...
2022-07-22Leveraging Molecular Data for Drug Discovery
2022-07-22The Rewards and Challenges of Constructing Patient Registries in Mexico
2022-07-22Whole Genome Methylation Patterns as a Biomarkers for EHR Imputation
2022-07-22Biological Discovery and Consumer Genomics Databases Activate Latent Privacy Risk in...
2022-07-22How Do We Deliver Precision Health at Scale for All?



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Computational Challenges in Very Large-Scale 'Omics'
Rayan Chikhi