Long-Read Transcriptome Complexity and Cell-Type Regulatory Signatures in ENCODE4

Published on ● Video Link: https://www.youtube.com/watch?v=CivPG_qL2MU



Duration: 32:25
57 views
2


Ali Mortazavi (University of California, Irvine)
https://simons.berkeley.edu/talks/long-read-transcriptome-complexity-and-cell-type-regulatory-signatures-encode4
Computational Challenges in Very Large-Scale 'Omics'

A significant proportion of mammalian genes encode for multiple transcript isoforms that result from differential promoter usage, changes in internal splicing, and 3’ end choice. The comprehensive characterization of transcript diversity across tissues, cell types, and species has been challenging because transcripts are much longer than reads normally used for RNA-seq. Long-read RNA-seq (lrRNA-seq) allows for identification of the complete structure of each transcript. As part of the final phase of the ENCODE Consortium, we sequenced 216 lrRNA-seq libraries totaling 1 billion circular consensus reads (CCS) for 60 unique human and mouse samples. We detected and quantified 94.4% of GENCODE protein coding genes as well as 42.6% of known protein coding transcripts. Overall, we detected over 100,000 full-length transcripts, one third of which are novel. We then define a new reference set of transcription start sites (TSSs), transcription end sites (TESs), and intron chains that are used for each gene across diverse tissues and cell types. Finally, we develop new metrics to characterize the transcriptional diversity of each gene in terms of alternative TSS choice, TES choice, and internal splicing; and demonstrate that this diversity varies on a per-gene basis across tissues, cell lines, and species. Our results represent the first comprehensive survey of human and mouse transcriptomes using full-length long reads and will serve as a foundation for further transcript-centric analyses. Genomic regulation after birth contributes significantly to tissue and organ maturation, but is under-studied relative to existing genomic catalogues of prenatal development in mouse. As part of ENCODE4, we generated the first comprehensive bulk and single-cell atlas of postnatal regulatory events across a diverse set of mouse tissues. The collection encompassed seven postnatal time points spanning the human equivalent of childhood through adolescence and adulthood, and focused on adrenal glands, gastrocnemius muscle, heart, hippocampus, and cortex. To allow for allele-specific analyses, we used C57BL6J/Castaneus F1 hybrid mice. Our analysis revealed novel dynamics of cell type composition including identifying new sex-specific cell populations and new commonalities in cell types shared among tissues. We also identify genomic regulatory signatures associated with dynamics of cell type composition, specialization of sub-cell types, and switching between cell states during postnatal development across 21 different cell types broken down into 68 sub-cell types. We provide an organizational framework to describe TFs that are re-purposed in regulatory signatures of cell type identity in different tissues. Together, these analyses provide a foundation for understanding the postnatal development of diverse tissues.




Other Videos By Simons Institute for the Theory of Computing


2022-07-22Learning Gene Association Networks Using Single-Cell RNA-Seq Data: A Graphical Model Approach
2022-07-22Mapping Gene Regulatory Dependencies with Single-Cell Resolution
2022-07-22Harnessing Multimodal Single-Cell Sequencing Data for Integrative Analysis
2022-07-22Learning From Large-Scale (Single-Cell) ‘Omics’
2022-07-22Panel Discussion
2022-07-22Exploratory and Model-Based Analysis of ScHi-C Data
2022-07-22The Earth Biogenome Project: Progress and the Challenges Ahead
2022-07-22Multiple Sequence Alignment for Predicting Antigen-Antibody Interactions
2022-07-22Evolution of Germline Mutation Spectrum in Humans
2022-07-22Sequence Bioinformatics at Large Scale: Petabase-Scale Sequence Alignment Catalyses Viral Discovery
2022-07-22Long-Read Transcriptome Complexity and Cell-Type Regulatory Signatures in ENCODE4
2022-07-22Leveraging Long Reads Sequencing for Developing a Functional Iso-Transcriptomics Analysis Framework
2022-07-22Multi-Omic Integration for Understanding Disease
2022-07-22The Epigenetic Logic of Gene Activation
2022-07-22Profiling of Antibody Repertoires and Immunoglobulin Loci Enables Large-Scale Analysis of...
2022-07-22Leveraging Molecular Data for Drug Discovery
2022-07-22The Rewards and Challenges of Constructing Patient Registries in Mexico
2022-07-22Whole Genome Methylation Patterns as a Biomarkers for EHR Imputation
2022-07-22Biological Discovery and Consumer Genomics Databases Activate Latent Privacy Risk in...
2022-07-22How Do We Deliver Precision Health at Scale for All?
2022-07-22Longitudinal Phenotypes and Disease Trajectories at Population Scale



Tags:
Simons Institute
theoretical computer science
UC Berkeley
Computer Science
Theory of Computation
Theory of Computing
Computational Challenges in Very Large-Scale 'Omics'
Ali Mortazavi