Predicting the Deleteriousness of Genomic Variants – Big and Small

Channel:

Simons Institute for the Theory of Computing

Subscribers:

68,700

Published on July 12, 2022 3:46:46 AM ● Video Link: https://www.youtube.com/watch?v=kDOvaeqY4i0

Duration: 26:55

142 views

Martin Kircher (BIH @ Charité / University of Luebeck)
https://simons.berkeley.edu/talks/predicting-deleteriousness-genomic-variants-big-and-small
From Algorithms to Discovery in Genome-Scale Biology and Medicine

Approaches for the identification of disease causal mutations are widely applied in research and clinical settings, but interpretation and ranking of the resulting variants remains challenging. Combined Annotation Dependent Depletion (CADD, https://cadd-sv.bihealth.org/) integrates annotations by contrasting variants that survived purifying selection along the human lineage with simulated mutations to score short sequence variants (SNVs, InDels, multi-allelic substitutions). Since its publication (Kircher, Witten et al. Nat Genet. 2014), CADD was well adopted by the community and minor adjustments and fixes were released since, including the native support of both GRCh37 and GRCh38 assemblies (Rentzsch et al. NAR 2019). Recently, we assessed existing deep neural network (DNN) models for splice effects with the Multiplexed Functional Assay of Splicing using Sort-seq dataset (MFASS, Cheung et al. Mol Cell. 2019). We selected two DNN models based only on genomic sequence, MMSplice and SpliceAI, which showed the best performance for integration into CADD (Rentzsch et al. Genome Med. 2021). The DNN scores boosted CADD's predictions for splice effects and we noted that while the DNN scores have superior performance on splice variants, they fail to account for nonsense and missense effects of the same variants. This suggests that variant prioritization will improve with more domain-specific information and underlines the importance of identifying additional such features, e.g. for regulatory sequences. With rapid advances in the identification of structural variants (SVs), we decided to apply the general concept of CADD to score them (CADD-SV, https://cadd-sv.bihealth.org/). While methods utilizing individual mechanistic principles like the deletion of coding sequence or 3D architecture disruptions were available, a comprehensive tool that uses the broad spectrum of available SV annotations was missing. We show that CADD-SV scores are predictive of pathogenicity and population frequency and that CADD-SV's ability to prioritize pathogenic variants exceeds that of existing methods like SVScore and AnnotSV (Kleinert & Kircher, Genome Res. 2022). Our results highlight advantages of the CADD approach, like profiting from a large training data set covering diverse and rare feature annotations without major ascertainment effects from historic and on-going variant collections.

Other Videos By Simons Institute for the Theory of Computing

2022-07-14	Integrated Information Theory (IIT) and Nuclear Command and Control: Whither Sovereignty?
2022-07-14	Authorship, Technicity, and Contingency
2022-07-13	AI & Humanity on the Ground: Embedding AI into Critical Clinical Decision Making
2022-07-13	Hard Choices in Artificial Intelligence
2022-07-13	Law's Consumers and Platform Users: How Competing Constructions of Humans Legitimize...
2022-07-12	Outward-Facing Science
2022-07-11	Exponentiating Single-Cell Sequencing
2022-07-11	Distinct Gene Programs Underpinning ‘Disease Tolerance’ and ‘Resistance’ Against Infections
2022-07-11	Determining the Molecular Intermediates Between Genotype and Phenotype
2022-07-11	How Genome 3D Organization Regulates Alternative Splicing?
2022-07-11	Predicting the Deleteriousness of Genomic Variants – Big and Small
2022-07-11	Algorithms for Inferring Phenotypes from Ancient DNA
2022-07-11	Mapping Biological Pathways Using Systematic Genetics and Cell Biology
2022-07-11	Computational Approaches to Study Interactions Between Mutagenic Processes and Cellular Processes
2022-07-11	A Tyrosine Kinase Protein Interaction Map Reveals Targetable EGFR Network Oncogenesis in Lung Cancer
2022-07-11	A Binary Quantitative Interaction Mapping Approach: Elucidating Multiprotein Complexes in...
2022-07-11	Long-Range Propagation of Genetic Effects in Molecular Networks
2022-07-11	Using Large-Scale Clinico-Genomics Data for in silico Clinical Trials and Precision Oncology
2022-07-11	A Statistical, Reference-Free Algorithm Subsumes Myriad Problems in Genome Science
2022-07-11	Machine Learning for Single-Cell 3D Epigenomics
2022-07-11	Understanding Molecular Complexity for Precision Medicine

Tags:

Simons Institute

theoretical computer science

UC Berkeley

Computer Science

Theory of Computation

Theory of Computing

From Algorithms to Discovery in Genome-Scale Biology and Medicine

Martin Kircher

Channel	Latest
Big punchman	6 hours ago
Jakou	6 hours ago
Brunoborne	6 hours ago
Stan's Mod Gaming	6 hours ago
OPEN TV	7 hours ago
neXzen MMD & MUSIC	7 hours ago
flipswitch3111	7 hours ago
WalkthroughGuy	7 hours ago
ТРЕНДИ ШОРТС	7 hours ago
Linkwolf	7 hours ago
아루우	7 hours ago
Nostradamus	7 hours ago
Xeres Artrophel Ch.	7 hours ago
Dandy Caballero	7 hours ago
OUDO - ON THE RIFT	7 hours ago
Zaus Eragon	7 hours ago
Ian Harrison	7 hours ago
KiLLiNG MaCHiNE ( ͡° ͜ʖ ͡°)	7 hours ago
RFTV	7 hours ago
Foxline	7 hours ago
S-Tavo Plays	7 hours ago
Ictfix.net	7 hours ago
Evelone Rofls	7 hours ago
Winkazi	7 hours ago
Samanta Gamer	8 hours ago