IMS-Microsoft Research Workshop: Foundations of Data Science - False Discovery Rates - a new deal

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=fHrEM-MWr1U



Duration: 30:09
1,326 views
25


Session Chair Intro - Rafael Irizarry Harvard University Session Chair Intro: Statistical and Computational Challenges in Biology Matthew Stephens University of Chicago False Discovery Rates - a new deal False Discovery Rate (FDR) methodology, first put forward by Benjamini and Hochberg, and further developed by many authors - including Storey, Tibshirani, and Efron - is now one of the most widely used statistical methods in large-scale scientific data analysis, particularly in genomics. A typical genomics workflow consists of i) estimating thousands of effects, and their associated p values; ii) feeding these p values to software (e.g. the widely used qvalue package) to estimate the FDR for any given significance threshold. In this talk we take a fresh look at this problem, and highlight two deficiencies of this standard pipeline that we believe could be improved. First, current methods, being based directly on p values (or z scores), fail to fully account for the fact that some measurements are more precise than others. Second, current methods assume that the least significant p values (those near 1) are all null - something that initially appears intuitive, but will not necessarily hold in practice. We suggest simple approaches to address both issues, and demonstrate the potential for these methods to increase the number of discoveries at a given FDR threshold. We also discuss the connection between this problem and shrinkage estimation, and problems involving sparsity more generally.




Other Videos By Microsoft Research


2016-06-22Towards Understandable Neural Networks for High Level AI Tasks - Part 3
2016-06-22IMS-Microsoft Research Workshop: Foundations of Data Science - Opening Remarks and Morning Session I
2016-06-22Peter Lee Address to Summer School 2014 Attendees
2016-06-22Approximating Integer Programming Problems by Partial Resampling
2016-06-22IMS-Microsoft Research Workshop: Foundations of Data Science - Opening Remarks and Morning Session I
2016-06-22Proof Engineering, from the Four Colour to the Odd Order Theorem
2016-06-22Thinking for Programmers: Rising Above the Code
2016-06-22Optimal and Adaptive Online Learning
2016-06-22Tutorial: Introduction to Reinforcement Learning with Function Approximation
2016-06-22Towards Understandable Neural Networks for High Level AI Tasks - Part 5
2016-06-22IMS-Microsoft Research Workshop: Foundations of Data Science - False Discovery Rates - a new deal
2016-06-22Interactive Biotechnology: Cloud Labs, Biotic Games, DIY kits, and more
2016-06-22An Algorithm for Precision Medicine
2016-06-22Reverse Engineering Autonomous Language Acquisition
2016-06-22Code Hunt Workshop - Day 2 Session 7
2016-06-22NSF Interdisciplinary Workshop on Statistical NLP and Software Engineering - Session 4
2016-06-22Tutorial: High-Performance Hardware for Machine Learning
2016-06-22Representation Power of Neural Networks
2016-06-22Synthetic Biology: New Tools for an Industry at an Inflection Point
2016-06-22Topological Data Analysis: potential applications to computer vision
2016-06-22Tutorial: Large-Scale Distributed Systems for Training Neural Networks



Tags:
microsoft research
program languages and software engineering