Perspectives on Cross-Validation

Channel:

Subscribers:

344,000

Published on February 3, 2020 11:10:39 PM ● Video Link: https://www.youtube.com/watch?v=AU6OS_uq0mU

Duration: 54:36

1,439 views

Cross-validation is probably the most widely used method for risk estimation in machine learning and statistics. However, analyzing it and comparing it to the data splitting estimator has proved difficult. In the first part of the talk, I will present a new analysis which characterizes the exact asymptotic of cross-validation in the form of a central limit theorem for estimators which satisfy certain stability conditions. In particular, parametric estimators automatically satisfy these conditions, and the theorems characterize the cross-validated risk for such estimators fully. I will demonstrate that they exhibit a wide variety of behaviours: in the case of a parametric empirical risk minimizer, the folds behave as if independent if the evaluation loss is the same as the training loss. However, if a surrogate loss is used, different behaviours may occur. In the second part, I will move on to discuss issues which arise when using cross-validation for high-dimensional estimators: in the regime where the number of parameters is comparable to the number of observations, cross-validation (and data splitting) may introduce serious bias in the estimate of the risk when the amount of data left out is high (i.e. the number of folds is low). A natural approach may thus be to alleviate this problem by leaving out as little data as possible: a single observation, leading to leave-one-out cross-validation (LOOCV). I will show that indeed, such a result holds and the LOOCV estimator is consistent in the high-dimensional asymptotic. Unfortunately, the LOOCV estimator is computationally prohibitive, and cannot be used in practice. Finally, I will discuss a general framework, approximate LOOCV, from which closed-formed approximate estimators can be derived for penalized GLMs, including non-smooth ones such as the LASSO or SVMs.

See more at https://www.microsoft.com/en-us/research/video/perspectives-on-cross-validation/

Other Videos By Microsoft Research

2020-03-04	AI, Azure and the future of healthcare with Dr. Peter Lee \| Podcast
2020-02-27	Towards Mainstream Brain-Computer Interfaces (BCIs)
2020-02-27	Exploring Massively Multilingual, Massive Neural Machine Translation
2020-02-27	Fireside Chat with Maarten de Rijke
2020-02-26	Neural architecture search, imitation learning and the optimized pipeline with Dr. Debadeepta Dey
2020-02-21	Information Agents: Directions and Futures (2001)
2020-02-19	Democratizing data, thinking backwards and setting North Star goals with Dr. Donald Kossmann
2020-02-19	Behind the scenes on Team Explorer’s practice run at Microsoft for the DARPA SubT Urban Challenge
2020-02-12	Microsoft Scheduler and dawn of Intelligent PDAs with Dr. Pamela Bhattacharya \| Podcast
2020-02-05	Responsible AI with Dr. Saleema Amershi \| Podcast
2020-02-03	Perspectives on Cross-Validation
2020-01-30	Data Science Summer School 2019 - Replicating "An Empirical Analysis of Racial Differences in Po..."
2020-01-29	Going deep on deep learning with Dr. Jianfeng Gao \| Podcast
2020-01-22	Innovating in India with Dr. Sriram Rajamani [Podcast]
2020-01-17	Underestimating the challenge of cognitive disabilities (and digital literacy)
2020-01-17	Understanding Knowledge Distillation in Neural Sequence Generation
2020-01-17	'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
2020-01-07	Private AI Bootcamp Keynote – Sreekanth Kannepalli
2020-01-07	Introduction to CKKS (Approximate Homomorphic Encryption)
2020-01-07	Private AI Bootcamp Competition: Team 3
2020-01-07	Conversations Based on Search Engine Result Pages

Tags:

Cross-validation

machine learning and statistics

data splitting

LOOCV

microsoft research

Channel	Latest
C.M. Simpson	6 hours ago
Feeder Sindex	6 hours ago
mechanic kid	6 hours ago
Riz Goodies TV	6 hours ago
wandis channel	7 hours ago
Jean 360°	7 hours ago
Sodapoppin Playthroughs	7 hours ago
Wapitika	7 hours ago
Vithy	7 hours ago
CXI_NARAWI	7 hours ago
Iqbal Jabiren	7 hours ago
Hoshizora	8 hours ago
domisumReplay: Kayn	8 hours ago
Aditya Aslami	8 hours ago
MISS MIARI	8 hours ago
domisumReplay: Diana	8 hours ago
REKSA	8 hours ago
Masteur TV	8 hours ago
NDD TV	8 hours ago
VUONG	8 hours ago
mrs.spark	8 hours ago
SahiDDROid.	8 hours ago
Borutokun Indonesia	8 hours ago
TemiGaming	8 hours ago
JOVCARS	8 hours ago