Machine Learning Day 2013 - Clustering; Geometry Preserving Non-Linear Dimension Reduction

Subscribers:
344,000
Published on ● Video Link: https://www.youtube.com/watch?v=jZH4e56YEc0



Duration: 1:06:29
861 views
8


Clustering: Probably Approximately Useless?, Rich Caruana (MSR) Clustering never seems to live up to the hype. To paraphrase the popular saying, clustering looks good in theory, yet often fails to deliver in practice. Why? You would think that something so simple and elegant as finding groups of similar items in data would be incredibly useful. Yet often it isn't. The problem is that clustering rarely finds the groups *you* want, or expected, or that are most useful for the task at hand. There are so many good ways to cluster a dataset that the odds of coming up with the clustering that is best for what you're doing are small. How do we fix this and make clustering more useful in practice? How do we make clustering do what you want, while still giving it the freedom to "do its own thing" and surprise us? Geometry preserving non-linear dimension reduction, Marina Meila (UW Statistics) In recent years, manifold learning has become increasingly popular as a tool for performing non-linear dimensionality reduction. This has led to the development of numerous algorithms of varying degrees of complexity that aim to recover the underlying low-dimensional parameters of the data using either local or global features. It is also widely recognized that the low dimensional parametrizations will typically distort the geometric properties of the original data, like distances, angles, areas and so on. These distortions depend both on the data and on the algorithm used. Building on the Laplacian Eigenmap framework, we propose a new paradigm that offers a guarantee, under reasonable assumptions, that *any* manifold learning algorithm will preserve the geometry of a data set. Our approach is based on augmenting the output of an algorithm with geometric information, embodied in the Riemannian metric of the manifold. The Riemannian metric allows us to compute geometric quantities (such as angle, length, or volume) for any coordinate system or embedding of the manifold. This geometric faithfulness, which is not guarante edfor most algorithms, allows us to define geometric measurements that are inde pendent of the algorithm used, and hence move seamlessly from one algorithm to another. In this work, we provide an algorithm for estimating the Riemannian metric from data and demonstrate the advantages of our approach in a variety of examples. As an application of this new framework, we develop a new, principled, unsupervised to selecting the scale parameter in manifold learning, based on optimizing the geometric self-consistency w.r.t the scale. This talk will not require any knowledge of advanced mathematics or manifold learning. Joint work with Dominique Perrault-Joncas.




Other Videos By Microsoft Research


2016-08-08The 31st UW/MS Symposium in Computational Linguistics
2016-08-08Parallel Thinking
2016-08-08The 3rd Age of Computing
2016-08-08Computational Fair Division: From Cake Cutting to Cluster Computing
2016-08-08Student Session: Learning Cloud Computing, Environmental Science, and You
2016-08-08Machine Learning Day 2013 - Deep Learning; A Bayesian Information Criterion for Singular Models
2016-08-08IEEE eScience Keynote: From Genes to Stars
2016-08-08Large-Scale Data Analysis for Biomedical and Social Sciences - Tom Cai
2016-08-08Large-Scale Data Analysis for Biomedical and Social Sciences - Takayuki Okatani
2016-08-08Tutorial 2 - Kinect for Windows in Science Applications - SDK Introduction
2016-08-08Machine Learning Day 2013 - Clustering; Geometry Preserving Non-Linear Dimension Reduction
2016-08-08From Smart Sensors to City OS (II) - Panel Discussion
2016-08-08Locally Testable Codes and L_1 Embeddings of Cayley Graphs
2016-08-08Interactive Visual Analytics for Scientific Discovery - Solving Problems with Visual Analytics
2016-08-08Big Planet Big Questions, Big Data Big Science - Fetch Climate
2016-08-08From Smart Sensors to City OS (II) - Lei Chen
2016-08-08Tutorial 1: Azure Platform for Cloud Computing - Windows Azure Virtual Machines
2016-08-08From Smart Sensors to City OS (II) - Zhen Liu
2016-08-08Tutorial 1: Azure Platform for Cloud Computing - Windows Azure SOI Database and Storage
2016-08-08From Smart Sensors to City OS (I) - Geospatial Service Web
2016-08-08From Smart Sensors to City OS-How to Design for Long-Term Usage in Behavior Sensing and Feedback



Tags:
microsoft research