Compositional sparsity and learnability

Published on ● Video Link: https://www.youtube.com/watch?v=5R_KqZn0bXs



Duration: 0:00
52 views
2


Tomaso Poggio (Massachusetts Institute of Technology)
https://simons.berkeley.edu/talks/tomaso-poggio-massachusetts-institute-technology-2025-07-22
Smale@95: A Conference in Honor of Steve Smale

Is mathematics invented or discovered? Recent progress in formulating fundamental principles underlying the stunning success of deep learning (DL) provides a new light on this age-old questions. I will discuss why deep networks seem to escape the curse of dimensionality. The answer lies in a key property of all functions that are efficiently Turing computable: they are compositionally sparse. This property enables the effective use of deep (and sparse) networks — the engine powering deep networks and related systems such as LLMs. It is however difficult to select a "good" decomposition exploiting sparse compositionality: each efficiently Turing computable function admits many sparse decompositions. How then can deep network learn reusable sparse decompositions? One way is to impose a curriculum similar to a chain of
thoughts in which the constituent functions are common across different tasks.