Prediction builds representations! Fixed Bias speeds up learning!
We perceive high-dimensional stimuli, yet we do not seem to store our memory in this high-dimensional space. We seem to have an abstraction space where we store the memory in a semantically meaningful way. How can we best store memory? I highlight that next-token prediction using self-supervised learning may hold the key to learn representations and hence, may be the reasons why Transformers work so well. I also highlight that autoencoder reconstruction loss may be a bad way to abstract meaningful relations.
Perhaps we store memory in different forms for different tasks. We need a way to retrieve memory and adapt it to our current situation. The embedding space from input space to abstraction space may be learnable at first, but should be fixed to allow for memory reuse. We may also have natural fixed biases such as frequency in cilia of cochlear, and rods and cones in our eyes, to restrict the inputs we receive, which may help with better semantic representation. There is a lot to think about, and I still have not found the answer. Next week, we will talk more about how hierarchical representations may be helpful for memory storage.
~~~~~~~~~~~~~~~~~~~~~
References:
See Part 2 here: https://www.youtube.com/watch?v=1x049Dmxes0
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Representation%20Learning.pdf
Two-kitten experiment (earliest version by Held and Hein, showing that movement is necessary for learning interactions with the world): http://wexler.free.fr/library/files/held%20%281963%29%20movement-produced%20stimulation%20in%20the%20development%20of%20visually%20guided%20behavior.pdf
Learning, Fast and Slow: https://www.youtube.com/watch?app=desktop&v=Hr9zW7Usb7I
~~~~~~~~~~~~~~~~~~~~~
0:00 Introduction
1:20 Autoencoders: Representation via Reconstruction
8:08 Do you need to predict everything?
11:26 Transformers: Representation via Prediction
18:08 Self-Supervised Learning learns manifolds without human labels
21:30 Action Prediction is all you need
37:58 JEPA: Prediction in latent space
48:40 Why Contrastive Loss is bad
55:29 World Models: Do we really reconstruct pixel space?
1:13:10 Natural Fixed Biases: Faster learning by constraints
1:25:50 Information Pipeline incorporating fixed bias
1:28:52 Discussion
~~~~~~~~~~~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin