Hierarchy! The future of AI: How it helps representations and why it is important.

Subscribers:
5,330
Published on ● Video Link: https://www.youtube.com/watch?v=1x049Dmxes0



Category:
Let's Play
Duration: 2:26:39
252 views
14


The world is largely hierarchical in its representation. Learning how to represent this hierarchy can lead to reuse of components and also enable solving a difficult problem iteratively - from the broad level to the specific level. This enables a divide-and-conquer approach as we split the problem into subgoals and subsubgoals.

Memory can be said to be in a vector form, with an initial reference point and some movement. I posit that as we go down the hierarchy, the movement would be mapped by some learnable function conditioned on the current state and the meta-level state, and would be more and more interpretable to the finer details.

In audio, we have residual vector quantization. In images, we have Feature Pyramid Network. In reinforcement learning, we have SayCan that can do hierarchical goal setting, bottleneck searching to find out subgoals. For Large Language Models, we have context conditioning to generate broad level plans to more specific level plans. The brain has plenty of feedback connections in addition to feedforward connections. Perhaps the feedback is to condition the bottom layers on the context of the top layers. All these various domains have some form of conditioning from the broad levels to the specific levels - perhaps that is how intelligence is generated - through a hierarchy.

Transformers may not have the feedback connections for conditioning explicitly, but I posit that the skip connections could already do such a role as we can pass the original input token embeddings unchanged (less LayerNorm which affects all tokens in the same way), and enables it to be conditioned by some context which is formed based on iterative self-attention in the same layer. Having multiple heads help with multiple ways of conditioning. As such, the hierarchical layers may actually matter a lot for Transformers.

We are just barely touching the surface on how to abstract into hierarchy, and I do not know the answer myself. Let us explore the various ways which have been done before and see if we can find an answer together!

~~~~~~~~~~~~~~~~~~~~~~~~

References:
Slides: https://github.com/tanchongmin/TensorFlow-Implementations/blob/main/Paper_Reviews/Representation%20Learning.pdf
See Part 1 here: https://www.youtube.com/watch?v=cK5TaIz4-eQ

Prediction Coding Hierarchy in the Brain: https://www.nature.com/articles/s41562-022-01516-2
GLOM: How to represent part-whole hierarchies
in a neural network (Hinton): https://arxiv.org/pdf/2102.12627.pdf
A Path Towards Autonomous Machine Intelligence - Hierarchical JEPA (Yann LeCun): https://openreview.net/pdf?id=BZ5a1r-kVsf
Learning, Fast and Slow (my own paper on learning with memory +: neural networks) https://arxiv.org/abs/2301.13758
Transformers - Attention is all you need: https://arxiv.org/abs/1706.03762
Jukebox: A Generative Model for Music: https://arxiv.org/abs/2005.00341
Residual Vector Quantization: https://arxiv.org/abs/1509.05195
Feature Pyramid Networks for Object Detection (Visual): https://arxiv.org/abs/1612.03144
Generative Agents: Interactive Simulacra of Human Behavior (Memory/LLM): https://arxiv.org/abs/2304.03442
SayCan (RL/LLM): https://say-can.github.io/
Option Learning (RL): https://arxiv.org/abs/2112.03097
Action Chunking (RL): https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9274316/
ARC Challenge: https://lab42.global/arc/
Memorizing Transformers: https://arxiv.org/abs/2203.08913

~~~~~~~~~~~~~~~~~~~~~~~~

0:00 Introduction and Recap
10:15 Hierarchical Representational Space
16:14 Memory Representation and Generalisation
24:00 Hierarchy and why it is important
31:45 Jukebox (Audio) - Hierarchical Conditioning on earlier layers
48:10 Residual Vector Quantization (Audio) - Creating Hierarchical Representations
1:05:38 Feature Pyramid Network (Visual) - Bottom-Up Context Building and Top-Down Context Conditioning
1:22:56 Hierarchical JEPA (RL) - Goals and Subgoals
1:28:10 Hierarchical Planning (LLMs) - Using LLMs to generate broad and specific actions
1:35:11 Hierarchical RL (RL) - Finding chunked actions, finding bottleneck states, finding subgoals
1:50:22 Hierarchical Planning (LLMs) - ARC Challenge
1:53:08 Can a Transformer perform hierarchical generation?
2:03:01 Discussion

~~~~~~~~~~~~~~~~~~~~~~~~~

AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.

Discord: https://discord.gg/bzp87AHJy5
LinkedIn: https://www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: https://delvingintotech.wordpress.com/
Twitter: https://twitter.com/johntanchongmin
Try out my games here: https://simmer.io/@chongmin




Other Videos By John Tan Chong Min


2023-07-09Gen AI Study Group Introductory Tutorial - Transformers, ChatGPT, Prompt Engineering, Projects
2023-07-03Tutorial #5: Strict JSON LLM Framework - Get LLM to output JSON exactly the way you want it!
2023-07-01Tutorial #4: SymbolicAI ChatBot In-Depth Demonstration (Tool Use and Iterative Processing)
2023-06-29How do we learn so fast? Towards a biologically plausible model for one-shot learning.
2023-06-20LLMs as a system to solve the Abstraction and Reasoning Corpus (ARC) Challenge!
2023-06-16Tutorial #3: Symbolic AI - Symbols, Operations, Expressions, LLM-based functions!
2023-06-13No more RL needed! LLMs for high-level planning: Voyager + Ghost In the Minecraft
2023-06-06Voyager - An LLM-based curriculum generator, actor and critic, with skill reuse in Minecraft!
2023-06-01Evolution ChatGPT Prompt Game - From Bacteria to.... Jellyfish???
2023-05-30Prompt Engineering and LLMOps: Tips and Tricks
2023-05-25Hierarchy! The future of AI: How it helps representations and why it is important.
2023-05-18Prediction builds representations! Fixed Bias speeds up learning!
2023-05-09Memory: How is it encoded, retrieved and how it can be used for learning systems
2023-05-02I created a Law Court Simulator with GPT4!
2023-05-02I created a Law Court Simulator with ChatGPT!
2023-04-25Creating a ChatGPT Harry Potter Text-based RPG game!
2023-04-25Learn from just Memory Storage and Retrieval: Generative Agents Interacting in Simulation!
2023-04-18The future is neuro-symbolic: Expressiveness of ChatGPT and generalizability of symbols (SymbolicAI)
2023-04-17Can GPT4 solve the Abstraction and Reasoning Corpus (ARC) Challenge Zero-Shot?
2023-04-12GPT4: Zero-shot Classification without any examples + Fine-tune with reflection
2023-04-11OpenAI Vector Embeddings - Talk to any book or document; Retrieval-Augmented Generation!