ChatGPT - the Chatbot that Follows Instructions - DRT S2E9
In this episode we talked about:
- how things changed since 2017 with transformers; BERT based models as encoders, and GPT based models as decoders
- how scaling laws for LMs got to a point where emergent mult-tasking behavior appeared beyond certain model capacities
- how prompt engineering appeared as a field to control the behavior of LLMs and problems associated with it
- how instruction finetuning has lead into a promising solution to the problems associated with prompt engineering
- how OpenAI team made interesting product decisions to release as a chat bot and the impact of that on creating hype
- opportunities founders and other builders have to create vertical GPT based products
Summary: I think it's incredible that they've been able to get the attention of people outside of the machine learning field, and I commend the product team for their insight. However, I want to add a word of caution to the hype around the model. I've found that it still makes basic language model errors. For example, when I asked it about the smallest congressional district in Canada, it gave me the incorrect answer. I think it's important to remember that while this model is a significant technical progress, there are still many shortcomings and it's crucial to think about the problem we're trying to solve and the right system design for achieving that goal when working with these models.