Understanding AI #4 - What does Transformer mean in Generative Pre-Training Transformer (GPT)?
Welcome to my new series called Understaing AI where I research and talk about various topics within Artificial Intelligence.
In this episode I talk about the term Transfermer in GPT models and models generally. I also explain terms associated terms important to understanding transformers.
Here are some links to relevant documents for further study
Neural Machine Translation by Jointly Learning to Align and Translate, D. Bahdanau, K. Cho, Y. Bengio (2015) - https://arxiv.org/abs/1409.0473v5
Attention Is All You Need, A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin (2017) - https://arxiv.org/abs/1706.03762
ImageNet Classification with Deep Convolutional Neural Networks, A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) - https://papers.nips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html