How to Take Advantage of the New Disruptive AI Technology Called Transformers
Transformer neural networks are shaking up Artificial Intelligence
--
July 15, 2021
Starting in 2017, Transformers have facilitated impressive progress in the field of deep learning. Many of us consider Transformers to be the most important development in recent years and with the greatest potential in the area. For this reason, I believe that it is worthwhile for us to be watchful of their progress.
The new normal that changes the way we do NLP
Transformers were introduced in the seminal paper “Attention is all you need” by Vaswani et al. The gist of this paper is to introduce a mechanism called “neural attention”, which has quickly become one of the most influential ideas in deep learning applied to the NLP domain.
It can be applied to other domains like computer vision
But also, the same attention mechanisms that make Transformers so effective for language models can be used in other domains, and nowadays, Transformers have started to find tremendous success in areas such as computer vision.
One of the advantages of Transformers is their capability to learn without the need for labeled data. For example, the Transformers can develop representations through unsupervised learning. Then they can apply those representations to fill in the blanks in incomplete sentences or to generate coherent text after receiving a prompt.
Computing cost of training transformers
However, the training of Transformers and their application remains a privilege of the big technology companies with access to vast data sources and compute resources. For example, the popular OpenAI’s GPT-3 model costs around 10 million dollars to train, an amount of…