The authors of the Llama 2 model have made a huge leap forward by updating the training dataset, increasing its cleanliness and volume to 2 trillion tokens. They also made improvements to the model architecture by adding faster grouped-query attention. Now the model's context has been increased to 4 thousand tokens. Training was conducted in several stages: pre-training, supervised fine-tuning, and reinforcement learning.
An interesting observation was that reinforcement learning (RL) not only affects probability calibration (as noted by OpenAI researchers) but also regulates the model's temperature, achieving a balance between factual accuracy and creativity, depending on the input data.
Most importantly, the Llama-2 update was released with open source and a license allowing commercial use!
📝 Paper: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
👨💻 Github: https://github.com/a16z-infra/llama2-chatbot
