Questo sito utilizza cookie tecnici, analytics e di terze parti.
Proseguendo nella navigazione accetti l’utilizzo dei cookie.

Evento:
AI Conf 2024
Lingua:
Inglese

Tag

  • AI Generativa
  • Artificial Intelligence
  • LLM

Speaker

Bigger models or more data? The new scaling laws for LLMs

The incredibly famous Chinchilla paper changed the way we train LLMs. The authors - including the current Mistral CEO - outlined the scaling laws to maximise your model performance under a compute budget, balancing the number of parameters and training tokens.

Today, these heuristics are in jeopardy. LLaMA-3, for one, is trained on an unreasonable amount of tokens of text - but this is why it's so good. How much data do we actually need to train LLMs? This talk will shed light on the latest trends in model training and perhaps suggest newer scaling laws.