Skip to content
AI docs · Foundations

Large language models (LLMs)

Models trained to predict the next token of text, which turns out to be a powerful way to read, write, and reason over language.

What it is

  • An LLM is a neural network trained on large amounts of text to predict the next token (a word or word-piece) given what came before.
  • That simple objective, at scale, produces models that can summarize, translate, answer questions, write code, and follow instructions.

How it works

  • Most modern LLMs use the Transformer architecture, which uses attention to weigh how tokens relate to each other.
  • They are pretrained on broad text, then often fine-tuned and aligned (e.g. with human feedback) to follow instructions helpfully and safely.
  • At use time, the model is given a prompt and generates a response token by token.

Trade-offs

  • Strong general capability, but no built-in notion of truth: they can be fluent and wrong.
  • Bigger models are more capable but cost more to run; smaller models can be cheaper and faster for narrow tasks.

When to use it

  • Tasks involving understanding or generating natural language, code, or structured text.
  • When flexibility matters more than guaranteed correctness, with checks around the output.

Common pitfalls

  • Treating output as fact without verification.
  • Assuming a bigger model is always the right answer when a smaller one would do.

Related concepts

Large language models (LLMs): explained · SDEN