Start a project

AI docs · Foundations

Large language models (LLMs)

Models trained to predict the next token of text, which turns out to be a powerful way to read, write, and reason over language.

What it is

An LLM is a neural network trained on large amounts of text to predict the next token (a word or word-piece) given what came before.
That simple objective, at scale, produces models that can summarize, translate, answer questions, write code, and follow instructions.

How it works

Most modern LLMs use the Transformer architecture, which uses attention to weigh how tokens relate to each other.
They are pretrained on broad text, then often fine-tuned and aligned (e.g. with human feedback) to follow instructions helpfully and safely.
At use time, the model is given a prompt and generates a response token by token.

Trade-offs

Strong general capability, but no built-in notion of truth: they can be fluent and wrong.
Bigger models are more capable but cost more to run; smaller models can be cheaper and faster for narrow tasks.

When to use it

Tasks involving understanding or generating natural language, code, or structured text.
When flexibility matters more than guaranteed correctness, with checks around the output.

Common pitfalls

Treating output as fact without verification.
Assuming a bigger model is always the right answer when a smaller one would do.

Quick check

At its core, what does a large language model actually do to generate text?

Related concepts

Tokens and context windows Prompting Hallucinations

Let's get to work

Want to build with AI for real?

Beyond the explainer, we design, secure, build and run production AI. Tell us what you have in mind.

Start a project See the offer

Large language models (LLMs): explained · SDEN