- Explain how large language models generate text
- Describe what tokens and parameters are
- Identify what LLMs can and cannot do well
- Explain why LLMs sometimes produce wrong answers
When you type a question into an AI assistant, you get a human-like response in seconds. But how does it actually work? Large language models (LLMs) are the technology behind these tools. This tutorial explains them in plain English, with no technical background needed.
What is a large language model?#
A large language model is a type of AI that has been trained on enormous amounts of text. Books, websites, articles, conversations, billions of pages of written material.
By reading all this text, the model learns patterns in language. It learns which words tend to follow other words, how sentences are structured, and how ideas connect. It does not memorise the text. It learns the patterns within it.
Think of it like learning to cook by reading thousands of recipes. You would start to notice patterns, roast dinners follow certain steps, cakes need specific ratios of ingredients. Eventually, you could write a new recipe that makes sense, even for a dish you have never made.
The key insight
An LLM does not understand language the way you do. It predicts the most likely next word based on patterns it has learned. It does this so well that the results often seem like genuine understanding.
How LLMs generate text#
When you send a prompt to an LLM, here is what happens:
-
Your text is broken into tokens. A token is a piece of a word. The word "understanding" might be split into "under" and "standing". The model works with these tokens, not whole words.
-
The model reads your tokens. It processes them through layers of mathematical calculations called a neural network. Each layer looks at the tokens in a different way.
-
It predicts the next token. Based on everything it has learned, the model calculates the most likely next token. Then it adds that token and predicts the next one. And the next. This continues until the response is complete.
-
You see the result. The tokens are combined back into readable text and shown to you.
This process happens incredibly fast. A model might generate hundreds of words in seconds.
Why LLMs are "large"#
The "large" in large language model refers to two things:
- Training data, LLMs are trained on vast amounts of text. We are talking about billions of web pages and books.
- Parameters, the model contains billions of numerical values that have been adjusted during training. These parameters are what store the patterns the model has learned. More parameters generally means the model can handle more complex tasks.
Modern LLMs have hundreds of billions of parameters. Training them requires enormous computing power and can cost millions of pounds.
What LLMs can and cannot do#
LLMs are good at:
- Writing and editing text in many styles
- Answering questions based on their training data
- Translating between languages
- Summarising long documents
- Following instructions and completing structured tasks
LLMs struggle with:
- Facts and accuracy, they predict likely text, not verified facts. They can produce confident-sounding but wrong information.
- Maths, they are language models, not calculators. Complex maths can trip them up.
- Recent events, their knowledge has a cut-off date based on when they were trained.
- Reasoning, they can mimic reasoning patterns but do not truly think through problems the way humans do.
LLMs predict, they do not know
An LLM does not have beliefs, memories, or understanding. It produces text that is statistically likely to be a good response. This is why it sometimes gives wrong answers with complete confidence.
Key takeaways#
- Large language models learn language patterns from billions of pages of text
- They generate text by predicting the most likely next word, one token at a time
- They are very good at producing human-like text but do not truly understand it
- "Large" refers to both the training data and the billions of parameters in the model
- Always verify important information from LLMs, they predict, they do not know
- What is a large language model?
- How LLMs generate text
- Why LLMs are "large"
- What LLMs can and cannot do
- Key takeaways