How Do LLMs Use Context?
Context windows, attention, and why better input compounds

Every time you send a message to an AI tool, something happens before it responds. The model doesn't just read your question - it reads everything available to it: the conversation so far, any instructions it was given, documents you've uploaded, anything else in its working space.
That entire package is context. Understanding how AI tools actually use it explains why they behave the way they do - and why better context produces dramatically better results.
What does an LLM actually see?
When you send a message, the model receives a single block of text containing everything relevant to your request:
- Standing instructions for the session (who you are, how to respond, what to focus on)
- The full conversation history up to that point
- Your current message
- Any documents, data, or tool outputs that have been loaded
The model reads all of it, generates a response, and that response gets added to the history. On your next message, it reads everything again - including its own previous response.
No selective recall. No searching stored memories. The model processes what's in front of it, every time.
Does an LLM actually remember things?
No. This is one of the most counterintuitive things about how AI tools work.
When Claude or ChatGPT seems to "remember" something you said three messages ago, it's not retrieving a stored fact. It's re-reading the entire conversation from the beginning. Every response involves processing the full history from scratch.
This creates a useful illusion of memory - as long as the conversation fits in the context window. Once it grows long enough that earlier messages fall outside that limit, those details are gone. The model can no longer see them.
What is a context window?
The maximum amount of text an LLM can process in a single request. Think of it as the model's desk: everything on the desk is visible and usable; anything that doesn't fit gets left off.
Context windows are measured in tokens - roughly three-quarters of a word each. A 200,000-token window holds about 150,000 words, or 500 pages. Modern models have gotten significantly larger; several crossed the 1-million-token mark in 2024-2025.
For everyday use, limits rarely cause problems. They matter with very long documents, extended research sessions, or complex agent workflows that accumulate data across many steps.
Why does context quality matter so much?
The model works with what's in the context window. Vague context, vague output. Accurate and specific context, accurate and specific output.
Ask an AI tool "what should I do next?" with no context. Generic suggestions. Ask the same question after giving it your role, your current project, the decision you're facing, and your constraints - the answer changes completely. Not because the model got smarter. Because it has more to work with.
This is why the same AI tool can feel dramatically different depending on setup. The model is identical. The context is different.
What kinds of context help most?
Who you are - role, background, expertise. Calibrates the level of detail and frame of reference for every response. An AI that knows you're a technical founder answers differently than one assuming you're a student.
What you're working on - current project, specific problem, decision in front of you. Without it, responses optimize for a generic version of your question, not your actual situation.
How you want responses structured - length, format, tone, what to avoid. Set once as standing instructions, not restated every session.
Relevant documents and data - anything the model needs to give a specific answer: specs, briefs, research, previous work.
The first two - who you are and what you're doing - are personal context. Also the most commonly missing, because there's no automatic way for an AI tool to know them.
What happens when context is missing or wrong?
The model fills in the gaps. Makes assumptions based on the most likely interpretation of your question - optimized for the average user, not for you.
Results are usually technically correct but imprecise. Generic advice. Explanations at the wrong level. Recommendations that don't fit your situation. The model isn't wrong - it's just answering a slightly different question than the one you meant to ask.
Incorrect context is worse. An old role, a finished project, a shifted priority - the model applies it confidently to every response.
How do you give an LLM better context?
Structured personal context that loads automatically at session start. Not typed from scratch each time - extracted from the sources that already reflect who you are, served to whatever tool you're using.
Re-explaining yourself every session vs. having accurate context pre-loaded isn't a marginal difference. It's the difference between a general-purpose tool and one that consistently gives relevant, specific, immediately useful answers.
→ What personal context looks like: What Is Personal Context for AI?
→ How to build it: How to Build Personal Context for AI