Minimum AI Knowledge for Practical Use, Part 3: Tokens and Context Windows
A practical explanation of tokens, context windows, context selection, and how to work with larger codebases without overwhelming the model.
Contents
In the previous post, I wrote that context matters more than prompt wording.
The next question is obvious:
"Should I just give the model as much context as possible?"
The answer is partly yes, but mostly: be careful.
AI models have a limit on how much information they can consider at once. This is where tokens and context windows come in.
Tokens Are Small Units of Text
A token is a small unit of text that the model reads and writes.
It is not always the same as a word. A word, symbol, space, or part of a word can become a token depending on the tokenizer.
For practical use, this mental model is enough:
text
-> smaller chunks the model can process
-> input and output are handled through those chunks
Tokens matter for two reasons:
1. There is a limit to how much input and output can fit. 2. More tokens can mean more cost and more latency.
So tokens are not just a model internals detail. They affect speed, cost, and quality.
A Context Window Is the Workspace
The context window is the amount of input and output the model can handle in one run.
For development work, it may include:
| Context Item | Example |
|---|---|
| User request | "Add comment deletion" |
| Conversation history | Previously agreed design decisions |
| File contents | API handler, frontend script, CSS |
| Logs | Build error, API response |
| Command results | Test pass or failure |
| Current output | The answer being generated |
One detail is easy to forget: output also uses the context window.
If the input fills the entire window, there is less room for a useful answer.
Why More Is Not Always Better
It is tempting to paste the whole codebase.
But that can cause problems:
1. Important files get buried under irrelevant ones.
2. Old and new information can conflict.
3. Cost and latency go up.
4. The answer can become unfocused.
If I am fixing comment deletion, I do not need every Markdown post, every CSS rule, and every unrelated migration in the first pass.
The model needs the files that can change the decision.
How I Decide What to Include
I usually use this filter:
| Item | Include? | Rule |
|---|---|---|
| Direct edit target | Yes | The model may modify it |
| Call chain | Yes | It affects behavior |
| Error logs | Yes | They narrow the cause |
| Config files | Maybe | Include if build/deploy/auth depends on them |
| Old conversation | Summarize | Keep decisions, remove noise |
| Unrelated files | No | Exclude unless they affect the task |
The core question is:
Can this information change the model's decision?
If not, it can usually stay out.
How I Split the Token Budget
Even with a large context window, I roughly prioritize the space like this:
| Priority | Information | Why |
|---|---|---|
| 1 | Goal and done criteria | They define direction and stopping point |
| 2 | Direct edit targets and call flow | They affect implementation decisions most |
| 3 | Error logs and reproduction steps | They narrow the cause |
| 4 | Constraints and do-not rules | They reduce cost, security, and ops mistakes |
| 5 | Summary of previous decisions | Only keep decisions that still matter |
I try not to include these in the first pass:
- repeated copies of the same log
- designs that were already abandoned
- full file lists unrelated to the change
- memories phrased as "I think it was..."
- decorative explanation that does not affect verification
Being able to read a long input is not the same as reliably finding the important parts inside it.
Summaries Are Useful, But Not Perfect
Long documents can be summarized.
For example:
Design summary:
- Posts are managed as document files.
- Only published posts are public.
- Drafts are visible only in a private review area.
- Comments, likes, and views are handled by separate APIs.
- The main homepage should not link to the blog yet.
This can be enough for many tasks.
But summaries can drop important constraints. For sensitive work like auth, billing, deploys, or deletion, I prefer checking the original file too.
Think in Flows, Not Just Files
For coding tasks, file selection should follow the behavior flow.
For comment deletion, the flow might be:
1. Article page renders comments
2. User clicks delete
3. Frontend sends API request
4. The database is updated
5. UI refreshes
That tells me which files matter:
public/assets/article.js
src/api/comments.js
migrations/comment-schema.sql
scripts/build-posts.mjs
The exact list changes by project, but the method stays the same.
Practical Habits
Good context management is not about being short. It is about removing noise.
Useful habits:
- Do not paste the same log repeatedly.
- Summarize old decisions.
- Keep unrelated files out of the first pass.
- Separate "decided" from "open question."
- Let the agent search first, then read targeted files.
Large codebases are easier to handle when the model can search, inspect, edit, and verify in steps.
Signs of Too Little or Too Much Context
These answers often mean the model needs more context:
- It only says "generally..."
- It cannot identify the files involved.
- It asks again about constraints.
- It suggests a cause that does not match the logs.
These answers often mean the context is too noisy:
- It keeps mentioning unrelated files.
- It mixes old and current decisions.
- The answer is long but the execution order is vague.
- Explanation grows faster than verification criteria.
In that case, cutting the input down can help more than adding more files.
Summary
Tokens and context windows are practical engineering constraints.
1. Tokens are the chunks of text models read and write.
2. A context window is the workspace available for one run.
3. Relevant context matters more than maximum context.
4. Summaries help, but sensitive work still needs original sources.
5. For code, select context by behavior flow.
In the next post, I will turn this into a concrete input package for coding tasks.
Comments
0