2026.05.17 AI AI Basics en

Minimum AI Knowledge for Practical Use, Part 3: Tokens and Context Windows

A practical explanation of tokens, context windows, context selection, and how to work with larger codebases without overwhelming the model.

#AI #LLM #Token #Context Window #AI Basics

Kouji Operations Notes

8 min read 2026.05.17

In the previous post, I wrote that context matters more than prompt wording.

The next question is obvious:

"Should I just give the model as much context as possible?"

The answer is partly yes, but mostly: be careful.

AI models have a limit on how much information they can consider at once. This is where tokens and context windows come in.

Token budget triage for deciding what to include, summarize, or leave out

Tokens Are Small Units of Text

A token is a small unit of text that the model reads and writes.

It is not always the same as a word. A word, symbol, space, or part of a word can become a token depending on the tokenizer.

For practical use, this mental model is enough:

text
-> smaller chunks the model can process
-> input and output are handled through those chunks

Tokens matter for two reasons:

1. There is a limit to how much input and output can fit. 2. More tokens can mean more cost and more latency.

So tokens are not just a model internals detail. They affect speed, cost, and quality.

A Context Window Is the Workspace

The context window is the amount of input and output the model can handle in one run.

For development work, it may include:

Context Item	Example
User request	"Add comment deletion"
Conversation history	Previously agreed design decisions
File contents	API handler, frontend script, CSS
Logs	Build error, API response
Command results	Test pass or failure
Current output	The answer being generated

One detail is easy to forget: output also uses the context window.

If the input fills the entire window, there is less room for a useful answer.

Why More Is Not Always Better

It is tempting to paste the whole codebase.

But that can cause problems:

1. Important files get buried under irrelevant ones.
2. Old and new information can conflict.
3. Cost and latency go up.
4. The answer can become unfocused.

If I am fixing comment deletion, I do not need every Markdown post, every CSS rule, and every unrelated migration in the first pass.

The model needs the files that can change the decision.

How I Decide What to Include

I usually use this filter:

Item	Include?	Rule
Direct edit target	Yes	The model may modify it
Call chain	Yes	It affects behavior
Error logs	Yes	They narrow the cause
Config files	Maybe	Include if build/deploy/auth depends on them
Old conversation	Summarize	Keep decisions, remove noise
Unrelated files	No	Exclude unless they affect the task

The core question is:

Can this information change the model's decision?

If not, it can usually stay out.

How I Split the Token Budget

Even with a large context window, I roughly prioritize the space like this:

Priority	Information	Why
1	Goal and done criteria	They define direction and stopping point
2	Direct edit targets and call flow	They affect implementation decisions most
3	Error logs and reproduction steps	They narrow the cause
4	Constraints and do-not rules	They reduce cost, security, and ops mistakes
5	Summary of previous decisions	Only keep decisions that still matter

I try not to include these in the first pass:

- repeated copies of the same log
- designs that were already abandoned
- full file lists unrelated to the change
- memories phrased as "I think it was..."
- decorative explanation that does not affect verification

Being able to read a long input is not the same as reliably finding the important parts inside it.

Summaries Are Useful, But Not Perfect

Long documents can be summarized.

For example:

Design summary:
- Posts are managed as document files.
- Only published posts are public.
- Drafts are visible only in a private review area.
- Comments, likes, and views are handled by separate APIs.
- The main homepage should not link to the blog yet.

This can be enough for many tasks.

But summaries can drop important constraints. For sensitive work like auth, billing, deploys, or deletion, I prefer checking the original file too.

Think in Flows, Not Just Files

For coding tasks, file selection should follow the behavior flow.

For comment deletion, the flow might be:

1. Article page renders comments
2. User clicks delete
3. Frontend sends API request
4. The database is updated
5. UI refreshes

That tells me which files matter:

public/assets/article.js
src/api/comments.js
migrations/comment-schema.sql
scripts/build-posts.mjs

The exact list changes by project, but the method stays the same.

Practical Habits

Good context management is not about being short. It is about removing noise.

Useful habits:
- Do not paste the same log repeatedly.
- Summarize old decisions.
- Keep unrelated files out of the first pass.
- Separate "decided" from "open question."
- Let the agent search first, then read targeted files.

Large codebases are easier to handle when the model can search, inspect, edit, and verify in steps.

Signs of Too Little or Too Much Context

These answers often mean the model needs more context:

- It only says "generally..."
- It cannot identify the files involved.
- It asks again about constraints.
- It suggests a cause that does not match the logs.

These answers often mean the context is too noisy:

- It keeps mentioning unrelated files.
- It mixes old and current decisions.
- The answer is long but the execution order is vague.
- Explanation grows faster than verification criteria.

In that case, cutting the input down can help more than adding more files.

Summary

Tokens and context windows are practical engineering constraints.

1. Tokens are the chunks of text models read and write.
2. A context window is the workspace available for one run.
3. Relevant context matters more than maximum context.
4. Summaries help, but sensitive work still needs original sources.
5. For code, select context by behavior flow.

In the next post, I will turn this into a concrete input package for coding tasks.

Minimum AI Knowledge for Practical Use, Part 3: Tokens and Context Windows

Contents

Tokens Are Small Units of Text

A Context Window Is the Workspace

Why More Is Not Always Better

How I Decide What to Include

How I Split the Token Budget

Summaries Are Useful, But Not Perfect

Think in Flows, Not Just Files

Practical Habits

Signs of Too Little or Too Much Context

Summary

Comments

Write a Comment

Minimum AI Knowledge for Practical Use, Part 3: Tokens and Context Windows

Contents

Tokens Are Small Units of Text

A Context Window Is the Workspace

Why More Is Not Always Better

How I Decide What to Include

How I Split the Token Budget

Summaries Are Useful, But Not Perfect

Think in Flows, Not Just Files

Practical Habits

Signs of Too Little or Too Much Context

Summary

Comments

Write a Comment

Read Next