2026.05.19 AI AI Basics en

Minimum AI Knowledge for Practical Use, Part 5: RAG and Search Indexes

A practical introduction to RAG, search indexes, embeddings, chunking, retrieval quality, and how they fit into developer workflows.

#AI #RAG #Embedding #Search #AI Basics

Kouji Operations Notes

9 min read 2026.05.19

In the previous post, I wrote about giving AI a better input package.

But manually preparing that package every time has a limit.

As documents, code, logs, and decisions pile up, it becomes harder to know what to include. Sometimes the problem is not writing the prompt. The problem is finding the right material.

That is where RAG comes in.

RAG stands for Retrieval-Augmented Generation. The name sounds heavier than the idea.

RAG flow from question to retrieval, context pack, and verified answer

RAG Means "Search, Then Answer"

A basic LLM flow looks like this:

user question
-> model
-> answer

With RAG, a retrieval step is added:

user question
-> search relevant documents
-> add search results to context
-> model
-> answer

For example:

What was the rule for keeping private drafts out of public indexes?

If the model does not have the project context, it may guess.

If retrieval can find design notes and generator code, the answer can be grounded in facts:

- Only public-status documents enter public listings
- Drafts are visible only in private review screens
- RSS, sitemap, and public data must not include draft identifiers
- Generated public output is checked for draft identifiers

That is the practical value of RAG.

Why Search Indexes Matter

If there are only a few documents, opening files manually is fine.

But once there are many posts, docs, logs, and code files, we need a way to find relevant pieces quickly.

A search index is the structure that makes that possible.

Keyword search is an index. Semantic search using embeddings is also an index.

Search Type	Good At	Weak At
Keyword search	Exact words, function names, error messages	Different wording
Semantic search	Meaning-based matches	Exact identifiers
Hybrid search	Combining both	More implementation work

In development work, both are useful.

Error messages and function names are perfect for keyword search. Similar design decisions are often easier to find with semantic search.

Embeddings Turn Text Into Vectors

RAG often uses embeddings.

An embedding turns text into a vector:

"private drafts must not appear in public indexes"
-> [0.12, -0.03, 0.44, ...]

The vector represents meaning in a way that can be compared with other vectors.

The retrieval flow becomes:

question embedding
-> find nearby document chunks
-> inject top results into context

A chunk is a piece of a larger document.

If a document is embedded as one huge block, the meaning can become too broad. If it is split into tiny sentences, useful context may be lost. Choosing chunk size is part of RAG design.

Common RAG Problems

RAG is not automatically correct.

Common problems include:

1. Retrieving the wrong documents
2. Adding too many documents
3. Mixing old and current information
4. Failing to cite or use the retrieved material
5. Filling gaps with unsupported claims

So RAG needs verification too.

Check	Question
Retrieval quality	Did the right documents rank high?
Freshness	Did old docs override current ones?
Source	What documents support the answer?
Coverage	Were important constraints missed?
Hallucination	Did the answer invent facts?

RAG is not "AI magically knows my documents." It is a search-and-answer pipeline that needs quality control.

RAG quality checks across retrieval relevance, freshness, sources, coverage, and hallucination

What to Prepare Before RAG

Starting with RAG does not mean starting with a vector database.

The first useful step is document hygiene.

Preparation	Check
Document scope	Is one document focused on one topic?
Title	Does the search result make sense from the title?
Date	Can old and current decisions be separated?
Tags	Can related material be found again?
Status	Are draft, deprecated, and published states clear?

If documents are messy, embeddings will not magically make them useful.

A good RAG system starts with findable documents.

Start Small

Personal projects do not need a huge vector database on day one.

A small approach can work:

1. Collect Markdown/docs files.
2. Extract title, slug, tags, and summary.
3. Use keyword search first.
4. Add embedding search later if needed.
5. Pass retrieved results as evidence.

The first useful step is not infrastructure.

It is making documents findable.

Clear titles, focused tags, and well-scoped posts improve retrieval quality more than people expect.

How I Would Start Small

For a personal project, I would start like this:

1. Build a document inventory.
2. Extract title, date, tags, status, and summary.
3. Use keyword search first.
4. Pass only the top 3-5 results to the model.
5. Keep the document titles used in the answer.
6. When retrieval is wrong, improve titles/tags/summaries.

This is already useful.

Jumping directly to embeddings, vector databases, rerankers, and evaluation pipelines can be fun, but it can also add operational weight too early.

RAG is not just a feature. It is an operating loop. When retrieval is wrong, we need to know what to fix.

RAG in Development Work

For a project question:

How should comment deletion behave?

The search target may include:

- design docs
- database migrations
- server/API code
- previous blog notes

The retrieved facts might be:

- comments use password-based edit/delete
- deleted comments display as "deleted comment"
- private comments are visible only to admin
- admin can hide/delete from article pages

Now the model can produce a better checked answer:

- policy summary
- files to inspect
- implementation checklist
- verification steps

That is much better than asking the model to remember or guess.

How I Review a RAG Answer

A RAG answer is tempting to trust because it came with documents.

But retrieved context does not automatically mean the answer is correct.

I check:

Check	Question
Source	Is the answer actually grounded in the document?
Freshness	Is an old decision being used as current policy?
Scope	Do the retrieved docs match the question?
Coverage	Were important exceptions missed?
Actionability	Does the answer lead to a next step?

For development work, actionability matters a lot.

A useful RAG answer should not stop at "here is the explanation." It should point to the files to inspect, possible changes, and verification criteria.

Summary

RAG is not a magic button. It is a structure for finding useful material and adding it to context.

1. RAG means search before answering.
2. Search indexes make documents findable.
3. Embeddings support meaning-based search.
4. Chunk size and freshness affect answer quality.
5. RAG still needs retrieval and source verification.

In the next post, I will move from "AI reads documents" to "AI executes actions" with tool calling and agents.

Minimum AI Knowledge for Practical Use, Part 5: RAG and Search Indexes

Contents

RAG Means "Search, Then Answer"

Why Search Indexes Matter

Embeddings Turn Text Into Vectors

Common RAG Problems

What to Prepare Before RAG

Start Small

How I Would Start Small

RAG in Development Work

How I Review a RAG Answer

Summary

Comments

Write a Comment

Minimum AI Knowledge for Practical Use, Part 5: RAG and Search Indexes

Contents

RAG Means "Search, Then Answer"

Why Search Indexes Matter

Embeddings Turn Text Into Vectors

Common RAG Problems

What to Prepare Before RAG

Start Small

How I Would Start Small

RAG in Development Work

How I Review a RAG Answer

Summary

Comments

Write a Comment

Read Next