2026.05.19 AI AI Basics en

Minimum AI Knowledge for Practical Use, Part 5: RAG and Search Indexes

A practical introduction to RAG, search indexes, embeddings, chunking, retrieval quality, and how they fit into developer workflows.

Contents

In the previous post, I wrote about giving AI a better input package.

But manually preparing that package every time has a limit.

As documents, code, logs, and decisions pile up, it becomes harder to know what to include. Sometimes the problem is not writing the prompt. The problem is finding the right material.

That is where RAG comes in.

RAG stands for Retrieval-Augmented Generation. The name sounds heavier than the idea.

RAG flow from question to retrieval, context pack, and verified answer

RAG Means "Search, Then Answer"

A basic LLM flow looks like this:

user question
-> model
-> answer

With RAG, a retrieval step is added:

user question
-> search relevant documents
-> add search results to context
-> model
-> answer

For example:

What was the rule for keeping private drafts out of public indexes?

If the model does not have the project context, it may guess.

If retrieval can find design notes and generator code, the answer can be grounded in facts:

- Only public-status documents enter public listings
- Drafts are visible only in private review screens
- RSS, sitemap, and public data must not include draft identifiers
- Generated public output is checked for draft identifiers

That is the practical value of RAG.

Why Search Indexes Matter

If there are only a few documents, opening files manually is fine.

But once there are many posts, docs, logs, and code files, we need a way to find relevant pieces quickly.

A search index is the structure that makes that possible.

Keyword search is an index. Semantic search using embeddings is also an index.

Search TypeGood AtWeak At
Keyword searchExact words, function names, error messagesDifferent wording
Semantic searchMeaning-based matchesExact identifiers
Hybrid searchCombining bothMore implementation work

In development work, both are useful.

Error messages and function names are perfect for keyword search. Similar design decisions are often easier to find with semantic search.

Embeddings Turn Text Into Vectors

RAG often uses embeddings.

An embedding turns text into a vector:

"private drafts must not appear in public indexes"
-> [0.12, -0.03, 0.44, ...]

The vector represents meaning in a way that can be compared with other vectors.

The retrieval flow becomes:

question embedding
-> find nearby document chunks
-> inject top results into context

A chunk is a piece of a larger document.

If a document is embedded as one huge block, the meaning can become too broad. If it is split into tiny sentences, useful context may be lost. Choosing chunk size is part of RAG design.

Common RAG Problems

RAG is not automatically correct.

Common problems include:

1. Retrieving the wrong documents
2. Adding too many documents
3. Mixing old and current information
4. Failing to cite or use the retrieved material
5. Filling gaps with unsupported claims

So RAG needs verification too.

CheckQuestion
Retrieval qualityDid the right documents rank high?
FreshnessDid old docs override current ones?
SourceWhat documents support the answer?
CoverageWere important constraints missed?
HallucinationDid the answer invent facts?

RAG is not "AI magically knows my documents." It is a search-and-answer pipeline that needs quality control.

RAG quality checks across retrieval relevance, freshness, sources, coverage, and hallucination

What to Prepare Before RAG

Starting with RAG does not mean starting with a vector database.

The first useful step is document hygiene.

PreparationCheck
Document scopeIs one document focused on one topic?
TitleDoes the search result make sense from the title?
DateCan old and current decisions be separated?
TagsCan related material be found again?
StatusAre draft, deprecated, and published states clear?

If documents are messy, embeddings will not magically make them useful.

A good RAG system starts with findable documents.

Start Small

Personal projects do not need a huge vector database on day one.

A small approach can work:

1. Collect Markdown/docs files.
2. Extract title, slug, tags, and summary.
3. Use keyword search first.
4. Add embedding search later if needed.
5. Pass retrieved results as evidence.

The first useful step is not infrastructure.

It is making documents findable.

Clear titles, focused tags, and well-scoped posts improve retrieval quality more than people expect.

How I Would Start Small

For a personal project, I would start like this:

1. Build a document inventory.
2. Extract title, date, tags, status, and summary.
3. Use keyword search first.
4. Pass only the top 3-5 results to the model.
5. Keep the document titles used in the answer.
6. When retrieval is wrong, improve titles/tags/summaries.

This is already useful.

Jumping directly to embeddings, vector databases, rerankers, and evaluation pipelines can be fun, but it can also add operational weight too early.

RAG is not just a feature. It is an operating loop. When retrieval is wrong, we need to know what to fix.

RAG in Development Work

For a project question:

How should comment deletion behave?

The search target may include:

- design docs
- database migrations
- server/API code
- previous blog notes

The retrieved facts might be:

- comments use password-based edit/delete
- deleted comments display as "deleted comment"
- private comments are visible only to admin
- admin can hide/delete from article pages

Now the model can produce a better checked answer:

- policy summary
- files to inspect
- implementation checklist
- verification steps

That is much better than asking the model to remember or guess.

How I Review a RAG Answer

A RAG answer is tempting to trust because it came with documents.

But retrieved context does not automatically mean the answer is correct.

I check:

CheckQuestion
SourceIs the answer actually grounded in the document?
FreshnessIs an old decision being used as current policy?
ScopeDo the retrieved docs match the question?
CoverageWere important exceptions missed?
ActionabilityDoes the answer lead to a next step?

For development work, actionability matters a lot.

A useful RAG answer should not stop at "here is the explanation." It should point to the files to inspect, possible changes, and verification criteria.

Summary

RAG is not a magic button. It is a structure for finding useful material and adding it to context.

1. RAG means search before answering.
2. Search indexes make documents findable.
3. Embeddings support meaning-based search.
4. Chunk size and freshness affect answer quality.
5. RAG still needs retrieval and source verification.

In the next post, I will move from "AI reads documents" to "AI executes actions" with tool calling and agents.

Comments

0

Write a Comment

Comments are public by default. Private comments are visible to the admin only.