Minimum AI Knowledge for Practical Use, Part 5: RAG and Search Indexes
A practical introduction to RAG, search indexes, embeddings, chunking, retrieval quality, and how they fit into developer workflows.
Contents
In the previous post, I wrote about giving AI a better input package.
But manually preparing that package every time has a limit.
As documents, code, logs, and decisions pile up, it becomes harder to know what to include. Sometimes the problem is not writing the prompt. The problem is finding the right material.
That is where RAG comes in.
RAG stands for Retrieval-Augmented Generation. The name sounds heavier than the idea.
RAG Means "Search, Then Answer"
A basic LLM flow looks like this:
user question
-> model
-> answer
With RAG, a retrieval step is added:
user question
-> search relevant documents
-> add search results to context
-> model
-> answer
For example:
What was the rule for keeping private drafts out of public indexes?
If the model does not have the project context, it may guess.
If retrieval can find design notes and generator code, the answer can be grounded in facts:
- Only public-status documents enter public listings
- Drafts are visible only in private review screens
- RSS, sitemap, and public data must not include draft identifiers
- Generated public output is checked for draft identifiers
That is the practical value of RAG.
Why Search Indexes Matter
If there are only a few documents, opening files manually is fine.
But once there are many posts, docs, logs, and code files, we need a way to find relevant pieces quickly.
A search index is the structure that makes that possible.
Keyword search is an index. Semantic search using embeddings is also an index.
| Search Type | Good At | Weak At |
|---|---|---|
| Keyword search | Exact words, function names, error messages | Different wording |
| Semantic search | Meaning-based matches | Exact identifiers |
| Hybrid search | Combining both | More implementation work |
In development work, both are useful.
Error messages and function names are perfect for keyword search. Similar design decisions are often easier to find with semantic search.
Embeddings Turn Text Into Vectors
RAG often uses embeddings.
An embedding turns text into a vector:
"private drafts must not appear in public indexes"
-> [0.12, -0.03, 0.44, ...]
The vector represents meaning in a way that can be compared with other vectors.
The retrieval flow becomes:
question embedding
-> find nearby document chunks
-> inject top results into context
A chunk is a piece of a larger document.
If a document is embedded as one huge block, the meaning can become too broad. If it is split into tiny sentences, useful context may be lost. Choosing chunk size is part of RAG design.
Common RAG Problems
RAG is not automatically correct.
Common problems include:
1. Retrieving the wrong documents
2. Adding too many documents
3. Mixing old and current information
4. Failing to cite or use the retrieved material
5. Filling gaps with unsupported claims
So RAG needs verification too.
| Check | Question |
|---|---|
| Retrieval quality | Did the right documents rank high? |
| Freshness | Did old docs override current ones? |
| Source | What documents support the answer? |
| Coverage | Were important constraints missed? |
| Hallucination | Did the answer invent facts? |
RAG is not "AI magically knows my documents." It is a search-and-answer pipeline that needs quality control.
What to Prepare Before RAG
Starting with RAG does not mean starting with a vector database.
The first useful step is document hygiene.
| Preparation | Check |
|---|---|
| Document scope | Is one document focused on one topic? |
| Title | Does the search result make sense from the title? |
| Date | Can old and current decisions be separated? |
| Tags | Can related material be found again? |
| Status | Are draft, deprecated, and published states clear? |
If documents are messy, embeddings will not magically make them useful.
A good RAG system starts with findable documents.
Start Small
Personal projects do not need a huge vector database on day one.
A small approach can work:
1. Collect Markdown/docs files.
2. Extract title, slug, tags, and summary.
3. Use keyword search first.
4. Add embedding search later if needed.
5. Pass retrieved results as evidence.
The first useful step is not infrastructure.
It is making documents findable.
Clear titles, focused tags, and well-scoped posts improve retrieval quality more than people expect.
How I Would Start Small
For a personal project, I would start like this:
1. Build a document inventory.
2. Extract title, date, tags, status, and summary.
3. Use keyword search first.
4. Pass only the top 3-5 results to the model.
5. Keep the document titles used in the answer.
6. When retrieval is wrong, improve titles/tags/summaries.
This is already useful.
Jumping directly to embeddings, vector databases, rerankers, and evaluation pipelines can be fun, but it can also add operational weight too early.
RAG is not just a feature. It is an operating loop. When retrieval is wrong, we need to know what to fix.
RAG in Development Work
For a project question:
How should comment deletion behave?
The search target may include:
- design docs
- database migrations
- server/API code
- previous blog notes
The retrieved facts might be:
- comments use password-based edit/delete
- deleted comments display as "deleted comment"
- private comments are visible only to admin
- admin can hide/delete from article pages
Now the model can produce a better checked answer:
- policy summary
- files to inspect
- implementation checklist
- verification steps
That is much better than asking the model to remember or guess.
How I Review a RAG Answer
A RAG answer is tempting to trust because it came with documents.
But retrieved context does not automatically mean the answer is correct.
I check:
| Check | Question |
|---|---|
| Source | Is the answer actually grounded in the document? |
| Freshness | Is an old decision being used as current policy? |
| Scope | Do the retrieved docs match the question? |
| Coverage | Were important exceptions missed? |
| Actionability | Does the answer lead to a next step? |
For development work, actionability matters a lot.
A useful RAG answer should not stop at "here is the explanation." It should point to the files to inspect, possible changes, and verification criteria.
Summary
RAG is not a magic button. It is a structure for finding useful material and adding it to context.
1. RAG means search before answering.
2. Search indexes make documents findable.
3. Embeddings support meaning-based search.
4. Chunk size and freshness affect answer quality.
5. RAG still needs retrieval and source verification.
In the next post, I will move from "AI reads documents" to "AI executes actions" with tool calling and agents.
Comments
0