RAG Readiness Playbook

A practical RAG readiness playbook: document prep, chunking, citations, evaluation, and rollout. Built for operators and engineers.

Last updated: February 27, 2026

Summary

This playbook helps you decide if you are actually ready for RAG, or if you are about to build a search problem with extra steps. It is designed for operators and engineers who want a practical path: what to check, what to fix, and what to measure. Use the LLM Safety Review Checklist for data and output guardrails, and the AI Toolkit for tool picks. Run this before you commit to a RAG build, and again before you roll it out to real users.

Who it is for

What you get

Steps

  1. Define the job to be done, not the tech.
    Write one sentence:
    • “Users want to [task] using [source], and the result must be [quality bar].”
    If you cannot state the task, you cannot evaluate success.
  2. Inventory the sources and the truth level.
    List each source and label it:
    • Authoritative: policies, contracts, system docs, verified runbooks
    • Helpful: meeting notes, tickets, chat logs
    • Untrusted: random docs, outdated files, duplicated content
    RAG is only as trustworthy as the sources you feed it.
  3. Fix content quality before you build retrieval.
    Do a quick cleanup pass:
    • Remove duplicates
    • Remove dead docs and old versions
    • Standardize titles and section headers
    • Add “last updated” where possible
    If the library is messy, retrieval will be messy.
  4. Choose a minimum viable retrieval design.
    Start simple:
    • Chunk by headings, not arbitrary size
    • Store source URL and section title per chunk
    • Retrieve top K and cite them
    • Refuse when there is no evidence
    You can add reranking later. Do not start with complexity.
  5. Define evaluation before you ship.
    Create a test set of 25 questions:
    • 10 easy (direct lookup)
    • 10 medium (needs synthesis across sections)
    • 5 hard (edge cases, ambiguous)
    For each question, define what a correct answer must cite.
  6. Ship with guardrails and logging.
    Your first version should:
    • Show citations for every claim
    • Say “Not in library” when it cannot prove it
    • Log the question, top docs, and whether the user was satisfied
    This is how you improve relevance and trust over time.

Templates

Copy these into your doc, then fill them with your actual system details.

Template 1: RAG readiness snapshot

One page that says if you are ready and why.

RAG readiness snapshot
    
    Job to be done:
    Users:
    Primary sources:
    Success criteria:
    
    Sources inventory:
    - Source | type (authoritative/helpful/untrusted) | owner | last updated
    
    Data quality issues:
    - duplicates:
    - outdated:
    - missing metadata:
    - access problems:
    
    MVP retrieval plan:
    - chunking strategy:
    - top K:
    - citations:
    - refusal behavior:
    
    Evaluation plan:
    - test questions count:
    - acceptance criteria:
    - metrics tracked:
    
    Decision:
    - Ready / Not ready
    Next actions:

Template 2: Test questions table

Use this to build your eval set.

Test questions
    
    Format:
    Question | difficulty | expected sources | must-cite sections | answer notes
    
    Examples:
    - What is our refund policy? | easy | policy doc | section 2 | must cite
    - How do we handle access exceptions? | medium | runbook + policy | both | synthesize
    - What should we do if the system is down and logs are missing? | hard | incident doc | section 4 | edge case

Template 3: Acceptance criteria

Keep it measurable and strict.

Acceptance criteria (MVP)
    
    - Every answer includes citations for factual claims.
    - If no evidence is retrieved, the system refuses and says "Not in library."
    - Top 3 retrieved chunks are relevant for at least 70% of test questions.
    - Correct answer rate is at least 60% on the 25-question test set.
    - Users can report "wrong" and we capture: question, retrieved docs, and why it failed.

Common mistakes

Related tools

Related glossary terms