Patterns
๐Ÿ“š

Naive RAG(NRAG)

Foundational "Retrieve-Read" framework following traditional indexing, retrieval, and generation process

Complexity: lowKnowledge Retrieval (RAG)

๐ŸŽฏ 30-Second Overview

Pattern: Simple retrieve-then-read pipeline: query โ†’ retrieve documents โ†’ concatenate โ†’ generate

Why: Provides external knowledge access with minimal complexity - foundational approach established by Lewis et al. (2020)

Key Insight: Direct concatenation of top-k retrieved documents to query prompt - no optimization or post-processing

โšก Quick Implementation

1Index:Create vector embeddings of knowledge base
2Retrieve:Find top-k relevant documents via similarity
3Concatenate:Append retrieved docs to query prompt
4Generate:Submit combined prompt to LLM
5Return:Output generated response directly
Example: query โ†’ vector_search โ†’ concat_context โ†’ llm_generate โ†’ response

๐Ÿ“‹ Do's & Don'ts

โœ…Use semantic embeddings (e.g., sentence-transformers)
โœ…Implement basic chunking strategy (fixed size 512-1024 tokens)
โœ…Set clear top-k retrieval limits (typically 3-5 documents)
โœ…Include document metadata and source attribution
โœ…Use FAISS or similar for efficient vector search
โŒSkip query preprocessing or normalization
โŒRetrieve too many documents (causes context pollution)
โŒIgnore relevance scoring thresholds
โŒConcatenate without clear document boundaries
โŒExpect sophisticated reasoning from basic approach

๐Ÿšฆ When to Use

Use When

  • โ€ข Simple Q&A over documents
  • โ€ข Proof-of-concept RAG systems
  • โ€ข Limited technical complexity allowed
  • โ€ข Small to medium knowledge bases
  • โ€ข Straightforward factual queries

Avoid When

  • โ€ข Complex multi-hop reasoning required
  • โ€ข High-accuracy critical applications
  • โ€ข Large-scale production systems
  • โ€ข Noisy or contradictory knowledge bases
  • โ€ข Real-time performance requirements

๐Ÿ“Š Key Metrics

Retrieval Accuracy
Relevant docs in top-k (Recall@k)
Answer Quality
BLEU/ROUGE scores vs ground truth
Response Time
End-to-end latency (retrieval + generation)
Context Utilization
% of retrieved context used in response
Hallucination Rate
% responses with unsupported claims
Source Attribution
% responses with correct source citations

๐Ÿ’ก Top Use Cases

Document Q&A: Simple factual questions over company documents or knowledge bases
FAQ Systems: Automated responses to frequently asked questions using existing documentation
Research Assistance: Basic information retrieval from academic papers or research collections
Customer Support: Level 1 support queries answerable from documentation and manuals
Educational Tools: Simple question-answering over textbooks and educational materials

References & Further Reading

Deepen your understanding with these curated resources

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

Patterns

closed

Loading...

Built by Kortexya