Design Patterns & Techniques

🔗

Prompt Chaining

🔀

Routing

⚡

Parallelization

🪞

Reflection

🔧

Tool Use

🎯

Planning

👥

Multi-Agent

🧠

Memory Management

📈

Learning and Adaptation

🏗️

Fault Tolerance Infrastructure

📚

Knowledge Retrieval (RAG)

🧠

Reasoning Techniques

🔐

Security & Privacy Patterns

📊

Evaluation and Monitoring

🧠

Context Management

🎨

UI/UX & Human-AI Interaction

Loading...

📚

Naive RAG(NRAG)

Foundational "Retrieve-Read" framework following traditional indexing, retrieval, and generation process

Complexity: lowKnowledge Retrieval (RAG)

🎯 30-Second Overview

Pattern: Simple retrieve-then-read pipeline: query → retrieve documents → concatenate → generate

Why: Provides external knowledge access with minimal complexity - foundational approach established by Lewis et al. (2020)

Key Insight: Direct concatenation of top-k retrieved documents to query prompt - no optimization or post-processing

⚡ Quick Implementation

1Index:Create vector embeddings of knowledge base

2Retrieve:Find top-k relevant documents via similarity

3Concatenate:Append retrieved docs to query prompt

4Generate:Submit combined prompt to LLM

5Return:Output generated response directly

Example: query → vector_search → concat_context → llm_generate → response

📋 Do's & Don'ts

✅Use semantic embeddings (e.g., sentence-transformers)

✅Implement basic chunking strategy (fixed size 512-1024 tokens)

✅Set clear top-k retrieval limits (typically 3-5 documents)

✅Include document metadata and source attribution

✅Use FAISS or similar for efficient vector search

❌Skip query preprocessing or normalization

❌Retrieve too many documents (causes context pollution)

❌Ignore relevance scoring thresholds

❌Concatenate without clear document boundaries

❌Expect sophisticated reasoning from basic approach

🚦 When to Use

Use When

• Simple Q&A over documents
• Proof-of-concept RAG systems
• Limited technical complexity allowed
• Small to medium knowledge bases
• Straightforward factual queries

Avoid When

• Complex multi-hop reasoning required
• High-accuracy critical applications
• Large-scale production systems
• Noisy or contradictory knowledge bases
• Real-time performance requirements

📊 Key Metrics

Retrieval Accuracy

Relevant docs in top-k (Recall@k)

Answer Quality

BLEU/ROUGE scores vs ground truth

Response Time

End-to-end latency (retrieval + generation)

Context Utilization

% of retrieved context used in response

Hallucination Rate

% responses with unsupported claims

Source Attribution

% responses with correct source citations

💡 Top Use Cases

Document Q&A: Simple factual questions over company documents or knowledge bases

FAQ Systems: Automated responses to frequently asked questions using existing documentation

Research Assistance: Basic information retrieval from academic papers or research collections

Customer Support: Level 1 support queries answerable from documentation and manuals

Educational Tools: Simple question-answering over textbooks and educational materials

References & Further Reading

Deepen your understanding with these curated resources

Foundational Papers

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020)

Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., 2020)

RAG vs FiD: Comparing Retrieval-Augmented Generation Models (Izacard & Grave, 2021)

Leveraging Passage Retrieval with Generative Models (Izacard et al., 2022)

Comprehensive Surveys

Retrieval-Augmented Generation for Large Language Models: A Survey (Gao et al., 2023)

A Comprehensive Survey of RAG: Evolution and Future Directions (Gupta et al., 2024)

Retrieval-Augmented Generation for AI-Generated Content: A Survey (Li et al., 2024)

RAG and RAU: A Survey on Retrieval-Augmented Language Model (Zhao et al., 2024)

Implementation Resources

LangChain RAG Tutorial - Basic Implementation

Hugging Face RAG Documentation and Examples

OpenAI RAG Implementation Guide

LlamaIndex Basic RAG Pipeline Tutorial

Vector Database Solutions

FAISS - Facebook AI Similarity Search Library

Pinecone Vector Database Documentation

Weaviate Vector Search Engine

Chroma - Open-source Embedding Database

Embedding Models & Evaluation

Sentence-BERT: Sentence Embeddings using Siamese BERT (Reimers & Gurevych, 2019)

Text and Code Embeddings by Contrastive Pre-Training (OpenAI, 2022)

MTEB: Massive Text Embedding Benchmark (Muennighoff et al., 2022)

BGE: BAAI General Embedding Model Documentation

Evaluation Frameworks

RAGAS: Evaluation Framework for RAG Applications

TruLens: Evaluation and Tracking for LLM Applications

LangSmith: LLM Application Testing and Monitoring

DeepEval: Unit Testing for LLM Applications

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

📚

Naive RAG(NRAG)

Foundational "Retrieve-Read" framework following traditional indexing, retrieval, and generation process

Complexity: lowKnowledge Retrieval (RAG)

🎯 30-Second Overview

Pattern: Simple retrieve-then-read pipeline: query → retrieve documents → concatenate → generate

Why: Provides external knowledge access with minimal complexity - foundational approach established by Lewis et al. (2020)

Key Insight: Direct concatenation of top-k retrieved documents to query prompt - no optimization or post-processing

⚡ Quick Implementation

1Index:Create vector embeddings of knowledge base

2Retrieve:Find top-k relevant documents via similarity

3Concatenate:Append retrieved docs to query prompt

4Generate:Submit combined prompt to LLM

5Return:Output generated response directly

Example: query → vector_search → concat_context → llm_generate → response

📋 Do's & Don'ts

✅Use semantic embeddings (e.g., sentence-transformers)

✅Implement basic chunking strategy (fixed size 512-1024 tokens)

✅Set clear top-k retrieval limits (typically 3-5 documents)

✅Include document metadata and source attribution

✅Use FAISS or similar for efficient vector search

❌Skip query preprocessing or normalization

❌Retrieve too many documents (causes context pollution)

❌Ignore relevance scoring thresholds

❌Concatenate without clear document boundaries

❌Expect sophisticated reasoning from basic approach

🚦 When to Use

Use When

• Simple Q&A over documents
• Proof-of-concept RAG systems
• Limited technical complexity allowed
• Small to medium knowledge bases
• Straightforward factual queries

Avoid When

• Complex multi-hop reasoning required
• High-accuracy critical applications
• Large-scale production systems
• Noisy or contradictory knowledge bases
• Real-time performance requirements

📊 Key Metrics

Retrieval Accuracy

Relevant docs in top-k (Recall@k)

Answer Quality

BLEU/ROUGE scores vs ground truth

Response Time

End-to-end latency (retrieval + generation)

Context Utilization

% of retrieved context used in response

Hallucination Rate

% responses with unsupported claims

Source Attribution

% responses with correct source citations

💡 Top Use Cases

Document Q&A: Simple factual questions over company documents or knowledge bases

FAQ Systems: Automated responses to frequently asked questions using existing documentation

Research Assistance: Basic information retrieval from academic papers or research collections

Customer Support: Level 1 support queries answerable from documentation and manuals

Educational Tools: Simple question-answering over textbooks and educational materials

References & Further Reading

Deepen your understanding with these curated resources

Foundational Papers

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Lewis et al., 2020)

Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., 2020)

RAG vs FiD: Comparing Retrieval-Augmented Generation Models (Izacard & Grave, 2021)

Leveraging Passage Retrieval with Generative Models (Izacard et al., 2022)

Comprehensive Surveys

Retrieval-Augmented Generation for Large Language Models: A Survey (Gao et al., 2023)

A Comprehensive Survey of RAG: Evolution and Future Directions (Gupta et al., 2024)

Retrieval-Augmented Generation for AI-Generated Content: A Survey (Li et al., 2024)

RAG and RAU: A Survey on Retrieval-Augmented Language Model (Zhao et al., 2024)

Implementation Resources

LangChain RAG Tutorial - Basic Implementation

Hugging Face RAG Documentation and Examples

OpenAI RAG Implementation Guide

LlamaIndex Basic RAG Pipeline Tutorial

Vector Database Solutions

FAISS - Facebook AI Similarity Search Library

Pinecone Vector Database Documentation

Weaviate Vector Search Engine

Chroma - Open-source Embedding Database

Embedding Models & Evaluation

Sentence-BERT: Sentence Embeddings using Siamese BERT (Reimers & Gurevych, 2019)

Text and Code Embeddings by Contrastive Pre-Training (OpenAI, 2022)

MTEB: Massive Text Embedding Benchmark (Muennighoff et al., 2022)

BGE: BAAI General Embedding Model Documentation

Evaluation Frameworks

RAGAS: Evaluation Framework for RAG Applications

TruLens: Evaluation and Tracking for LLM Applications

LangSmith: LLM Application Testing and Monitoring

DeepEval: Unit Testing for LLM Applications

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

Patterns

closed

Design Patterns & Techniques

🔗

Prompt Chaining

🔀

Routing

⚡

Parallelization

🪞

Reflection

🔧

Tool Use

🎯

Planning

👥

Multi-Agent

🧠

Memory Management

📈

Learning and Adaptation

🏗️

Fault Tolerance Infrastructure

📚

Knowledge Retrieval (RAG)

🧠

Reasoning Techniques

🔐

Security & Privacy Patterns

📊

Evaluation and Monitoring

🧠

Context Management

🎨

Agentic Design

Agentic Design

Design Patterns & Techniques

Prompt Chaining

Routing

Parallelization

Reflection

Tool Use

Planning

Multi-Agent

Memory Management

Learning and Adaptation

Fault Tolerance Infrastructure

Knowledge Retrieval (RAG)

Naive RAG(NRAG)

Advanced RAG(ARAG)

Modular RAG(MRAG)

Self-RAG(SRAG)

Corrective RAG (CRAG)(CRAG)

Graph RAG(GRAG)

Multimodal RAG(MMRAG)

Agentic RAG(AgRAG)

Reasoning Techniques

Security & Privacy Patterns

Evaluation and Monitoring

Context Management

UI/UX & Human-AI Interaction

Loading...

Naive RAG(NRAG)

🎯 30-Second Overview

⚡ Quick Implementation

📋 Do's & Don'ts

🚦 When to Use

Use When

Avoid When

📊 Key Metrics

💡 Top Use Cases

References & Further Reading

Foundational Papers

Comprehensive Surveys

Implementation Resources

Vector Database Solutions

Embedding Models & Evaluation

Evaluation Frameworks

Contribute to this collection

Naive RAG(NRAG)

🎯 30-Second Overview

⚡ Quick Implementation

📋 Do's & Don'ts

🚦 When to Use

Use When

Avoid When

📊 Key Metrics

💡 Top Use Cases

References & Further Reading

Foundational Papers

Comprehensive Surveys

Implementation Resources

Vector Database Solutions

Embedding Models & Evaluation

Evaluation Frameworks

Contribute to this collection

Patterns

Design Patterns & Techniques

Prompt Chaining

Routing

Parallelization

Reflection

Tool Use

Planning

Multi-Agent

Memory Management

Learning and Adaptation

Fault Tolerance Infrastructure

Knowledge Retrieval (RAG)

Naive RAG(NRAG)

Advanced RAG(ARAG)

Modular RAG(MRAG)

Self-RAG(SRAG)

Corrective RAG (CRAG)(CRAG)