Design Patterns & Techniques

🔗

Prompt Chaining

🔀

Routing

⚡

Parallelization

🪞

Reflection

🔧

Tool Use

🎯

Planning

👥

Multi-Agent

🧠

Memory Management

📈

Learning and Adaptation

🏗️

Fault Tolerance Infrastructure

📚

Knowledge Retrieval (RAG)

🧠

Reasoning Techniques

🔐

Security & Privacy Patterns

📊

Evaluation and Monitoring

🧠

Context Management

🎨

UI/UX & Human-AI Interaction

Loading...

🧩

Modular RAG(MRAG)

Flexible RAG architecture with interchangeable modules supporting iterative, adaptive, and non-sequential retrieval patterns

Complexity: highKnowledge Retrieval (RAG)

🎯 30-Second Overview

Pattern: Decomposed RAG architecture with independent, interchangeable modules connected through standardized interfaces

Why: Enables flexibility, maintainability, and team scalability by separating concerns into distinct, testable components

Key Insight: Module boundaries defined by function (retrieval, ranking, fusion, generation) with standardized APIs enabling hot-swapping

⚡ Quick Implementation

1Module Design:Create independent retrieval, generation, and augmentation modules

2Interface Definition:Define standardized APIs between modules

3Pipeline Assembly:Compose modules into flexible processing pipelines

4Dynamic Routing:Route queries to appropriate module combinations

5Module Orchestration:Coordinate execution and data flow between modules

Example: query → module_router → [retrieval_module, rerank_module, fusion_module] → generation_module → response

📋 Do's & Don'ts

✅Design modules with clear input/output interfaces and standardized APIs

✅Implement hot-swappable modules for A/B testing and gradual rollouts

✅Use dependency injection for flexible module composition

✅Cache module outputs at appropriate granularity levels

✅Implement module health checks and fallback mechanisms

❌Create tight coupling between modules or shared mutable state

❌Skip module versioning and backward compatibility considerations

❌Ignore module-level monitoring and observability

❌Over-engineer module boundaries for simple use cases

❌Neglect module performance isolation and resource limits

🚦 When to Use

Use When

• Large-scale RAG systems requiring flexibility and maintainability
• Multi-team development with different domain expertise
• Need for rapid experimentation with different approaches
• Systems requiring different behavior for different query types
• Production environments needing gradual rollouts and A/B testing

Avoid When

• Simple single-purpose RAG applications
• Resource-constrained environments with tight latency budgets
• Small teams without modular architecture experience
• Prototypes and proof-of-concept implementations
• Systems with stable, unchanging requirements

📊 Key Metrics

Module Isolation

Independence and replaceability of individual modules

Pipeline Flexibility

Number of supported module combinations and configurations

Development Velocity

Time to implement new modules or modify existing ones

System Reliability

Fault isolation and graceful degradation capabilities

Performance Scalability

Independent scaling of bottleneck modules

Configuration Complexity

Ease of pipeline composition and module orchestration

💡 Top Use Cases

Enterprise RAG Platforms: Modular architecture supporting multiple business units with different requirements

Research Infrastructure: Academic platforms allowing researchers to experiment with different RAG components

Multi-Domain Systems: E-commerce platforms with specialized modules for products, reviews, and support content

SaaS RAG Services: Cloud platforms offering configurable RAG pipelines to customers

Hybrid AI Systems: Complex architectures combining RAG with other AI capabilities like code generation and analysis

References & Further Reading

Deepen your understanding with these curated resources

Modular RAG Frameworks & Architecture

Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks (Zhang et al., 2024)

RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing (Zhao et al., 2024)

Comprehensive Survey of RAG: Evolution, Current Landscape and Future Directions (Chen et al., 2024)

Retrieval-Augmented Generation for Large Language Models: A Survey (Gao et al., 2023)

Module Design Patterns & APIs

LlamaIndex Modular Architecture: Building Blocks and Interfaces

LangChain Modular Components: Retrievers, Chains, and Tools

Haystack Pipeline Architecture: Modular NLP Framework

DSPy: Programming Foundation Models with Composable Modules

Microservices & Distributed Architecture

Building Microservices: Designing Fine-Grained Systems (Newman, 2021)

Microservices Patterns: With Examples in Java (Richardson, 2018)

The Twelve-Factor App: Methodology for Building SaaS Applications

Domain-Driven Design: Tackling Complexity in Software (Evans, 2003)

Pipeline Orchestration & Workflow Management

Apache Airflow: Platform for Workflow Management and Scheduling

Kubeflow Pipelines: Machine Learning Workflows on Kubernetes

Prefect: Modern Workflow Orchestration Framework

LangGraph: Graph-Based Multi-Actor Applications with LangChain

Implementation Frameworks & Tools

Ray Serve: Scalable Model Serving with Python

FastAPI: Modern Web Framework for Building APIs

Pydantic: Data Validation Using Python Type Hints

Docker Compose: Multi-Container Application Definition

Monitoring & Observability

OpenTelemetry: Observability Framework for Cloud-Native Software

Jaeger: End-to-End Distributed Tracing

Prometheus: Monitoring System and Time Series Database

LangSmith: LLM Application Development Platform

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

🧩

Modular RAG(MRAG)

Flexible RAG architecture with interchangeable modules supporting iterative, adaptive, and non-sequential retrieval patterns

Complexity: highKnowledge Retrieval (RAG)

🎯 30-Second Overview

Pattern: Decomposed RAG architecture with independent, interchangeable modules connected through standardized interfaces

Why: Enables flexibility, maintainability, and team scalability by separating concerns into distinct, testable components

Key Insight: Module boundaries defined by function (retrieval, ranking, fusion, generation) with standardized APIs enabling hot-swapping

⚡ Quick Implementation

1Module Design:Create independent retrieval, generation, and augmentation modules

2Interface Definition:Define standardized APIs between modules

3Pipeline Assembly:Compose modules into flexible processing pipelines

4Dynamic Routing:Route queries to appropriate module combinations

5Module Orchestration:Coordinate execution and data flow between modules

Example: query → module_router → [retrieval_module, rerank_module, fusion_module] → generation_module → response

📋 Do's & Don'ts

✅Design modules with clear input/output interfaces and standardized APIs

✅Implement hot-swappable modules for A/B testing and gradual rollouts

✅Use dependency injection for flexible module composition

✅Cache module outputs at appropriate granularity levels

✅Implement module health checks and fallback mechanisms

❌Create tight coupling between modules or shared mutable state

❌Skip module versioning and backward compatibility considerations

❌Ignore module-level monitoring and observability

❌Over-engineer module boundaries for simple use cases

❌Neglect module performance isolation and resource limits

🚦 When to Use

Use When

• Large-scale RAG systems requiring flexibility and maintainability
• Multi-team development with different domain expertise
• Need for rapid experimentation with different approaches
• Systems requiring different behavior for different query types
• Production environments needing gradual rollouts and A/B testing

Avoid When

• Simple single-purpose RAG applications
• Resource-constrained environments with tight latency budgets
• Small teams without modular architecture experience
• Prototypes and proof-of-concept implementations
• Systems with stable, unchanging requirements

📊 Key Metrics

Module Isolation

Independence and replaceability of individual modules

Pipeline Flexibility

Number of supported module combinations and configurations

Development Velocity

Time to implement new modules or modify existing ones

System Reliability

Fault isolation and graceful degradation capabilities

Performance Scalability

Independent scaling of bottleneck modules

Configuration Complexity

Ease of pipeline composition and module orchestration

💡 Top Use Cases

Enterprise RAG Platforms: Modular architecture supporting multiple business units with different requirements

Research Infrastructure: Academic platforms allowing researchers to experiment with different RAG components

Multi-Domain Systems: E-commerce platforms with specialized modules for products, reviews, and support content

SaaS RAG Services: Cloud platforms offering configurable RAG pipelines to customers

Hybrid AI Systems: Complex architectures combining RAG with other AI capabilities like code generation and analysis

References & Further Reading

Deepen your understanding with these curated resources

Modular RAG Frameworks & Architecture

Modular RAG: Transforming RAG Systems into LEGO-like Reconfigurable Frameworks (Zhang et al., 2024)

RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing (Zhao et al., 2024)

Comprehensive Survey of RAG: Evolution, Current Landscape and Future Directions (Chen et al., 2024)

Retrieval-Augmented Generation for Large Language Models: A Survey (Gao et al., 2023)

Module Design Patterns & APIs

LlamaIndex Modular Architecture: Building Blocks and Interfaces

LangChain Modular Components: Retrievers, Chains, and Tools

Haystack Pipeline Architecture: Modular NLP Framework

DSPy: Programming Foundation Models with Composable Modules

Microservices & Distributed Architecture

Building Microservices: Designing Fine-Grained Systems (Newman, 2021)

Microservices Patterns: With Examples in Java (Richardson, 2018)

The Twelve-Factor App: Methodology for Building SaaS Applications

Domain-Driven Design: Tackling Complexity in Software (Evans, 2003)

Pipeline Orchestration & Workflow Management

Apache Airflow: Platform for Workflow Management and Scheduling

Kubeflow Pipelines: Machine Learning Workflows on Kubernetes

Prefect: Modern Workflow Orchestration Framework

LangGraph: Graph-Based Multi-Actor Applications with LangChain

Implementation Frameworks & Tools

Ray Serve: Scalable Model Serving with Python

FastAPI: Modern Web Framework for Building APIs

Pydantic: Data Validation Using Python Type Hints

Docker Compose: Multi-Container Application Definition

Monitoring & Observability

OpenTelemetry: Observability Framework for Cloud-Native Software

Jaeger: End-to-End Distributed Tracing

Prometheus: Monitoring System and Time Series Database

LangSmith: LLM Application Development Platform

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

Patterns

closed

Design Patterns & Techniques

🔗

Prompt Chaining

🔀

Routing

⚡

Parallelization

🪞

Reflection

🔧

Tool Use

🎯

Planning

👥

Multi-Agent

🧠

Memory Management

📈

Learning and Adaptation

🏗️

Fault Tolerance Infrastructure

📚

Knowledge Retrieval (RAG)

🧠

Reasoning Techniques

🔐

Security & Privacy Patterns

📊

Evaluation and Monitoring

🧠

Context Management

🎨

Agentic Design

Agentic Design

Design Patterns & Techniques

Prompt Chaining

Routing

Parallelization

Reflection

Tool Use

Planning

Multi-Agent

Memory Management

Learning and Adaptation

Fault Tolerance Infrastructure

Knowledge Retrieval (RAG)

Naive RAG(NRAG)

Advanced RAG(ARAG)

Modular RAG(MRAG)

Self-RAG(SRAG)

Corrective RAG (CRAG)(CRAG)

Graph RAG(GRAG)

Multimodal RAG(MMRAG)

Agentic RAG(AgRAG)

Reasoning Techniques

Security & Privacy Patterns

Evaluation and Monitoring

Context Management

UI/UX & Human-AI Interaction

Loading...

Modular RAG(MRAG)

🎯 30-Second Overview

⚡ Quick Implementation

📋 Do's & Don'ts

🚦 When to Use

Use When

Avoid When

📊 Key Metrics

💡 Top Use Cases

References & Further Reading

Modular RAG Frameworks & Architecture

Module Design Patterns & APIs

Microservices & Distributed Architecture

Pipeline Orchestration & Workflow Management

Implementation Frameworks & Tools

Monitoring & Observability

Contribute to this collection

Modular RAG(MRAG)

🎯 30-Second Overview

⚡ Quick Implementation

📋 Do's & Don'ts

🚦 When to Use

Use When

Avoid When

📊 Key Metrics

💡 Top Use Cases

References & Further Reading

Modular RAG Frameworks & Architecture

Module Design Patterns & APIs

Microservices & Distributed Architecture

Pipeline Orchestration & Workflow Management

Implementation Frameworks & Tools

Monitoring & Observability

Contribute to this collection

Patterns

Design Patterns & Techniques

Prompt Chaining

Routing

Parallelization

Reflection

Tool Use

Planning

Multi-Agent

Memory Management

Learning and Adaptation

Fault Tolerance Infrastructure

Knowledge Retrieval (RAG)

Naive RAG(NRAG)

Advanced RAG(ARAG)

Modular RAG(MRAG)

Self-RAG(SRAG)

Corrective RAG (CRAG)(CRAG)