Agentic Design

Patterns
๐Ÿงญ

LLM-based Routing(LBR)

An intelligent query distribution system that uses a specialized LLM router to analyze incoming requests and dynamically route them to the most appropriate model, API endpoint, or processing pipeline based on query characteristics, ensuring optimal resource utilization and response quality through intent classification and capability matching

Complexity: mediumRouting

๐ŸŽฏ 30-Second Overview

Pattern: Use LLM to analyze input and determine routing path dynamically

Why: Handles complex, nuanced routing decisions that simple rules can't capture

Key Insight: Prompt engineering + structured outputs = reliable intent classification

โšก Quick Implementation

1Analyze:Prompt LLM to classify intent/category
2Extract:Parse structured output (JSON/enum)
3Map:Route decision โ†’ handler/agent
4Execute:Invoke selected workflow
5Monitor:Track routing accuracy & latency
Example: analyze_query โ†’ "category: booking" โ†’ booking_agent.invoke(query)

๐Ÿ“‹ Do's & Don'ts

โœ…Use structured outputs (JSON mode) for deterministic parsing
โœ…Provide clear examples in routing prompt
โœ…Implement fallback routes for unclear classifications
โœ…Cache routing decisions for identical queries
โœ…Use temperature=0 for consistent routing
โŒRely on free-form text parsing for routing
โŒSkip validation of LLM routing output
โŒUse high temperature for routing decisions
โŒIgnore edge cases and ambiguous inputs
โŒRoute without confidence thresholds

๐Ÿšฆ When to Use

Use When

  • โ€ข Complex intent classification needed
  • โ€ข Natural language understanding required
  • โ€ข Dynamic routing rules that evolve
  • โ€ข Multi-dimensional routing criteria

Avoid When

  • โ€ข Simple keyword-based routing suffices
  • โ€ข Ultra-low latency requirements (<100ms)
  • โ€ข Deterministic routing is mandatory
  • โ€ข Cost constraints are tight

๐Ÿ“Š Key Metrics

Routing Accuracy
% correctly routed queries
Classification Time
P50/P95 routing latency
Ambiguity Rate
% queries needing clarification
Cost per Route
LLM tokens ร— price
Fallback Rate
% routed to default handler
Cache Hit Rate
% reused routing decisions

๐Ÿ’ก Top Use Cases

Customer Support: analyze intent โ†’ route to sales/tech/billing specialist
Multi-Tool Agents: parse request โ†’ select appropriate tool/API
Document Processing: classify type โ†’ apply correct parser/workflow
Query Routing: understand complexity โ†’ route to fast/powerful model
Workflow Selection: analyze task โ†’ choose sequential/parallel execution

References & Further Reading

Deepen your understanding with these curated resources

Contribute to this collection

Know a great resource? Submit a pull request to add it.

Contribute

Patterns

closed

Loading...