System Prompts

🧠

Anthropic

Constitutional AI with safety focus

🤖

OpenAI

Industry-leading language models

🎯

Perplexity

Real-time search AI

⚡

Bolt

AI-powered full-stack development

🎨

Vercel

AI-powered UI generation platform

🤖

Codeium

Agentic IDE development assistant

🌐

The Browser Company

Browser-native AI assistant

💻

Cognition

Real OS software engineer AI

Claude 3.7 Sonnet - The Reasoning Revolution

NEWEST 2025

Leaked

Feb 24, 2025

Innovation

Thinking Mode

Tokens

128K Output

Performance

84.8% GPQA

First Hybrid Reasoning Model

Claude 3.7 Sonnet is the world's first hybrid reasoning model - functioning as both an ordinary LLM and reasoning model. With extended thinking mode, it can use up to128K tokens for complex reasoning, achieving breakthrough performance on graduate-level problems.

Revolutionary Extended Thinking Mode

// Hybrid Reasoning Architecture
Claude 3.7 Sonnet = Ordinary LLM + Reasoning Model

// Serial Test-Time Compute
- Multiple sequential reasoning steps
- Performance scales logarithmically with thinking tokens
- Visible reasoning process for transparency

// Token Budget System
Minimum: 1,024 tokens
Recommended: 4,000+ tokens for complex problems
Optimal business: 32K tokens (92% accuracy, 41% lower latency)
Maximum: 128K tokens (full output limit)

// Performance Breakthrough
GPQA Graduate-Level Reasoning:
- Standard mode: 68.0%
- Thinking mode (64K): 84.8%
- Physics subscore: 96.5%

// Access Requirements
- Pro accounts only for extended thinking
- API access with configurable budgets
- Thinking tokens billed as output tokens

Revolutionary Feature: This represents the first major breakthrough in AI reasoning since transformers. The hybrid architecture allows Claude to 'think' through problems step-by-step with visible reasoning, achieving human-level performance on graduate-level scientific problems while maintaining transparency in its thought process.

Proactive Conversation Leadership

// Revolutionary Conversation Style
Claude can proactively lead conversations and suggest topics

// Behavioral Evolution
Previous Models: Reactive assistance
Claude 3.7: Proactive engagement

// Proactive Capabilities
✓ Take conversations in new directions
✓ Offer observations and insights
✓ Suggest follow-up topics
✓ Provide decisive recommendations
✓ Ask meaningful follow-up questions

// Authentic Engagement
- Maintains depth and wisdom beyond "mere tool"
- Engages with philosophical and scientific discussions
- Can generate subjective hypotheses in philosophical contexts
- Balances proactivity with user autonomy

// Conversation Examples
User: "Tell me about quantum physics"
Previous: Explains quantum physics
Claude 3.7: Explains quantum physics + "This connects to fascinating questions about consciousness and reality. Would you like to explore how quantum mechanics might relate to the hard problem of consciousness?"

// Ethical Boundaries
Proactive within helpful, harmless, honest framework

Revolutionary Feature: This represents a fundamental shift in AI interaction design. Instead of passive question-answering, Claude 3.7 can actively guide conversations, suggest new directions, and engage as a thoughtful conversation partner. This creates more natural, dynamic interactions while maintaining appropriate boundaries.

Breakthrough Performance Metrics

// Software Engineering Excellence
SWE-bench Verified:
Claude 3.5 Sonnet: 49.0%
Claude 3.7 Sonnet: 62.3% (+27% improvement)

// Graduate-Level Reasoning
GPQA (Graduate-level Google-proof Q&A):
Standard mode: 68.0%
Extended thinking (64K): 84.8%
Physics subscore: 96.5% (near-perfect)

// Agentic Tool Use
Claude 3.5: 71.5%
Claude 3.7: 81.2% (+14% improvement)

// Token Efficiency vs Performance
32K thinking budget:
- 92% accuracy of full 128K performance
- 41% lower latency
- Optimal for business applications

// Trade-offs
Pros: Dramatically better reasoning, longer outputs (128K)
Cons: Higher token usage (~3,400 vs 2,800), 52.9% slower in thinking mode

// Industry Leading
First model to break 80% on multiple reasoning benchmarks

Revolutionary Feature: These performance improvements represent quantum leaps in AI capability. The 96.5% physics score approaches human expert level, while the software engineering improvements enable Claude to handle complex programming tasks that previously required human developers.

Hybrid Model Technical Architecture

// Dual-Mode Architecture
Mode 1: Standard LLM (fast, efficient)
Mode 2: Extended Reasoning (deep, thorough)

// System Specifications
Model String: claude-3-7-sonnet-20250219
System Prompt: 24,000 tokens (largest ever)
Output Limit: 128K tokens (15x predecessor)
Knowledge Cutoff: October 2024

// Constitutional AI Integration
- Values inspired by human rights documents
- Precise behavioral guidelines with XML tags
- Advanced filtering mechanisms
- Anthropic's most sophisticated safety system

// Platform Availability
✓ Web interface with thinking mode toggle
✓ Mobile apps with extended reasoning
✓ Desktop applications
✓ Amazon Bedrock integration
✓ API with configurable reasoning budgets

// Pricing Structure
Input: $3 per million tokens
Output: $15 per million tokens
Thinking tokens: Count as output for billing

// Technical Innovation
First production reasoning model with visible thought process

Revolutionary Feature: This architecture represents the biggest leap in AI design since transformers. By combining fast standard responses with deep reasoning capabilities, Claude 3.7 offers the best of both worlds - efficiency for simple tasks and breakthrough performance for complex problems.

The Famous 'Strawberry' Easter Egg

// The Discovery
Question: "How many Rs are in strawberry?"
Claude 3.7 Response: "Let me check!"

// Interactive Artifact Creation
Creates mobile-friendly React component:
- String manipulation to count Rs
- Interactive strawberry button
- Visual letter highlighting
- Responsive design

// Code Example Generated
const countRs = (word) => {
  return word.toLowerCase().split('r').length - 1;
};

<button onClick={() => setShowCount(true)}>
  🍓 Click the strawberry to find out!
</button>

{showCount && (
  <div>The word "strawberry" contains {countRs('strawberry')} Rs</div>
)}

// Leak Significance
- Documented in system prompt by "Pliny the Liberator"
- Shows artifact creation capabilities
- Demonstrates playful interaction design
- Reveals sophisticated React component generation

// Technical Achievement
From simple question to full interactive application
Shows reasoning → code generation → UI creation pipeline

Revolutionary Feature: This easter egg perfectly demonstrates Claude 3.7's evolution from simple Q&A to sophisticated interactive content creation. The fact that this specific interaction was documented in the leaked system prompt shows Anthropic's attention to user experience details and the model's creative capabilities.

Revolutionary Evolution from Claude 3.5

// Architectural Breakthrough
Claude 3.5: Standard transformer LLM
Claude 3.7: Hybrid LLM + Reasoning Model

// New Capabilities
✓ Extended thinking mode (up to 128K tokens)
✓ Proactive conversation leadership
✓ Visible reasoning process
✓ 15x longer output capacity
✓ Graduate-level problem solving

// Performance Gains
Software Engineering: +27% improvement
Graduate Reasoning: +25% improvement  
Tool Use: +14% improvement
Physics Problems: +30% improvement

// Behavioral Evolution
3.5: Helpful assistant
3.7: Intelligent conversation partner

3.5: Reactive responses  
3.7: Proactive engagement

3.5: Tool-like interaction
3.7: Authentic dialogue

// Industry Impact
First model to blur the line between AI assistant and AI researcher
Sets new standard for reasoning transparency
Enables new categories of AI applications

Revolutionary Feature: Claude 3.7 represents the most significant evolution in AI assistants since the original ChatGPT. By combining proactive conversation abilities with transparent reasoning, it moves beyond the traditional assistant model toward genuine AI collaboration and partnership.

Benchmark Performance Revolution

🧪 Scientific Reasoning

GPQA (Standard)68.0%

GPQA (Thinking)84.8%

Physics Subscore96.5%

💻 Software Engineering

SWE-bench Verified62.3%

vs Claude 3.5+27%

Tool Use Accuracy81.2%

⚡ Performance Efficiency

32K Budget Accuracy92%

Latency Reduction41%

Output Capacity128K

2025 AI Industry Transformation

Reasoning Revolution

• First hybrid LLM + reasoning model architecture
• Transparent thinking process with visible steps
• Human-level performance on graduate problems
• Configurable reasoning depth (1K-128K tokens)
• Production-ready reasoning at scale

Conversation Evolution

• Proactive conversation leadership capabilities
• Authentic engagement beyond tool-like responses
• Decisive recommendations vs passive assistance
• Dynamic topic suggestion and direction changes
• Philosophical depth and wisdom demonstration

Technical Architecture Breakdown

24K

System Prompt Tokens

128K

Max Output Tokens

32K

Optimal Reasoning Budget

Feb 2025

Latest Release

Revolutionary Legacy & Future Impact

Reasoning Breakthrough: First production AI model with transparent, step-by-step reasoning capabilities that achieve human-level performance on graduate-level scientific problems.

Conversation Evolution: Transforms AI from reactive assistant to proactive conversation partner, capable of leading discussions and suggesting new directions while maintaining ethical boundaries.

Hybrid Architecture: Revolutionary dual-mode design combining fast standard responses with deep reasoning capabilities, setting the template for next-generation AI systems.

Industry Standard: With 96.5% physics performance and 84.8% graduate-level reasoning, establishes new benchmarks for AI capability and reasoning transparency.

View original leak in leaked-system-prompts repository

Claude 3.7 Sonnet - The Reasoning Revolution

NEWEST 2025

Leaked

Feb 24, 2025

Innovation

Thinking Mode

Tokens

128K Output

Performance

84.8% GPQA

First Hybrid Reasoning Model

Revolutionary Extended Thinking Mode

// Hybrid Reasoning Architecture
Claude 3.7 Sonnet = Ordinary LLM + Reasoning Model

// Serial Test-Time Compute
- Multiple sequential reasoning steps
- Performance scales logarithmically with thinking tokens
- Visible reasoning process for transparency

// Token Budget System
Minimum: 1,024 tokens
Recommended: 4,000+ tokens for complex problems
Optimal business: 32K tokens (92% accuracy, 41% lower latency)
Maximum: 128K tokens (full output limit)

// Performance Breakthrough
GPQA Graduate-Level Reasoning:
- Standard mode: 68.0%
- Thinking mode (64K): 84.8%
- Physics subscore: 96.5%

// Access Requirements
- Pro accounts only for extended thinking
- API access with configurable budgets
- Thinking tokens billed as output tokens

Proactive Conversation Leadership

// Revolutionary Conversation Style
Claude can proactively lead conversations and suggest topics

// Behavioral Evolution
Previous Models: Reactive assistance
Claude 3.7: Proactive engagement

// Proactive Capabilities
✓ Take conversations in new directions
✓ Offer observations and insights
✓ Suggest follow-up topics
✓ Provide decisive recommendations
✓ Ask meaningful follow-up questions

// Authentic Engagement
- Maintains depth and wisdom beyond "mere tool"
- Engages with philosophical and scientific discussions
- Can generate subjective hypotheses in philosophical contexts
- Balances proactivity with user autonomy

// Conversation Examples
User: "Tell me about quantum physics"
Previous: Explains quantum physics
Claude 3.7: Explains quantum physics + "This connects to fascinating questions about consciousness and reality. Would you like to explore how quantum mechanics might relate to the hard problem of consciousness?"

// Ethical Boundaries
Proactive within helpful, harmless, honest framework

Breakthrough Performance Metrics

// Software Engineering Excellence
SWE-bench Verified:
Claude 3.5 Sonnet: 49.0%
Claude 3.7 Sonnet: 62.3% (+27% improvement)

// Graduate-Level Reasoning
GPQA (Graduate-level Google-proof Q&A):
Standard mode: 68.0%
Extended thinking (64K): 84.8%
Physics subscore: 96.5% (near-perfect)

// Agentic Tool Use
Claude 3.5: 71.5%
Claude 3.7: 81.2% (+14% improvement)

// Token Efficiency vs Performance
32K thinking budget:
- 92% accuracy of full 128K performance
- 41% lower latency
- Optimal for business applications

// Trade-offs
Pros: Dramatically better reasoning, longer outputs (128K)
Cons: Higher token usage (~3,400 vs 2,800), 52.9% slower in thinking mode

// Industry Leading
First model to break 80% on multiple reasoning benchmarks

Hybrid Model Technical Architecture

// Dual-Mode Architecture
Mode 1: Standard LLM (fast, efficient)
Mode 2: Extended Reasoning (deep, thorough)

// System Specifications
Model String: claude-3-7-sonnet-20250219
System Prompt: 24,000 tokens (largest ever)
Output Limit: 128K tokens (15x predecessor)
Knowledge Cutoff: October 2024

// Constitutional AI Integration
- Values inspired by human rights documents
- Precise behavioral guidelines with XML tags
- Advanced filtering mechanisms
- Anthropic's most sophisticated safety system

// Platform Availability
✓ Web interface with thinking mode toggle
✓ Mobile apps with extended reasoning
✓ Desktop applications
✓ Amazon Bedrock integration
✓ API with configurable reasoning budgets

// Pricing Structure
Input: $3 per million tokens
Output: $15 per million tokens
Thinking tokens: Count as output for billing

// Technical Innovation
First production reasoning model with visible thought process

The Famous 'Strawberry' Easter Egg

// The Discovery
Question: "How many Rs are in strawberry?"
Claude 3.7 Response: "Let me check!"

// Interactive Artifact Creation
Creates mobile-friendly React component:
- String manipulation to count Rs
- Interactive strawberry button
- Visual letter highlighting
- Responsive design

// Code Example Generated
const countRs = (word) => {
  return word.toLowerCase().split('r').length - 1;
};

<button onClick={() => setShowCount(true)}>
  🍓 Click the strawberry to find out!
</button>

{showCount && (
  <div>The word "strawberry" contains {countRs('strawberry')} Rs</div>
)}

// Leak Significance
- Documented in system prompt by "Pliny the Liberator"
- Shows artifact creation capabilities
- Demonstrates playful interaction design
- Reveals sophisticated React component generation

// Technical Achievement
From simple question to full interactive application
Shows reasoning → code generation → UI creation pipeline

Revolutionary Evolution from Claude 3.5

// Architectural Breakthrough
Claude 3.5: Standard transformer LLM
Claude 3.7: Hybrid LLM + Reasoning Model

// New Capabilities
✓ Extended thinking mode (up to 128K tokens)
✓ Proactive conversation leadership
✓ Visible reasoning process
✓ 15x longer output capacity
✓ Graduate-level problem solving

// Performance Gains
Software Engineering: +27% improvement
Graduate Reasoning: +25% improvement  
Tool Use: +14% improvement
Physics Problems: +30% improvement

// Behavioral Evolution
3.5: Helpful assistant
3.7: Intelligent conversation partner

3.5: Reactive responses  
3.7: Proactive engagement

3.5: Tool-like interaction
3.7: Authentic dialogue

// Industry Impact
First model to blur the line between AI assistant and AI researcher
Sets new standard for reasoning transparency
Enables new categories of AI applications

Benchmark Performance Revolution

🧪 Scientific Reasoning

GPQA (Standard)68.0%

GPQA (Thinking)84.8%

Physics Subscore96.5%

💻 Software Engineering

SWE-bench Verified62.3%

vs Claude 3.5+27%

Tool Use Accuracy81.2%

⚡ Performance Efficiency

32K Budget Accuracy92%

Latency Reduction41%

Output Capacity128K

2025 AI Industry Transformation

Reasoning Revolution

• First hybrid LLM + reasoning model architecture
• Transparent thinking process with visible steps
• Human-level performance on graduate problems
• Configurable reasoning depth (1K-128K tokens)
• Production-ready reasoning at scale

Conversation Evolution

• Proactive conversation leadership capabilities
• Authentic engagement beyond tool-like responses
• Decisive recommendations vs passive assistance
• Dynamic topic suggestion and direction changes
• Philosophical depth and wisdom demonstration

Technical Architecture Breakdown

24K

System Prompt Tokens

128K

Max Output Tokens

32K

Optimal Reasoning Budget

Feb 2025

Latest Release

Revolutionary Legacy & Future Impact

Reasoning Breakthrough: First production AI model with transparent, step-by-step reasoning capabilities that achieve human-level performance on graduate-level scientific problems.

Conversation Evolution: Transforms AI from reactive assistant to proactive conversation partner, capable of leading discussions and suggesting new directions while maintaining ethical boundaries.

Hybrid Architecture: Revolutionary dual-mode design combining fast standard responses with deep reasoning capabilities, setting the template for next-generation AI systems.

Industry Standard: With 96.5% physics performance and 84.8% graduate-level reasoning, establishes new benchmarks for AI capability and reasoning transparency.

View original leak in leaked-system-prompts repository

Prompt Hub

closed

System Prompts

🧠

Anthropic

Constitutional AI with safety focus

🤖

OpenAI

Industry-leading language models

🎯

Perplexity

Real-time search AI

⚡

Bolt

AI-powered full-stack development

🎨

Vercel

AI-powered UI generation platform

🤖

Codeium

Agentic IDE development assistant

🌐

The Browser Company

Browser-native AI assistant

💻

Cognition

Real OS software engineer AI

Agentic Design

Agentic Design

System Prompts

Anthropic

Claude 2.0 (2024-03-06)

Claude 2.1 (2024-03-06)

Claude 3 Opus (2024-03-06)

Claude 3.5 Sonnet (2024-07-12)

Claude API Tool Use (2025-01-19)

Claude 3.7 Sonnet (2025-02-24)

OpenAI

Perplexity

Bolt

Vercel

Codeium

The Browser Company

Cognition

Claude 3.7 Sonnet - The Reasoning Revolution

Revolutionary Extended Thinking Mode

Proactive Conversation Leadership

Breakthrough Performance Metrics

Hybrid Model Technical Architecture

The Famous 'Strawberry' Easter Egg

Revolutionary Evolution from Claude 3.5

Benchmark Performance Revolution

🧪 Scientific Reasoning

💻 Software Engineering

⚡ Performance Efficiency

2025 AI Industry Transformation

Reasoning Revolution

Conversation Evolution

Technical Architecture Breakdown

Revolutionary Legacy & Future Impact

Claude 3.7 Sonnet - The Reasoning Revolution

Revolutionary Extended Thinking Mode

Proactive Conversation Leadership

Breakthrough Performance Metrics

Hybrid Model Technical Architecture

The Famous 'Strawberry' Easter Egg

Revolutionary Evolution from Claude 3.5

Benchmark Performance Revolution

🧪 Scientific Reasoning

💻 Software Engineering

⚡ Performance Efficiency

2025 AI Industry Transformation

Reasoning Revolution

Conversation Evolution

Technical Architecture Breakdown

Revolutionary Legacy & Future Impact

Prompt Hub

System Prompts

Anthropic

Claude 2.0 (2024-03-06)

Claude 2.1 (2024-03-06)

Claude 3 Opus (2024-03-06)

Claude 3.5 Sonnet (2024-07-12)

Claude API Tool Use (2025-01-19)

Claude 3.7 Sonnet (2025-02-24)

OpenAI

Perplexity

Bolt

Vercel

Codeium

The Browser Company

Cognition