17 — Comparisons
What Was Researched
Cross-cutting comparisons across all studied codebases, identifying common patterns, divergent approaches, and relative strengths/weaknesses.
Which Sources Were Used
All 10 local codebases + model landscape research from directory 13.
Key Findings
Master Comparison: Agent Frameworks
| Feature | Hermes | OpenClaw | Codex | Pi | LangGraph |
|---|---|---|---|---|---|
| Language | Python | TypeScript | Rust | TypeScript | Python |
| License | MIT | MIT | Apache-2.0 | MIT | MIT |
| Architecture | Monolithic + plugins | Gateway-centric | Modular crates | Minimal monorepo | Graph state machine |
| Agent loop | While-loop (sync) | While-loop | While-loop | While-loop | Graph traversal |
| Model-agnostic | ✅ 200+ via OpenRouter | ✅ Multi-provider | ⚠️ OpenAI-focused | ✅ Multi-provider | ✅ Via LangChain |
| Tools | 87+ files | First-class | Minimal core + MCP | Minimal | Via LangChain tools |
| MCP | ✅ Client + Server | ✅ Client | ✅ First-class | ❌ | ⚠️ Via integration |
| Skills | ✅ Autonomous | ✅ Manual | ❌ | ❌ | ❌ |
| Memory | ✅ 8 providers | ✅ Files | ❌ | ❌ | ✅ Checkpoints |
| Subagents | ✅ Advanced | ✅ Session-based | ❌ | ❌ | ✅ Subgraphs |
| Voice | ✅ Full pipeline | ✅ Wake + Talk | ❌ | ❌ | ❌ |
| Channels | 20+ platforms | 26+ platforms | CLI only | CLI only | N/A (library) |
| Sandboxing | ✅ 7 backends | ✅ Docker/SSH | ✅ OS-native | ✅ External | N/A |
| AGENTS.md | ✅ 71KB | ✅ | ✅ Creator | ✅ | ⚠️ |
| Cron/automation | ✅ | ✅ + webhooks | ❌ | ❌ | ❌ |
| Desktop app | ✅ Electron | ✅ Win/Mac/iOS/Android | ❌ | ❌ | N/A |
| Test suite | ~1,655 test files | CI | N/A | vitest | pytest |
| Community | Nous Research | 200+ contributors | OpenAI | Small team | LangChain Inc |
SDK / Proxy Comparison
| Feature | LiteLLM | OpenRouter SDK | LangChain | Open Responses |
|---|---|---|---|---|
| Language | Python | TypeScript | Python | Go |
| Approach | SDK + Proxy | SDK | SDK + integrations | API proxy |
| Providers | 100+ | 200+ (via OR) | 700+ integrations | Any OpenAI-compat |
| Wire format | OpenAI-compat | OpenAI-compat | Custom + adapters | OpenAI Responses |
| Provider translation | BaseConfig pattern | Speakeasy-gen | Provider packages | Go structs |
| Streaming | ✅ | ✅ ReusableStream | ✅ | ✅ |
| Caching | ✅ Redis | N/A | ⚠️ | ✅ Redis |
| Self-hostable | ✅ | N/A (client SDK) | N/A | ✅ |
| Production scale | ✅ 8ms P95 | ✅ | ✅ | ⚠️ Single monolith |
Model Tier Comparison (June 2026 pricing)
| Tier | Best Value | Best Quality | Best for Coding |
|---|---|---|---|
| Frontier | Nemotron 3 Ultra ($0.50/M in) | Claude Fable 5 ($10/M in) | Kimi K2.7 Code ($0.68/M in) |
| Flash | Gemini 3.5 Flash | GPT-5.4 Mini | DeepSeek V4 Flash |
| Nano | GPT-5.4 Nano | — | — |
| Embedding | Text Embedding 3 Small ($0.02/M) | Text Embedding 3 Large ($0.13/M) | — |
| Reranking | Rerank 4 Fast | Rerank 4 Pro | — |
Architecture Pattern Comparison
| Pattern | Proponent | Tradeoffs |
|---|---|---|
| Monolithic agent | Hermes | Simple deploy, hard to scale, huge files |
| Gateway-centric | OpenClaw | Scalable, distributed, complex setup |
| Modular crates | Codex | Maximum modularity, Rust learning curve |
| Minimal monorepo | Pi | Clean separation, limited features |
| Graph state machine | LangGraph | Best debugging, academic overhead |
| SDK + Proxy | LiteLLM | Flexible, dual deployment, two codebases |
What Is Confirmed
- No single framework has it all — Hermes leads in features, OpenClaw in channels, Codex in security, Pi in simplicity, LangGraph in debuggability
- Python + TypeScript is the winning stack — Python for agent logic, TypeScript for gateway/frontend
- OpenAI-compatible wire format is the universal choice
- AGENTS.md + SKILL.md are the emerging file conventions
- MCP is the extensibility standard — all major frameworks support it
- 20x pricing spread in models means intelligent routing is critical
How This Applies to Building a Modern Model-Agnostic Agent Harness
The ideal harness would combine:
- Hermes's learning loop (skills, memory, autonomous improvement)
- OpenClaw's gateway architecture (multi-channel, distributed)
- Codex's security model (sandboxing, context discipline)
- Pi's clean separation (AI → Agent → Application layers)
- LangGraph's state management (checkpointing, interrupts)
- OpenAI-compat model access (Ollama local, OpenRouter hosted, or optional LiteLLM proxy — not all three)