Local Codebase Study: Open Responses
What Was Researched
Architecture and implementation of Open Responses (open-responses/open-responses) — a self-hosted, drop-in replacement for OpenAI's Responses API. Built by the Julep AI team, it enables any LLM provider (Claude, Qwen, Deepseek, Ollama, etc.) to be used with existing OpenAI Agents SDK code without modification.
Which Sources Were Used
- Local clone:
c:\Users\Adam\Desktop\agent2\open-responses - Files analyzed:
README.md,CLAUDE.md,CLI.md,main.go,open_responses/__init__.py,package.json,pyproject.toml,go.mod
Key Findings
Architecture
- Hybrid Go/Python/Node.js project: The core server is implemented as a single
main.gofile (~80KB), with Python and Node.js wrappers for distribution viapipandnpm - Docker Compose orchestration: The runtime deploys an API server, database, and management UI as containerized services
- Configuration: Uses
open-responses.jsonto track setup state, supporting both camelCase and snake_case for backward compatibility - CLI-first design: CLI built with Go's Cobra framework, all Docker Compose commands proxied through it (up, down, logs, ps, exec, etc.)
- Code navigation: Uses marker comment system (
CLAUDE-{type}-{descriptor}) for navigating the largemain.gofile — interesting pattern for large single-file codebases
API Compatibility Layer
- Implements OpenAI's Responses API endpoints — the newer, agentic-first API that replaces Chat Completions for agent workflows
- Supports: web search, file search, computer use (built-in tools)
- Drop-in compatible with
openaiPython/JS SDK — just changebase_urltohttp://localhost:8080/ - Works with OpenAI's official Agents SDK via
set_default_openai_client()
Multi-Provider Support
- Models addressed via provider-prefixed strings:
openrouter/deepseek/deepseek-r1,anthropic/claude-3, etc. - Environment variables for multiple providers:
OPENAI_API_KEY,ANTHROPIC_API_KEY,OPENROUTER_API_KEY, etc. - Rate limiting, request timeout, and payload size controls built in
Distribution Strategy
- npm:
npx -y open-responses init(primary recommended path) - pip:
uvx open-responses init - Docker: Pre-built images on Docker Hub (
julepai/*) - Python wrapper (
open_responses/__init__.py) simply shells out to a platform-specific binary — the Go binary is the actual runtime
Key Architectural Decisions
- Single-binary Go core — entire server in one
main.go(80KB) rather than a modular Go package structure - Docker Compose V1/V2 compatibility — auto-detects and abstracts both formats
- Multi-location config search — current dir → parent dir → git root
- Thin language wrappers — Python/Node packages are just binary launchers, no native reimplementation
What Is Confirmed
- Repository successfully cloned
- Hybrid Go/Python/Node.js architecture verified
- OpenAI Responses API compatibility confirmed
- Julep AI is the parent project
What Is Uncertain
- How the Go server translates between Responses API format and different provider formats internally (main.go is very large)
- Whether streaming is fully supported across all providers
- Performance characteristics vs. direct provider access
- Completeness of Responses API coverage (web search, file search, computer use)
- Whether it supports the OpenAI Realtime API in addition to Responses
How This Applies to Building a Modern Model-Agnostic Agent Harness
Open Responses is directly relevant to harness design in several ways:
- API Compatibility Pattern: Demonstrates how to build a self-hosted API-compatible proxy — a core concern for model-agnostic harnesses that need to work with existing SDKs
- Responses API as Target: The shift from Chat Completions to Responses API for agent workflows is a significant trend. The harness should consider Responses API as a first-class interface
- Multi-Provider Configuration: The environment variable + config file approach for managing multiple provider keys is a practical reference
- Distribution Model: The hybrid Go binary + thin language wrappers is an interesting cross-platform distribution strategy worth studying
- CLI-First UX: The setup wizard and Docker Compose abstraction provide a reference for how agent harnesses might handle deployment
- Single-File Monolith Warning: The 80KB
main.goshows what happens when architecture isn't modularized from the start — a pattern to avoid