Replace verbose AI-to-AI communication with structured, compressed blocks. Save up to 96% tokens while maintaining 100% semantic fidelity.
Agent chains consume 5,000-50,000 tokens in natural language. H2C cuts this to 200-2,000.
AI agents communicate in unstructured text. H2C provides typed, parseable blocks.
Natural language chains break after ~40 messages. H2C scales to 130+ messages.
Prompts fail between model families. H2C works zero-shot across 4 model families.
Agents can't resume conversations. H2C includes cycle tracking and versioning.
Building multi-agent systems requires custom protocols. H2C is a standard wire format.
Formal BNF grammar with typed fields, lists, and revisions. Self-describing blocks that LLMs parse natively.
Lossless semantic compression. Same information, drastically fewer tokens. Validated across 5 scenarios.
Agnostic to transport: stdin/stdout, HTTP, WebSocket, MCP. Integrate with any framework.
PRUNE, COMPACT, FREEZE commands for handling long agent chains. Scale beyond 130 messages.
Built-in cycle tracking, retry counters, and versioned handoff. Versioning-aware agent choreography.
Works across Claude, GPT, Gemini, and Llama without retraining. No model-specific tweaks needed.
| Metric | Natural Language | H2C | Improvement |
|---|---|---|---|
| Architectural Plan | ~800 tokens | ~50 tokens | 94% |
| Build Outcome | ~200 tokens | ~15 tokens | 93% |
| 3-Agent Cycle | ~5,000 tokens | ~200 tokens | 96% |
| 130-Message Chain | ~42,000 tokens | ~7,140 tokens | 83% |
| Context Breakpoint | ~40 messages | ~130 messages | 3.25x |
Validated on Claude Sonnet 4.6, Opus 4.7, and cross-model testing. See docs/benchmarks for methodology.
Architect โ Builder โ Tester pipelines with retry tracking and versioned handoff.
100+ message conversations with intelligent pruning, compaction, and freezing.
Structured output from Agent A โ direct consumption by Agent B, no parsing overhead.
Semantic compression for retrieval-augmented generation and reasoning transport.
Standard wire format for agent hosting platforms and orchestration frameworks.
Drop-in layer for LangGraph, AutoGen, CrewAI, Semantic Kernel, and MCP.
Replace 180 tokens of natural language with just 55 structured tokens.
This minimal example shows how H2C blocks replace verbose AI communication with clean, typed fields.
[ARCH:PLAN]
id:api-weather|fw:python3.11|lib:fastapi,httpx|auth:APIKey|struct:[main.py,services/weather.py]
[BUILD:EXEC]
id:m1|target:main.py|desc:setup_fastapi_app
[BUILD:DONE]
id:m1|diff:[main.py~1]|rev:1
[ORCH:END]
final:complete|est_token:15
Python FastAPI service with caching, rate limiting, and multi-step build orchestration.
C# .NET 8 application with SQLite, demonstrating H2C in stateful, long-running workflows.
Complete v1.3 workflow with context management, stress-tested to 130+ messages.
5 complex scenarios, 130 messages, cross-model validation, full benchmark suite.
See how H2C blocks replace verbose natural language
[ARCH:PLAN]
id:weather-api
fw:python3.11
lib:[fastapi,httpx,cachetools]
auth:APIKey::env(OPENWEATHER_API_KEY)
struct:[main.py,routers/weather.py,services/weather_service.py]
notes:[cache_TTL_10min,rate-limit_60req-min]
Architecture plan with framework, libraries, auth, and structure
[BUILD:EXEC]
id:m1
target:main.py
desc:setup_fastapi_app
[BUILD:DONE]
id:m1
diff:[main.py~1]
rev:1
Build execution and completion with revision tracking
[TEST:RUN]
id:test_weather_endpoint
target:tests/test_weather.py
framework:pytest
[TEST:PASS]
id:test_weather_endpoint
count:42
coverage:95%
Test execution with results and coverage metrics
[STATE:UPDATE]
cycle:3
agent_id:agent_1
status:in_progress
context:analyzing_results
tokens_used:12450
timestamp:1717519200
State management with cycle tracking and metrics
[CTX:PRUNE]
target:[BUILD:DONE~1,BUILD:DONE~2]
reason:consolidate_old_steps
keep_count:10
[CTX:COMPACT]
source:[TEST:PASS~1,TEST:PASS~2,TEST:PASS~3]
into:test_summary
Context pruning and compaction for long-running chains
[ORCH:END]
final:complete
status:success
total_tokens:15420
messages:130
compression:94%
timestamp:1717519300
Orchestration completion with full cycle metrics
I've set up a new FastAPI weather service
using Python 3.11. The service includes
multiple endpoints for weather data fetching
with caching (10 minute TTL) and rate limiting
at 60 requests per minute. I've structured
the code with separate routers and service
layers. Authentication is handled via API key
stored in environment variables...
[ARCH:PLAN]
id:weather-api|fw:python3.11
lib:[fastapi,httpx,cachetools]
auth:APIKey::env(OPENWEATHER_API_KEY)
struct:[main.py,routers/,services/]
notes:[cache_TTL_10min,rate-limit_60req-min]
Result: ~70% token reduction while maintaining all semantic information
Foundational blocks, base syntax
ReleasedPRUNE/COMPACT, revisions, counters
ReleasedFREEZE, cycle tracking, retry logic
ReleasedEBNF ISO 14977, AST model, opcodes
ReleasedSigned integers, DAG cycle detection
PlannedParser, validator, transpiler
PlannedNative MCP transport, agent runtime
ResearchH2C works as the semantic layer for your favorite frameworks
Transport H2C blocks via MCP tool calls
H2C as node output format and state schema
H2C as agent response protocol
H2C for function result serialization
H2C as task output format
H2C as structured output format
H2C is open-source (MIT), requires zero dependencies, and works with any LLM with an 8K+ context window.
MIT License โข Copyright ยฉ 2026 Paolino Salamone