Semantic Compression for AI Agents

Replace verbose AI-to-AI communication with structured, compressed blocks. Save up to 96% tokens while maintaining 100% semantic fidelity.

96% Token Savings
130+ Messages/Chain
4 Model Families
[ARCH:PLAN]
id:api|fw:python
[BUILD:EXEC]
id:m1|target:main
[TEST:RUN]
id:t1|suite:unit

Why H2C Exists

โšก

Token Waste

Agent chains consume 5,000-50,000 tokens in natural language. H2C cuts this to 200-2,000.

๐Ÿ”—

No Structured Protocol

AI agents communicate in unstructured text. H2C provides typed, parseable blocks.

๐Ÿ“ฆ

Context Collapse

Natural language chains break after ~40 messages. H2C scales to 130+ messages.

๐Ÿ”„

Cross-Model Fragility

Prompts fail between model families. H2C works zero-shot across 4 model families.

๐Ÿค

No Versioned Handoff

Agents can't resume conversations. H2C includes cycle tracking and versioning.

๐ŸŽฏ

Agent Orchestration

Building multi-agent systems requires custom protocols. H2C is a standard wire format.

Core Features

๐Ÿ“

Structured Grammar

Formal BNF grammar with typed fields, lists, and revisions. Self-describing blocks that LLMs parse natively.

๐Ÿš€

Maximum Compression

Lossless semantic compression. Same information, drastically fewer tokens. Validated across 5 scenarios.

๐Ÿ”Œ

Universal Transport

Agnostic to transport: stdin/stdout, HTTP, WebSocket, MCP. Integrate with any framework.

๐ŸŽ›๏ธ

Context Management

PRUNE, COMPACT, FREEZE commands for handling long agent chains. Scale beyond 130 messages.

๐Ÿ“Š

Agent Orchestration

Built-in cycle tracking, retry counters, and versioned handoff. Versioning-aware agent choreography.

โœ…

Zero-Shot Cross-Model

Works across Claude, GPT, Gemini, and Llama without retraining. No model-specific tweaks needed.

Validated Benchmarks

Metric Natural Language H2C Improvement
Architectural Plan ~800 tokens ~50 tokens 94%
Build Outcome ~200 tokens ~15 tokens 93%
3-Agent Cycle ~5,000 tokens ~200 tokens 96%
130-Message Chain ~42,000 tokens ~7,140 tokens 83%
Context Breakpoint ~40 messages ~130 messages 3.25x

Validated on Claude Sonnet 4.6, Opus 4.7, and cross-model testing. See docs/benchmarks for methodology.

Use Cases

01

Multi-Agent Orchestration

Architect โ†’ Builder โ†’ Tester pipelines with retry tracking and versioned handoff.

02

Long-Running Chains

100+ message conversations with intelligent pruning, compaction, and freezing.

03

LLM-to-LLM Handoff

Structured output from Agent A โ†’ direct consumption by Agent B, no parsing overhead.

04

Cognitive IR

Semantic compression for retrieval-augmented generation and reasoning transport.

05

Agent Runtime Protocol

Standard wire format for agent hosting platforms and orchestration frameworks.

06

Framework Integration

Drop-in layer for LangGraph, AutoGen, CrewAI, Semantic Kernel, and MCP.

Core Syntax

Minimal H2C Example

Replace 180 tokens of natural language with just 55 structured tokens.

This minimal example shows how H2C blocks replace verbose AI communication with clean, typed fields.

[ARCH:PLAN]
id:api-weather|fw:python3.11|lib:fastapi,httpx|auth:APIKey|struct:[main.py,services/weather.py]

[BUILD:EXEC]
id:m1|target:main.py|desc:setup_fastapi_app

[BUILD:DONE]
id:m1|diff:[main.py~1]|rev:1

[ORCH:END]
final:complete|est_token:15
Natural Language ~180 tokens
H2C Blocks ~55 tokens
Savings ~70%

Real-World Examples

65% Savings

๐ŸŒค๏ธ Weather API Service

Python FastAPI service with caching, rate limiting, and multi-step build orchestration.

59% Savings

๐Ÿ“ TODO Console App

C# .NET 8 application with SQLite, demonstrating H2C in stateful, long-running workflows.

80% Savings

๐Ÿ”„ PRUNE/COMPACT Chain

Complete v1.3 workflow with context management, stress-tested to 130+ messages.

78โ€“83% Savings

๐Ÿงช Opus 4.7 Stress Test

5 complex scenarios, 130 messages, cross-model validation, full benchmark suite.

H2C Code Examples

See how H2C blocks replace verbose natural language

ARCH ~65% savings
[ARCH:PLAN]
id:weather-api
fw:python3.11
lib:[fastapi,httpx,cachetools]
auth:APIKey::env(OPENWEATHER_API_KEY)
struct:[main.py,routers/weather.py,services/weather_service.py]
notes:[cache_TTL_10min,rate-limit_60req-min]

Architecture plan with framework, libraries, auth, and structure

BUILD ~70% savings
[BUILD:EXEC]
id:m1
target:main.py
desc:setup_fastapi_app

[BUILD:DONE]
id:m1
diff:[main.py~1]
rev:1

Build execution and completion with revision tracking

TEST ~72% savings
[TEST:RUN]
id:test_weather_endpoint
target:tests/test_weather.py
framework:pytest

[TEST:PASS]
id:test_weather_endpoint
count:42
coverage:95%

Test execution with results and coverage metrics

STATE ~80% savings
[STATE:UPDATE]
cycle:3
agent_id:agent_1
status:in_progress
context:analyzing_results
tokens_used:12450
timestamp:1717519200

State management with cycle tracking and metrics

CTX ~85% savings
[CTX:PRUNE]
target:[BUILD:DONE~1,BUILD:DONE~2]
reason:consolidate_old_steps
keep_count:10

[CTX:COMPACT]
source:[TEST:PASS~1,TEST:PASS~2,TEST:PASS~3]
into:test_summary

Context pruning and compaction for long-running chains

ORCH ~78% savings
[ORCH:END]
final:complete
status:success
total_tokens:15420
messages:130
compression:94%
timestamp:1717519300

Orchestration completion with full cycle metrics

Natural Language vs H2C

โŒ Natural Language (~180 tokens)

I've set up a new FastAPI weather service 
using Python 3.11. The service includes 
multiple endpoints for weather data fetching 
with caching (10 minute TTL) and rate limiting 
at 60 requests per minute. I've structured 
the code with separate routers and service 
layers. Authentication is handled via API key 
stored in environment variables...

โœ… H2C (~55 tokens)

[ARCH:PLAN]
id:weather-api|fw:python3.11
lib:[fastapi,httpx,cachetools]
auth:APIKey::env(OPENWEATHER_API_KEY)
struct:[main.py,routers/,services/]
notes:[cache_TTL_10min,rate-limit_60req-min]

Result: ~70% token reduction while maintaining all semantic information

Project Roadmap

โœ“

v1.0 - Core Grammar

Foundational blocks, base syntax

Released
โœ“

v1.1 - Context Management

PRUNE/COMPACT, revisions, counters

Released
โœ“

v1.2 - State Machine

FREEZE, cycle tracking, retry logic

Released
โœ“

v1.3 - Formal Specification

EBNF ISO 14977, AST model, opcodes

Released
โ†’

v1.4 - Advanced Grammar

Signed integers, DAG cycle detection

Planned
โ†’

v2.0 - Reference Implementation

Parser, validator, transpiler

Planned
๐Ÿ”ฌ

v3.0 - Runtime & Compiler

Native MCP transport, agent runtime

Research

Ecosystem Integration

H2C works as the semantic layer for your favorite frameworks

๐Ÿ”Œ

MCP

Transport H2C blocks via MCP tool calls

๐Ÿ”€

LangGraph

H2C as node output format and state schema

๐Ÿค–

AutoGen

H2C as agent response protocol

โš™๏ธ

Semantic Kernel

H2C for function result serialization

๐Ÿ‘ฅ

CrewAI

H2C as task output format

๐ŸŽฏ

OpenAI Agents SDK

H2C as structured output format

Ready to Compress Your AI Workflows?

H2C is open-source (MIT), requires zero dependencies, and works with any LLM with an 8K+ context window.

MIT License โ€ข Copyright ยฉ 2026 Paolino Salamone