Real-Time Stream Routing
Private Preview

HZRelay

The routing layer between your app and every real-time provider. Voice, tokens, webhooks — normalized to one session model, one SDK.

Why Teams Use HZRelay

Real-time integrations share the same three problems: codec mismatches between voice providers, SSE streams that need fan-out to multiple subscribers, and N×M webhook parsers for every provider pair. HZRelay solves all three once — you own the logic, we own the plumbing.

Voice Stream Routing

Twilio sends mulaw 8kHz, Deepgram wants PCM 16kHz, ElevenLabs returns PCM. HZRelay transcodes, reconnects, and normalizes across providers automatically.

Token Stream Fan-out

LLM streams are SSE. Your UI wants WebSocket. Route one source to web, mobile, and logs simultaneously — no re-architecture required.

Webhook Normalization

One inbound event, any number of targets. HZRelay handles signature verification, payload normalization, and fan-out across Stripe, GitHub, Twilio, and more.

Per-Session Observability

Standard APM is blind to real-time streams. HZRelay traces every stage per session — which provider added latency, when, and by how much.

Auto-Failover

Set SLA thresholds per provider. When the primary exceeds them, HZRelay routes to the backup automatically — no code change, no downtime.

One SDK, Any Provider

Swap Deepgram for Cartesia or OpenAI for Anthropic in config. The same session model works across all adapters in the registry.

Getting Started Path

Step 1

Read the Docs

Review stream types, adapter registry, and SDK quickstart at hzrelay.mecverse.com.

Step 2

Pick Your Depth

Start with managed routing (Depth 2) or wire HZRelay into your own loop (Depth 1). Switch anytime.

Step 3

Connect Providers

Add adapters from the registry. Route voice, tokens, or webhooks through one session without touching codec or reconnect logic.

HZRelay in Action

Session routing, provider observability, and stream event model across voice, token, and webhook workloads.

Session Router

Inbound sources → HZRelay core → outbound sinks, with live log strip.

[ SOURCES ]

TwilioPSTN
PlivoPSTN
WebRTCBROWSER
WebSocketRAW
StripeWEBHOOK
GitHubWEBHOOK

HZRELAY

CODEC
RECONNECT
ROUTE
VAD
OBSERVE

[ SINKS ]

TTSElevenLabs
STTDeepgram
LLMOpenAI
TTSCartesia
LLMAnthropic
CUSTOMYour App
WIREtwilio:mulaw8k → pcm16k → deepgram
EVTstripe:payment_intent.succeeded → [3 sinks]
TOKopenai:stream → ws:client_a + ws:client_b
VADspeech.end +680ms → transcript routed
OKsession a3f9b2c1 ▪ 762ms ▪ 3 adapters live
_

Latency Trace

Per-session, per-stage breakdown — not aggregate averages.

[ LATENCY_TRACE ] session: a3f9b2c1LIVE
audio_received
+0ms
stt_start
+12msDEEPGRAM
stt_final
+310ms
llm_start
+312msOPENAI
llm_first_token
+680ms
tts_start
+685msELEVENLABS
tts_first_audio
+760ms
audio_sent
+762ms← CALLER HEARS

298ms

STT

370ms

LLM

75ms

TTS

mouth → ear total762ms

Stream Events

Typed events emitted at each pipeline stage — subscribe from any adapter.

EVENTWHEN
session.createdSession ready, pipeline active
speech.startVAD detects speech began
speech.endVAD detects silence threshold hit
transcript.interimStreaming partial STT result
transcript.finalFinal STT result — triggers LLM
llm.token_startFirst LLM token received
tts.audio_startFirst TTS audio frame ready
tts.interruptedBarge-in flushed TTS buffer
errorAdapter error — check retryable