Architecture
End-to-end audio path, sidecar flow, SORA helper layer, and lifecycle of the Gemini Live Discord voice bridge.
System map
Discord voice user
↓
Discord Voice UDP
↓ Opus decode / discord-ext-voice-recv
48 kHz PCM stereo
↓ downsample
16 kHz PCM mono
↓ WebSocket
Gemini Live model
↓ 24 kHz PCM mono
48 kHz PCM stereo
↓ Discord AudioSource
Discord Voice
Manual screenshot / feeder → local sidecar 127.0.0.1:18943 → Gemini Live
SORA helpers → preflight · grill · goal synth · redact → Hermes toolsAudio path
Discord Voice (Opus)
↓ discord-ext-voice-recv decode
48 kHz PCM stereo (16-bit)
↓ VoiceListener / downsample
16 kHz PCM mono
↓ Gemini Live WebSocket input
Gemini Live API
↓ Gemini Live WebSocket output
24 kHz PCM mono (PCM16)
↓ LiveAudioSource / upsample
48 kHz PCM stereo
↓ Discord AudioSource
Discord Voice (Opus encode)The important correction: input to Gemini is 16 kHz mono PCM; output from Gemini is 24 kHz mono PCM; Discord playback is 48 kHz stereo.
Sidecar path
The sidecar is local-first and is meant for the plugin, the frame feeder, and local diagnostics, not public internet traffic.
| Route | Purpose |
|---|---|
GET /health | Bridge health, metrics, connection state. |
POST /frame | Push a JPEG/PNG/WebP frame into Gemini Live. |
GET /say?text=... | Inject text into the live Gemini session. |
GET /notes?limit=50 | Read recent notes/transcript events. |
GET/POST /notify | Trigger notification breakout. |
GET /stop / GET /leave | Stop the active bridge. |
SORA helper layer
Transcript / live call notes
↓
sora_redact → strips tokens/webhooks/JWTs before reuse
↓
sora_live_grill → forces objective, constraints, owner, risk, next command, verification test
↓
sora_goal_synth → emits Discord-safe /goal and /subgoal blocks
↓
weaker model / autonomous agent / Discord operator handoffSORA bridge elements are helper tools, not Vapi, Dograh, or MCP support.
Integration boundaries
| System | Boundary |
|---|---|
| Gemini Live | Primary transport in this repository. |
| SORA | Helper layer imported into Gemini bridge; not a replacement transport. |
| Vapi | Sibling transport if installed elsewhere; not bundled here. |
| MCP | Research/adapter target; no first-class MCP server/client in this repo yet. |
| Dograh | External comparison/integration target; not bundled here. |
Hermes Live v2 · MIT · github.com/Capslockb/hermes-live-discord-agent-plugin