H
Discord voice agent · Gemini Live · SORA bridge helpers
Hermes Live
A self-hosted Discord voice bridge powered by Gemini Live, now documented with a truthful SORA release map.
status
v2 cross-exam pass.
The public docs now separate working Gemini bridge features from partial/backend-dependent systems and research targets. Vapi, MCP, and Dograh are not described as bundled features unless code and tests land in this repository.
Architecture
Discord Voice → Opus Decode → 48 kHz PCM → 16 kHz mono → Gemini Live WSS
Gemini Live WSS → 24 kHz PCM → 48 kHz stereo → Discord AudioSource
Manual frame / screenshot → 127.0.0.1:18943 /frame → Gemini Live
SORA tools → preflight · grill · goal synth · redact → Hermes toolsTruth map
| Area | Status | Scope |
|---|---|---|
| Gemini Live Discord voice | Working | Voice join/leave, audio RX/TX, Gemini Live WSS. |
| Sidecar API | Working | Local 127.0.0.1:18943 health, frame, say, notes, notify, stop/leave. |
| Manual vision | Working with constraint | Frames can be pushed; Discord screenshare/camera is not automatically visible to bots. |
| SORA bridge elements | Included | Preflight, Live Grill Mode, goal/subgoal synthesis, and redaction. |
| Vapi | Sibling / not bundled | Do not document as shipped inside this repo. |
| MCP | Research target | No first-class MCP server/client in this repo yet. |
| Dograh | Research target | External comparison/integration target, not bundled. |
Get started
git clone https://github.com/Capslockb/hermes-live-discord-agent-plugin.git
cd hermes-live-discord-agent-plugin/installer
./install.py
cd ..
python3 installer/enable_sora_bridge_elements.py
python3 -m py_compile plugin/sora_bridge_elements.py plugin/__init__.py
systemctl --user restart hermes-gatewayThen join a Discord voice channel and run /voice-live. Stop with /voice-live-leave.
Documentation
Quick start →
Install, restart, first Discord voice session, local health check.
Architecture →
Audio path, sidecar flow, lifecycle, and integration boundaries.
SORA bridge elements →
Preflight, transcript grilling, goal synthesis, and redaction.
Release truth table →
Working, partial, sibling, and research claims separated cleanly.
Video feeder →
Manual frame feeder and the Discord screenshare limitation.
Environment variables →
Every key, default, and optional backend configuration.
Troubleshooting →
Bridge failures, sidecar checks, logs, and common runtime errors.
Changelog →
Release history and load-bearing fixes.
Release checks
voice_live_status
voice_live_notes limit=10
sora_bridge_preflight
sora_redact text="Authorization: Bearer fake.fake.fake"
sora_live_grill text="migrate SORA bridge features into Gemini bridge"
sora_goal_synth text="migrate SORA bridge features into Gemini bridge"License
MIT. Free to fork, host, and extend.
Hermes Live v2 · MIT · github.com/Capslockb/hermes-live-discord-agent-plugin