Back to projects
Python (LiveKit Agents SDK) / TypeScript (Hono.js) / Next.js (dashboard)

Soniq

Multi-tenant AI voice agent platform

Overview

A multi-tenant AI voice agent platform where inbound phone calls are handled by an AI agent that can check availability, create bookings, process orders, and escalate to humans. Built as a full-stack product: Python voice agent, TypeScript API with 22 route modules, Next.js dashboard, and 19 database migrations.

Architecture

1

Inbound SIP calls from SignalWire hit a LiveKit room. The Python agent connects, extracts sip.trunkPhoneNumber to identify the tenant, fetches per-tenant config (system prompt, voice settings, operating hours, escalation rules) from the API, then starts an AgentSession.

2

The voice pipeline uses Deepgram Nova-3 for STT (multilingual), OpenAI gpt-4.1-mini as the reasoning LLM, and Cartesia Sonic-3 for TTS. Silero VAD is prewarmed at process startup for fast voice activity detection.

3

Six function tools (check_availability, create_booking, create_order, transfer_to_human, end_call, log_note) all route through a single _call_tool() helper that POSTs to /internal/voice-tools/:action on the API. Business logic lives centrally in the API, not in the agent.

4

Multi-tenant isolation: tenant lookup by phone_number (unique, indexed). All tables (calls, bookings, callback_queue, sms_messages) have tenant_id FK with CASCADE delete. Per-tenant system prompts, voice config (Cartesia voice_id), greetings, and escalation settings.

5

Duration watchdog enforces max call length: warns at T-2min, final warning at T-30s, force SIP REFER transfer at T-0. Call logging is fire-and-forget with 5s timeout to avoid blocking session teardown.

Design Decisions

Centralized tool execution in API over in-agent logic
All tool calls (booking, ordering, etc.) route from the agent to the API via HTTP. This adds ~50ms latency per tool call, but keeps business logic in one place -- the API is the single source of truth for booking availability, order state, and escalation rules. If we ran logic in-agent, we'd need to sync state bidirectionally.
gpt-4.1-mini over larger models
Voice conversations need sub-200ms response times. gpt-4.1-mini gives adequate reasoning quality for booking/ordering tasks at much lower latency and cost than gpt-4o. The quality gap matters less in structured, task-oriented conversations than in open-ended chat.
Fire-and-forget call logging over transactional logging
Call transcripts are logged with a 5s timeout after the call ends. If the API is slow, the log is dropped. This prevents a slow logging call from blocking agent cleanup and delaying the next call. Call metadata (duration, outcome) is always captured; the full transcript is best-effort.

Tech Stack

Python (LiveKit Agents SDK)TypeScript (Hono.js)Next.js (dashboard)Deepgram Nova-3 (STT)Cartesia Sonic-3 (TTS)OpenAI gpt-4.1-miniSupabase/PostgreSQLDocker Compose