How We Built Voice AI Agents That Actually Answer Business Calls

The Problem

Every small business owner knows the pain: a customer calls after hours, gets voicemail, and calls your competitor instead. Traditional IVR systems ("Press 1 for sales, press 2 for support...") frustrate callers and feel robotic. Hiring a 24/7 receptionist costs $3,000-5,000/month. We built HelloCalls to solve this — an AI voice agent platform where businesses deploy intelligent phone agents that answer calls naturally, book appointments, qualify leads, and route calls by intent. Here's how the technology actually works.

The Architecture

At its core, the system connects three things in real-time: speech recognition (what the caller says), an LLM (deciding what to respond), and text-to-speech (speaking the response back). The challenge is doing this fast enough that the conversation feels natural.

Twilio ConversationRelay

We use Twilio's ConversationRelay protocol — a bidirectional WebSocket connection that handles the telephony layer. When a call comes in: 1. Twilio answers and opens a WebSocket to our server 2. Caller speech is transcribed to text (STT) on Twilio's side 3. We receive the text, send it to our LLM pipeline 4. Our LLM generates a response (streamed) 5. We send the response text back through the WebSocket 6. Twilio converts it to speech (TTS) and plays it to the caller This happens in under a second. The key insight: streaming the LLM response rather than waiting for the complete answer. The caller starts hearing the response while the LLM is still generating it.

The LLM Gateway

We don't rely on a single LLM provider. We built an intelligent routing layer that scores providers based on:

Cost (30% weight) — voice calls burn tokens fast
Health (30%) — if a provider is returning errors, skip it
Latency (20%) — for voice, speed is everything
Task match (15%) — some models handle tool-calling better
Capabilities (5%) — streaming support, context window

The router tries the best-scored provider first and automatically fails over if there's an error. In practice, this means Groq handles most voice calls (ultra-low latency), with OpenRouter as fallback for complex reasoning tasks.

Agentic Tool-Calling

The AI doesn't just talk — it takes action. During a live call, the agent can:

Book appointments with calendar validation (check availability, respect business hours, prevent double-booking)
Capture leads with contact details and intent classification
Transfer calls to specific departments or phone numbers
Route by intent using a hybrid approach: fast keyword matching first, LLM classification as fallback with confidence scoring
Query knowledge bases to answer business-specific questions

Each tool is defined with a schema that the LLM understands. When the model decides to use a tool, we execute it server-side and feed the result back into the conversation.

What We Learned

Latency is everything. In a phone call, even 200ms of extra delay feels wrong. We optimized every layer: connection pooling to LLM providers, streaming responses, pre-warming TTS engines. Interruption handling matters. Humans don't wait for the AI to finish speaking before they respond. ConversationRelay handles "barge-in" — when the caller starts talking, the AI stops and listens. Industry-specific prompts are essential. A dental office receptionist AI needs different knowledge than a plumber's dispatch. We built 20+ industry templates with pre-configured system prompts, tool configurations, and knowledge modules. Multi-language support is non-negotiable for Canada. We added auto-language detection — the AI identifies the caller's language within the first few seconds and switches to respond in kind. 15+ languages supported.

The Result

HelloCalls is live in the App Store and Google Play. Businesses can deploy a voice AI agent in minutes, not months. The same engineering approach — real-time systems, multi-provider resilience, agentic tool-calling — is available in every product we build. If you're building something that needs voice AI, or any AI integration into your existing product, let's talk.