branch:
README.md
3473 bytesRaw
# @cloudflare/voice-twilio
Twilio Media Streams adapter for the [Cloudflare Agents](https://github.com/cloudflare/agents) voice pipeline. Connects phone calls to your `VoiceAgent` — the same agent that handles web voice, text chat, and email can now answer the phone.
## How it works
```
Phone call → Twilio → Media Streams WebSocket → TwilioAdapter → VoiceAgent (Durable Object)
↓
STT → LLM → TTS
↓
Phone speaker ← Twilio ← mulaw 8kHz audio ← TwilioAdapter ← VoiceAgent
```
The adapter bridges Twilio's bidirectional Media Streams protocol (mulaw 8kHz, base64 JSON) to VoiceAgent's binary PCM protocol (16kHz, 16-bit LE). Audio resampling and encoding conversion happen automatically.
## Install
```bash
npm install @cloudflare/voice-twilio
```
## Usage
### 1. Add the adapter to your Worker
```typescript
import { Agent, routeAgentRequest } from "agents";
import { withVoice, type VoiceTurnContext } from "@cloudflare/voice";
import { TwilioAdapter } from "@cloudflare/voice-twilio";
const VoiceAgent = withVoice(Agent);
export class MyAgent extends VoiceAgent<Env> {
async onTurn(transcript: string, context: VoiceTurnContext) {
// Same agent handles both web and phone calls
return "Hello! How can I help you?";
}
}
export default {
async fetch(request: Request, env: Env) {
const url = new URL(request.url);
// Twilio sends WebSocket connections to this path
if (url.pathname === "/twilio") {
return TwilioAdapter.handleRequest(request, env, "MyAgent");
}
// Normal agent routing for web clients
return (
(await routeAgentRequest(request, env)) ??
new Response("Not found", { status: 404 })
);
}
};
```
### 2. Configure Twilio
In your Twilio console, set up a TwiML Bin or webhook that streams media to your Worker:
```xml
<Response>
<Connect>
<Stream url="wss://your-worker.your-account.workers.dev/twilio" />
</Connect>
</Response>
```
### 3. Assign a phone number
Attach the TwiML to a Twilio phone number. When someone calls that number, Twilio streams the audio to your Worker, which routes it to your VoiceAgent.
## Options
```typescript
TwilioAdapter.handleRequest(request, env, "MyAgent", {
// Use a custom instance name instead of the Twilio Call SID
instanceName: "shared-agent"
});
```
By default, each phone call creates a new VoiceAgent instance (using the Twilio Call SID as the instance name). Set `instanceName` to route multiple calls to the same agent instance.
## Limitations
- **TTS output format**: VoiceAgent's default TTS (Workers AI Deepgram Aura) outputs MP3. Twilio expects mulaw 8kHz. The adapter currently handles inbound audio conversion (mulaw → PCM) but outbound audio conversion (MP3 → mulaw) requires an MP3 decoder. For production use, configure a TTS provider that outputs mulaw or PCM directly, or use the `beforeSynthesize`/`afterSynthesize` hooks to handle format conversion.
## Same agent, every channel
The same `VoiceAgent` instance can handle:
- **Web voice** via `VoiceClient` / `useVoiceAgent`
- **Phone calls** via this Twilio adapter
- **Text chat** via `sendText()`
- **Email** via `routeAgentEmail()`
All channels share the same conversation history (SQLite), state, tools, and scheduling.