Web SDK setup
Step-by-step guide to integrating the Pipecat client SDK with Daily transport in your frontend application.
Install dependencies
Install the Pipecat client SDK and Daily transport packages:
# npm
npm install @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-js
# pnpm
pnpm add @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-js
# yarn
yarn add @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-jsTypeScript Support
All packages ship with built-in TypeScript declarations. No additional @types/* packages are needed.
Initialize the client
Create a PipecatClient with DailyTransport and register your event callbacks:
import { PipecatClient } from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
//
const client = new PipecatClient({
transport: new DailyTransport(),
enableMic: false,
enableCam: false,
callbacks: {
onConnected: () => client.enableMic(true),
onUserTranscript: (data) => {
if (data.final) appendToTranscript('user', data.text);
},
onBotLlmText: (data) => appendToStream('bot', data.text),
onBotStoppedSpeaking: () => finaliseStream('bot'),
onDisconnected: () => showStatus('ended')
}
});| Option | Type | Description |
|---|---|---|
transport | Transport | Transport instance — use DailyTransport for WebRTC |
enableMic | boolean | Start with the mic off; enable it on onConnected so the bot’s greeting plays cleanly |
enableCam | boolean | Always false for voice-only sessions |
callbacks | PipecatClientCallbacks | Event handlers (full list in RTVI events) |
Connect to a session
Push a lead via the Leads API, then hand the returned lead_id to startBotAndConnect. The Breeze Buddy backend does the Daily-room provisioning + bot spawn and responds with credentials that the SDK uses to join the room.
// Request mic permission once before the first connect.
await client.initDevices();
//
const endpoint = 'https://your-api.example.com/agent/voice/breeze-buddy/connect';
const headers = new Headers({
Authorization: `Bearer ${apiToken}`
});
//
await client.startBotAndConnect({
endpoint,
headers,
requestData: { lead_id: 'your-lead-id' }
});
//
// …later, when the conversation is complete:
await client.disconnect();| Field | Required | Description |
|---|---|---|
endpoint | Yes | Absolute URL of POST /agent/voice/breeze-buddy/connect on your Breeze Buddy backend. |
headers | Yes | Pass your API token here (Authorization: Bearer …). See Authentication. |
requestData.lead_id | Yes | The lead_call_tracker_id returned by the Leads API. |
Execution mode lives on the lead, not the client
The client always calls the same endpoint. The bot’s behaviour is decided by the lead’s execution_mode (DAILY, DAILY_TEST, or DAILY_STREAM) — see Build with Daily — Execution modes.
Server-proxied connect
The example above puts a Breeze Buddy API token in the browser. If you’d rather keep your S2S token on your server — and have your server own lead creation too — point startBotAndConnect at your backend instead. Your server calls Breeze Buddy with its S2S token and relays the Daily credentials back.
// Server endpoint the browser hits. Uses your S2S token; the browser never sees it.
export async function POST({ request }) {
const { agent_type } = await request.json();
const agent = getAgent(agent_type); // your lookup: { template, payload }
//
const lead = await fetch(`${BACKEND_URL}/agent/voice/breeze-buddy/leads`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.BREEZE_S2S_TOKEN}`
},
body: JSON.stringify({
reseller_id: 'your-reseller',
merchant_id: 'your-merchant',
template: agent.template,
payload: agent.payload,
execution_mode: 'DAILY_TEST'
})
}).then((r) => r.json());
//
const connect = await fetch(`${BACKEND_URL}/agent/voice/breeze-buddy/connect`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.BREEZE_S2S_TOKEN}`
},
body: JSON.stringify({ lead_id: lead.lead_call_tracker_id })
}).then((r) => r.json());
//
// The browser's PipecatClient expects { room_url, token } in the body.
return Response.json(connect);
}// Point the SDK at your server; no Authorization header from the browser.
await client.initDevices();
await client.startBotAndConnect({
endpoint: '/api/voice/start',
requestData: { agent_type: 'appointment-reminder' }
});When to pick this pattern
Use server-proxied connect when any of these apply: you can’t expose an API token in the browser; the lead payload is sensitive or derived server-side; you want a single integration seam (your own /api/voice/start) instead of two Breeze Buddy endpoints. Direct connect is simpler when the browser already has a short-lived scoped token.
Event handling
Callbacks are registered in the constructor (above). The full catalog — user/bot speech, LLM tokens, function calls, metrics, errors, custom server messages — lives in RTVI events. A couple of common ones:
// Inside the PipecatClient(...) callbacks object:
onUserTranscript: (data) => {
if (data.final) appendToTranscript('user', data.text);
else renderInterim(data.text);
},
onBotLlmText: (data) => appendToStream('bot', data.text),
onBotStoppedSpeaking: () => finaliseStream('bot'),
onLLMFunctionCall: (data) => {
if (data.function_name === 'appointment_confirmed') {
showConfirmation(data.arguments);
}
},
onTransportStateChanged: (state) => {
if (state === 'error') showError('Connection lost');
}Sending messages to the bot
Client → server messages go through sendClientMessage(type, data). Today Breeze Buddy handles a single message type end-to-end: tts-speak, available in DAILY_STREAM mode.
// Tell the bot to speak a specific utterance (stream mode only).
await client.sendClientMessage('tts-speak', {
text: 'Hello! Let me confirm your appointment details.'
});
//
// Chain multiple — they queue FIFO in the pipeline.
await client.sendClientMessage('tts-speak', { text: 'I see you booked April 15.' });
await client.sendClientMessage('tts-speak', { text: 'Shall I go ahead and confirm?' });Rules of the road
- Only works when the lead’s
execution_modeisDAILY_STREAM. In agent modes (DAILY,DAILY_TEST) the LLM owns what the bot says — the message is ignored. - Queues after the current utterance (no interruption). For barge-in, speak into the mic — VAD will handle it.
- Max 2000 characters per call; longer strings are silently truncated server-side.
- Each
tts-speakfires the normalonBotTts*,onBotStartedSpeaking, andonBotStoppedSpeakingevents — reflect the spoken text in your transcript UI from those listeners.
See the full event reference for details.
Audio visualization
The remote bot audio track arrives on onTrackStarted — attach an AudioContext analyser to it for waveform or volume visuals:
// Inside the PipecatClient callbacks object:
onTrackStarted: (track, participant) => {
if (track.kind !== 'audio' || participant?.local) return;
const ctx = new AudioContext();
const source = ctx.createMediaStreamSource(new MediaStream([track]));
const analyser = ctx.createAnalyser();
analyser.fftSize = 256;
source.connect(analyser);
const bins = new Uint8Array(analyser.frequencyBinCount);
const draw = () => {
analyser.getByteFrequencyData(bins);
renderBars(bins);
requestAnimationFrame(draw);
};
draw();
}Error handling
Handle network drops, permission denials, and pipeline errors via the transport-state and custom-message channels:
// Inside the PipecatClient callbacks object:
onTransportStateChanged: (state) => {
if (state === 'error') showError('Unable to connect. Check your network.');
},
onDisconnected: () => {
if (!intentionalDisconnect) {
showError('Connection lost. Please refresh to reconnect.');
}
},
onServerMessage: (msg) => {
if (msg.type === 'pipeline-error') {
showError('The voice assistant encountered an error. Reconnecting...');
setTimeout(() => reconnect(currentLeadId), 2000);
}
}Mic permission denied
Browser microphone denials surface from initDevices(). Catch the rejection from the initDevices() / startBotAndConnect() await to prompt the user.
React integration
Example React hook wrapping the client lifecycle:
import { useEffect, useRef, useState, useCallback } from 'react';
import { PipecatClient } from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
//
type Status = 'idle' | 'connecting' | 'active' | 'ended' | 'error';
//
export function useVoiceSession(baseUrl: string, apiToken: string) {
const clientRef = useRef<PipecatClient | null>(null);
const [status, setStatus] = useState<Status>('idle');
const [transcript, setTranscript] = useState<string[]>([]);
//
useEffect(() => {
const client = new PipecatClient({
transport: new DailyTransport(),
enableMic: false,
enableCam: false,
callbacks: {
onConnected: () => { client.enableMic(true); setStatus('active'); },
onDisconnected: () => setStatus('ended'),
onTransportStateChanged: (s) => { if (s === 'error') setStatus('error'); },
onUserTranscript: (d) => {
if (d.final) setTranscript((t) => [...t, `You: ${d.text}`]);
},
onBotLlmText: (d) => {
setTranscript((t) => [...t, `Bot: ${d.text}`]);
}
}
});
clientRef.current = client;
return () => { client.disconnect(); };
}, [baseUrl]);
//
const connect = useCallback(async (leadId: string) => {
setStatus('connecting');
await clientRef.current?.initDevices();
await clientRef.current?.startBotAndConnect({
endpoint: `${baseUrl}/agent/voice/breeze-buddy/connect`,
headers: new Headers({ Authorization: `Bearer ${apiToken}` }),
requestData: { lead_id: leadId }
});
}, [baseUrl, apiToken]);
//
const disconnect = useCallback(async () => {
await clientRef.current?.disconnect();
}, []);
//
return { status, transcript, connect, disconnect };
}Vanilla JavaScript
For non-framework projects, the same SDK works directly:
<script type="module">
import { PipecatClient } from 'https://esm.sh/@pipecat-ai/client-js';
import { DailyTransport } from 'https://esm.sh/@pipecat-ai/daily-transport';
//
const client = new PipecatClient({
transport: new DailyTransport(),
enableMic: false,
enableCam: false,
callbacks: {
onConnected: () => client.enableMic(true),
onUserTranscript: (d) => {
if (!d.final) return;
document.getElementById('transcript').innerHTML +=
'<p><strong>User:</strong> ' + d.text + '</p>';
},
onBotLlmText: (d) => {
document.getElementById('transcript').innerHTML +=
'<p><strong>Bot:</strong> ' + d.text + '</p>';
}
}
});
//
document.getElementById('connect-btn').onclick = async () => {
await client.initDevices();
await client.startBotAndConnect({
endpoint: 'https://your-api.example.com/agent/voice/breeze-buddy/connect',
headers: new Headers({ Authorization: 'Bearer ' + apiToken }),
requestData: { lead_id: document.getElementById('lead-input').value }
});
};
document.getElementById('disconnect-btn').onclick = () => client.disconnect();
</script>Mobile Considerations
iOS Safari requires a user gesture (tap) before AudioContext can start. Call client.initDevices() + client.startBotAndConnect() inside a click/tap handler, not on page load.
Android WebView — ensure WebSettings.setMediaPlaybackRequiresUserGesture(false) is set and microphone permissions are granted via onPermissionRequest.
Background audio — mobile browsers may suspend WebRTC connections when the app is backgrounded. Listen for visibilitychange events and notify users if the session may be interrupted.