How-to

Web SDK setup

Step-by-step guide to integrating the Pipecat client SDK with Daily transport in your frontend application.

Install dependencies

Install the Pipecat client SDK and Daily transport packages:

terminal
bash
# npm
npm install @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-js
# pnpm
pnpm add @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-js
# yarn
yarn add @pipecat-ai/client-js @pipecat-ai/daily-transport @daily-co/daily-js

TypeScript Support

All packages ship with built-in TypeScript declarations. No additional @types/* packages are needed.

Initialize the client

Create a PipecatClient with DailyTransport and register your event callbacks:

voice-client.ts
typescript
import { PipecatClient } from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
//
const client = new PipecatClient({
  transport: new DailyTransport(),
  enableMic: false,
  enableCam: false,
  callbacks: {
    onConnected: () => client.enableMic(true),
    onUserTranscript: (data) => {
      if (data.final) appendToTranscript('user', data.text);
    },
    onBotLlmText: (data) => appendToStream('bot', data.text),
    onBotStoppedSpeaking: () => finaliseStream('bot'),
    onDisconnected: () => showStatus('ended')
  }
});
OptionTypeDescription
transportTransportTransport instance — use DailyTransport for WebRTC
enableMicbooleanStart with the mic off; enable it on onConnected so the bot’s greeting plays cleanly
enableCambooleanAlways false for voice-only sessions
callbacksPipecatClientCallbacksEvent handlers (full list in RTVI events)

Connect to a session

Push a lead via the Leads API, then hand the returned lead_id to startBotAndConnect. The Breeze Buddy backend does the Daily-room provisioning + bot spawn and responds with credentials that the SDK uses to join the room.

session.ts
typescript
// Request mic permission once before the first connect.
await client.initDevices();
//
const endpoint = 'https://your-api.example.com/agent/voice/breeze-buddy/connect';
const headers = new Headers({
  Authorization: `Bearer ${apiToken}`
});
//
await client.startBotAndConnect({
  endpoint,
  headers,
  requestData: { lead_id: 'your-lead-id' }
});
//
// …later, when the conversation is complete:
await client.disconnect();
FieldRequiredDescription
endpointYesAbsolute URL of POST /agent/voice/breeze-buddy/connect on your Breeze Buddy backend.
headersYesPass your API token here (Authorization: Bearer …). See Authentication.
requestData.lead_idYesThe lead_call_tracker_id returned by the Leads API.

Execution mode lives on the lead, not the client

The client always calls the same endpoint. The bot’s behaviour is decided by the lead’s execution_mode (DAILY, DAILY_TEST, or DAILY_STREAM) — see Build with Daily — Execution modes.

Server-proxied connect

The example above puts a Breeze Buddy API token in the browser. If you’d rather keep your S2S token on your server — and have your server own lead creation too — point startBotAndConnect at your backend instead. Your server calls Breeze Buddy with its S2S token and relays the Daily credentials back.

Browser: startBotAndConnect(endpoint: /api/voice/start)
Your server: POST /leads (S2S token)
Your server: POST /connect (S2S token)
Your server: relay { room_url, token }
Browser joins Daily room
api/voice/start.server.ts
typescript
// Server endpoint the browser hits. Uses your S2S token; the browser never sees it.
export async function POST({ request }) {
  const { agent_type } = await request.json();
  const agent = getAgent(agent_type); // your lookup: { template, payload }
  //
  const lead = await fetch(`${BACKEND_URL}/agent/voice/breeze-buddy/leads`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${process.env.BREEZE_S2S_TOKEN}`
    },
    body: JSON.stringify({
      reseller_id: 'your-reseller',
      merchant_id: 'your-merchant',
      template: agent.template,
      payload: agent.payload,
      execution_mode: 'DAILY_TEST'
    })
  }).then((r) => r.json());
  //
  const connect = await fetch(`${BACKEND_URL}/agent/voice/breeze-buddy/connect`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      Authorization: `Bearer ${process.env.BREEZE_S2S_TOKEN}`
    },
    body: JSON.stringify({ lead_id: lead.lead_call_tracker_id })
  }).then((r) => r.json());
  //
  // The browser's PipecatClient expects { room_url, token } in the body.
  return Response.json(connect);
}
client.ts
typescript
// Point the SDK at your server; no Authorization header from the browser.
await client.initDevices();
await client.startBotAndConnect({
  endpoint: '/api/voice/start',
  requestData: { agent_type: 'appointment-reminder' }
});

When to pick this pattern

Use server-proxied connect when any of these apply: you can’t expose an API token in the browser; the lead payload is sensitive or derived server-side; you want a single integration seam (your own /api/voice/start) instead of two Breeze Buddy endpoints. Direct connect is simpler when the browser already has a short-lived scoped token.

Event handling

Callbacks are registered in the constructor (above). The full catalog — user/bot speech, LLM tokens, function calls, metrics, errors, custom server messages — lives in RTVI events. A couple of common ones:

events.ts
typescript
// Inside the PipecatClient(...) callbacks object:
onUserTranscript: (data) => {
  if (data.final) appendToTranscript('user', data.text);
  else renderInterim(data.text);
},
onBotLlmText: (data) => appendToStream('bot', data.text),
onBotStoppedSpeaking: () => finaliseStream('bot'),
onLLMFunctionCall: (data) => {
  if (data.function_name === 'appointment_confirmed') {
    showConfirmation(data.arguments);
  }
},
onTransportStateChanged: (state) => {
  if (state === 'error') showError('Connection lost');
}

Sending messages to the bot

Client → server messages go through sendClientMessage(type, data). Today Breeze Buddy handles a single message type end-to-end: tts-speak, available in DAILY_STREAM mode.

client-messages.ts
typescript
// Tell the bot to speak a specific utterance (stream mode only).
await client.sendClientMessage('tts-speak', {
  text: 'Hello! Let me confirm your appointment details.'
});
//
// Chain multiple — they queue FIFO in the pipeline.
await client.sendClientMessage('tts-speak', { text: 'I see you booked April 15.' });
await client.sendClientMessage('tts-speak', { text: 'Shall I go ahead and confirm?' });

Rules of the road

  • Only works when the lead’s execution_mode is DAILY_STREAM. In agent modes (DAILY, DAILY_TEST) the LLM owns what the bot says — the message is ignored.
  • Queues after the current utterance (no interruption). For barge-in, speak into the mic — VAD will handle it.
  • Max 2000 characters per call; longer strings are silently truncated server-side.
  • Each tts-speak fires the normal onBotTts*, onBotStartedSpeaking, and onBotStoppedSpeaking events — reflect the spoken text in your transcript UI from those listeners.

See the full event reference for details.

Audio visualization

The remote bot audio track arrives on onTrackStarted — attach an AudioContext analyser to it for waveform or volume visuals:

visualizer.ts
typescript
// Inside the PipecatClient callbacks object:
onTrackStarted: (track, participant) => {
  if (track.kind !== 'audio' || participant?.local) return;
  const ctx = new AudioContext();
  const source = ctx.createMediaStreamSource(new MediaStream([track]));
  const analyser = ctx.createAnalyser();
  analyser.fftSize = 256;
  source.connect(analyser);
  const bins = new Uint8Array(analyser.frequencyBinCount);
  const draw = () => {
    analyser.getByteFrequencyData(bins);
    renderBars(bins);
    requestAnimationFrame(draw);
  };
  draw();
}

Error handling

Handle network drops, permission denials, and pipeline errors via the transport-state and custom-message channels:

error-handling.ts
typescript
// Inside the PipecatClient callbacks object:
onTransportStateChanged: (state) => {
  if (state === 'error') showError('Unable to connect. Check your network.');
},
onDisconnected: () => {
  if (!intentionalDisconnect) {
    showError('Connection lost. Please refresh to reconnect.');
  }
},
onServerMessage: (msg) => {
  if (msg.type === 'pipeline-error') {
    showError('The voice assistant encountered an error. Reconnecting...');
    setTimeout(() => reconnect(currentLeadId), 2000);
  }
}

Mic permission denied

Browser microphone denials surface from initDevices(). Catch the rejection from the initDevices() / startBotAndConnect() await to prompt the user.

React integration

Example React hook wrapping the client lifecycle:

useVoiceSession.tsx
tsx
import { useEffect, useRef, useState, useCallback } from 'react';
import { PipecatClient } from '@pipecat-ai/client-js';
import { DailyTransport } from '@pipecat-ai/daily-transport';
//
type Status = 'idle' | 'connecting' | 'active' | 'ended' | 'error';
//
export function useVoiceSession(baseUrl: string, apiToken: string) {
  const clientRef = useRef<PipecatClient | null>(null);
  const [status, setStatus] = useState<Status>('idle');
  const [transcript, setTranscript] = useState<string[]>([]);
  //
  useEffect(() => {
    const client = new PipecatClient({
      transport: new DailyTransport(),
      enableMic: false,
      enableCam: false,
      callbacks: {
        onConnected: () => { client.enableMic(true); setStatus('active'); },
        onDisconnected: () => setStatus('ended'),
        onTransportStateChanged: (s) => { if (s === 'error') setStatus('error'); },
        onUserTranscript: (d) => {
          if (d.final) setTranscript((t) => [...t, `You: ${d.text}`]);
        },
        onBotLlmText: (d) => {
          setTranscript((t) => [...t, `Bot: ${d.text}`]);
        }
      }
    });
    clientRef.current = client;
    return () => { client.disconnect(); };
  }, [baseUrl]);
  //
  const connect = useCallback(async (leadId: string) => {
    setStatus('connecting');
    await clientRef.current?.initDevices();
    await clientRef.current?.startBotAndConnect({
      endpoint: `${baseUrl}/agent/voice/breeze-buddy/connect`,
      headers: new Headers({ Authorization: `Bearer ${apiToken}` }),
      requestData: { lead_id: leadId }
    });
  }, [baseUrl, apiToken]);
  //
  const disconnect = useCallback(async () => {
    await clientRef.current?.disconnect();
  }, []);
  //
  return { status, transcript, connect, disconnect };
}

Vanilla JavaScript

For non-framework projects, the same SDK works directly:

index.html
html
<script type="module">
  import { PipecatClient } from 'https://esm.sh/@pipecat-ai/client-js';
  import { DailyTransport } from 'https://esm.sh/@pipecat-ai/daily-transport';
  //
  const client = new PipecatClient({
    transport: new DailyTransport(),
    enableMic: false,
    enableCam: false,
    callbacks: {
      onConnected: () => client.enableMic(true),
      onUserTranscript: (d) => {
        if (!d.final) return;
        document.getElementById('transcript').innerHTML +=
          '<p><strong>User:</strong> ' + d.text + '</p>';
      },
      onBotLlmText: (d) => {
        document.getElementById('transcript').innerHTML +=
          '<p><strong>Bot:</strong> ' + d.text + '</p>';
      }
    }
  });
  //
  document.getElementById('connect-btn').onclick = async () => {
    await client.initDevices();
    await client.startBotAndConnect({
      endpoint: 'https://your-api.example.com/agent/voice/breeze-buddy/connect',
      headers: new Headers({ Authorization: 'Bearer ' + apiToken }),
      requestData: { lead_id: document.getElementById('lead-input').value }
    });
  };
  document.getElementById('disconnect-btn').onclick = () => client.disconnect();
</script>

Mobile Considerations

iOS Safari requires a user gesture (tap) before AudioContext can start. Call client.initDevices() + client.startBotAndConnect() inside a click/tap handler, not on page load.

Android WebView — ensure WebSettings.setMediaPlaybackRequiresUserGesture(false) is set and microphone permissions are granted via onPermissionRequest.

Background audio — mobile browsers may suspend WebRTC connections when the app is backgrounded. Listen for visibilitychange events and notify users if the session may be interrupted.

Next steps

Was this helpful?