How-to

Debug a failed call

The short list of places to look when a call ended badly — lead record, Langfuse trace, webhooks, logs.

A call went wrong. The user hung up, the bot hallucinated, or the call never dialled. This guide lists the four places to look — in the right order. You’ll find the cause in one of them.

Prerequisites

The lead_id or call_sid for the failed call.
Langfuse access on the configured base URL.
Log access (Loguru structured logs, typically in your observability stack).
Webhook delivery logs for your reporting webhook, if you have one.

Step 1 — Read the lead record

GET /agent/voice/breeze-buddy/leads/{lead_id}. This is the single source of truth for what happened at the lead level.

Look at:

status — one of BACKLOG, PROCESSING, RETRY, FINISHED. FINISHED is terminal; read outcome for what actually happened.
outcome — confirmed, rescheduled, PRECHECK_FAILED, no_answer, call_error, etc.
metaData.transcription — full back-and-forth turn by turn.
metaData.errors — array of errors captured by handlers; empty on clean runs.
metaData.call_ended_by — agent, customer, timeout, error.
metaData.node_traversal — ordered list of nodes visited; useful when the call ended on the wrong node.

If status = BACKLOG and it’s been there a while, check the template’s IST calling hours, per-number rate limits, and that payload.customer_mobile_number was set. A missing customer_mobile_number is the single most common reason a lead sits in BACKLOG.

Step 2 — Open the Langfuse trace

Every call generates a trace keyed on the call SID. The lead record exposes this as call_id (it’s the same value). Paste lead.call_id into Langfuse’s filter to pull up the trace — that’s your single correlation key between a lead and its LLM run.

In the trace you see:

Every LLM call with its full prompt and response.
Function calls and their arguments.
Evaluator scores (if configured).
Latency per node.

Common patterns:

Wrong function called — tighten the function description or node task_messages.
Hallucinated data — add an explicit constraint in task_instructions.
Stuck in a loop — look for missing transition_to on a function or a function returning the same outcome repeatedly.
Empty response — often an STT issue, not an LLM issue. Continue to step 3.

Step 3 — Listen to the recording

The lead record includes a recording URL when the call completed with audio exchange. Listen to the last 30–60 seconds — 90% of “what actually happened” becomes obvious.

If the recording is silent or abruptly cut: likely a telephony or VAD issue. Check:

VAD timings — start_secs, stop_secs in the VAD config. Too tight = clipping; too loose = long pauses.
STT events — transcription events in the trace. Missing or malformed transcriptions point to STT problems.
Provider callback — the provider webhook log for this call.

Step 4 — Check webhook delivery

If you rely on the reporting webhook for downstream processing, verify it was delivered. The lead record’s metaData may include retry history.

If the webhook failed repeatedly, look at the receiving system — often authentication or TLS issues, not Breeze Buddy.

Fast diagnostic matrix

Symptom	First place to look
Call never dialled	Lead record `status`. Check calling hours, rate limits, pre-checks.
Bot hallucinated	Langfuse trace → LLM prompt/response.
Bot stuck in loop	Langfuse trace → function calls and transitions.
User hung up abruptly	Recording first, then Langfuse.
No audio at all	VAD config + STT trace events.
Transfer failed	Lead record `metaData.error` + provider callback log.
Webhook didn’t fire	Webhook delivery log + receiving system.

Keep notes in the trace

If you diagnose a real issue, add a comment in Langfuse on the trace. Future you (or future on-call) will find it during pattern matching.

Next steps

Observability How tracing is wired.Langfuse auto-evaluation The alert loop that tells you a call failed.Leads API reference Full lead record shape.Test locally Reproduce the failure against a local backend.

Was this helpful?

Edit on GitHub