Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

Text link

Bold text

Emphasis

Superscript

Subscript

Read more

Speech-To-Text

Build a customer interview library with Gladia, Airtable & Make.com

TL;DR: Most product teams lose qualitative insights to scattered audio and transcripts that misattribute quotes. A reliable interview library needs accurate async diarization, automated routing, and a searchable database. Gladia's Solaria-1 sets the accuracy floor (29% lower WER, 3x lower DER on conversational speech), and Make.com routes its structured JSON into Airtable automatically, turning raw recordings into a searchable, theme-tagged customer content library.

Speech-To-Text

Build an automated sales call analyzer with Gladia and n8n

TL;DR: Off-the-shelf conversation intelligence platforms cost $1,200 to $2,400 per seat per year, while this n8n and Gladia pipeline scales at $0.20 to $0.61 per hour of audio with all features included. The async pipeline handles transcription, speaker diarization, and audio intelligence in a single API call, and the structured JSON output maps directly into HubSpot or Salesforce through n8n nodes. Gladia's Solaria-1 model covers 100+ languages, including 42 that no other API-level competitor supports, protecting CRM data quality for global sales teams.

Speech-To-Text

How to build a no-touch pipeline from sales calls to CRM

TL;DR: Manual CRM entry breaks sales intelligence pipelines because reps skip fields and misremember details, creating corrupted deal data that spreads into forecasts, coaching scores, and follow-up tasks. The bottleneck in fixing this isn't the CRM API or the LLM prompt, it's the transcription layer, since a high word error rate corrupts every entity Claude extracts downstream. This tutorial walks through a production-ready pipeline using Gladia's async STT for transcription, Claude for entity extraction, and n8n for orchestration, with most teams reaching production in under 24 hours. Gladia's Solaria-1 model delivers on average 29% lower WER than alternatives on conversational speech, directly protecting the accuracy of every deal record written to the CRM.

How to build a no-touch pipeline from sales calls to CRM

Published on May 15, 2026
by Ani Ghazaryan
How to build a no-touch pipeline from sales calls to CRM

TL;DR: Manual CRM entry breaks sales intelligence pipelines because reps skip fields and misremember details, creating corrupted deal data that spreads into forecasts, coaching scores, and follow-up tasks. The bottleneck in fixing this isn't the CRM API or the LLM prompt, it's the transcription layer, since a high word error rate corrupts every entity Claude extracts downstream. This tutorial walks through a production-ready pipeline using Gladia's async STT for transcription, Claude for entity extraction, and n8n for orchestration, with most teams reaching production in under 24 hours. Gladia's Solaria-1 model delivers on average 29% lower WER than alternatives on conversational speech, directly protecting the accuracy of every deal record written to the CRM.

Most engineering teams building sales intelligence pipelines obsess over the Claude prompt while ignoring the transcript quality underneath. If your STT layer produces high WER on an accented call, Claude extracts a corrupted company name, wrong budget figure, or garbled next step. That error propagates to the CRM, the coaching scorecard, and the forecast, and by the time the AE notices the deal record is wrong, you've pushed bad data three systems deep.

This tutorial covers exactly how to build a no-touch pipeline that extracts structured deal data from recorded sales calls and pushes it to Attio or HubSpot. We'll use Gladia for async transcription and diarization, Claude for structured entity extraction, and n8n for orchestration. The stack runs in production without custom infrastructure, and most teams are live in under a day.

Automate CRM updates from sales calls

Sales intelligence CRM automation converts raw call audio into structured deal data and writes it into your CRM without manual intervention. You build a pipeline that reliably extracts specific fields, such as company name, deal stage, budget, decision-maker, and next steps, and maps them to the correct CRM objects on every call, not just the ones with clean audio.

The gap between "we have call recordings" and "our CRM reflects what was said" is exactly where engineering teams build or buy. For a sales-specific CRM pipeline, you need speaker attribution to know whether the rep or the prospect stated the budget, and you need sub-3% WER to trust the extracted values. We covered the broader pattern in our async transcription architecture guide. The same principles apply here with tighter schema requirements.

No-touch CRM integration architecture

The pipeline connects four services in sequence:

  1. Raw audio: A call recording file (WAV, M4A, FLAC, AAC, or a URL) arrives via webhook or cloud storage trigger.
  2. Gladia async transcription: You submit the file to the pre-recorded transcription endpoint with diarization enabled, and Gladia returns a JSON transcript with speaker labels, word-level timestamps, and language detection.
  3. Claude entity extraction: You pass the diarized transcript to Claude with a structured JSON schema prompt, and Claude returns a clean deal object.
  4. n8n orchestration: An n8n workflow connects all three stages, runs validation logic, deduplicates against the CRM, and writes the deal record via upsert.

Using n8n for orchestration means you don't maintain custom microservices for each integration. When the CRM API changes or you swap Claude model versions, you update one workflow node. The n8n HTTP Request node handles both Gladia and Anthropic endpoints without custom code.

Evaluate build vs. buy costs

Self-hosting open-source STT models is the most common alternative teams consider before evaluating managed APIs. The total cost of ownership is consistently higher than the headline "free" label suggests.

Component Managed stack (Gladia + Claude + n8n) Self-hosted (open-source STT + open-source LLM)
STT $0.20-$0.61/hr (all features included) GPU EC2 instance (approximately $3,600/month for three instances) + DevOps
Diarization Included in base rate Not included in self-hosted open-source STT
LLM Claude API (pay per token) Self-hosted Llama: server + MLOps engineer overhead
Orchestration n8n Cloud from $20/month Infrastructure + maintenance
WER on messy audio On average 29% lower than alternatives Varies by dataset and audio conditions

We compared this trade-off in detail in our self-hosted vs. managed STT analysis. Beyond cost, self-hosted open-source STT models have documented limitations including lack of native diarization support and known hallucination issues on certain audio segments. These issues can corrupt your CRM data silently.

Fast PoC to production readiness

The managed stack compresses evaluation time significantly. You can submit a test recording to the Gladia async API, inspect the diarized JSON output, build the Claude extraction prompt against it, and wire up the n8n workflow in a single afternoon.

Initial configuration: connect your pipeline

Before writing any workflow logic, you need four credentials: a Gladia API key, an Anthropic API key, a CRM API token, and an n8n instance. None of these require a sales call to access.

API keys for pipeline automation

  • Gladia: Sign up at app.gladia.io the Gladia platform and generate a key from the dashboard. You can access the Gladia API reference immediately, no sales demo required.
  • Anthropic: Log in at console.anthropic.com the Anthropic console, navigate to API Keys in the sidebar, and create a key.
  • HubSpot: In your HubSpot account, go to Settings > Integrations > Private Apps. Create a private app and configure deal write permissions.
  • Attio: Generate an access token from Workspace settings > Developers tab and set the record write scope.

How to install n8n for automation

Two deployment paths work well here:

  1. n8n Cloud: Sign up at app.n8n.cloud for a managed instance. This is the fastest path to a working PoC.
  2. Self-hosted via Docker: Run docker run -it --rm --name n8n -p 5678:5678 n8nio/n8n to start a local instance. Self-hosting gives you full control over data routing, which matters for regulated call audio.

CRM for deal data: Attio vs. HubSpot

Both CRMs handle deal records, but their data models differ in ways that affect your n8n field mapping logic.

Dimension Attio HubSpot
Object model Flexible custom objects Standardized objects model
API auth Bearer token Private app token
Record creation POST /v2/objects/{object}/records Native n8n node (no raw HTTP needed)
Common fit Custom deal attributes or flexible schemas Existing HubSpot deployments

The HubSpot native n8n node supports creating, updating, and searching deals without raw HTTP requests. Attio typically requires HTTP Request nodes with the Attio record creation endpoint, but its flexible object model makes it easier to map domain-specific fields like deal_risk_indicator without fighting a predefined schema.

Step 1: Automate sales call transcription with Gladia

Once a call recording is available, the first pipeline stage submits it to Gladia's async endpoint and polls for the completed transcript. You win or lose accuracy at this stage, and everything Claude extracts in step 2 depends on what Gladia returns here.

Configure the Gladia API endpoint

Submit a POST request to the Gladia pre-recorded endpoint with the following headers and body:

POST https://api.gladia.io/v2/pre-recorded
Headers:
  Content-Type: application/json
  x-gladia-key: YOUR_GLADIA_API_KEY

Body:
{
  "audio_url": "https://your-storage.com/call-recording.m4a",
  "diarization": true,
  "diarization_config": {
    "min_speakers": 1,
    "max_speakers": 5
  }
}

The endpoint returns a result_url you poll with a GET request until status: "done" returns "done". In n8n, set up two HTTP Request nodes: one to initiate the job and one inside a Wait + HTTP Request loop to poll for completion. For a one-hour call, the transcript returns in under 60 seconds of wall-clock time from initial submission.

Accepted formats include common audio files such as WAV, M4A, FLAC, and AAC. The audio intelligence docs cover all available parameters including custom vocabulary, named entity recognition, and optional PII redaction (you must explicitly enable this feature, it does not activate by default).

Segment speakers in call transcripts

Diarization separates a usable sales call transcript from an unusable one. Without speaker attribution, Claude can't distinguish whether the budget figure came from the rep or the prospect, and that distinction determines how you populate the deal record.

Gladia powers diarization with pyannoteAI's Precision-2 technology in async workflows. When you enable it, every utterance in the JSON output includes a speaker field with an index (0, 1, 2...) assigned by order of first appearance. A typical diarized utterance looks like this:

{
  "text": "We have a budget of around fifty thousand for Q3.",
  "start": 142.4,
  "end": 147.1,
  "confidence": 0.94,
  "speaker": 1,
  "language": "en"
}

The speaker diarization docs cover how to configure number_of_speakers when you know the expected participant count, which improves attribution accuracy on calls with multiple stakeholders. Gladia's async benchmark shows on average 3x lower DER than alternatives on conversational speech. That reduction in diarization errors translates directly to fewer misattributed deal fields in your CRM.

Real-time vs. batch for CRM data

Use async (batch) transcription for post-call CRM updates. The reasons are technical, not preferential.

Async transcription processes the full audio file before generating any output, so Gladia applies full-utterance context to language detection, accent handling, and speaker attribution in a single pass. You've already recorded the call before the workflow triggers, so no latency constraint justifies trading accuracy for speed. A 10-minute sales call processes in under a minute, well within any reasonable post-call pipeline SLA.

Managing Gladia API errors

Two error categories surface most frequently in production:

  • Unsupported format or corrupted file: Returns a 400 from the initiation call. In n8n, add an Error Trigger node after the HTTP Request and route to a Slack notification that includes the audio_url and error response body.
  • Polling timeout: If a call doesn't status: "done" complete within your polling window, check the Gladia status page for incidents before assuming a bug. Increase your polling interval for long files rather than hammering the result endpoint.

Step 2: Extract deal data using Claude AI

With a diarized transcript in JSON format, you can write a straightforward extraction prompt. Claude reads the utterances, attributes statements to the correct speaker, and returns a flat JSON object your n8n workflow maps directly to CRM fields.

Optimize your extraction prompt

Use this system prompt structure to get clean JSON output reliably:

System: You are a sales analyst. Extract structured deal information from the following 
sales call transcript. The transcript contains speaker-attributed utterances in JSON format.

Return a JSON object with exactly these fields:
- company_name: string or null
- contact_person: string or null
- budget_amount: number (USD) or null
- timeline: string or null
- next_steps: string or null
- pain_points: array of strings or null
- deal_summary: string (max 150 words)

If a field is not mentioned in the transcript, return null for that field.
Return only valid JSON with no additional text, markdown, or explanation.

Transcript:
{{TRANSCRIPT_JSON}}

Pass the full utterances array from Gladia's JSON response as TRANSCRIPT_JSON. Including the speaker index in each utterance lets Claude correctly attribute budget mentions to the prospect (speaker 1) rather than the rep (speaker 0).

Define CRM deal data schema

Each extracted field maps to a specific CRM purpose:

Field CRM purpose Why it matters
company_name Deal name / company association Primary deduplication identifier
contact_person Contact association Links deal to person record
budget_amount Deal value estimate Forecast accuracy
timeline Expected close date range Pipeline stage assignment
next_steps Follow-up task creation Keeps rep accountable
pain_points Deal notes / tags Sales coaching and qualification
deal_summary Deal description Context for non-participants

The pain_points array connects directly to sales coaching workflows. When multiple calls surface the same pain point string, you can aggregate across deals to identify patterns, which connects this pipeline to the sales intelligence use cases covered in building outbound sales systems with AI.

Automate CRM field population

In n8n, use an HTTP Request node to call the Anthropic Messages API:

POST https://api.anthropic.com/v1/messages
Headers:
  x-api-key: YOUR_ANTHROPIC_API_KEY
  anthropic-version: 2023-06-01
  content-type: application/json

Body:
{
  "model": "claude-opus-4-5-20251101",
  "max_tokens": 1024,
  "messages": [
    {
      "role": "user",
      "content": "{{your_prompt_with_transcript}}"
    }
  ]
}

Parse response.content[0].text to get the JSON string, then use n8n's JSON Parse node to convert it to a structured object. From that point, every field is accessible as {{ $json.company_name }}, {{ $json.budget_amount }}, and so on.

Validate extracted deal data

Before writing to the CRM, add a validation IF node that checks for minimum required fields:

Condition: {{ $json.company_name === null || $json.next_steps === null }}
True path → Slack alert with transcript ID and missing fields
False path → Continue to CRM write

This step prevents hallucinated nulls from creating empty deal records. Claude will occasionally return null for a field mentioned ambiguously. Route those cases to a human review queue rather than writing them to the CRM, and your pipeline's data quality stays high.

Step 3: Link all pipeline services with n8n

n8n acts as the control plane for the entire pipeline. It holds the sequencing logic, manages retries, routes validation failures, and triggers the CRM write. Building this in n8n rather than custom microservices means your team doesn't maintain a separate deployment for orchestration.

Set up your workflow's starting event

Two trigger patterns work well here:

  1. Webhook trigger: Configure your call recording platform to POST to an n8n webhook URL when a recording completes. The payload typically includes a recording URL, rep name, CRM deal ID, and call duration. n8n's webhook node generates a unique URL you paste directly into your recording platform's notification settings.
  2. Storage trigger: Poll an S3 bucket or Google Cloud Storage prefix on a schedule. Set the interval based on your call volume and processing requirements.

Automate call data extraction

Pass the transcript JSON from the Gladia poll node into the Claude HTTP Request node using an n8n expression. Stringify it before injecting into the Claude prompt body:

{{ JSON.stringify($('Poll Gladia Result').item.json.result.transcription.utterances) }}

This ensures valid JSON formatting without any custom code. The n8n expressions engine handles the serialization automatically.

Add conditional logic and routing

n8n's IF node enables nuanced deal routing beyond the null-field validation in step 2:

  • Budget above threshold: If {{ $json.budget_amount > 50000 }}, tag the deal and notify sales leadership in Slack.
  • Specific pain point detected: If pain_points contains "procurement delay", route to a dedicated deal stage.
  • Non-English call: If the diarized transcript contains utterances with language !== "en", add a language tag to the CRM record for rep assignment.
    This conditional logic is where Gladia's per-speaker attribution pays off. You can write conditions like "if the prospect mentioned budget and the rep mentioned a timeline, mark deal as qualified" by checking speaker index against field presence.

Detect pipeline issues instantly

Add an Error Trigger node at the workflow level. When any node fails, the trigger fires with the node name, error message, and input data. Route this to a Slack channel with a message template that includes the node that failed, the call recording URL, the error message, and a timestamp. This gives your team a complete audit trail without building a custom logging system.

Step 4: Automate CRM updates with extracted deal data

With the validated Claude JSON in hand, the final step writes the structured data to your CRM and handles both new deals and updates to existing ones.

Map extracted fields to CRM objects

The mapping logic differs slightly between the two CRMs, but the n8n expression syntax is identical.

For Attio, send a POST to https://api.attio.com/v2/objects/deals/recordsthe Attio deals endpoint with a Bearer token header:

{
  "data": {
    "values": {
      "name": "{{ $('Claude Extract').item.json.company_name }}",
      "budget": {{ $('Claude Extract').item.json.budget_amount }},
      "timeline": "{{ $('Claude Extract').item.json.timeline }}",
      "next_steps": "{{ $('Claude Extract').item.json.next_steps }}",
      "summary": "{{ $('Claude Extract').item.json.deal_summary }}"
    }
  }
}

Custom attributes like pain_points map to a multi-select attribute you define in Attio's schema editor. The Attio record creation endpoint accepts arbitrary values keys as long as the attribute slug exists in your workspace.

For HubSpot, use the native n8n node with resource "Deal" and operation "Create" or "Update", mapping fields as {{ $('Claude Extract').item.json.company_name }}. For the pain_points array, concatenate with {{ $('Claude Extract').item.json.pain_points.join(', ') }} and write to a custom multi-line text property. The n8n HubSpot node docs cover all supported Deal object properties.

Validating incoming deal uniqueness

Before creating a new deal, search for an existing company by domain. In n8n, add a HubSpot Search node before the Create node with a filter on the company domain property. If the search returns results, capture the company ID. If not, create the company first and then associate the deal. This prevents duplicate company records from accumulating when the same prospect appears on multiple calls.

Implementing upsert logic for deals

Route the uniqueness check through an IF node using {{ $('Search Companies').item.json.data.length > 0 }} as a clean boolean condition:

  • True (company exists): Use the "Update" operation on the existing deal record, preserving the original creation date and ownership.
  • False (new company): Use the "Create" operation, assign to the rep from the call metadata, and set the initial deal stage based on the timeline value Claude extracted.

Deploying your automated deal data flow

Moving from a working PoC to production requires a few additional configurations beyond the core pipeline logic.

Data residency options for calls

Sales call audio often contains legally sensitive information: pricing, contract terms, competitive intelligence. Your data handling decisions matter for enterprise deployments.

We are SOC 2 Type II, ISO 27001, HIPAA, and GDPR certified, all documented at the Gladia compliance hub. You can configure data residency to EU or US regions. On Growth plan, we never use customer audio to retrain our models, and you don't need to opt out. On the Starter plan, audio data may be used for model training, so Growth is the appropriate tier for any pipeline processing actual customer calls.

For organizations with strict data perimeter requirements, Gladia supports Enterprise data residency configurations. This requires an Enterprise contract. The security documentation covers encryption in transit (TLS) and DPA options available for EU-headquartered customers.

Optimizing call transcription throughput

Gladia scales without pre-provisioning. Higher concurrency is available through capacity requests. Per-hour billing makes cost forecasting straightforward: multiply your monthly call hours by the Growth plan rate.

Managing costs for high call volumes

Gladia charges per hour of audio duration. Audio intelligence features including diarization, translation, sentiment analysis, named entity recognition, and summarization are included in the base rate on Starter and Growth plans with no add-on fees.

Plan Async rate Data training Best for
Starter $0.61/hr (10 free hours/month) May be used for training Testing only
Growth From $0.20/hr Never used Production pipelines with real call data
Enterprise Custom Never used High volume, custom models, annual commitment

For a pipeline processing 1,000 hours of calls per month on Growth, the minimum cost is $200/month at $0.20/hr. The pricing breakdown lists current tier details.

Optimizing your automated deal data flow

Once the pipeline is live, these configurations improve accuracy and resilience for edge cases that appear at scale.

Evaluating WER for accented call audio

A higher WER on a sales call means more transcription errors. If even one error is a company name or a dollar figure, the CRM record can be corrupted.

Our Solaria-1 model achieves on average 29% lower WER than alternatives on conversational speech, benchmarked using open methodology. For sales calls specifically, the model's handling of accented speech is a direct pipeline reliability factor.

To validate accuracy against your own call audio, run 20-30 representative recordings through the API before going live and manually review transcripts for entity accuracy on company names, numbers, and product names. Custom vocabulary (available in the API request body) helps when your sales calls use proprietary product names a general-purpose model might misspell.

Automating multilingual call transcription

For global sales teams, the pipeline handles multilingual calls without configuration changes. Our code-switching support covers mid-conversation language changes across 100+ languages. When a bilingual prospect switches from English to Spanish mid-call, the transcript maintains continuity rather than producing garbled output or breaking the session.

To enable code-switching explicitly, set "enable_code_switching": true in your transcription request body. Each utterance in the output includes language detection information, which you pass to Claude as context for extraction.

Start with 10 free hours and validate the pipeline with your own call recordings. To validate accuracy claims against your own call audio before committing to a plan, review the async benchmark methodology for the dataset composition and test conditions behind the WER and DER figures cited in this tutorial.

FAQs

What audio file formats does the Gladia async API accept?

The async endpoint accepts WAV, MP3, M4A, FLAC, and AAC files, as well as audio URLs, with a maximum file size of 1,000MB and maximum duration of 135 minutes per job.

Does Gladia support speaker diarization for real-time transcription in this pipeline?

Diarization is available in async (batch) workflows. For post-call CRM pipelines, this doesn't limit you since you've already recorded the call before the workflow triggers, and async processing produces higher diarization accuracy than real-time systems.

Will Gladia use my sales call audio to retrain its models?

On Growth and Enterprise plans, we never use customer audio for model training and you don't need to opt out. On the Starter plan, audio data may be used for training. For any pipeline processing real customer conversations, Growth is the appropriate tier.

What happens if Claude returns null for a required CRM field?

Add an IF node after the Claude extraction step that checks for null values on required fields (at minimum, company_name). Route records with missing required fields to a Slack alert with the recording URL rather than writing an incomplete record to the CRM. This keeps data clean while giving your team visibility into edge cases.

How does this pipeline handle calls where the rep and prospect switch languages mid-conversation?

Enable "enable_code_switching": true in the Gladia API request body. The diarized transcript labels each utterance with the detected language, and the text is transcribed in the language spoken. Pass this to Claude with a note in the system prompt that utterances may contain multiple languages, and Claude extracts the structured fields regardless of which language the speaker used for each field.

What's the all-in cost for 1,000 hours of call audio per month with diarization enabled?

On our Growth plan at $0.20/hr, 1,000 hours costs $200/month. Audio intelligence features are available on paid plans. Add Claude API token costs (billed per token consumed, not per call) and n8n Cloud fees, and the full Gladia pricing page has the current tier details to model your total spend accurately.

How long does it take to transcribe a 30-minute sales call with Gladia?

A 30-minute call typically produces a complete transcript in well under a minute from the initial API submission.

Key terms glossary

Word error rate (WER): The percentage of words in a transcript that differ from the ground truth, typically calculated as (substitutions + deletions + insertions) / total reference words.

Diarization error rate (DER): A metric for speaker attribution accuracy in multi-speaker audio, measuring errors in speaker assignment and segment detection.

Speaker diarization: The process of segmenting an audio recording by speaker identity, assigning each utterance to a specific speaker index.

Code-switching: Mid-conversation language changes where a speaker transitions from one language to another within a single turn or across adjacent turns.

Async (batch) transcription: A transcription workflow that processes a complete audio file after recording ends, enabling full-context accuracy, diarization, and multilingual handling before the system generates output.

Upsert: A database write operation that updates an existing record if found by a unique identifier, or creates a new record if no match exists. This pattern prevents duplicate CRM entries when the same company appears across multiple calls.

Data residency: The requirement that you store and process data within a specific geographic region, enforced through cloud region selection or on-premises deployment to satisfy regulatory or contractual obligations.

PII redaction: An optional feature that replaces personally identifiable information (names, phone numbers, email addresses) with placeholder tokens in the transcript output. You must explicitly enable this in the API request body. It does not activate by default.

Contact us

280
Your request has been registered
A problem occurred while submitting the form.

Read more