Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

Text link

Bold text

Emphasis

Superscript

Subscript

Read more

Speech-To-Text

4 powerful use cases with Gladia's pre-recorded transcription API

TL;DR: Pre-recorded transcription is often thought of as just "uploading audio and getting a transcript back." But with Gladia's Audio Intelligence layer sitting on top of the transcription pipeline, a single API call can return sentiment, summaries, anonymized text, translations, and more.

Product News

Audio-to-LLM: From audio to structured intelligence in one API call

TL;DR: Gladia's Audio-to-LLM runs transcription, diarization, and LLM analysis in a single POST request. Pass a 'prompts' array, get structured outputs back in one webhook. No pipeline to build or maintain. Pick from 700+ model choices, with a free tier including 10 hours/month.

Speech-To-Text

Mastering multilingual speech-to-text: handle code-switching with AI

The article explains why code-switching makes multilingual speech-to-text harder, especially when speakers switch languages mid-sentence or use accents in noisy environments.

4 powerful use cases with Gladia's pre-recorded transcription API

Published on May 7, 2026
By Emma Genthon
4 powerful use cases with Gladia's pre-recorded transcription API

TL;DR: Pre-recorded transcription is often thought of as just "uploading audio and getting a transcript back." But with Gladia's Audio Intelligence layer sitting on top of the transcription pipeline, a single API call can return sentiment, summaries, anonymized text, translations, and more.

We'll build four things in this article, each with a single transcribe() call: a per-speaker emotional breakdown of a customer call, a bullet-point meeting summary, a GDPR-redacted transcript, and a multilingual YouTube video translated into English. Working code, sample outputs, and doc links for each.

Before you start

Install the Python SDK and grab your free API key:

pip install gladiaio-sdk

Get your API key in 30 seconds. All four examples below use the same entry point:

from gladiaio_sdk import GladiaClient

gladia_client = GladiaClient(api_key="GLADIA_API_KEY").prerecorded()

1. Sentiment analysis for call centers

Every word in a customer call carries an emotional signal. Gladia's sentiment analysis detects the sentiment (positive, negative, neutral, etc.) and emotion (anger, joy, frustration, etc.) for each sentence in the transcript. Combined with speaker diarization, you get a per-speaker emotional breakdown of the whole conversation, letting you pinpoint exactly where a call turned sour, or where an agent handled a difficult moment well. Enabling diarization and sentiment analysis together lets you extract the sentiment and emotion for each speaker in each sentence.

This is directly actionable for QA teams, coaching workflows, and CSAT prediction.

Code sample

from gladiaio_sdk import GladiaClient

gladia_client = GladiaClient(api_key="GLADIA_API_KEY").prerecorded()

transcription = gladia_client.transcribe(
    audio_url="https://www.youtube.com/watch?v=cVQxknk53LA",
    options={
        "language_config": {
            "languages": ["en"],
        },
        "sentiment_analysis": True,
        "diarization": True,
        "diarization_config": {
            "number_of_speakers": 2,
        },
    },
)

sentiments = transcription.result.sentiment_analysis.results

import ast
if isinstance(sentiments, str):
    sentiments = ast.literal_eval(sentiments)

for i, r in enumerate(sentiments):
    print(f"Speaker {r['speaker']}: [{r['sentiment']}] {r['emotion']}")
    print(f'  "{r["text"]}"')
    print(f"  {r['start']:.2f}s - {r['end']:.2f}s")

Sample output

Speaker 0: [negative] frustration
  "I've been waiting for three weeks and nobody has called me back."
  2.40s - 6.10s

Speaker 1: [positive] empathy
  "I completely understand, and I sincerely apologize for the delay."
  6.50s - 9.80s

2. Automated meeting summaries

Long meeting recordings are valuable but impractical to review in full. Gladia's summarization feature generates a structured summary of the transcript, either as a paragraph or as a bullet-point list of key topics and decisions. One API call replaces the entire workflow of transcribing, reading, and summarizing manually.

This is ideal for meeting assistants, async collaboration tools, and post-call CRM note generation.

Code sample

from gladiaio_sdk import GladiaClient

gladia_client = GladiaClient(api_key="GLADIA_API_KEY").prerecorded()

transcription = gladia_client.transcribe(
    audio_url="https://www.youtube.com/watch?v=3WrZMzqpFTc",
    options={
        "summarization": True,
        "summarization_config": {
            "type": "bullet_points"  
        },
    },
)

print(transcription.result.summarization)

Sample output

- Customer reported a 3-week delay in receiving a callback after submitting a support ticket.
- Agent acknowledged the error and escalated the case to the logistics team.
- Resolution timeline set at 48 hours; agent to follow up by email.
- Customer expressed conditional satisfaction pending resolution.

Go further

3. Anonymized call transcripts (PII redaction)

Regulatory compliance (GDPR, HIPAA, PCI-DSS) often requires that customer data be stripped from transcripts before storage, sharing, or analysis. Gladia's PII redaction feature detects and removes (or masks) personal identifiable information directly in the transcript, without requiring a separate processing step.

You can target specific entity types (names, addresses, credit card numbers, phone numbers, etc.) or use the "GDPR" preset to cover the most common regulated categories. The output is a clean, safe transcript ready for storage or downstream processing.

Code sample

from gladiaio_sdk import GladiaClient

gladia_client = GladiaClient(api_key="GLADIA_API_KEY").prerecorded()

transcription = gladia_client.transcribe(
    audio_url="https://www.youtube.com/watch?v=cVQxknk53LA",
    options={
        "pii_redaction": True,
        "pii_redaction_config": {
            "entity_types": ["GDPR"],
            "processed_text_type": "MARKER", 
        },
    },
)

print(transcription.result.transcription.full_transcript)

Sample output

"Hi, my name is [PERSON] and I'm calling about an order I placed on [DATE].
My email is [EMAIL_ADDRESS] and my phone number is [PHONE_NUMBER]."

4. Multilingual YouTube video translation

Not all content lives in one language. This example transcribes a YouTube video that switches between several languages (English, Korean, Chinese, Mongolian, Russian, Japanese), then translates the full transcript into English. Two features make this work together:

  • Code switching: Gladia detects when the speaker changes language mid-conversation and handles each segment correctly, rather than forcing a single language onto the whole audio.
  • Translation: the transcript is then translated into any target language in the same API call.

The example also uses custom vocabulary to ensure domain-specific terms (food names, proper nouns) are correctly recognized across languages.

Code sample

from gladiaio_sdk import GladiaClient

gladia_client = GladiaClient(api_key="GLADIA_API_KEY").prerecorded()

transcription = gladia_client.transcribe(
    audio_url="https://www.youtube.com/watch?v=hbhTVIa9arE",
    options={
        "language_config": {
            "languages": ["en", "ko", "zh", "mn", "ru", "ja"],
            "code_switching": True,
        },
        "custom_vocabulary_config": {
            "vocabulary": [
                "aaruul",
                {"value": "mutton"},
                {
                    "value": "Misha",
                    "pronunciations": ["micha, misha, mi cha, mi sha"],
                    "intensity": 0.4,
                    "language": "ko",
                },
            ],
            "default_intensity": 0.6,
        },
        "translation": True,
        "translation_config": {
            "target_languages": ["en"],
        },
    },
)
print("Transcription: ", transcription.result.transcription.full_transcript)
print("--------------------------------")
print("Translation: ", transcription.result.translation.results[0].full_transcript)

Sample output

Transcription:  안녕하세요! 오늘은 몽골 전통 음식을 소개합니다... aaruul は乾燥させた...
--------------------------------
Translation:  Hello! Today I'm introducing traditional Mongolian food... Aaruul is a dried...

Go further

Summary

Use Cases & Key Features
Use case Key features used Best for
Call center sentiment sentiment_analysis + diarization QA, coaching, CSAT prediction
Meeting summary summarization Meeting assistants, CRM notes
Anonymized call pii_redaction GDPR / HIPAA compliance
YouTube translation translation + code_switching + custom_vocabulary Media, multilingual content

All four examples use a single synchronous transcribe() call — no pipelines, no intermediate steps. Gladia handles the audio intelligence layer alongside the transcription so you get structured, enriched output in one round trip.

FAQs

Does Gladia's pre-recorded API require separate calls for transcription and Audio Intelligence features? 

No. Gladia's pre-recorded transcription API runs transcription and Audio Intelligence features in a single synchronous transcribe() call. Sentiment analysis, summarization, PII redaction, translation, code switching, and custom vocabulary are enabled as options on the same request, and Gladia returns the enriched output in one round trip with no pipelines or intermediate steps.

How does Gladia produce a per-speaker sentiment and emotion breakdown of a conversation? 

Gladia produces a per-speaker emotional breakdown when sentiment analysis and speaker diarization are enabled together on the same transcribe() call. With both features on, Gladia returns the sentiment (positive, negative, neutral) and emotion (anger, joy, frustration, etc.) for each sentence in the transcript, attributed to the speaker who said it.

What summary formats does Gladia's summarization feature support? 

Gladia's summarization feature supports two formats: a paragraph summary, or a bullet-point list of key topics and decisions. The format is selected via the summarization_config.type option, where "bullet_points" returns a structured list.

How can Gladia anonymize transcripts for GDPR or HIPAA compliance? 

Gladia's PII redaction feature detects and removes or masks personal identifiable information directly in the transcript, with no separate processing step. The algorithm can target specific entity types (names, addresses, credit card numbers, phone numbers, and others) or use the built-in "GDPR" preset to cover the most common regulated categories. The output is a clean transcript safe for storage, sharing, or downstream analysis.

Does Gladia handle audio that switches between multiple languages? 

Yes. Gladia's code switching feature detects when a speaker changes language mid-conversation and transcribes each segment in its correct language, rather than forcing a single language onto the whole audio. It is enabled by setting code_switching: true in language_config and listing the languages present in the recording.

Can Gladia translate a transcript in the same API call? 

Yes. Gladia's pre-recorded API can transcribe and translate audio in a single transcribe() call by enabling the translation option and setting target languages in translation_config.target_languages. Translation also works alongside code switching, so multilingual audio can be transcribed in its original languages and translated into a single target language in one request.

How does Gladia improve recognition of domain-specific terms or proper nouns? 

Gladia supports custom vocabulary through the custom_vocabulary_config.vocabulary option. Terms can be passed as plain strings, or as objects with pronunciations, intensity, and language fields for finer control over how each term is recognized.

Contact us

280
Your request has been registered
A problem occurred while submitting the form.

Read more