Use case

Virtual Meetings

Every online meeting is a source of knowledge

With Gladia's audio and video transcription API, your virtual meetings become efficient, productive, and secure. Save time, improve customer service, and gain valuable insights from each and every discussion.

video communications platforms

Top features

Speech analytics

Analyze speech patterns and identify keywords and phrases, such as customer names, product names, and emotions, to gain valuable insights into customer behavior and sentiment.


Transcribe any virtual meeting, conference or webinar asynchronously or in real time. An essential prerequisite for any virtual platform's user experience, speech-to-text can unlock a series of new features for your platform, including note-taking, semantic search and user analytics.


Translate your international meetings in real time to and from 99 languages. A must-have feature for the global enterprise, allowing teams to communicate seamlessly in their preferred language.
Code-switching supported.


Get snapshot summaries of key talking points, decisions made, and action items. Output length can be customized with a prompt, from 100 to up to 1.5k words.

Audio Indexing & NER

As audio data becomes transcribed and labeled, you can easily search and review specific parts of the meeting. Essential for teams that count on retrieving information from a large volume of files quickly.

Some stats on performance

boost in sales
saved processing calls
more informed decisions

for your needs


Gladia API utilizes automatic speech recognition technology to convert audio, video files, or URL to text format. It transcribes 1h of audio in less than 60s.


Based on a proprietary algorithm, automatically partitions an audio recording into segments corresponding to different speakers.

Topic classification

Refers to the process of categorizing content into one of the 698 predefined topic categories for easier content indexation.

Sentiment analysis

Determining the sentiment or opinion behind a piece of audio, such as a conversation or dialogue, using natural language processing.

Speech moderation

Allows to automatically identify and flag hate speech or other inappropriate and offensive verbal content according to pre-determined parameters.

Emotion detection

Our emotion recognition system is built upon the latest research and aims to accurately identify and distinguish between 27 human emotions.



Perfect for developers, early-stage startups, and individuals



(10h/month included)


Designed to grow with scaling digital companies



+ $0.00004 / sec for live transcription


Custom plan tailored to the modern enterprise

Contact us

We initially attempted to host Whisper AI, which required significant effort to scale. Switching to Gladia's transcription service brought a welcome change.

Robin lambert, CPO LIVESTORM

Read more


How to integrate live transcription API with Twilio to transcribe calls in real time.

Twilio, used by hundreds of thousands of businesses and more than ten million developers worldwide, can now integrate with our live transcription API. The integration makes it easier for users to natively transcribe any phone call in real time while using Twilio. With transcribed text at your disposal, you'll then be able to analyze, archive, and act upon voice data more effectively.


Best speech-to-text APIs in 2023

Speech-to-text (STT), also known as automatic speech or voice recognition, is a type of AI technology that recognizes human speech in audio or video and transcribes it into written output. In the form of an API, it can power a variety of applications, ranging from call bots to voice assistants to AI-powered virtual meeting platforms.


How to build a voice-to-text Discord both with Gladia real-time transcription API

Discord, the leading communication platform for gamers and communities, is designed for seamless communication with other users, be it through text channels, DMs, 1-1 calls or even collective voice channels.