What is the difference between Solaria-1 and Solaria-3?

Solaria-1 offers maximum language coverage across 100+ languages and excels on clean read-speech and formal institutional audio. Solaria-3 is optimised for European real-world audio quality — with 9.6% WER on real English customer audio, #1 on Earnings22 (6.4% WER), and #1 on Switchboard (33.9% WER) across EN, FR, DE, ES, and IT.

How can I try Solaria-3 for free?

Redeem TRY-SOLARIA-3 once per account for $200 worth of hours of async transcription via the Gladia app at app.gladia.io. Must be used by June 21st at the latest. Select Solaria-3 as the model. Full pricing applies after.

When should I choose Solaria-1 over Solaria-3?

If formal institutional speech or clean read-speech is your primary use case — VoxPopuli Cleaned AA or Multilingual LibriSpeech — Solaria-1 remains the better choice. Solaria-3 is optimised for European real-world audio quality.

Provider	WER (%)	Rank
Solaria-3	9.6	#1
ElevenLabs Scribe v2	9.9	#2
AssemblyAI	10.0	#3
Deepgram Nova-3	10.7	#4
Mistral Voxtral	12.2	#5
Solaria-1	12.9	#6

Provider	WER (%)	Rank
Solaria-3	6.4	#1
AssemblyAI	6.9	#2
ElevenLabs Scribe v2	7.7	#3
Speechmatics	7.8	#4
Mistral Voxtral	7.9	#5
Solaria-1	8.1	#6
Deepgram Nova-3	12.0	#7

Provider	WER (%)	Rank
Solaria-3	33.9	#1
Solaria-1	37.3	#2
AssemblyAI	42.3	#3
Speechmatics	46.0	#4
Mistral Voxtral	48.1	#5
Deepgram Nova-3	49.8	#6
ElevenLabs Scribe v2	55.2	#7

Provider	WER (%)	Rank
Mistral Voxtral	1.0	#1
Solaria-3	1.4	#2
Solaria-1	1.9	#3
Speechmatics	1.9	#3
AssemblyAI	2.1	#5
Deepgram Nova-3	3.2	#6
ElevenLabs Scribe v2	4.0	#7

Benchmark	Solaria-3 WER (%)	Solaria-1 WER (%)	Solaria-3 rank / note
Earnings22 Cleaned AA	6.4	8.1	#1 (−21% vs Solaria-1)
Switchboard	33.9	37.3	#1 (−9% vs Solaria-1)
Noisy audio	1.4	1.9	#2 (−26% vs Solaria-1)
Common Voice 24	6.9	8.2	−16% vs Solaria-1
FLEURS	3.7	3.9	−5% vs Solaria-1
VoxPopuli	2.9	2.2	Regression: +32% vs Solaria-1 (formal parliamentary speech)
Multilingual LibriSpeech	8.0	5.9	Regression: +36% vs Solaria-1 (clean read-speech)

New model

Solaria-3

Solaria-3 is built for production audio — noisy, fast-paced, and conversational. Best-in-class on real customer recordings in English and core European languages, with higher precision on the names, terms, and entities that matter most in business scenarios.

Try Solaria-3 Try Solaria-3

#1 on English production audio with real customer calls

#1 on business audio & conversational call center speech

5 langs with superior performance in EN, FR, DE, ES, IT

Why Solaria-3

Accurate where it counts

Solaria-3 excels with the challenging production audio that breaks other models.

Best on real English audio

The truest test of a speech model isn't a curated benchmark — it's real customer calls. On Gladia's internal English dataset, drawn from real production recordings annotated by humans, Solaria-3 hits 9.6% WER — at the very top of the field, and 26% better than Solaria-1.

Deepgram — 10.7% WER

"hello ladies and gentlemen thank you for standing by for cugen's third quarter twenty twenty one earnings…"

Solaria-3 — 4.2% WER

"Hello, ladies and gentlemen. Thank you for standing by. Qudian's third quarter 2021 earnings conference…"

Company name mangled, numbers written as words. On a 15-minute call, errors compound.

#1 on business calls and telephone speech

Production audio is never clean. Call center recordings, mobile interviews, field audio — they all carry background noise, compression artifacts, and variable microphone quality. Solaria-3 handles them all, with leading WER on Earnings22 and Switchboard.

Solaria-1, Speechmatics, ElevenLabs — 100% WER

"Yeah, not even that much, probably. Well, that would be a good time."

Solaria-3 — 0.0% WER

"Yeah, not even that much probably. Yeah."

On degraded phone audio, three providers hallucinate an entire sentence that was never spoken.

The most accurate model for European languages

Multilingual accuracy has been core to Gladia since day one. Solaria-3 delivers on that commitment with its strongest European performance yet, tested on real customer calls across English, French, German, Spanish, and Italian.

Solaria-1 & Mistral — 50% WER

"Thus the bison tens were focused to fight lone."

Solaria-3 — 0.0% WER

"Thus, the Byzantines were forced to fight alone."

Three independent errors on an 8-word sentence. A proper noun, a verb, and an adverb — all wrong at once.

Benchmarks

The numbers, unfiltered

WER by language. Lower is better. Results include regressions — we publish the full picture.

Benchmark summary (WER — lower is better)

Dataset	Solaria-3	Rank
Real customer audio — English	9.6%	#1
Earnings22 cleaned AA	6.4%	#1
Switchboard	33.9%	#1
Noisy audio	1.4%	#2

Solaria-3 vs. Solaria-1 — all European languages

Relative WER improvement. Negative = better.

Language	Real customer audio	Common Voice 24
English (EN)	−26%	−16%
French (FR)	−18%	−19%
Italian (IT)	−10%	−12%
Spanish (ES)	−9%	≈ flat
German (DE)	−3%	−13%

Real customer audio = Gladia's internal production dataset, annotated by humans.

Solaria-1

Where Solaria-1 is still stronger

Solaria-3 steps back on two benchmarks vs. Solaria-1: VoxPopuli Cleaned AA (+32% — formal parliamentary speech) and Multilingual LibriSpeech (+36% — clean read-speech). We're publishing this openly. If formal institutional speech or clean read-speech is your primary use case, Solaria-1 remains the better choice.

Compare

Solaria-1 vs. Solaria-3 — Which one is right for you

Two models, two jobs. Solaria-3 for real-world European audio quality. Solaria-1 for full breadth — 100+ languages, code-switching, formal speech.

	Solaria-3	Solaria-1
Best for
Primary use case	Highest accuracy on European real-world audioBest EU real-world accuracy	Maximum language coverage across any domainMax language coverage
Recommended for	Business audio, call centers, noisy recordings, real-world European speechCalls, noisy audio, EU speech	Global multilingual, rare languages, clean read-speech, formal/institutional audioMultilingual, clean speech
Language coverage
Languages	Optimized for EN, FR, DE, ES, ITEN, FR, DE, ES, IT	100+ languages incl. 42 exclusive to Gladia100+ languages
Code-switching	Limited	✓ Supported✓
Auto language detectionAuto detection	✓ Supported✓	✓ Supported✓
Accuracy (WER — lower is better)Accuracy (WER ↓)
Earnings22 Cleaned AAEarnings22	#1 — 6.4% (−21%)	8.1%
Switchboard	#1 33.9% (−9%)	37.3%
Noisy audio	#2 1.4% (−26%)	1.9%
Common Voice 24Common Voice	6.9% (−16%)	8.2%
FLEURS	3.7% (−5%)	3.9%
VoxPopuli	2.9% (+32%)	2.2%
Multilingual LibriSpeechMLS	8.0% (+36%)	5.9% — stronger5.9%
Architecture & performanceArchitecture
On-premise deploymentOn-premise	Available✓	Available✓
Real-time streamingStreaming	Async only, for nowAsync only	<103ms partials<103ms
Availability
Status	Free for 5 days → GATrial → GA	Generally availableGA

Try Solaria-3 for free Try Solaria-3 for free

VoxPopuli and Multilingual LibriSpeech regressions published openly. For formal institutional speech or clean read-speech, Solaria-1 remains the better choice.

Try Solaria-3

Join 2,000+ enterprise teams building with Gladia.

Try it now Try it now

Solaria-3 ASR Benchmark Results — Word Error Rate (WER %, lower is better)

Real customer audio — English (Gladia internal production dataset)

Earnings22 Cleaned AA — Financial Calls (Curated by Artificial Analysis)

Switchboard — Conversational Speech

Noisy audio — Degraded production audio

Solaria-3 vs Solaria-1 — European language WER improvement (negative = Solaria-3 better)

Solaria-3 vs Solaria-1 — accuracy by benchmark (WER %, lower is better)