
TL;DR
- Mistral launches Voxtral, an open-source family of audio models for production use.
- Supports 40 minutes of audio comprehension, live transcription, summaries, and real-time API actions.
- Two model variants: Voxtral Small (24B) and Voxtral Mini (3B), plus a budget version for transcription.
- Multilingual support across 8 global languages.
- Publicly accessible on Hugging Face and via chatbot Le Chat, starting at $0.001/minute.
Mistral Takes on Audio AI With Voxtral
French AI company Mistral has released Voxtral, its first open-source AI audio model, entering the competitive voice-tech market with an offering that rivals OpenAI Whisper and ElevenLabs Scribe.
Designed to bring enterprise-grade functionality to developers without the constraints of proprietary systems, Voxtral provides real-time voice intelligence, from transcription to comprehension and API execution.
Unlike closed offerings, Voxtral gives businesses full control over deployment, with significantly lower operating costs.
Speech-to-Action for the Real World
At the core of Voxtral is Mistral Small 3.1, a powerful LLM backbone that enables the model to:
- Transcribe up to 30 minutes of audio,
- Understand 40 minutes of context,
- Provide voice-based summarization, Q&A, and
- Execute live functions via API calls.
This enables Voxtral to support use cases such as:
- Automated meeting summaries,
- Voice-powered applications,
- Multilingual transcription for media and customer service,
- AI agents with voice-command functionality.
Open Weights, Edge Models, and API Pricing
Voxtral comes in two main variants:
- Voxtral Small (24B parameters): For enterprise-scale production deployments.
- Voxtral Mini (3B parameters): For local and on-device inference.
Additionally, Mistral has released Voxtral Mini Transcribe, a stripped-down API variant optimized for transcription-only use cases. Mistral claims it outperforms Whisper at less than half the cost.
Developers can access Voxtral freely via Hugging Face, or test it live through Mistral’s chatbot Le Chat. The API starts at $0.001 per minute.
Voxtral by the Numbers
Metric | Value | Source |
Audio comprehension limit | 40 minutes | Mistral Launch Blog |
Model size (Voxtral Small) | 24 billion parameters | Hugging Face – Voxtral |
Pricing (API usage) | $0.001 per minute | Mistral API Pricing |
Supported languages | 8 (English, Spanish, French, Hindi, etc.) | Voxtral Launch |
Primary competitors | OpenAI Whisper, ElevenLabs Scribe | TechCrunch Report |
A Step Toward Open Audio Standards
The launch of Voxtral follows the company’s introduction of Magistral, a reasoning model family released last month.
As one of Europe’s most prominent AI firms, Mistral continues to champion open-source alternatives, offering developers the freedom to build complex AI systems without vendor lock-in. With growing interest from investors—including a rumored $1B raise led by MGX)—Mistral’s pace in AI innovation appears to be accelerating.