Speech to Text Recognition Model

News

aiOla drops ultra-fast ‘multi-head’ speech recognition model, beats ...

To develop Whisper-Medusa speech recognition model, aiOla modified Whisper’s architecture to add a multi-head attention mechanism.

Geeky Gadgets3mon

NVIDIA Parakeet 2 vs OpenAI Whisper: Which AI Speech Recognition Model ...

What if the race to perfect AI speech recognition wasn’t just about accuracy but also speed and usability? In a world where audio-to-text transcription powers everything from virtual meetings to ...

Tech Xplore on MSN12d

Researchers develop privacy-focused speech recognition for children

From the voice-to-text feature on your phone to the captions that make videos more accessible, speech transcription is ...

InfoQ10mon

Meta Spirit LM Integrates Speech and Text in New Multimodal GenAI Model

Presented in a recent paper, Spirit LM enables the creation of pipelines that mixes spoken and written text to integrate speech and text in the same multimodal model. According to Meta, their ...

Qifu Technology's Paper Achieves ASRU 2025, Demonstrating Strong Self-Research Capabilities in Speech Technology

The intelligent voice team at Qifu Technology has brought more good news — the multimodal emotion recognition research paper ...

10d

How to Choose a Speech-to-Text Converter

Key features, accuracy, and usability factors to consider when selecting the right speech-to-text converter for your needs ...

CU Boulder News & Events6mon

Fine-tuning a Strong Language model to Enable Classroom Speech Recognition

Combining audio, images, and text helps the model better understand speech context. To improve its performance, we fine-tune a strong language model by blending unsupervised learning with multimodal ...

VentureBeat5mon

A new, enterprise-specific AI speech model is here: Jargonic from aiOla ...

Jargonic is available immediately to enterprise customers via API, allowing them to integrate the model’s speech recognition capabilities into their own workflows, applications, or customer ...

Hosted on MSN1mon

Mistral launches Voxtral speech recognition model - MSN

Apache-licensed plan takes aim at costlier options Mistral has released an open automatic speech recognition (ASR) software bundle called Voxtral in a bid to undercut rivals on price and quality ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results