Which languages are supported?

It primarily supports Mandarin, English, Cantonese, Japanese, and Korean, with more being added via the ModelScope community.

FunASR

FunASR | Find AI List

Overview

FunASR is a fundamental speech recognition toolkit developed by Alibaba DAMO Academy’s Speech Lab, engineered to bridge the gap between academic research and production-grade industrial applications. Positioned as a market leader in 2026 for multilingual processing, its core architecture utilizes the Paraformer model—a non-autoregressive transformer that achieves state-of-the-art accuracy while significantly reducing inference latency compared to traditional RNN-T or Whisper-based models. The framework is highly modular, integrating Voice Activity Detection (VAD) via FSMN-VAD, punctuation restoration through CT-Transformer, and speaker diarization using the CAM++ model. FunASR is specifically optimized for long-form audio processing and real-time streaming, offering unique features like hotword customization (Seaco-Paraformer) to handle technical jargon and proper nouns. By supporting deployment across ONNX, TensorRT, and various edge devices, it provides enterprises with a privacy-first, self-hosted alternative to proprietary APIs. It is particularly dominant in the Asia-Pacific market due to its superior handling of Mandarin-English code-switching and diverse Chinese dialects, making it a critical asset for global enterprises targeting cross-border communication and localized customer service automation.

Common tasks

Automatic Speech Recognition Speaker Diarization Voice Activity Detection Punctuation Restoration Timestamp Prediction

FAQ

View all

How does FunASR compare to OpenAI Whisper?

FunASR is generally faster due to its non-autoregressive architecture and provides better accuracy for Mandarin dialects and technical hotwords.

Can I run FunASR on a CPU?

Yes, by exporting the models to ONNX format, FunASR can achieve high performance on standard CPUs.

Does it support real-time streaming?

Yes, FunASR includes specific 'online' models designed for low-latency streaming applications.

Is it completely free for commercial use?

Yes, the framework and the majority of models are under the Apache 2.0 license, allowing for commercial modification and distribution.

FAQ+

How does FunASR compare to OpenAI Whisper?

FunASR is generally faster due to its non-autoregressive architecture and provides better accuracy for Mandarin dialects and technical hotwords.

Can I run FunASR on a CPU?

Yes, by exporting the models to ONNX format, FunASR can achieve high performance on standard CPUs.

Does it support real-time streaming?

Yes, FunASR includes specific 'online' models designed for low-latency streaming applications.

Is it completely free for commercial use?

Yes, the framework and the majority of models are under the Apache 2.0 license, allowing for commercial modification and distribution.

View all

Compare with top alternatives

Full compare

Tool	Pricing	Rating	Visits
FunASRCurrent	Freemium	-	-
Deepgram	Freemium	★ 0.0	-
Google Cloud Speech-to-Text	Freemium	★ 0.0	-
insanely-fast-whisper	Free	★ 0.0	-

FunASR

Current

Pricing: Freemium
Rating: -
Visits: -

Deepgram

Pricing: Freemium
Rating: ★ 0.0
Visits: -

Google Cloud Speech-to-Text

Pricing: Freemium
Rating: ★ 0.0
Visits: -

insanely-fast-whisper

Pricing: Free
Rating: ★ 0.0
Visits: -

FunASR

Should you use FunASR?

Overview

FAQ

Pricing

Pros & Cons

Compare with top alternatives

More tools from Modelscope

Reviews & Ratings