Qwen3-ASR Demo

Model: Qwen3-ASR-1.7B with Qwen3-ForcedAligner-0.6B

Qwen3-ASR is a state-of-the-art automatic speech recognition model that supports 52+ languages and dialect with high accuracy. This demo showcases the 1.7B model which provides excellent multilingual recognition capabilities.

Features:

Multi-language ASR (Chinese, English, Japanese, Korean, and 52+ more languages and dialect)
Word/character-level timestamp alignment
Interactive timestamp visualization - hear each word/character segment!