Chinese tech giant Alibaba has unveiled Qwen3, a new family of artificial intelligence models that, according to the company, rival — and in some cases surpass — the top-performing models from OpenAI and Google.
The Qwen3 model family — most of which are, or will soon be, available under an open license — can be accessed via AI development platforms like Hugging Face and GitHub. These models vary in size, ranging from 0.6 billion to 235 billion parameters. Parameters are a key indicator of a model’s reasoning ability, and generally, more parameters mean better performance.
The emergence of sophisticated AI models developed in China, like Qwen3, has increased pressure on U.S.-based AI labs, prompting U.S. policymakers to further tighten restrictions on the export of advanced chips to Chinese firms.
According to Alibaba, Qwen3 models are "hybrid" models, capable of rapidly answering simple questions while taking extra time to "reason through" more complex problems. This reasoning capability enables the model to internally fact-check its own answers — similar to OpenAI’s o3 model — although this comes with increased latency.
"We have seamlessly integrated thinking and non-thinking modes, giving users the flexibility to manage their reasoning budget," the Qwen team wrote in a blog post. "This design makes it easier to configure task-specific budgets."
Some Qwen3 models are also built on a Mixture of Experts (MoE) architecture, a method that improves computational efficiency by dividing a task into subtasks and assigning each to smaller, specialized “expert” models.
Qwen3 models support 119 languages and have been trained on a dataset of around 36 trillion tokens — raw units of data the model processes (1 million tokens ≈ 750,000 words). The training data includes textbooks, question-answer pairs, code snippets, and AI-generated content, among other sources.
These enhancements significantly improve performance compared to the Qwen2 generation. While Qwen3 models aren’t definitively superior to state-of-the-art models like OpenAI’s o3 or o4-mini, they hold their own in benchmark comparisons.
On the Codeforces competitive programming platform, the largest model, Qwen-3-235B-A22B, slightly outperforms both OpenAI’s o3-mini and Google’s Gemini 2.5 Pro. It also beats o3-mini in the AIME math benchmark and BFCL, a test of reasoning abilities. However, this particular model isn’t publicly available — at least not yet.
The largest publicly available model, Qwen3-32B, is still competitive with several other proprietary and open-source models. Notably, it outperforms OpenAI’s o1 model in various tasks, including the LiveCodeBench coding benchmark.
Alibaba highlights that Qwen3 models excel in tool-calling, instruction following, and data format replication. In addition to open downloads, Qwen3 is also accessible via cloud platforms, including Fireworks AI and Hyperbolic.