Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

Röya Yusifbəyli

alibaba OpenAI

28 November 2024 21:29

1015 views

Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

A new so-called “reasoning” AI model, QwQ-32B-Preview, has arrived on the scene. It’s one of the few to rival OpenAI’s o1, and it’s the first available to download under a permissive license.

Developed by Alibaba’s Qwen team, QwQ-32B-Preview contains 32.5 billion parameters and can consider prompts up ~32,000 words in length; it performs better on certain benchmarks than o1-preview and o1-mini, the two reasoning models that OpenAI has released so far. (Parameters roughly correspond to a model’s problem-solving skills, and models with more parameters generally perform better than those with fewer parameters. OpenAI does not disclose the parameter count for its models.)

Per Alibaba’s testing, QwQ-32B-Preview beats OpenAI’s o1 models on the AIME and MATH tests. AIME uses other AI models to evaluate a model’s performance, while MATH is a collection of word problems.

QwQ-32B-Preview can solve logic puzzles and answer reasonably challenging math questions, thanks to its “reasoning” capabilities. But it isn’t perfect. Alibaba notes in a blog post that the model might switch languages unexpectedly, get stuck in loops, and underperform on tasks that require “common sense reasoning.”

Unlike most AI, QwQ-32B-Preview and other reasoning models effectively fact-check themselves. This helps them avoid some of the pitfalls that normally trip up models, with the downside being that they often take longer to arrive at solutions. Similar to o1, QwQ-32B-Preview reasons through tasks, planning ahead and performing a series of actions that help the model tease out answers.

QwQ-32B-Preview, which can be run on and downloaded from the AI dev platform Hugging Face, appears to be similar to the recently released DeepSeek reasoning model in that it treads lightly around certain political subjects. Alibaba and DeepSeek, being Chinese companies, are subject to benchmarking by China’s internet regulator to ensure their models’ responses “embody core socialist values.” Many Chinese AI systems decline to respond to topics that might raise the ire of regulators, like speculation about the Xi Jinping regime.