DeepSeek has unveiled its new distilled R1 model

Nigar Sultanli

DeepSeek

30 May 2025 10:53

522 views

DeepSeek has unveiled its new distilled R1 model

DeepSeek has not only introduced its new R1 model that captured the attention of the AI community this week, but also released a lighter, distilled version called DeepSeek-R1-0528-Qwen3-8B. This new model stands out by outperforming similarly sized models in certain benchmarks.

The distilled model is built on Alibaba’s Qwen3-8B, which was launched in May. According to DeepSeek, it surpasses Google’s Gemini 2.5 Flash in AIME 2025, a challenging math benchmark. It also performs nearly on par with Microsoft’s Phi-4 Reasoning Plus in another math-focused test, HMMT.

Distilled models typically offer lower performance than their full-scale counterparts, but their major advantage lies in requiring significantly less computing power. According to cloud platform NodeShift, Qwen3-8B can run on a single GPU with 40GB–80GB of RAM, such as the Nvidia H100. In contrast, the full-sized R1 model demands around twelve 80GB GPUs.

DeepSeek-R1-0528-Qwen3-8B was created by collecting outputs from the full R1 model and using them to fine-tune Qwen3-8B.

In its official page on the popular AI platform Hugging Face, DeepSeek notes that the model is intended for both academic research and small-scale industrial applications.

Additionally, the model is released under the permissive MIT license, allowing for unrestricted commercial use. Several platforms, including LM Studio, already offer API access to the model.