DeepSeek just released V3.2, matching OpenAI's GPT-5 performance while using 90% fewer computational resources. The Hangzhou lab trained their model with just 2.8 million GPU hours compared to the 30+ million typically needed for frontier models.
The breakthrough comes from DeepSeek Sparse Attention, which processes only the most relevant information instead of all tokens equally. V3.2-Speciale even scored gold on the 2025 International Math Olympiad, rivaling Google's Gemini 3 Pro.
This challenges the assumption that frontier AI requires massive compute budgets. If smaller labs can achieve similar results with smarter architecture, what does this mean for the AI race?