DeepSeek Unveils V4 at Rock-Bottom Prices With Full Support From Huawei Chips

{Post Title} | AiToolInsight

A year after shocking global markets with its V3 model, the Hangzhou-based startup DeepSeek is back dropping V4 on the world with 1.6 trillion parameters, a million-token context window, open-source weights, and pricing that makes every major US lab look wildly overpriced. Oh, and it runs on Huawei chips. NVIDIA’s CEO is not pleased.

What Is DeepSeek V4 and Why Does It Matter?

DeepSeek V4 is the fourth-generation flagship large language model (LLM) family from DeepSeek, the Chinese AI startup that burst into the global spotlight in early 2025 when its V3 and R1 models proved frontier-class AI performance didn’t require a frontier-class budget. V4 is the first major new model the lab has released since R1 briefly crashed US tech stocks by more than $1 trillion in market value in January 2025.

The Hangzhou-based company released two versions simultaneously on April 24, 2026: DeepSeek-V4-Pro, a heavyweight 1.6 trillion-parameter model designed to rival the world’s top closed-source systems, and DeepSeek-V4-Flash, a leaner 284 billion-parameter variant aimed at speed and cost-efficiency. Both are available on Hugging Face with open weights, through the DeepSeek API, and on chat.deepseek.com.

What makes V4 different from a typical model launch and what’s already sending shockwaves through both markets and policy circles is the trifecta it hits simultaneously: near-frontier performance, prices that undercut the competition by as much as 90%, and full deployment on Huawei’s Ascend AI chips, signaling that China’s AI stack may be closer to full independence from US silicon than anyone in Washington had hoped.

The Pricing Is the Real Story

DeepSeek has always competed on price. But with V4, the gap between China and the US AI industry is getting harder to ignore.

Output Pricing Comparison Per 1 Million Tokens

ModelProviderPrice / 1M Output Tokens
GPT-5.4OpenAI$30.00
Claude Opus 4.6Anthropic$25.00
Kimi K26Moonshot AI$4.00
DeepSeek V4-ProDeepSeek$3.48
DeepSeek V4-FlashDeepSeek$0.28

For context: OpenAI charges $30 per million output tokens for its GPT-5.4 model, and Anthropic charges $25 for Claude Opus 4.6. DeepSeek’s V4-Pro comes in at $3.48. That’s nearly a 9x price gap against OpenAI for a model that, by coding benchmarks at least, performs within striking distance.

The gap is even more striking when you realize that both OpenAI and Anthropic have been moving in the opposite direction. Both labs recently hiked prices and imposed rate limits to manage surging demand. Other Chinese developers followed suit as well, raising prices and removing unlimited usage plans. DeepSeek is not just holding the line it’s still cutting.

And prices could go even lower. DeepSeek has stated it expects to reduce V4-Pro pricing once Huawei ramps up production of its new Ascend 950 AI processors in the second half of 2026. That’s an important signal: cheaper hardware leads to cheaper inference, which leads to cheaper APIs.How DeepSeek V4 Actually Works

DeepSeek V4 uses a Mixture-of-Experts (MoE) architecture a design where not all parameters are used for every inference. V4-Pro has 1.6 trillion total parameters but activates only 49 billion per token. V4-Flash has 284 billion total parameters with 13 billion active per token. This selective activation is key to the model’s efficiency: it’s why DeepSeek can offer massive scale at dramatically lower compute costs.

The Pro version was pre-trained on 33 trillion tokens; Flash was trained on 32 trillion tokens. Both support a one million token context window roughly equivalent to 750,000 words which DeepSeek describes as a default native feature, not an add-on. Most LLMs treat long context as a bolt-on capability that gets expensive fast. DeepSeek built it into the architecture from the ground up, using a hybrid of Compressed Sparse Attention (CSA) techniques that radically reduce the compute and memory required to process extremely long inputs.

The result: at one million tokens, V4-Pro uses only 27% of the per-token inference compute that DeepSeek’s previous V3.2 model required. V4-Flash pushes that efficiency further to just 10%. That’s the underlying reason the pricing can be this aggressive the model genuinely costs less to run.

Both V4 models are also built with agentic use cases at the center. DeepSeek says V4-Pro already serves as the internal agentic coding assistant for its own employees, with internal feedback placing its performance above Anthropic’s Claude Sonnet 4.5 and close to Claude Opus 4.6 in non-thinking mode. If you’re a developer building AI-powered workflows, these are numbers worth taking seriously. Our deep dive on DeepSeek’s full AI suite covers how to integrate earlier models into your stack.

So How Good Is It, Really?

DeepSeek’s benchmarks are strong enough to get attention but the company is also honest about where it stands.

According to DeepSeek’s official technical report, V4 “falls marginally short of GPT-5.4 and Gemini 3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models by approximately 3 to 6 months.” That’s a rare and refreshing level of candor in a press cycle that usually involves only superlative claims.

That said, the specific benchmarks are genuinely impressive. On SWE-bench Verified a widely-used coding evaluation V4-Pro scores 80.6%, which is within 0.2 percentage points of Claude Opus 4.6. On the competitive programming platform Codeforces, V4-Pro achieves a rating of 3,206, placing it 23rd among human competitors. DeepSeek claims V4-Pro achieves the best results of any open-source model in agentic coding evaluations. For reasoning tasks, the model’s “thinking mode” shows particularly dramatic gains the Humanity’s Last Exam (HLE) benchmark score jumps from 7.7 in non-thinking mode to 37.7 in thinking mode, a near-fivefold improvement.

What’s notable here is what Huang’s comment reveals: the concern isn’t just that China has a capable model. It’s that the model was trained and is running on non-NVIDIA hardware. The training-side shift to Huawei silicon matters as much as the performance numbers themselves.

The Huawei Chip Story Is the Bigger Deal

DeepSeek worked closely with Huawei for months to ensure V4 runs and was trained on Huawei’s Ascend AI processors. On the day of V4’s release, Huawei announced that its Ascend supernodes would offer “full support” for the new model. Semiconductor Manufacturing International Corporation (SMIC), which manufactures Huawei’s Ascend chips, saw its shares jump 10% in Hong Kong trading on the news. Competing Chinese AI labs MiniMax and Knowledge Atlas each dropped more than 9%.

This is significant for multiple reasons. US export controls have blocked Chinese developers from buying the most advanced AI chips from companies like NVIDIA and AMD. Rather than crippling China’s AI development as intended, those restrictions appear to have done something different: they forced Chinese developers to innovate around hardware constraints, producing models that extract more performance from less silicon.

DeepSeek explicitly gave Huawei’s Ascend chips exclusive early optimization access to V4 during development reportedly denying NVIDIA and AMD that same early access. That’s a deliberate strategic signal. If China can train and deploy frontier-class models on domestic chips, the leverage that US chip controls provide begins to shrink substantially. The Ascend 950, which sits broadly between NVIDIA’s H100 and H200 in capability according to analysts, is already in production. Huawei’s next-generation 960 and 970 chips are in the pipeline, each targeting roughly double the performance gains. The trajectory here is upward and fast.

This also has direct implications for where the AI supply chain is heading. As we covered in our analysis of the Stanford 2026 AI Index, China’s lead over the US in some AI capability metrics is narrowing faster than most experts predicted just two years ago.

The Geopolitical Flashpoint Nobody Is Ignoring

The same day DeepSeek dropped V4, US science advisor Michael Kratsios published a White House memo accusing Chinese AI developers of running “industrial-scale campaigns” to copy US technology. The specific claim is that Chinese labs including DeepSeek have been conducting “illicit distillation attacks,” meaning they train their models on the outputs of US models like GPT and Claude rather than from scratch. OpenAI and Anthropic have both leveled similar accusations.

China’s foreign ministry called those claims “groundless” and characterized them as a smear against China’s AI achievements. The dispute is unlikely to be resolved anytime soon, and the absence of independent verification cuts both ways the accusations are unproven, but so are DeepSeek’s counter-claims of pure domestic development. What’s not in dispute is the outcome: a Chinese lab has produced a model that, by many measures, performs within a few months of the global frontier, at a fraction of the cost.

The fundraising news adds another layer. Both the Financial Times and The Information report that DeepSeek is in talks to raise money from Tencent and Alibaba in a funding round valuing the lab at $20 billion. The motivation isn’t cash parent company High-Flyer, a Chinese hedge fund, is reportedly well-funded. The goal is talent retention. With competitor AI labs offering larger valuations and equity packages, DeepSeek needs to give its top researchers a reason to stay.

This is a familiar dynamic in the US AI race see the massive capital poured into the Amazon-Anthropic $25 billion investment playing out now at scale in China. The war for AI talent is as global as the war for AI chips.

Real-World Implications for Developers and Businesses

DeepSeek V4 is a direct opportunity for any developer or company that currently uses a commercial LLM for coding assistance, document processing, or agentic workflows. The pricing math is straightforward: the same work that costs $30 per million tokens with OpenAI costs $3.48 with DeepSeek and less than a dollar with V4-Flash.

The one million token context window is also genuinely useful at the application level. Developers can pass entire codebases, lengthy legal documents, or extensive research corpora to the model in a single call without chunking. This is where the architectural efficiency of V4 translates into real product capability, not just benchmark bragging rights. If you’re building AI-powered developer tools, this context length is a game-changer worth testing. Our overview of the best AI developer tools has additional context on how to evaluate these models for real workflows.

That said, businesses with US government contracts, defense adjacency, or strong data residency requirements should proceed carefully. The DeepSeek API routes through servers based in China. For many enterprise use cases, that’s a dealbreaker regardless of the pricing advantage. The open-source weights mean you can run V4 locally but at 1.6 trillion parameters, local deployment is a serious infrastructure commitment that most teams won’t handle easily.

What’s Next and What It Means for the AI Race

DeepSeek releasing V4 into a landscape where OpenAI just shipped GPT-5.5 and Anthropic is testing its unreleased Mythos model with select enterprise partners is not a coincidence of timing. This is what competition at the frontier looks like in 2026: continuous pressure from multiple directions, including one that operates entirely outside the US regulatory and capital ecosystem.

The more immediate story is hardware. As Huawei scales its Ascend 950 supernodes through the second half of this year, DeepSeek’s API prices will likely fall further. If the Ascend 960 and 970 chips hit their performance targets roughly double the Ascend 950 the cost efficiency of running Chinese models on Chinese chips could become a structural advantage, not just a temporary pricing play.

The US strategy of export controls was always a bet that limiting chip access would limit Chinese AI capability. DeepSeek V4 makes that bet look shakier by the quarter. Whether you’re a developer looking for cost savings, a business evaluating your AI vendor stack, or a policymaker watching the US-China tech gap, V4 is a model release that deserves close attention not because it beats the best American models outright, but because it’s close enough, cheap enough, and independent enough to change the calculus for a lot of people.

Frequently Asked Questions

What is DeepSeek V4 and when was it released?

DeepSeek V4 is a family of open-source large language models released on April 24, 2026 by DeepSeek, a Chinese AI startup. It comes in two versions: V4-Pro (1.6 trillion parameters) and V4-Flash (284 billion parameters). Both support 1 million token context windows and are available for free download on Hugging Face under the MIT License.

How does DeepSeek V4 pricing compare to OpenAI and Anthropic?

DeepSeek V4-Pro costs $3.48 per million output tokens roughly 9x cheaper than OpenAI’s GPT-5.4 at $30 and 7x cheaper than Anthropic’s Claude Opus 4.6 at $25. V4-Flash costs just $0.28 per million tokens. DeepSeek expects prices to drop further as Huawei scales production of its Ascend 950 chips.

Does DeepSeek V4 run on Huawei chips?

Yes. DeepSeek V4 was both trained on and is deployed using Huawei’s Ascend AI processors. Huawei announced “full support” for DeepSeek V4 on the day of the model’s launch. This marks a significant shift away from NVIDIA hardware and is seen as a key step toward a self-contained Chinese AI supply chain.

How does DeepSeek V4-Pro perform compared to GPT-5 and Claude?

DeepSeek’s own technical report states that V4-Pro trails GPT-5.4 and Gemini 3.1-Pro by roughly 3–6 months in capability. On SWE-bench Verified (coding), V4-Pro scores 80.6% within 0.2 points of Claude Opus 4.6. It claims the top score among all open-source models in agentic coding evaluations.

Is DeepSeek V4 safe to use for business applications?

DeepSeek V4 is open-source and can be run locally, which addresses data residency concerns. However, using the cloud API routes data through servers in China. For companies with US government contracts, defense ties, or strict data compliance requirements, local deployment of the open weights is the safer path though it requires significant compute infrastructure.