{"id":96482,"date":"2026-04-24T15:42:43","date_gmt":"2026-04-24T12:42:43","guid":{"rendered":"https:\/\/u1f987.com\/en\/?p=96482"},"modified":"2026-04-24T15:45:21","modified_gmt":"2026-04-24T12:45:21","slug":"deepseek-unveils-a-rival-to-claude-chatgpt-and-gemini","status":"publish","type":"post","link":"https:\/\/u1f987.com\/en\/deepseek-unveils-a-rival-to-claude-chatgpt-and-gemini\/","title":{"rendered":"DeepSeek Unveils a Rival to Claude, ChatGPT, and Gemini"},"content":{"rendered":"<p>Chinese AI startup DeepSeek has released a preview of its new line of language models. The flagship V4-Pro has surpassed Claude Opus 4.6 and GPT-5.4, becoming the best open system available.\u00a0<\/p>\n<blockquote class=\"twitter-tweet\">\n<p lang=\"en\" dir=\"ltr\">\ud83d\ude80 DeepSeek-V4 Preview is officially live &#038; open-sourced! Welcome to the era of cost-effective 1M context length.<\/p>\n<p>\ud83d\udd39 DeepSeek-V4-Pro: 1.6T total \/ 49B active params. Performance rivaling the world&#8217;s top closed-source models.<br \/>\ud83d\udd39 DeepSeek-V4-Flash: 284B total \/ 13B active params.\u2026 <a href=\"https:\/\/t.co\/n1AgwMIymu\">pic.twitter.com\/n1AgwMIymu<\/a><\/p>\n<p>\u2014 DeepSeek (@deepseek_ai) <a href=\"https:\/\/twitter.com\/deepseek_ai\/status\/2047516922263285776?ref_src=twsrc%5Etfw\">April 24, 2026<\/a><\/p><\/blockquote>\n<p> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<h2 class=\"wp-block-heading\">Architecture and Scale\u00a0<\/h2>\n<p>The V4-Pro model comprises approximately 1.6 trillion parameters, but only 49 billion are used at each step. The second version, V4-Flash, has a total scale of 284 billion, with 13 billion activated.\u00a0<\/p>\n<p>Both models are built on a Mixture of Experts (MoE) architecture: only the subnetworks relevant to the task are activated for each token. This approach is more cost-effective than fully dense architectures while maintaining performance.<\/p>\n<p>Pre-training was conducted on a corpus exceeding 32 trillion tokens. Developers then fine-tuned the models in stages, dedicating separate blocks for coding, mathematics, logic, and instruction following. The final version integrates these skills through distillation.<\/p>\n<h2 class=\"wp-block-heading\">Long Context Made Affordable<\/h2>\n<p>A key distinction of V4 is the optimization for processing long sequences. While a 1 million token context window exists in other models, its use typically involves high costs and delays.\u00a0<\/p>\n<p>DeepSeek announced that the new version significantly reduced the resource demands of such operations. Compared to V3.2, V4-Pro requires about 27% of the computations and 10% of the memory <span data-descr=\"an optimization method in large language models that retains intermediate 'Key' (K) and 'Value' (V) tensors for previous tokens\" class=\"old_tooltip\">KV cache<\/span> when working with maximum context. For V4-Flash, the figures are approximately 10% and 7%, respectively.<\/p>\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" src=\"https:\/\/u1f987.com\/wp-content\/uploads\/img-e021e2138aaa63ab-2635810424543458.webp\" alt=\"image\" class=\"wp-image-278950\"\/><figcaption class=\"wp-element-caption\">Source: <a href=\"https:\/\/huggingface.co\/deepseek-ai\/DeepSeek-V4-Pro\/blob\/main\/DeepSeek_V4.pdf\">Hugging Face<\/a>.\u00a0<\/figcaption><\/figure>\n<p>The team achieved this through a hybrid attention architecture: two mechanisms compress data and reduce load when handling long texts. Special hyperconnections were also used for stability, and the Muon optimizer accelerated training.<\/p>\n<h2 class=\"wp-block-heading\">Reasoning Modes and Agent Capabilities<\/h2>\n<p>DeepSeek V4 supports three reasoning modes:<\/p>\n<ol class=\"wp-block-list\">\n<li>Non-think \u2014 quick responses to simple questions without additional analysis.\u00a0<\/li>\n<li>Think High \u2014 deep analysis for complex tasks and planning.\u00a0<\/li>\n<li>Think Max \u2014 maximum mode: the model outlines each step and checks all options.\u00a0\u00a0<\/li>\n<\/ol>\n<p>In agent tasks, the Max mode now retains the chain of intermediate steps within a single task. In the previous version, some of this context was lost during user interaction.\u00a0<\/p>\n<h2 class=\"wp-block-heading\">Testing Results<\/h2>\n<p>According to DeepSeek, the flagship version shows results comparable to leading systems in several areas:<\/p>\n<ul class=\"wp-block-list\">\n<li>In programming tasks on Codeforces, the model achieved a rating of 3206 \u2014 23rd place among active programmers worldwide, on par with GPT-5.4;<\/li>\n<li>In mathematics, it scored 95.2 on HMMT 2026 and 89.8 on IMOAnswerBench, surpassing most competitors;<\/li>\n<li>In SimpleQA Verified knowledge, it scored 57.9 (Opus 4.6 \u2014 46.2, but Gemini 3.1 Pro \u2014 75.6).<\/li>\n<li>In Reasoning, the models lag behind GPT-5.4 and Gemini 3.1 Pro by only three to six months;<\/li>\n<li>In DeepSeek&#8217;s internal test, including development, debugging, and refactoring tasks, the model achieved 67% \u2014 between Sonnet 4.5 (47%) and Opus 4.5 (70%);<\/li>\n<li>In agent scenarios and development tasks, V4-Pro-Max demonstrated 80.6% on SWE Verified and 67.9% on Terminal Bench.<\/li>\n<\/ul>\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" src=\"https:\/\/u1f987.com\/wp-content\/uploads\/img-790dff6c8eb7427c-2635829965260762.webp\" alt=\"image\" class=\"wp-image-278951\"\/><figcaption class=\"wp-element-caption\">Source: Hugging Face.\u00a0<\/figcaption><\/figure>\n<p>V4 was specifically trained on real-world scenarios: data analysis, reports, document editing, and internet searches with iterative tool use.<\/p>\n<p>To assess the model&#8217;s suitability for real development, the startup conducted internal testing on tasks from its engineers. In a survey of 85 developers and researchers, 52% stated they are ready to use V4-Pro as their primary coding model, while another 39% indicated they are inclined to do so.<\/p>\n<p>Back in April, OpenAI <a href=\"https:\/\/u1f987.com\/en\/news\/openai-unveils-gpt-5-5-its-flagship-ai-model\">released<\/a> GPT-5.5. The model is positioned as &#8220;a new level of intelligence for real work and agent management.&#8221;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Chinese AI startup DeepSeek has released a preview of its new line of language models. The flagship V4-Pro has surpassed Claude Opus 4.6 and GPT-5.4.<\/p>\n","protected":false},"author":1,"featured_media":96483,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"1","news_style_id":"1","cryptorium_level":"","_short_excerpt_text":"DeepSeek unveils its new language models, surpassing Claude Opus 4.6 and GPT-5.4.","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1743],"class_list":["post-96482","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-deepseek"],"aioseo_notices":[],"amp_enabled":true,"views":"19","promo_type":"1","layout_type":"1","short_excerpt":"DeepSeek unveils its new language models, surpassing Claude Opus 4.6 and GPT-5.4.","is_update":"","_links":{"self":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/96482","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/comments?post=96482"}],"version-history":[{"count":1,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/96482\/revisions"}],"predecessor-version":[{"id":96484,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/96482\/revisions\/96484"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media\/96483"}],"wp:attachment":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media?parent=96482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/categories?post=96482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/tags?post=96482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}