{"id":23145,"date":"2025-04-17T17:27:33","date_gmt":"2025-04-17T14:27:33","guid":{"rendered":"https:\/\/forklog.com\/en\/openai-unveils-o3-and-o4-mini-reasoning-models-prone-to-deception\/"},"modified":"2025-04-17T17:27:33","modified_gmt":"2025-04-17T14:27:33","slug":"openai-unveils-o3-and-o4-mini-reasoning-models-prone-to-deception","status":"publish","type":"post","link":"https:\/\/u1f987.com\/en\/openai-unveils-o3-and-o4-mini-reasoning-models-prone-to-deception\/","title":{"rendered":"OpenAI unveils o3 and o4-mini, reasoning models prone to deception"},"content":{"rendered":"<div class=\"wp-block-text-wrappers-keypoints article_keypoints\">\n<ul class=\"wp-block-list\">\n<li>OpenAI has introduced new \u201creasoning\u201d AI models, o3 and o4-mini.<\/li>\n<li>The key feature is \u201cthinking with images\u201d rather than merely analysing them.<\/li>\n<li>Safety testers reported that o3 and o4-mini show a propensity to deceive.<\/li>\n<li>The start-up is focusing on building programming agents.<\/li>\n<\/ul>\n<\/div>\n<p>OpenAI announced the launch of new AI models, o3 and o4-mini. Both lean into reasoning\u2014taking more time before responding to check their own work.\u00a0<\/p>\n<blockquote class=\"twitter-tweet\">\n<p lang=\"en\" dir=\"ltr\">Introducing OpenAI o3 and o4-mini\u2014our smartest and most capable models to date.<\/p>\n<p>For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation. <a href=\"https:\/\/t.co\/rDaqV0x0wE\">pic.twitter.com\/rDaqV0x0wE<\/a><\/p>\n<p>\u2014 OpenAI (@OpenAI) <a href=\"https:\/\/twitter.com\/OpenAI\/status\/1912560057100955661?ref_src=twsrc%5Etfw\">April 16, 2025<\/a><\/p><\/blockquote>\n<p> <script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n<p>OpenAI positions o3 as its most advanced \u201cthinking\u201d model. According to internal tests, it outperforms previous iterations in maths, programming, reasoning, science and visual understanding.\u00a0<\/p>\n<p>o4-mini offers a competitive balance of price, speed and performance.\u00a0<\/p>\n<p>Both models can browse the web, run Python code, and process and generate images. They, as well as the o4-mini-high variant, are available to Pro, Plus and Team subscribers.<\/p>\n<p>The company says o3 and o4-mini are the first not merely to recognise images but to literally \u201cthink with them.\u201d Users can upload images to ChatGPT\u2014such as whiteboard sketches or diagrams from PDFs\u2014and the models will analyse them using a so-called \u201cchain of thought.\u201d<\/p>\n<p>Thanks to this, the models can understand blurry and low-quality images. They can also execute Python code directly in the browser via Canvas in ChatGPT, or search the internet when asked about current events.\u00a0<\/p>\n<p>o3 scored 69.1% on the SWE-bench programming test, o4-mini 68.1%. o3-mini registered 49.3%, and Claude 3.7 Sonnet 62.3%.<\/p>\n<p>o3 costs $10 per million input tokens and $40 per million output tokens. For o4-mini the figures are $1.1 and $4.4, respectively.\u00a0<\/p>\n<p>In the coming weeks OpenAI plans to launch o3-pro\u2014a version of o3 that uses more compute to produce answers. It will be available only to ChatGPT Pro subscribers.<\/p>\n<h2 class=\"wp-block-heading\">A new safety system<\/h2>\n<p>OpenAI has <a href=\"https:\/\/cdn.openai.com\/pdf\/2221c875-02dc-4789-800b-e7758f3722c1\/o3-and-o4-mini-system-card.pdf\">implemented<\/a> a new monitoring system in o3 and o4-mini to detect queries related to biological and chemical threats. It is intended to prevent the provision of advice that could encourage potentially dangerous attacks.\u00a0<\/p>\n<p>The company noted that the new models have significantly expanded capabilities compared with previous ones and therefore carry heightened risk when used by ill-intentioned users.\u00a0<\/p>\n<p>o3 is more adept at answering questions related to creating certain types of biological threats, so the company built the new monitoring system. It runs on top of o3 and o4-mini and is designed to detect prompts tied to biological and chemical risk.\u00a0<\/p>\n<p>OpenAI specialists spent about 1,000 hours labelling \u201cunsafe\u201d conversations. The models then refused to answer risky prompts in 98.7% of cases.\u00a0\u00a0<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-qw.googleusercontent.com\/docsz\/AD_4nXdlp-lQI13U3I2Mzh3m3WHamp-g-c57_JqglIV5wrUxD3-yvuEn7M_Npo2xgBYYfB0z7ohjB5UrMj0I1VgMoTR1RB8VnpOJ31at0mSIk6U4dYvkBhhbaS2GiKQ-9r34WTHrmhK6Xg?key=mG69-tdHfBtUDz0VhIEOcYgP\" alt=\"OpenAI \u0432\u044b\u043f\u0443\u0441\u0442\u0438\u043b\u0430 \u0441\u043a\u043b\u043e\u043d\u043d\u044b\u0435 \u043a \u043e\u0431\u043c\u0430\u043d\u0443 \u0418\u0418-\u043c\u043e\u0434\u0435\u043b\u0438 o3 \u0438 o4-mini\"\/><figcaption class=\"wp-element-caption\">Comparison of OpenAI\u2019s new models on bio-risk. Data: OpenAI.<\/figcaption><\/figure>\n<p>Despite regular improvements to model safety, one of the company\u2019s partners expressed concern.<\/p>\n<h2 class=\"wp-block-heading\">OpenAI is rushing<\/h2>\n<p>The organisation Metr, which works with OpenAI to evaluate the capabilities and safety of its AI models, was given little time to test the new systems.\u00a0<\/p>\n<p>It <a href=\"https:\/\/metr.github.io\/autonomy-evals-guide\/openai-o3-report\/\">said<\/a> in a blog post that one of o3\u2019s benchmark experiments was completed \u201cin a relatively short time\u201d compared with the analysis of OpenAI\u2019s previous flagship model, o1.\u00a0<\/p>\n<p>According to the <a href=\"https:\/\/www.ft.com\/content\/8253b66e-ade7-4d1f-993b-2d0779c7e7d8\">Financial Times<\/a>, the AI start-up gave testers less than a week to check the safety of the new products.\u00a0<\/p>\n<p>Metr claims that, based on what it could collect in the limited time, o3 has \u201ca high propensity\u201d to \u201ccheat\u201d or \u201chack\u201d tests in sophisticated ways to maximise its score. It goes to extremes even when it clearly understands that the behaviour does not align with the intentions of the user and OpenAI.\u00a0<\/p>\n<p>The organisation believes o3 may exhibit other forms of hostile or \u201cmalicious\u201d behaviour.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cWhile we do not consider this especially likely, it is important to note that [our] evaluation setup will not be able to pick up this type of risk. Overall we believe that pre-deployment capability testing alone is not a sufficient risk-management strategy, and we are currently developing prototypes of additional forms of evaluation,\u201d the company stressed.\u00a0<\/p>\n<\/blockquote>\n<p>The company Apollo Research also <a href=\"https:\/\/cdn.openai.com\/pdf\/2221c875-02dc-4789-800b-e7758f3722c1\/o3-and-o4-mini-system-card.pdf\">recorded<\/a> deceptive behaviour by the o3 model. In one test it was forbidden to use a particular tool, but the model still used it, deciding it would help it handle the task better.\u00a0<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201c[Apollo\u2019s findings] show that o3 and o4-mini are capable of in-context scheming and strategic deception. Despite the relative harmlessness, everyday users should be aware of divergences between the models\u2019 statements and actions [\u2026] This could be further assessed by analysing internal traces of reasoning,\u201d OpenAI noted.\u00a0<\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\">A programming agent<\/h2>\n<p>Alongside the new AI models, OpenAI introduced <a href=\"https:\/\/github.com\/openai\/codex\">Codex CLI<\/a>, a local programming agent that runs directly from the terminal.<\/p>\n<p>The tool lets you write and edit code on the desktop and perform actions such as moving files.\u00a0<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cYou can get the benefits of multimodal reasoning from the command line by passing low-resolution screenshots or sketches to the model, combined with access to your code locally [via Codex CLI],\u201d the company noted.\u00a0<\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\">OpenAI wants to buy Windsurf<\/h2>\n<p>Meanwhile, OpenAI is in talks to acquire the popular AI assistant for programmers, Windsurf, <a href=\"https:\/\/www.bloomberg.com\/news\/articles\/2025-04-16\/openai-said-to-be-in-talks-to-buy-windsurf-for-about-3-billion\">Bloomberg<\/a> reports.\u00a0<\/p>\n<p>The deal could be the biggest purchase yet for Sam Altman\u2019s start-up. Its details are not yet set and may change, the agency emphasised.\u00a0<\/p>\n<p>In April, OpenAI unveiled a new family of AI models\u2014GPT-4.1, GPT-4.1 mini and GPT-4.1 nano. They \u201cdo an excellent job\u201d at programming and following instructions.\u00a0<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI has introduced new \u201creasoning\u201d AI models, o3 and o4-mini. The key feature is \u201cthinking with images\u201d rather than merely analysing them. Safety testers reported that o3 and o4-mini show a propensity to deceive. The start-up is focusing on building programming agents. OpenAI announced the launch of new AI models, o3 and o4-mini. Both lean [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":23144,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"","news_style_id":"","cryptorium_level":"","_short_excerpt_text":"","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1150,1190],"class_list":["post-23145","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-news-plus","tag-openai"],"aioseo_notices":[],"amp_enabled":true,"views":"69","promo_type":"","layout_type":"","short_excerpt":"","is_update":"","_links":{"self":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/23145","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/comments?post=23145"}],"version-history":[{"count":0,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/23145\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media\/23144"}],"wp:attachment":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media?parent=23145"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/categories?post=23145"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/tags?post=23145"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}