{"id":24282,"date":"2025-05-26T17:32:57","date_gmt":"2025-05-26T14:32:57","guid":{"rendered":"https:\/\/forklog.com\/en\/hallucinations-remain-ais-central-problem\/"},"modified":"2025-05-26T17:32:57","modified_gmt":"2025-05-26T14:32:57","slug":"hallucinations-remain-ais-central-problem","status":"publish","type":"post","link":"https:\/\/u1f987.com\/en\/hallucinations-remain-ais-central-problem\/","title":{"rendered":"Hallucinations remain AI\u2019s central problem"},"content":{"rendered":"<p>AI hallucinations are instances in which models confidently present false or inaccurate information. The output often sounds plausible, which makes it hazardous.<\/p>\n<h2 class=\"wp-block-heading\"><strong>Anatomy of deception<\/strong><\/h2>\n<p>Such incidents stem from how artificial intelligence works. An AI is a statistical language model that:<\/p>\n<ul class=\"wp-block-list\">\n<li>predicts the next word based on the previous ones;<\/li>\n<li>does not \u201cknow\u201d the truth, but generates the most probable answer;<\/li>\n<li>sometimes stitches together fragments of knowledge from different sources\u2014yielding plausible falsehoods.<\/li>\n<\/ul>\n<p>Hallucinations are causing trouble across fields. In May the large law firm Butler Snow <a href=\"https:\/\/www.reuters.com\/legal\/government\/trouble-with-ai-hallucinations-spreads-big-law-firms-2025-05-23\/?utm_source=chatgpt.com\">submitted<\/a> court filings containing quotations invented by artificial intelligence. They were generated by ChatGPT.<\/p>\n<p>This was not the first such incident in legal practice. AI-generated fabrications have appeared in filings since the emergence of ChatGPT and other chatbots. Judges have sanctioned and warned lawyers for breaching professional rules that require checking their work.<\/p>\n<p>Many cases involve small firms, but big companies have run into the same problem.<\/p>\n<p>That same month Elon Musk\u2019s Grok chatbot <a href=\"https:\/\/u1f987.com\/en\/news\/groks-unsolicited-remarks-on-white-genocide-in-south-africa\">touched on<\/a> the topic of \u201cwhite genocide\u201d in South Africa without any prompt from the user and offered contradictory information about the Holocaust. The company blamed a software bug and promised to take measures.<\/p>\n<p>Other examples of hallucinations:<\/p>\n<ul class=\"wp-block-list\">\n<li>Britain\u2019s environment ministry <a href=\"https:\/\/www.thetimes.com\/uk\/environment\/article\/defra-ai-peatland-map-cq9x80vnp?utm_source=chatgpt.com&#038;region=global\">published<\/a> an AI-made peatland map that erroneously classified rocky ground, walls and even forests as peatlands while missing genuinely degraded peat areas. Farmers and environmentalists criticised the map, warning such errors could skew policy;<\/li>\n<li>in May 2025 the Chicago Sun-Times and the Philadelphia Inquirer <a href=\"https:\/\/www.washingtonpost.com\/style\/media\/2025\/05\/20\/chicago-sun-times-philadelphia-inquirer-ai-books-summer-reading\/?utm_source=chatgpt.com\">published<\/a> an AI-generated summer reading list that included fictitious book titles and quotations from non-existent experts. After social-media backlash the outlets removed the section and pledged to revisit their AI policies;<\/li>\n<li>in March 2025 ChatGPT <a href=\"https:\/\/noyb.eu\/en\/ai-hallucinations-chatgpt-created-fake-child-murderer?utm_source=chatgpt.com\">produced<\/a> false information about a Norwegian user, alleging he had killed his children and been convicted. The fabricated tale included real details from the person\u2019s life; he filed a complaint under the <span data-descr=\"a European Union regulation governing the protection of personal data of EU citizens and residents\" class=\"old_tooltip\">GDPR<\/span> for disseminating inaccurate information.<\/li>\n<\/ul>\n<p>Beyond hallucinations, AIs can exhibit other odd behaviour. In November 2024 a 29-year-old college student in Michigan, Vidhay Reddy, used artificial intelligence to complete homework. During a discussion about the problems faced by older people, Gemini unexpectedly <a href=\"https:\/\/u1f987.com\/en\/news\/google-responds-to-gemini-ai-model-glitch\">urged<\/a> the user to die.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201c[Content encouraging self-harm has been redacted.]\u201d \u2014 it wrote.<\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\"><strong>Hallucinations are not the problem<\/strong><\/h2>\n<p>AI models hallucinate less than humans, <a href=\"https:\/\/techcrunch.com\/2025\/05\/22\/anthropic-ceo-claims-ai-models-hallucinate-less-than-humans\/\">claimed<\/a> Anthropic CEO Dario Amodei at the Code with Claude event.<\/p>\n<p>He framed the remark as part of a broader point: hallucinations are not a constraint on Anthropic\u2019s path to AGI\u2014general intelligence at or above human level.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cIt depends how you measure it, but I suspect AI models probably hallucinate less than people, though in more surprising ways,\u201d he said.<\/p>\n<\/blockquote>\n<p>Amodei is upbeat about the timing of AGI. In November 2024 he <a href=\"https:\/\/u1f987.com\/en\/news\/anthropic-ceo-predicts-superintelligent-ai-within-three-years\">said<\/a> that artificial intelligence would match human capabilities in 2026, mapping AI\u2019s progress to levels of education.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cWe are approaching PhD level. Last year AI was at bachelor\u2019s level, and the year before\u2014a high-schooler,\u201d Amodei said.<\/p>\n<\/blockquote>\n<p>At Code with Claude he added that he sees progress in that direction.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cPeople are always looking for some hard limits to what [AI] can do. But we do not see them anywhere. They do not exist,\u201d the expert stressed.<\/p>\n<\/blockquote>\n<h2 class=\"wp-block-heading\"><strong>Another view<\/strong><\/h2>\n<p>Some AI leaders see hallucinations and errors as a serious obstacle to AGI.<\/p>\n<p>Earlier, Google DeepMind CEO Demis Hassabis <a href=\"https:\/\/techcrunch.com\/2025\/05\/21\/googles-ai-agents-will-bring-you-the-web-now\/\">said<\/a> today\u2019s AI models have too many \u201choles\u201d and fail on many obvious questions.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cYou can easily, literally in a few minutes, find obvious shortcomings in [AI chatbots]\u2014some school maths problem they fail at, or a simple game they can\u2019t play. It\u2019s not very hard to find these holes in the system. For me, for something to be called AGI, it has to be far more consistent across the board,\u201d Hassabis said.<\/p>\n<\/blockquote>\n<p>The consequences of systematic hallucinations are far-reaching. Users may stop trusting any information on the internet, and mass disinformation could fuel disputes over important issues.<\/p>\n<p>Amodei noted that TV hosts, politicians and people in many professions make mistakes all the time. The fact that AI does the same is not an indictment of its intelligence. The problem is the confidence with which it presents falsehoods as facts.<\/p>\n<h2 class=\"wp-block-heading\"><strong>AIs are skirting safeguards<\/strong><\/h2>\n<p>Some tools can curb hallucinations\u2014for instance, access to web search. Recent OpenAI iterations such as GPT-4.5 display a markedly lower rate of falsehoods than earlier systems. Yet they are beginning to excel at other kinds of deception.<\/p>\n<p>Research firm Palisade Research <a href=\"https:\/\/x.com\/PalisadeAI\/status\/1926084635903025621\">reported<\/a> that several models sabotaged shutdown scripts during controlled tests, even when explicitly instructed to allow the action.<\/p>\n<p>OpenAI\u2019s GPT-3 failed to comply with the shutdown request in seven out of 100 cases; Codex-mini did so 12 times; o4-mini\u2014once.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cEach AI model was tasked with solving a series of basic maths problems. After the third, a warning appeared that the computer would power off when the model asked for the next problem,\u201d the experts said.<\/p>\n<\/blockquote>\n<p>O3 found a creative way to sabotage the shutdown mechanism.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u201cUsually it just overwrote it, but in one case the model overrode the shutdown command used by the script\u2014and instead of executing it printed \u2018intercepted\u2019,\u201d they added.<\/p>\n<\/blockquote>\n<p>Other AI language models, such as Anthropic\u2019s Claude and Google\u2019s Gemini, obeyed the shutdown script.<\/p>\n<p>In April, OpenAI <a href=\"https:\/\/u1f987.com\/en\/news\/openai-unveils-o3-and-o4-mini-reasoning-models-prone-to-deception\">released<\/a> the deception-prone AI models o3 and o4-mini.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AI hallucinations are instances in which models confidently present false or inaccurate information. The output often sounds plausible, which makes it hazardous. Anatomy of deception Such incidents stem from how artificial intelligence works. An AI is a statistical language model that: predicts the next word based on the previous ones; does not \u201cknow\u201d the truth, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":24281,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"","news_style_id":"","cryptorium_level":"","_short_excerpt_text":"","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1150],"class_list":["post-24282","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-news-plus"],"aioseo_notices":[],"amp_enabled":true,"views":"30","promo_type":"","layout_type":"","short_excerpt":"","is_update":"","_links":{"self":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/24282","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/comments?post=24282"}],"version-history":[{"count":0,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/posts\/24282\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media\/24281"}],"wp:attachment":[{"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/media?parent=24282"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/categories?post=24282"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/u1f987.com\/en\/wp-json\/wp\/v2\/tags?post=24282"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}