{"id":27557,"date":"2025-07-23T14:44:40","date_gmt":"2025-07-23T14:44:40","guid":{"rendered":"https:\/\/www.tun.com\/home\/?p=27557"},"modified":"2025-07-23T14:44:41","modified_gmt":"2025-07-23T14:44:41","slug":"ai-chatbots-overestimate-abilities-and-lack-self-awareness-study-reveals","status":"publish","type":"post","link":"https:\/\/www.tun.com\/home\/ai-chatbots-overestimate-abilities-and-lack-self-awareness-study-reveals\/","title":{"rendered":"AI Chatbots Overestimate Abilities and Lack Self-Awareness, Study Reveals"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-uagb-blockquote uagb-block-e7eb3fc3 uagb-blockquote__skin-border uagb-blockquote__stack-img-none\"><blockquote class=\"uagb-blockquote\"><div class=\"uagb-blockquote__content\">A recent study from Carnegie Mellon University reveals that AI chatbots often overestimate their abilities and struggle with self-awareness. The findings underscore the importance of scrutinizing AI-generated information and highlight avenues for future improvements in artificial intelligence.<\/div><footer><div class=\"uagb-blockquote__author-wrap uagb-blockquote__author-at-left\"><\/div><\/footer><\/blockquote><\/div>\n\n\n\n<div class=\"wp-block-group is-content-justification-space-between is-nowrap is-layout-flex wp-container-core-group-is-layout-0dfbf163 wp-block-group-is-layout-flex\"><div style=\"font-size:16px;\" class=\"has-text-align-left wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">The University Network<\/p><\/div><\/div>\n\n\n<div class=\"wp-block-uagb-social-share uagb-social-share__outer-wrap uagb-social-share__layout-horizontal uagb-block-ee584a31\">\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-ec619ce7\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.facebook.com\/sharer.php?u=\" tabindex=\"0\" role=\"button\" aria-label=\"facebook\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M504 256C504 119 393 8 256 8S8 119 8 256c0 123.8 90.69 226.4 209.3 245V327.7h-63V256h63v-54.64c0-62.15 37-96.48 93.67-96.48 27.14 0 55.52 4.84 55.52 4.84v61h-31.28c-30.8 0-40.41 19.12-40.41 38.73V256h68.78l-11 71.69h-57.78V501C413.3 482.4 504 379.8 504 256z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-32d99934\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/twitter.com\/share?url=\" tabindex=\"0\" role=\"button\" aria-label=\"twitter\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8L200.7 275.5 26.8 48H172.4L272.9 180.9 389.2 48zM364.4 421.8h39.1L151.1 88h-42L364.4 421.8z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-1d136f14\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.linkedin.com\/shareArticle?url=\" tabindex=\"0\" role=\"button\" aria-label=\"linkedin\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 448 512\"><path d=\"M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3zM135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5zm282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9V416z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<p>Artificial intelligence chatbots have swiftly integrated into various aspects of digital life, from customer service interactions to online searches. However, new research from Carnegie Mellon University highlights a critical flaw: these AI systems tend to be overly confident in their abilities, even when they&#8217;re wrong.<\/p>\n\n\n\n<p>The study, <a href=\"https:\/\/link.springer.com\/article\/10.3758\/s13421-025-01755-4\" target=\"_blank\" rel=\"noopener\" title=\"\">published<\/a> in the journal Memory &amp; Cognition, delved into the self-assessment capabilities of large language models (LLMs), comparing their confidence levels with those of human participants. <\/p>\n\n\n\n<p>Participants and LLMs were asked how confident they felt answering trivia questions, predicting NFL game outcomes, or participating in a Pictionary-like image identification game. Both groups displayed similar success rates but differed significantly in their self-assessments post-task.<\/p>\n\n\n\n<p>\u201cSay the people told us they were going to get 18 questions right, and they ended up getting 15 questions right. Typically, their estimate afterwards would be something like 16 correct answers,\u201d lead author Trent Cash, a recent doctoral graduate<strong> <\/strong>from Carnegie Mellon, said in a news release. \u201cSo, they\u2019d still be a little bit overconfident, but not as overconfident.\u201d<\/p>\n\n\n\n<p>In contrast, the AI models, which included ChatGPT, Bard\/Gemini, Sonnet and Haiku, did not adjust their confidence levels downward after poor performance. <\/p>\n\n\n\n<p>\u201cThey tended, if anything, to get more overconfident, even when they didn\u2019t do so well on the task,\u201d Cash added.<\/p>\n\n\n\n<p>This discovery has profound implications for the integration of AI chatbots into everyday activities. <\/p>\n\n\n\n<p>Misplaced user trust in overconfident AI responses can have significant repercussions, particularly in areas requiring high accuracy. For example, a BBC study found significant inaccuracies in over half of the AI-generated news responses. <\/p>\n\n\n\n<p>Similarly, other studies have reported frequent \u201challucinations\u201d in legal queries, where LLMs produce incorrect information.<\/p>\n\n\n\n<p>Co-author Danny Oppenheimer, a professor in CMU\u2019s Department of Social and Decision Sciences, emphasized the lack of intuitive cues in AI that humans typically rely on. <\/p>\n\n\n\n<p>\u201cHumans have evolved over time and practiced since birth to interpret the confidence cues given off by other humans. If my brow furrows or I\u2019m slow to answer, you might realize I\u2019m not necessarily sure about what I\u2019m saying, but with AI, we don\u2019t have as many cues about whether it knows what it\u2019s talking about,\u201d Oppenheimer said in the news release.<\/p>\n\n\n\n<p>The study underscores the importance of questioning AI responses, particularly when the stakes are high. By asking the AI for its level of confidence, users can gauge the reliability of the information, though the LLM\u2019s self-assessment may not always be accurate.<\/p>\n\n\n\n<p>Highlighting the potential for future improvements, Oppenheimer suggested that larger datasets might help AI develop better self-awareness. <\/p>\n\n\n\n<p>\u201cMaybe if it had thousands or millions of trials, it would do better,\u201d he added.<\/p>\n\n\n\n<p>The study also found variability in overconfidence levels among different LLMs. For instance, Sonnet tended to be less overconfident than its peers, while ChatGPT-4 achieved near-human performance in certain tasks.<\/p>\n\n\n\n<p>Exposing these weaknesses is crucial for developing more reliable AI systems. <\/p>\n\n\n\n<p>\u201cIf LLMs can recursively determine that they were wrong, then that fixes a lot of the problem,\u201d added Cash.<\/p>\n\n\n\n<div style=\"height:16px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>Source:<\/strong> <a href=\"https:\/\/www.cmu.edu\/dietrich\/news\/news-stories\/2025\/july\/trent-cash-ai-overconfidence.html\" target=\"_blank\" rel=\"noopener\" title=\"\">Carnegie Mellon University<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Artificial intelligence chatbots have swiftly integrated into various aspects of digital life, from customer service interactions to online searches. However, new research from Carnegie Mellon University highlights a critical flaw: these AI systems tend to be overly confident in their abilities, even when they&#8217;re wrong. The study, published in the journal Memory &amp; Cognition, delved [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-no-separators","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8],"tags":[149],"class_list":["post-27557","post","type-post","status-publish","format-standard","hentry","category-ai","tag-carnegie-mellon-university"],"acf":[],"aioseo_notices":[],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"The University Network","author_link":"https:\/\/www.tun.com\/home\/author\/funky_junkie\/"},"uagb_comment_info":0,"uagb_excerpt":"Artificial intelligence chatbots have swiftly integrated into various aspects of digital life, from customer service interactions to online searches. However, new research from Carnegie Mellon University highlights a critical flaw: these AI systems tend to be overly confident in their abilities, even when they&#8217;re wrong. The study, published in the journal Memory &amp; Cognition, delved&hellip;","_links":{"self":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/27557","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/comments?post=27557"}],"version-history":[{"count":8,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/27557\/revisions"}],"predecessor-version":[{"id":27569,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/27557\/revisions\/27569"}],"wp:attachment":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/media?parent=27557"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/categories?post=27557"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/tags?post=27557"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}