{"id":9684,"date":"2024-11-05T22:00:49","date_gmt":"2024-11-05T22:00:49","guid":{"rendered":"https:\/\/www.tun.com\/home\/?p=9684"},"modified":"2024-11-05T22:00:50","modified_gmt":"2024-11-05T22:00:50","slug":"new-study-unveils-generative-ais-gaps-in-world-understanding","status":"publish","type":"post","link":"https:\/\/www.tun.com\/home\/new-study-unveils-generative-ais-gaps-in-world-understanding\/","title":{"rendered":"New Study Unveils Generative AI\u2019s Gaps in World Understanding"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-uagb-blockquote uagb-block-e7eb3fc3 uagb-blockquote__skin-border uagb-blockquote__stack-img-none\"><blockquote class=\"uagb-blockquote\"><div class=\"uagb-blockquote__content\">Despite their impressive outputs, generative AI models, including GPT-4, fail to understand real-world complexities, a new study reveals. Researchers have found significant flaws in the internal world models used by these AI systems, demonstrating that they can falter under changing conditions and emphasizing the need for more robust evaluation metrics.<\/div><footer><div class=\"uagb-blockquote__author-wrap uagb-blockquote__author-at-left\"><\/div><\/footer><\/blockquote><\/div>\n\n\n\n<div class=\"wp-block-group is-content-justification-space-between is-nowrap is-layout-flex wp-container-core-group-is-layout-0dfbf163 wp-block-group-is-layout-flex\"><div style=\"font-size:16px;\" class=\"has-text-align-left wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">The University Network<\/p><\/div><\/div>\n\n\n<div class=\"wp-block-uagb-social-share uagb-social-share__outer-wrap uagb-social-share__layout-horizontal uagb-block-ee584a31\">\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-ec619ce7\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.facebook.com\/sharer.php?u=\" tabindex=\"0\" role=\"button\" aria-label=\"facebook\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M504 256C504 119 393 8 256 8S8 119 8 256c0 123.8 90.69 226.4 209.3 245V327.7h-63V256h63v-54.64c0-62.15 37-96.48 93.67-96.48 27.14 0 55.52 4.84 55.52 4.84v61h-31.28c-30.8 0-40.41 19.12-40.41 38.73V256h68.78l-11 71.69h-57.78V501C413.3 482.4 504 379.8 504 256z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-32d99934\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/twitter.com\/share?url=\" tabindex=\"0\" role=\"button\" aria-label=\"twitter\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8L200.7 275.5 26.8 48H172.4L272.9 180.9 389.2 48zM364.4 421.8h39.1L151.1 88h-42L364.4 421.8z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-1d136f14\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.linkedin.com\/shareArticle?url=\" tabindex=\"0\" role=\"button\" aria-label=\"linkedin\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 448 512\"><path d=\"M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3zM135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5zm282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9V416z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<p>A new <a href=\"https:\/\/arxiv.org\/pdf\/2406.03689\" title=\"\">study<\/a> reveals that generative AI, despite its impressive capabilities, lacks a coherent understanding of the world. Researchers have uncovered that these models can complete tasks such as providing precise turn-by-turn driving directions in New York City but falter dramatically when faced with new variables, such as detours or road closures.<\/p>\n\n\n\n<p>Generative AI, particularly large language models (LLMs) like GPT-4, are primarily trained to predict the next word in a text sequence. Their phenomenal performance, often mimicking human-like understanding, has led many to believe that these models might possess an implicit comprehension of the real world. However, a new study challenges this notion, highlighting significant flaws in the internal models that AI uses to navigate tasks.<\/p>\n\n\n\n<p>\u201cWe needed test beds where we know what the world model is. Now, we can rigorously think about what it means to recover that world model,\u201d lead author Keyon Vafa, a postdoc at Harvard University, said in a <a href=\"https:\/\/news.mit.edu\/2024\/generative-ai-lacks-coherent-world-understanding-1105\" title=\"\">news release<\/a>.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Concerning Findings on AI World Models<\/h2>\n\n\n\n<p>The research team focused their examination on a specific type of generative AI known as transformers, the backbone of many prominent LLMs. These models were tested using deterministic finite automations (DFAs), a problem class that involves rules-based sequential states, like navigating city streets or playing a board game like Othello.<\/p>\n\n\n\n<p>Their findings were striking. Transformers could accurately generate directions and valid game moves, yet when elements of the task environment were altered, their performance deteriorated quickly. <\/p>\n\n\n\n<p>\u201cI was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just 1% of the possible streets, accuracy immediately plummets from nearly 100% to just 67%,\u201d added Vafa.<\/p>\n\n\n\n<p>When the team reviewed the AI-generated maps of New York City, the maps bore little resemblance to reality. They were overlaid with imaginary streets and flyovers that do not exist, making the navigation models fail under real-world changes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">New Metrics for Evaluating AI Understanding<\/h2>\n\n\n\n<p><span style=\"font-weight: 400;\">To address these inconsistencies, the researchers developed two new metrics to evaluate if a transformer has a coherent world model: sequence distinction and sequence compression. <\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">Sequence distinction assesses whether the AI can differentiate between different states, such as distinct Othello boards. <\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">Sequence compression evaluates if the AI recognizes identical states and understands the sequence of possible next steps from these states.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">The study found that transformers, which learned from randomly-generated data, formed somewhat more accurate world models than those trained on strategy-based data. However, none of the tested AI models demonstrated a coherent world model for tasks involving wayfinding.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">\u201cOne hope is that, because LLMs can accomplish all these amazing things in language, maybe we could use these same tools in other parts of science, as well. But the question of whether LLMs are learning coherent world models is very important if we want to use these techniques to make new discoveries,\u201d senior author Ashesh Rambachan, an assistant professor of economics at MIT<\/span> and a principal investigator in the MIT Laboratory for Information and Decision Systems (LIDS), said in the news release<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Implications and Future Directions<\/h2>\n\n\n\n<p><span style=\"font-weight: 400;\">These findings underscore a pivotal limitation in current generative AI models, which could have serious implications for their deployment in real-world scenarios. As LLMs become integrated into various sectors, from autonomous vehicles to scientific research, understanding their capabilities and limitations is crucial.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">Future research aims to address these challenges by tackling more diverse problems, including those with partially known rules, and applying these evaluation metrics to real-world scientific issues.<\/span><\/p>\n\n\n\n<p><span style=\"font-weight: 400;\">\u201cOften, we see these models do impressive things and think they must have understood something about the world. I hope we can convince people that this is a question to think very carefully about, and we don\u2019t have to rely on our own intuitions to answer it,\u201d Rambachan added.<\/span><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A new study reveals that generative AI, despite its impressive capabilities, lacks a coherent understanding of the world. Researchers have uncovered that these models can complete tasks such as providing precise turn-by-turn driving directions in New York City but falter dramatically when faced with new variables, such as detours or road closures. Generative AI, particularly [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-no-separators","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8],"tags":[],"class_list":["post-9684","post","type-post","status-publish","format-standard","hentry","category-ai"],"acf":[],"aioseo_notices":[],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"The University Network","author_link":"https:\/\/www.tun.com\/home\/author\/funky_junkie\/"},"uagb_comment_info":0,"uagb_excerpt":"A new study reveals that generative AI, despite its impressive capabilities, lacks a coherent understanding of the world. Researchers have uncovered that these models can complete tasks such as providing precise turn-by-turn driving directions in New York City but falter dramatically when faced with new variables, such as detours or road closures. Generative AI, particularly&hellip;","_links":{"self":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/9684","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/comments?post=9684"}],"version-history":[{"count":12,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/9684\/revisions"}],"predecessor-version":[{"id":9728,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/9684\/revisions\/9728"}],"wp:attachment":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/media?parent=9684"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/categories?post=9684"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/tags?post=9684"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}