{"id":8625,"date":"2024-10-25T21:02:36","date_gmt":"2024-10-25T21:02:36","guid":{"rendered":"https:\/\/www.tun.com\/home\/?p=8625"},"modified":"2024-10-25T21:09:01","modified_gmt":"2024-10-25T21:09:01","slug":"mits-new-tool-simplifies-ai-verification-process","status":"publish","type":"post","link":"https:\/\/www.tun.com\/home\/mits-new-tool-simplifies-ai-verification-process\/","title":{"rendered":"MIT&#8217;s New Tool Simplifies AI Verification Process"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-uagb-blockquote uagb-block-e7eb3fc3 uagb-blockquote__skin-border uagb-blockquote__stack-img-none\"><blockquote class=\"uagb-blockquote\"><div class=\"uagb-blockquote__content\">MIT researchers introduced SymGen, a groundbreaking tool that accelerates the verification of AI responses by 20%. This innovation aims to enhance the reliability of large language models, aiding human validators across diverse fields.<\/div><footer><div class=\"uagb-blockquote__author-wrap uagb-blockquote__author-at-left\"><\/div><\/footer><\/blockquote><\/div>\n\n\n\n<div class=\"wp-block-group is-content-justification-space-between is-nowrap is-layout-flex wp-container-core-group-is-layout-0dfbf163 wp-block-group-is-layout-flex\"><div style=\"font-size:16px;\" class=\"has-text-align-left wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">The University Network<\/p><\/div><\/div>\n\n\n<div class=\"wp-block-uagb-social-share uagb-social-share__outer-wrap uagb-social-share__layout-horizontal uagb-block-ee584a31\">\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-ec619ce7\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.facebook.com\/sharer.php?u=\" tabindex=\"0\" role=\"button\" aria-label=\"facebook\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M504 256C504 119 393 8 256 8S8 119 8 256c0 123.8 90.69 226.4 209.3 245V327.7h-63V256h63v-54.64c0-62.15 37-96.48 93.67-96.48 27.14 0 55.52 4.84 55.52 4.84v61h-31.28c-30.8 0-40.41 19.12-40.41 38.73V256h68.78l-11 71.69h-57.78V501C413.3 482.4 504 379.8 504 256z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-32d99934\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/twitter.com\/share?url=\" tabindex=\"0\" role=\"button\" aria-label=\"twitter\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8L200.7 275.5 26.8 48H172.4L272.9 180.9 389.2 48zM364.4 421.8h39.1L151.1 88h-42L364.4 421.8z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-1d136f14\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.linkedin.com\/shareArticle?url=\" tabindex=\"0\" role=\"button\" aria-label=\"linkedin\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 448 512\"><path d=\"M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3zM135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5zm282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9V416z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<p>Large language models (LLMs), the backbone of modern artificial intelligence (AI), have showcased remarkable abilities but are not free from flaws. One critical issue they face is &#8220;hallucination,&#8221; where the AI fabricates incorrect or unsupported details.<\/p>\n\n\n\n<p>Traditionally, human validators play a crucial role in detecting these inaccuracies, especially in sensitive fields like healthcare and finance. However, the conventional method involves painstakingly cross-checking long documents, a process that is not only cumbersome but prone to human error. This labor-intensive task may even deter some users from leveraging generative AI models altogether.<\/p>\n\n\n\n<p>To address this challenge, researchers from MIT have created a tool called SymGen that simplifies and speeds up the validation of AI-generated responses. SymGen generates responses with citations pointing exactly to the relevant information in a source document, such as a specific cell in a table. Users can then hover over the highlighted portions of the text to see the underlying data, streamlining the verification process.<\/p>\n\n\n\n<p>\u201cWe give people the ability to selectively focus on parts of the text they need to be more worried about. In the end, SymGen can give people higher confidence in a model\u2019s responses because they can easily take a closer look to ensure that the information is verified,\u201d co-lead author Shannon Shen, an electrical engineering and computer science graduate student at MIT, said in a <a href=\"https:\/\/news.mit.edu\/2024\/making-it-easier-verify-ai-models-responses-1021\" title=\"\">news release<\/a>.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p>Through a user study, Shen and his team found that SymGen reduced verification times by about 20%, significantly enhancing the efficiency of validation processes for LLMs. This advancement has the potential to revolutionize various real-world applications, from generating clinical notes to summarizing financial market reports.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Innovative Symbolic References<\/h2>\n\n\n\n<p>Typically, LLMs generate citations linking to external documents to allow users to verify their language-based responses. However, these systems are usually an afterthought, often demanding extensive effort from users to sift through references.<\/p>\n\n\n\n<p>&#8220;Generative AI is intended to reduce the user&#8217;s time to complete a task. If you need to spend hours reading through all these documents to verify the model is saying something reasonable, then it\u2019s less helpful to have the generations in practice,&#8221; Shen added.<\/p>\n\n\n\n<p>The researchers approached the validation problem from the perspective of the human validators. A SymGen user begins by providing the LLM with structured data, like a table containing specific statistics. Instead of having the model immediately complete a task, such as generating a game summary, the researchers prompt it to generate responses in a symbolic format. Each cited word or phrase is linked to a specific cell in the data table, allowing for precise references.<\/p>\n\n\n\n<p>&#8220;Because we have this intermediate step that has the text in a symbolic format, we are able to have really fine-grained references. We can say, for every single span of text in the output, this is exactly where in the data it corresponds to,&#8221; co-lead author Lucas Torroba Hennigen, also an electrical engineering and computer science graduate student at MIT, said in the news release.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Streamlined and Error-Free<\/h2>\n\n\n\n<p>SymGen uses a rule-based tool to resolve each symbolic reference by copying the exact text from the data table, ensuring the cited information is free from errors.<\/p>\n\n\n\n<p>&#8220;This way, we know it is a verbatim copy, so we know there will not be any errors in the part of the text that corresponds to the actual data variable,&#8221; Shen added.<\/p>\n\n\n\n<p>While SymGen shows promising results, it does have limitations. The quality of output is dependent on the source data, and currently, the system operates only with structured data sets.<\/p>\n\n\n\n<p>The study is available <a href=\"https:\/\/arxiv.org\/pdf\/2311.09188\" title=\"\">here<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Future Prospects<\/h2>\n\n\n\n<p>The MIT team plans to extend SymGen\u2019s capabilities to handle arbitrary text and diverse data types, potentially aiding in the validation of AI-generated legal documents and clinical summaries.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Large language models (LLMs), the backbone of modern artificial intelligence (AI), have showcased remarkable abilities but are not free from flaws. One critical issue they face is &#8220;hallucination,&#8221; where the AI fabricates incorrect or unsupported details. Traditionally, human validators play a crucial role in detecting these inaccuracies, especially in sensitive fields like healthcare and finance. [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-no-separators","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8],"tags":[],"class_list":["post-8625","post","type-post","status-publish","format-standard","hentry","category-ai"],"acf":[],"aioseo_notices":[],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"The University Network","author_link":"https:\/\/www.tun.com\/home\/author\/funky_junkie\/"},"uagb_comment_info":0,"uagb_excerpt":"Large language models (LLMs), the backbone of modern artificial intelligence (AI), have showcased remarkable abilities but are not free from flaws. One critical issue they face is &#8220;hallucination,&#8221; where the AI fabricates incorrect or unsupported details. Traditionally, human validators play a crucial role in detecting these inaccuracies, especially in sensitive fields like healthcare and finance.&hellip;","_links":{"self":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/8625","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/comments?post=8625"}],"version-history":[{"count":8,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/8625\/revisions"}],"predecessor-version":[{"id":8638,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/8625\/revisions\/8638"}],"wp:attachment":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/media?parent=8625"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/categories?post=8625"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/tags?post=8625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}