{"id":24087,"date":"2025-05-12T15:00:11","date_gmt":"2025-05-12T15:00:11","guid":{"rendered":"https:\/\/www.tun.com\/home\/?p=24087"},"modified":"2025-05-12T15:00:12","modified_gmt":"2025-05-12T15:00:12","slug":"innovative-ai-headphones-translate-multiple-speakers-in-real-time","status":"publish","type":"post","link":"https:\/\/www.tun.com\/home\/innovative-ai-headphones-translate-multiple-speakers-in-real-time\/","title":{"rendered":"Innovative AI Headphones Translate Multiple Speakers in Real Time"},"content":{"rendered":"\n<div class=\"wp-block-group\"><div class=\"wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-uagb-blockquote uagb-block-e7eb3fc3 uagb-blockquote__skin-border uagb-blockquote__stack-img-none\"><blockquote class=\"uagb-blockquote\"><div class=\"uagb-blockquote__content\">University of Washington researchers unveil AI headphones that translate multiple speakers in real-time, maintaining their unique voice qualities. This groundbreaking technology could revolutionize communication in diverse languages.<\/div><footer><div class=\"uagb-blockquote__author-wrap uagb-blockquote__author-at-left\"><\/div><\/footer><\/blockquote><\/div>\n\n\n\n<div class=\"wp-block-group is-content-justification-space-between is-nowrap is-layout-flex wp-container-core-group-is-layout-0dfbf163 wp-block-group-is-layout-flex\"><div style=\"font-size:16px;\" class=\"has-text-align-left wp-block-post-author\"><div class=\"wp-block-post-author__content\"><p class=\"wp-block-post-author__name\">The University Network<\/p><\/div><\/div>\n\n\n<div class=\"wp-block-uagb-social-share uagb-social-share__outer-wrap uagb-social-share__layout-horizontal uagb-block-ee584a31\">\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-ec619ce7\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.facebook.com\/sharer.php?u=\" tabindex=\"0\" role=\"button\" aria-label=\"facebook\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M504 256C504 119 393 8 256 8S8 119 8 256c0 123.8 90.69 226.4 209.3 245V327.7h-63V256h63v-54.64c0-62.15 37-96.48 93.67-96.48 27.14 0 55.52 4.84 55.52 4.84v61h-31.28c-30.8 0-40.41 19.12-40.41 38.73V256h68.78l-11 71.69h-57.78V501C413.3 482.4 504 379.8 504 256z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-32d99934\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/twitter.com\/share?url=\" tabindex=\"0\" role=\"button\" aria-label=\"twitter\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\"><path d=\"M389.2 48h70.6L305.6 224.2 487 464H345L233.7 318.6 106.5 464H35.8L200.7 275.5 26.8 48H172.4L272.9 180.9 389.2 48zM364.4 421.8h39.1L151.1 88h-42L364.4 421.8z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n\n\n\n<div class=\"wp-block-uagb-social-share-child uagb-ss-repeater uagb-ss__wrapper uagb-block-1d136f14\"><span class=\"uagb-ss__link\" data-href=\"https:\/\/www.linkedin.com\/shareArticle?url=\" tabindex=\"0\" role=\"button\" aria-label=\"linkedin\"><span class=\"uagb-ss__source-wrap\"><span class=\"uagb-ss__source-icon\"><svg xmlns=\"https:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 448 512\"><path d=\"M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3zM135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5zm282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9V416z\"><\/path><\/svg><\/span><\/span><\/span><\/div>\n<\/div>\n<\/div>\n<\/div><\/div>\n\n\n\n<p>Researchers at the University of Washington (UW) have developed groundbreaking AI-powered headphones that can translate multiple speakers simultaneously while preserving the unique qualities and directions of their voices. This innovative system, known as Spatial Speech Translation, promises a significant advancement in real-time language translation technology.<\/p>\n\n\n\n<p>Tuochao Chen, a UW doctoral student in the Paul G. Allen School of Computer Science &amp; Engineering, recently faced a common barrier during a museum tour in Mexico: the inability to understand Spanish amidst the surrounding noise when using a translation app on a phone. The experience underscored the limitations of current translation apps, which are often overwhelmed by background sounds. <\/p>\n\n\n\n<p>Inspired by this challenge, Chen and his team set out to create a solution that could transcend these limitations.<\/p>\n\n\n\n<p>&#8220;Other translation tech is built on the assumption that only one person is speaking,&#8221; senior author Shyam Gollakota, a UW professor in the Allen School, said in a news release. &#8220;But in the real world, you can\u2019t have just one robotic voice talking for multiple people in a room. For the first time, we\u2019ve preserved the sound of each person\u2019s voice and the direction it\u2019s coming from.&#8221;<\/p>\n\n\n\n<p>The Spatial Speech Translation system employs off-the-shelf noise-canceling headphones fitted with microphones. The algorithms within the system work like radar, scanning the environment in 360 degrees to detect and track multiple speakers, translating their speech with a slight 2-4 second delay. <\/p>\n\n\n\n<p>This methodology ensures that each speaker&#8217;s voice is preserved authentically, maintaining their expressive qualities and volume.<\/p>\n\n\n\n<p>\u201cOur algorithms work a little like radar,\u201d added Chen. \u201cSo it&#8217;s scanning the space in 360 degrees and constantly determining and updating whether there\u2019s one person or six or seven.\u201d<\/p>\n\n\n\n<div style=\"height:18px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<figure class=\"wp-block-embed aligncenter is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"Spatial speech translation\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/zxs5QQgengs?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<div style=\"height:8px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p>The research team <a href=\"https:\/\/programs.sigchi.org\/chi\/2025\/program\/content\/189450\" target=\"_blank\" rel=\"noopener\" title=\"\">presented their findings<\/a> at the ACM CHI Conference on Human Factors in Computing Systems in Yokohama, Japan. The code for the proof-of-concept device is open-source, allowing others to build and expand on this pioneering work.<\/p>\n\n\n\n<p>This system functions with mobile devices sporting an Apple M2 chip, such as laptops and the Apple Vision Pro, and avoids cloud computing to address privacy concerns related to voice cloning. When tested in 10 different indoor and outdoor environments, users consistently favored the new system over traditional models that did not track speakers through space. <\/p>\n\n\n\n<p>In one of the user tests, participants preferred a 3-4 second delay, as the system made fewer errors compared to a 1-2 second delay. <\/p>\n\n\n\n<p>While the device currently handles common speech rather than technical jargon, it has been successfully tested with Spanish, German and French. Previous translation models suggest that it could potentially be trained to handle approximately 100 languages in the future.<\/p>\n\n\n\n<p>\u201cThis is a step toward breaking down the language barriers between cultures,\u201d Chen added. \u201cSo if I\u2019m walking down the street in Mexico, even though I don\u2019t speak Spanish, I can translate all the people\u2019s voices and know who said what.\u201d<\/p>\n\n\n\n<div style=\"height:8px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<p><strong>Source: <\/strong><a href=\"https:\/\/www.washington.edu\/news\/2025\/05\/09\/ai-headphones-translate-multiple-speakers-at-once-cloning-their-voices-in-3d-sound\/\" target=\"_blank\" rel=\"noopener\" title=\"\">University of Washington<\/a><\/p>\n\n\n\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Researchers at the University of Washington (UW) have developed groundbreaking AI-powered headphones that can translate multiple speakers simultaneously while preserving the unique qualities and directions of their voices. This innovative system, known as Spatial Speech Translation, promises a significant advancement in real-time language translation technology. Tuochao Chen, a UW doctoral student in the Paul G. [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"single-no-separators","format":"standard","meta":{"_acf_changed":false,"_uag_custom_page_level_css":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[8,17],"tags":[161],"class_list":["post-24087","post","type-post","status-publish","format-standard","hentry","category-ai","category-tech","tag-university-of-washington"],"acf":[],"aioseo_notices":[],"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false},"uagb_author_info":{"display_name":"The University Network","author_link":"https:\/\/www.tun.com\/home\/author\/funky_junkie\/"},"uagb_comment_info":0,"uagb_excerpt":"Researchers at the University of Washington (UW) have developed groundbreaking AI-powered headphones that can translate multiple speakers simultaneously while preserving the unique qualities and directions of their voices. This innovative system, known as Spatial Speech Translation, promises a significant advancement in real-time language translation technology. Tuochao Chen, a UW doctoral student in the Paul G.&hellip;","_links":{"self":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/24087","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/comments?post=24087"}],"version-history":[{"count":6,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/24087\/revisions"}],"predecessor-version":[{"id":24133,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/posts\/24087\/revisions\/24133"}],"wp:attachment":[{"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/media?parent=24087"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/categories?post=24087"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tun.com\/home\/wp-json\/wp\/v2\/tags?post=24087"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}