Study Reveals How Brain Works During Speech

Published:

Updated:

Everytime we speak, we are engaging nearly 100 different muscles in our lips, jaw, tongue and throat. Now, a new UC San Francisco study shows how the brain works during speech and suggests promising results to help create prosthetic devices for those who are unable to speak.

The new study is published in Neuron.

More than just sound

For a long time, linguists divided speech into abstract units of sounds called “phonemes,” by which they considered the /k/ sound in “keep” the same as the /k/ in “coop.”

But in reality, according to the UCSF team, our mouth forms the sound differently in these two words to prepare for the different vowels that follow.

At least to the brain regions responsible for producing speech, this physical difference appears to be more important than the theoretical sameness of the phoneme.

Picking up from their previous study on how the brain interprets the sounds of isolated, single syllables, the researchers dug deeper into how the brain works during fluent speech.

“In our most recent study, we were curious as to how the complexity of speech is represented in the brain with respect to the actual movements of the vocal tract,” said Josh Chartier, a doctoral candidate in the UC Berkeley and UCSF Joint Program in Bioengineering and co-author of the study.

The study

The researchers used electrocorticography, or ECoG, a method used in brain surgeries that places high-density arrays of electrodes onto the surface of the patients’ brains to record electrical activity in important areas of a human brain, such as those involved in language.

“It’s a unique means of looking at thousands of neurons activating in unison,” Chartier said in a statement.

With ECoG electrodes placed over a region of ventral sensorimotor cortex that is a key center of speech production, five volunteers awaiting surgery were asked to read aloud a collection of 460 natural sentences, which were constructed to encapsulate nearly all the possible articulatory contexts in American English.

This comprehensiveness was crucial to capture the complete range of “coarticulation,” the blending of phonemes in natural speech.

“Without coarticulation, our speech would be blocky and segmented to the point where we couldn’t really understand it,” Chartier said in a statement.

Instead of simultaneously recording the volunteers’ neural activity and their tongue, mouth and larynx movements, the researchers recorded only audio of the volunteers speaking and developed a new deep learning algorithm to estimate which movements were made during specific speaking tasks.

The result

The researchers found that the brain’s speech centers are organized more according to the physical movements of the vocal tract as it produces speech than by how the speech sounds.

They identified four emergent groups of neurons that appeared to be responsible for coordinating movements of muscles of the lips, tongue and throat into the four main configurations of the vocal tract used in American English.

Also, they identified neural populations associated with specific classes of phonetic groupings, such as consonants and vowels of different types, that are more of a byproduct of more natural groupings based on different types of muscle movement than their sound.

“It’s really made me think twice about phonemes fit in — in a sense, these units of speech that we pin so much of our research on are just byproducts of a sensorimotor signal,” Gopala K. Anumanchipalli, a doctoral candidate in the Department of Neurological Surgery at UCSF and co-author of the study, said in a statement.  

Regarding coarticulation, the researchers discovered that our brains’ speech centers coordinate different muscle movement patterns based on the context of what’s being said, and the order in which different sounds occur.

For example, the jaw opens more to say the word “tap” than to say the word “has” even though they both have the same vowel sound (/ae/). This is because the mouth has to get ready to close to make the /z/ sound in “has,” but not in “tap.”

They found that neurons in the ventral sensorimotor cortex, an area in the brain that controls speech, were highly attuned to this and other co-articulatory features of English, suggesting that the brain cells are tuned to produce fluid, context-dependent speech, as opposed to reading out discrete speech segments in consecutive order.

In languages other than English, the researchers speculate that the brain activity patterns would be reflective of the dominant vocal movements used in that language.

“This study highlights why we need to take into account vocal tract movements and not just linguistic features like phonemes when studying speech production,” Chartier said in a statement.

The next step

According to Chartier, the researchers will expand their study to look at native speakers of other languages.

In the long term, they hope this study will pave the way for building speech prosthetics.

“We know now that the sensorimotor cortex encodes vocal tract movements, so we can use that knowledge to decode cortical activity and translate that via a speech prosthetic,” Chartier said in a statement. “This would give voice to people who can’t speak but have intact neural functions.”

FREE 6-month trial

Then, enjoy Amazon Prime at half the price – 50% off!

TUN AI – Your Education Assistant

TUN AI

I’m here to help you with scholarships, college search, online classes, financial aid, choosing majors, college admissions and study tips!

The University Network