Whether through lively classroom discussions, profound conversations with friends, or political debates at family dinners, speech resides at the crux of human connection. Tragic, life-altering events, such as stroke, traumatic brain injury, and the development of neurodegenerative diseases such as ALS can contribute to a loss of speech, often via vocal paralysis. When the nervous system is ravaged by disease or injury, severe paralysis and consequent locked-in syndrome can occur. In cases of locked-in syndrome, a patient loses all motor abilities and can only communicate via blinking or very minimal movements, rendering traditional speech aids such as typing and writing tools useless. While years of research have produced several assistive speech devices targeted at patients with severe paralysis, these devices are often extremely limited in their vocabulary range and only offer output options that are choppy, slow, and inauthentic. Despite allowing patients to communicate to some degree, the shortcomings of these devices remove much of the character and connection a person derives through speech, leaving patients feeling socially and emotionally isolated (Ramsey and Crone, 2023).
A recent study published in Nature has partially succeeded in remedying this issue. Through the development and use of brain-computer interfaces (BCIs), scientists have created a pathway that translates a patient’s neural signals into personalized text, speech audio, and animated, expressive facial avatars. In a case study involving a 47 year-old woman suffering from severe paralysis and complete loss of speech as a result of a brainstem stroke sustained 18 years earlier, researchers Metzger et al. designed and implanted a BCI into the left hemisphere of the patient’s brain, centered on her central sulcus and spanning the regions of her brain associated with speech production and language perception. The first of its kind, this BCI harnesses electrocorticography (ECoG) to decode neural signals for attempted speech and vocal tract movements into corresponding words, phrases, and facial expressions. Similar to the more familiar and established electroencephalography (EEG), which uses small electrodes attached to the scalp to monitor electrophysiological activity in the brain, electrocorticography can directly adhere to exposed surfaces of the brain to decode electric signals (“Electrocorticography”). The specific BCI used in this study contains a high-density array of 253 individual ECoG electrodes, connected to a percutaneous pedestal connector, which allows for the signals to be decoded and displayed on a computer interface (Metzger et al., 2023).
Following surgical implantation, the BCI and its connector were hooked up to a computer, which contained deep learning models that had already been “trained” via probability data to have predictive abilities that enable them to decode certain phones, silences, and articulatory gestures into words and phrases. An additional probability-based feature called a connectionist temporal classification loss function was added to the neural decoding network to distinguish the timing of the attempted speech and signals, which allowed for order and pauses between words to be identified. After being processed through this system, the decoded signals could either be translated directly into letters, discrete vocal speech units, or discrete articulatory gestures, which respectively produce artificial text, speech, and facial expressions (Fig. 1, Metzger et al., 2023).
This novel neuroprosthesis provides an unprecedented sense of personalization and authenticity to communication for patients with extreme paralysis. While previous assistive speech technology could at best reach a rate of 14 words per minute, the BCI used in the case study averages 78 words per minute, which is much closer to the average adult rate of speaking, roughly 150 words per minute. Additionally, the speech function can be personalized to the patient’s voice prior to their vocal paralysis, and the facial expressions can be projected onto an avatar resembling the patient, adding another layer of personalization (Metzger et al., 2023).
Innovations in speech-targeted neuroprosthetic technology have the potential to change the lives of thousands of people dealing with severe paralysis. Naturally, these solutions are not perfect– the error rate of this particular BCI is approximately 20% for direct text decoding, and 50% for direct speech decoding (Ramsey and Crone, 2023). Despite sounding high, these numbers are a remarkable improvement from more established technologies and point to the vast potential of BCIs a few years down the road. Additionally, while this case is still in its early stages, there are several other similar BCIs being developed for the same purpose (Willett et al., 2023). As research continues, it will be fascinating to observe the continued effects of this revolutionary technology on patients’ lives.
References
Electrocorticography—An overview | sciencedirect topics. (n.d.). Retrieved November 4, 2023, from https://www.sciencedirect.com/topics/neuroscience/electrocorticography
Metzger, S. L., Littlejohn, K. T., Silva, A. B., Moses, D. A., Seaton, M. P., Wang, R., Dougherty, M. E., Liu, J. R., Wu, P., Berger, M. A., Zhuravleva, I., Tu-Chan, A., Ganguly, K., Anumanchipalli, G. K., & Chang, E. F. (2023). A high-performance neuroprosthesis for speech decoding and avatar control. Nature, 620(7976), 1037–1046. https://doi.org/10.1038/s41586-023-06443-4
Ramsey, N. F., & Crone, N. E. (2023). Brain implants that enable speech pass performance milestones. Nature, 620(7976), 954–955. https://doi.org/10.1038/d41586-023-02546-0
Willett, F. R., Kunz, E. M., Fan, C., Avansino, D. T., Wilson, G. H., Choi, E. Y., Kamdar, F., Glasser, M. F., Hochberg, L. R., Druckmann, S., Shenoy, K. V., & Henderson, J. M. (2023). A high-performance speech neuroprosthesis. Nature, 620(7976), 1031–1036. https://doi.org/10.1038/s41586-023-06377-x