• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

Bowdoin Science Journal

  • Home
  • About
    • Our Mission
    • Our Staff
  • Sections
    • Biology
    • Chemistry and Biochemistry
    • Math and Physics
    • Computer Science and Technology
    • Environmental Science and EOS
    • Honors Projects
    • Psychology and Neuroscience
  • Contact Us
  • Fun Links
  • Subscribe

Computer Science and Tech

AI for Language and Cultural Preservation

December 11, 2025 by Wing Kiu Lau '26

Diagram showing the computational process of translating Afáka, the script of the Ndyuka language (an English-based creole of Suriname), into English at the Missing Scripts Program

AI for Language and Cultural Preservation

Abstract

Nearly half of the world’s languages face extinction, threatening irreplaceable knowledge and cultural connections. This paper examines how artificial intelligence can support endangered language documentation and revitalization when guided by community priorities. Through case studies, from Hawaiian speech recognition to Cherokee learning platforms, the paper identifies both opportunities (improved access, engaging tools, cross-distance connection) and challenges (privacy risks, cultural appropriation, misinformation, sustainability). The central argument: effective preservation requires community leadership, robust consent frameworks, and sustained support rather than commodified technological quick fixes. The paper concludes with principles for responsible AI use that strengthens living languages and their cultural contexts.

Introduction

Approximately 40% of the world’s 6,700 languages risk extinction as speaker populations decline (Jampel, 2025). This crisis extends beyond communication loss. Languages embody cultural identity, historical memory, and community bonds. When languages disappear, speakers lose direct access to ancestral knowledge, particularly where oral histories predominate. Research suggests linguistic heritage connection correlates with improved adolescent mental health outcomes and reduced rates of certain chronic conditions.

Artificial intelligence has emerged as one approach to preservation, offering unprecedented documentation scale and interactive learning platforms. However, concerns persist beyond environmental costs, to whether technology can authentically serve community needs.

This paper argues that while AI provides powerful tools for endangered language work through natural language processing and speech recognition, success depends on careful integration with Indigenous communities’ priorities, values, and active participation.

It first examines AI’s technical foundations, real-world applications and case studies, diverse stakeholder perspectives, and future promises and challenges facing the field.

Technical Foundations: How AI Works in Language Preservation 

To understand how AI contributes to language preservation, it is helpful to see how Natural Language Processing provides the foundational methods for analyzing language, while modern language models apply these methods to learn from data and produce meaningful representations and outputs that support language preservation-focused tasks.

Natural Language Processing

Natural Language Processing (NLP) sits at the core of how AI is used to process and manipulate text. Two key areas inform language documentation: Computational Linguistics (developing tools and methods for analyzing language data) and Semantics (studying how meaning operates in language).

Semantics broadly concerns deriving meaning from language. It spans the linguistic side, where it handles lexical and grammatical meaning tied to computational linguistics, and the philosophical side, which examines distinctions between fact and fiction, emotional tone (e.g., positive, neutral, negative), and relationships between different corpora (Ali et al., 2025, p. 133).

Semantics can address challenges like word ambiguity: where lie could mean falsehood or resting horizontally. Computational Linguistics tools like Bidirectional Encoder Representations from Transformers (BERT) use contextual analysis to disambiguate such terms. Other common challenges include capturing idiomatic meanings in translation (Ali et al., 2025, p. 134).

Computational Linguistics began in the 1940s-50s, but recent advancements have driven major developments in NLP through Machine Learning and Deep Learning. Neural networks, which are data-processing structures inspired by the human brain, allow machines to learn from sample data to perform complex tasks by recognizing, classifying, and correlating patterns.

In more recent years, Generative AI has gained prominence. It relies on transformer architectures, a type of neural network that analyzes entire sequences simultaneously to determine which parts are most important, enabling effective learning from large datasets.

In short, NLP implementation involves preprocessing textual data through steps such as tokenization (breaking text into smaller units), stemming or lemmatization (reducing words to their root forms, e.g., talking → talk), and stop-word removal (eliminating common or low-value words like and, for, with). The processed data is then used to train models for specific tasks.

Common NLP applications relevant to language preservation include part-of-speech tagging, which labels words in a sentence based on their grammatical roles (e.g., nouns, verbs, adjectives, adverbs); word-sense disambiguation, which resolves multiple possible meanings of a word; speech recognition, which converts spoken language into text; machine translation, which enables translation between languages; sentiment analysis, which identifies emotional tone in text; and automatic resource mining, which involves the automated collection of linguistic resources (Amazon Web Services, n.d.).

Language Models 

BERT, developed by Google, is trained mainly with masked language modeling, where it predicts missing words from surrounding context. The original BERT also included a next sentence prediction task to judge whether one sentence follows another, although many modern variants modify or omit this objective (BERT, n.d.). Multilingual BERT (MBERT) extends this ability to multiple languages (Ali et al., 2025, p. 136).

Building on these advances, Cherokee researchers are applying and extending NLP techniques to advance language preservation and revitalization. According to Dr. David Montgomery, a citizen of the Cherokee Nation, “It would be a great service to Cherokee language learners to have a translation tool as well as an ability to draft a translation of documents for first-language Cherokee speakers to edit as part of their translation tasks” (Zhang et al., 2022, p. 1535).

To realize this potential, the research effort focuses on adapting existing NLP frameworks and creating tools specifically suited to Cherokee. Effective data collection and processing depend on capabilities such as automatic language identification and multilingual embedding models. For example, aligning Cherokee and English texts requires projecting sentences from both languages into a common semantic space to evaluate their similarity. These are capabilities that most standard NLP tools don’t provide and must be custom-built for this context (Zhang et al., 2022, p. 1535).

Real-World Applications and Case Studies 

Broadly speaking, researchers and developers are creating innovative AI solutions to support language preservation across communities.

For example, the First Languages A.I. Reality (FLAIR) Initiative develops adaptable AI tools for Indigenous language revitalization worldwide. Co-founder Michael Running Wolf (Northern Cheyenne Tribe) describes the project’s goal as increasing the number of active speakers through accessible technologies. One notable product, “Language in a Box,” is a portable, voice-based learning system that delivers customizable guided lessons for different languages (Jampel, 2025).

Indigenous scientists are also creating culturally grounded AI tools for youth engagement. Danielle Boyer developed Skobot, a talking robot designed to speak Indigenous languages (Smithsonian Magazine), while Jacqueline Brixey created Masheli, a chatbot that communicates in both English and Choctaw. Brixey notes that despite more than 220,000 enrolled Choctaw Nation members, fewer than 7,000 are fluent speakers today (Brixey, 2025)

Students with Skobots on their shoulders stand next to Danielle Boyer
Students with Skobots on their shoulders stand next to Danielle Boyer (The STEAM Connection. n.d.)

Hawaiian Language Revitalization – ASR 

A collaboration between The MITRE Corporation, University of Hawai‘i at Hilo, and University of Oxford explored Automatic Speech Recognition (ASR) for Hawaiian, a low-resource language. Using dozens of hours of labeled audio and millions of pages of digitized Hawaiian newspaper text, researchers fine-tuned models such as Whisper (large and large-v2), achieving a Word Error Rate (WER) of about 22% (Chaparala et al., 2024, p. 4). This is promising for research and assisted workflows, but it remains challenging for beginner and intermediate learners without human review.

The models struggled with key phonetic features, particularly the glottal stop (ʻokina ⟨ʻ⟩) and vowel length distinctions, due to their subtle acoustic properties. Occasionally, the model substituted spaces for glottal stops, potentially due to English linguistic patterns where glottal stops naturally occur before vowels that begin words. Hawaiian’s success with Whisper benefited from available training data, including 338 hours of Hawaiian and 1,381 hours of Māori, and its Latin-based alphabet. Other under-resourced languages lacking such advantages may face greater transcription challenges (Chaparala et al., 2024, p. 4).

Missing Scripts Initiative – Input Methods 

The Missing Scripts Initiative, led by ANRT (National School of Art and Design, France) in collaboration with UC Berkeley’s Script Encoding Initiative and the University of Applied Sciences, Mainz, addresses a major gap: nearly half of the world’s writing systems lack digital representation.

Launched in 2024 as part of the International Decade of Indigenous Languages, the initiative recognizes that beyond simply encoding these scripts into standard formats, there is the need to create functional input methods that allow users to type and interact with these writing systems. Developing these digital typefaces requires collaboration among linguists, developers, and native speakers. The initiative’s primary objectives involve encoding these scripts, a standardization process that assigns unique numerical identifiers to each character, and producing digital fonts. This work supports UNESCO’s global efforts to preserve and revitalize Indigenous linguistic heritage (UNESCO, n.d.).

Diagram showing the computational process of translating Afáka, the script of the Ndyuka language (an English-based creole of Suriname), into English at the Missing Scripts Program
Full process of translating Afáka to English computationally at the Missing Scripts Program, a script for the Ndyuka language, an English-based creole of Suriname (The Missing Scripts, n.d.)

Cherokee Case Study – Tokenization & Community-based Language Learning

Researchers at UNC Chapel Hill found that Cherokee’s strong morphological structure, where a single word can express an entire English sentence, poses unique NLP challenges. Character-level modeling using Latin script proved more effective than traditional word-level tokenization. Moreover, because Cherokee’s word order varies depending on discourse context, translating entire documents at once may be more effective than translating one sentence at a time (Zhang et al., 2022, p. 1535-1536).

Beyond technical modeling, researchers emphasized community-driven learning platforms that combine human input with AI. Inspired by systems like Wikipedia and Duolingo, these collaborative tools crowdsource content from speakers and learners. These platforms address two critical challenges simultaneously: the scarcity of training data for endangered languages and the resulting limitations in model performance. This approach transforms language learning from an individual task into a collective effort aimed at cultural preservation (Zhang et al., 2022, p. 1532).

Community Perspectives: Strengths and Concerns 

A study by Akdeniz University researchers examined community perspectives on AI for language preservation, highlighting both benefits and challenges. 

Strengths

Community members emphasized the transformative role of mobile apps in democratizing access: “Mobile apps have democratized access to our language, allowing learners from geographically dispersed areas to engage with it daily.” Interactive games and voice recognition tools make learning more engaging and accessible, while digital platforms foster connection and belonging among geographically dispersed speakers.

Translation tools and automated content generation have also proven valuable, with one linguist commenting that these technologies have been “game-changers in making our stories universally accessible.” Participants also underscored the value of cross-disciplinary collaboration, with one project manager noting that partnerships between tech developers and Indigenous communities have “opened new pathways for innovation.” AI’s adaptability was seen as another strength, allowing solutions to be customized for each language community (Soylu & Şahin, 2024. p. 15). For example, prioritizing translation tools over transcription systems depending on local needs.

Concerns

Participants also voiced serious concerns about ethics, privacy, and cultural sensitivity. One community leader stressed the importance of ensuring that “these technologies respect our cultural values and the integrity of our languages.” Limited internet infrastructure, funding instability, and intergenerational gaps remain ongoing barriers. As another participant observed, “Bridging the gap between our elders and technology is ongoing work.” Long-term sustainability depends on reliable funding and culturally informed consent practices (Soylu & Şahin, 2024. p. 15). 

The Human and Cultural Dimensions 

Focusing in on specific themes and perspectives, Indigenous innovators emphasize AI cannot replace human elders and tradition keepers. Technology should complement traditional practices like classes and intergenerational transmission. “Language is a living thing,” requiring living speakers, cultural context, and human relationships (Jampel, 2025). 

Language preservation carries profound emotional and cultural significance. It is not merely the deployment of ‘fancy technology’ but usually a response to the deep wounds caused by historical oppression, including forced assimilation, the systematic suppression of Indigenous languages, and the displacement of communities from their ancestral lands (Brixey, 2025). For many, language revitalization is not just an educational effort but an act of cultural healing and the restoration of what was forcibly taken.

Critical Concerns and Emerging Risks

Beyond community-identified challenges, broader concerns about AI’s role in language preservation have emerged, particularly regarding quality control and misinformation. In December 2024, the Montreal Gazette reported the sale of AI-generated “how-to” books for endangered languages, including Abenaki, Mi’kmaq, Mohawk, and Omok (a Siberian language extinct since the 18th century). These books contained inaccurate translations and fabricated content, which Abenaki community members described as demeaning and harmful, undermining both learners’ efforts and trust in legitimate revitalization work (Jiang, 2025). 

Many Indigenous communities also remain cautious about adopting AI. Jon Corbett, a Nehiyaw-Métis computational media artist and professor at Simon Fraser University, noted that some communities “don’t see the relevance to our culture, and they’re skeptical and wary of their contribution. Part of that is that for Indigenous people in North America, their language has been suppressed and their culture oppressed, so they’re weary of technology and what it can do” (Jiang, 2025). This caution reflects historical trauma and highlights critical questions about control, ownership, and ethical deployment of AI in cultural contexts.

Toward Ethical and Decolonized Approaches

Scholars emphasize decolonizing speech technology—respecting Indigenous knowledge systems rather than imposing Western frameworks. In 2019, Onowa McIvor and Jessica Ball, affiliated with the University of Victoria in Canada, underscored community-level initiatives supported by coherent policy and government backing (Soylu & Şahin, 2024. p. 13).

Before developing computational tools, speaker communities’ basic needs must be met: “respect, reciprocity, and understanding.” Researchers must avoid treating languages as commodities or prioritizing dataset size over community wellbeing. Common goals must be established before research begins. Only through such groundwork can AI technologies truly serve language revitalization rather than becoming another tool of extraction and exploitation (Zhang et al., 2022, p. 1531) .

These perspectives reveal that while technology offers promising pathways for language revitalization, success depends fundamentally on addressing both technical and sociocultural barriers through genuinely community-centered approaches that honor the living, relational nature of language itself. 

Future Challenges and Considerations

The Low-Resource Language Challenge

A key obstacle in applying AI to endangered languages is the lack of large training datasets. High-resource languages like English and Spanish rely on millions of parallel sentence pairs for accurate translation (Jampel, 2025), but many endangered languages have limited or no written resources. Some lack a script entirely, requiring more intensive dataset curation and multimodal approaches.

To address this, Professor Jacqueline Brixey and Dr. Ron Artstein compiled a dataset combining audio, video, and text, with many texts translated into English, allowing models to leverage multiple modalities (Brixey, 2025). Similarly, Jared Coleman at Loyola Marymount University is developing translation tools for Owens Valley Paiute, a “no-resource” language with no public datasets. His system first teaches grammar and vocabulary to the model, then has it translate using this foundation, mimicking human strategies when working with limited data. Coleman emphasizes: “Our goal isn’t perfect translation but producing outputs that accurately convey the user’s intended meaning” (Jiang, 2025).  

Capturing Linguistic and Cultural Features

Major models like ChatGPT perform poorly with Indigenous languages. Brixey notes: “ChatGPT could be good in Choctaw, but it’s currently ungrammatical; it shares misinformation about the tribe” (Jampel, 2025). Models fail to understand cultural nuance or privilege dominant culture perspectives, potentially mishandling sensitive information. These failures underscore the need for better security controls and validation mechanisms to mitigate the potential harm of linguistic misinformation. 

Technical challenges extend to basic digitization processes as well. For example, most Cherokee textual materials exist as physical manuscripts or printed books, which are readable by humans but not machine-processable. This limits applications such as automated language-learning tools. Optical Character Recognition (OCR), using systems like Tesseract-OCR and Google Vision OCR, can convert these materials into machine-readable text with reasonable accuracy. However, OCR performance is highly sensitive to image quality. Texts with cluttered layouts or illustrations, common in children’s books, often yield lower recognition rates, posing ongoing challenges for digitization and digital preservation efforts (Zhang et al., 2022, p. 1536).

Ethical and Governance Issues

The exploitation of Indigenous languages has deep historical roots that continue to shape debates on AI development. In 1890, anthropologist Jesse Walter Fewkes recorded Passamaquoddy stories and songs, some sacred and meant to remain private, but the community was denied access for nearly a century, highlighting longstanding issues of linguistic sovereignty (Jampel, 2025).

More recently, in late 2024, the Standing Rock Sioux Tribe sued an educational company for exploiting Lakota recordings without consent, profiting from tribal knowledge, and demanding extra fees to restore access (Jampel, 2025). 

In response, researchers like Brixey and Boyer implement protective measures, allowing participants to withdraw recordings and exclude their knowledge from AI development. These practices uphold data sovereignty, ensuring Indigenous communities retain control over their cultural knowledge and limiting commercialization. There is also a strong emphasis on keeping these technologies within Indigenous communities, preventing them from being commercialized or sold externally (Jampel, 2025).

As such, AI for language preservation requires clear policies for data governance and ethics. Some projects illustrate how AI can be ethically applied. New Zealand’s Te Hiku Media “Kōrero Māori” project uses AI for Māori language preservation under the Kaitiakitanga license, which forbids misuse of local data. CTO Keoni Mahelona emphasizes working with elders to record voices for transcription, demonstrating that AI tools can support Indigenous languages while respecting cultural values and community control (Jiang, 2025). Balancing technological openness with cultural sensitivity remains essential.

Resource and Infrastructure Needs

Beyond technical and ethical challenges, practical resource constraints significantly limit the scope and sustainability of language preservation initiatives. Securing funding for long-term projects remains one of the most persistent obstacles, as language revitalization requires sustained commitment over decades rather than short-term grant cycles. Training represents another critical need: communities require skilled teachers, technology experts, and materials developers who understand both the technical systems and the cultural context.

Infrastructure gaps pose fundamental barriers to participation. Many Indigenous communities lack reliable internet access and technology availability, limiting who can engage with digital language tools. Even when technologies are developed, communities need training to use and maintain AI tools independently, ensuring that these systems serve rather than create dependencies. Addressing these resource and infrastructure needs is essential for moving from pilot projects to sustainable, community-controlled language preservation ecosystems.

Conclusion

AI and NLP technologies hold significant promise for language preservation, addressing a critical need as many languages approach extinction due to declining numbers of speakers.

However, these technologies face inherent technical limitations. Low-resource languages often lack sufficient written materials or even a formal script, making model training difficult. LLMs trained primarily on English and other major languages struggle to capture the lexical, grammatical, and semantic nuances of endangered languages.

Equally important is the role of communities. Successful preservation depends on Indigenous leadership, ethical oversight, sustained collaboration, and adequate funding. AI should not be seen as a replacement for human knowledge but as one tool among many in a broader preservation toolkit.

Ultimately, digital preservation empowers communities to maintain and revitalize their linguistic heritage. Languages are living systems that thrive through active human relationships, and technology’s role is to support, not replace, these connections between people, language, and culture.

Bibliography 

Ali, M., Bhatti, Z. I., & Abbas, T. (2025). Exploring the Linguistic Capabilities and Limitations of AI for Endangered Language preservation. Journal of Development and Social Sciences, 6(2), 132–140. https://doi.org/10.47205/jdss.2025(6-II)12

BERT. (n.d.). Retrieved November 6, 2025, from https://huggingface.co/docs/transformers/en/model_doc/bert

Brixey, J. (Lina). (2025, January 22). Using Artificial Intelligence to Preserve Indigenous Languages—Institute for Creative Technologies. https://ict.usc.edu/news/essays/using-artificial-intelligence-to-preserve-indigenous-languages/

Chaparala, K., Zarrella, G., Fischer, B. T., Kimura, L., & Jones, O. P. (2024). Mai Ho’omāuna i ka ’Ai: Language Models Improve Automatic Speech Recognition in Hawaiian (arXiv:2404.03073). arXiv. https://doi.org/10.48550/arXiv.2404.03073

Digital preservation of Indigenous languages: At the intersection of. (n.d.). Retrieved November 6, 2025, from https://www.unesco.org/en/articles/digital-preservation-indigenous-languages-intersection-technology-and-culture

Jampel, S. (2025, July 31). Can A.I. Help Revitalize Indigenous Languages? Smithsonian Magazine. https://www.smithsonianmag.com/science-nature/can-ai-help-revitalize-indigenous-languages-180987060/

Jiang, M. (2025, February 22). Preserving the Past: AI in Indigenous Language Preservation. Viterbi Conversations in Ethics. https://vce.usc.edu/weekly-news-profile/preserving-the-past-ai-in-indigenous-language-preservation/

Soylu, D., & Şahin, A. (2024). The Role of AI in Supporting Indigenous Languages. AI and Tech in Behavioral and Social Sciences, 2(4), 11–18. https://doi.org/10.61838/kman.aitech.2.4.2

Students with Skobots on their shoulders stand next to Danielle Boyer. The STEAM Connection. (n.d.). [Graphic]. Retrieved November 6, 2025, from https://th-thumbnailer.cdn-si-edu.com/8iThtG8bZkWMxUq0goedXRXlzio=/fit-in/1072×0/filters:focal(616×411:617×412)/https://tf-cmsv2-smithsonianmag-media.s3.amazonaws.com/filer_public/a5/f3/a5f3877c-f738-423d-bcff-8a44efcbe48f/danielle-boyer-and-student-wearing-skobots_web.jpg

The Missing Scripts. (n.d.). [Graphic]. Retrieved November 6, 2025, from https://sei.berkeley.edu/the-missing-scripts/

What is NLP? – Natural Language Processing Explained – AWS. (n.d.). Amazon Web Services, Inc. Retrieved November 6, 2025, from https://aws.amazon.com/what-is/nlp/

Zhang, S., Frey, B., & Bansal, M. (2022). How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 1529–1541). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.acl-long.108

 

Filed Under: Computer Science and Tech Tagged With: AI, AI ethics, artificial intelligence, Computer Science and Tech, nlp, Technology

Ethical ramifications of AI-powered medical diagnoses

December 7, 2025 by Mauricio Cuba Almeida '27

Incredible advancements in artificial intelligence (AI) have recently paved the way for the use of AI in healthcare settings. Implementation of AI has the potential to address worker shortages in the medical field, lead to discovery of new drugs, or improve diagnoses (Bajwa et al., 2021). A writer for the American Medical Association, Benji Feldheim applauds AI for restoring the “human side” in medicine. For example, AI scribes in particular ease the documentation burden doctors face—reducing burnout and improving doctors’ interactions with patients as a result (Feldheim, 2025). Another example is the AI model developed by Shmatko et al. (2025), known as Delphi-2M, which is capable of accurately predicting a patient’s next 20 years of disease burden (i.e., what diseases they would contract and when). Evidently, AI is a very promising technology already capable of improving lives, however, there are reasons to be skeptical. While these advances are promising, these uses of AI also raise concerns about fairness and clinical safety. After a brief synopsis of Shmatko et al.’s Delphi-2M, I evaluate the ethical ramifications of AI-powered diagnoses and related clinical tools.

Delphi-2M is an AI model trained on over 400,000 patient histories from a UK database to forecast an individual’s 20-year disease trajectory. Similar to chatbots like ChatGPT, Delphi-2M is a large language model (LLM), a type of AI that can recognize and reproduce patterns from large amounts of data. Similar to how chatbots pick up on what words are likely to appear with other words in order to form sentences, Delphi-2M learns from its vast training set of medical records to predict a patient’s disease trajectory from realworld patterns. As Yonghui Wu puts it in her summary of Shmatko et al.’s work, it’s just how becoming a smoker may be followed by a future diagnosis of lung cancer—these are patterns Delphi-2M recognize. To do this, Delphi-2M is fed “tokens” that link diseases or health factors to specific times in a person’s life, like chickenpox at age 2 or smoking at age 41 (Figure 1). Then, Delphi-2M outputs new tokens that predict what diseases and when they will occur in an individual’s life, like the onset of respiratory disorders at age 71 as a result of smoking. Delphi-2M, after being trained, was tested by predicting the medical histories of 1.9 million patients not included in the original training set. Shmatko et al. demonstrate this AI to have great success in accurately predicting disease trajectory, as it partially predicts patterns in individuals’ diagnoses in 97% of cases.

Visualization of Delphi-2M input and output (Wu, 2025).

Nonetheless, we must hold AI used to diagnose patients to a higher level of scrutiny compared to AI used commercially. LLMs are not perfect as they are subject to algorithmic bias and misuse, beginning before their creation. Shmatko et al. (2025), for example, address some shortcomings of the training data used for Delphi-2M. Notably, they explain the data from a mostly-white, older subset of the UK population isn’t entirely generalizable to very different demographics. Though Shmatko et al. found successes testing the model against a Danish database after training it on UK patients, I’m still concerned how Delphi-2M would perform on non-European and younger demographics, or those underrepresented in training data. Facial recognition is a prime example of where AI underperforms when training datasets lack diverse representation. AI designed to recognize faces historically underperform on individuals with feminine features or darker skin due to unrepresentative training data (Hardesty, 2018). With this in mind, it’s important that training data for diagnostic AI is representative of all demographics prior to widespread implementation.

Furthermore, Cabitza et al. (2017) wrote on some of the unintended consequences of machine learning in healthcare, postulating that widespread implementation of these tools also has the potential to reduce the skill of physicians. Though convenient in the short run, Cabitza et al. raise concerns with overreliance on AI—as studies show physicians aided by AI were less sensitive and accurate in diagnosing patients. Mammogram readers, for instance, were 14% less sensitive in their diagnostics when presented with images marked by computer-aided detection (Povyakalo et al., 2013). Though this study focused on image diagnoses, it’s clear how widespread use of Delphi-2M would lead to the same problems of deskilling in physicians. Delphi-2M is also exclusively a text-based model, which as Cabitza et al. detail, means that these diagnosis algorithms do not incorporate crucial contextual elements that are “psychological, relational, social, and organizational” in nature. A realworld example that Cabitza et al. described was an instance in which an AI model predicted a lower mortality risk for patients with pneumonia and asthma compared to those with pneumonia and without asthma. Understanding that asthma is not a protective factor for pneumonia patients, the involved researchers found the discrepant AI output was the result of hospital procedures that admitted pneumonia patients with asthma directly to intensive care, giving them better health outcomes. This missing piece of crucial information, which was difficult to represent in these prognostic models, led to an error a physician would not make. Thus, AI is limited in what information it can train on.

Though these new advancements in healthcare AI are promising, they have their limits. Tools like Delphi-2M spot patterns across vast clinical histories that no single clinician could feasibly track, yet the benefits depend on who is represented in the data, how predictions are explained and used, and whether safeguards are in place when they fail. Before AI is implemented in healthcare, we must demand representative training sets, validation across diverse populations, clear disclosures of uncertainty and limitations, and constant human involvement in the process that resists automation bias and deskilling. In short, diagnostic AI should supplmenent—not replace—clinical judgment, and it should be developed with privacy, equity, and patient trust at the forefront. Only then will these systems reliably improve care rather than merely appear to.

 

References

Bajwa, J., Munir, U., Nori, A., & Williams, B. (2021). Artificial intelligence in healthcare: transforming the practice of medicine. Future Healthcare Journal, 8(2), e188–e194. https://doi.org/10.7861/fhj.2021-0095

Cabitza, F., Rasoini, R., & Gensini, G. F. (2017). Unintended consequences of machine learning in medicine. JAMA, 318(6), 517. https://doi.org/10.1001/jama.2017.7797

Feldheim, B. (2025, June 12). AI scribes save 15,000 hours—and restore the human side of medicine. American Medical Association. https://www.ama-assn.org/practice-management/digital-health/ai-scribes-save-15000-hours-and-restore-human-side-medicine

Hardesty, L. (2018, February 11). Study finds gender and skin-type bias in commercial artificial-intelligence systems. MIT News. https://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212

Povyakalo, A. A., Alberdi, E., Strigini, L., & Ayton, P. (2013). How to Discriminate between Computer-Aided and Computer-Hindered Decisions. Medical Decision Making, 33(1), 98–107. https://doi.org/10.1177/0272989×12465490

Wu, Y. (2025). AI uses medical records to accurately predict onset of disease 20 years into the future. Nature, 647(8088), 44–45. https://doi.org/10.1038/d41586-025-02971-3

Filed Under: Biology, Computer Science and Tech, Psychology and Neuroscience, Science

Biological ChatGPT: Rewriting Life With Evo 2

May 4, 2025 by Jenna Lam '28

What makes life life? Is there underlying code that, when written or altered, can be used to replicate or even create life? On February 19th 2025, scientists from Arc Institute, NVIDIA, Stanford, Berkeley, and UC San Francisco released Evo 2, a generative machine learning model that may help answer these questions. Unlike its precursor Evo 1, which was released a year earlier, Evo 2 is trained on genomic data of eukaryotes as well as prokaryotes. In total, it is trained on 9.3 trillion nucleotides from over 130,000 genomes, making it the largest AI model in biology. You can think of it as ChatGPT for creating genetic code—only it “thinks” in the language of DNA rather than human language, and it is being used to solve the most pressing health and disease challenges (rather than calculus homework).

Computers, defined broadly, are devices that store, process, and display information. Digital computers, such as your laptop or phone, function based on binary code—the most basic form of computer data composed of 0s and 1s, representing a current that is on or off. Evo 2 centers around the idea that DNA functions as nature’s “code,” which, through protein expression and organismal development, creates “computers” of life. Rather than binary, organisms function according to genetic code, made up of A, T, C, G, and U–the five major nucleotide bases that constitute DNA and RNA.

Although Evo 2 can potentially design code for artificial life, it has not yet designed an entire genome and is not being used to create artificial organisms. Instead, Evo 2 is being used to (1) predict genetic abnormalities and (2) generate genetic code.

11 Functions of Evo 2 in biology at the cellular/organismal, protein, RNA, and epigenome levels.
Functions of Evo 2 at different levels. Adapted from https://www.biorxiv.org/content/10.1101/2025.02.18.638918v1.full

Accurate over 90% of the time, Evo 2 can predict which BRCA1 (a gene central to understanding breast cancer) mutations are benign versus potentially pathogenic. This is big, since each gene is composed of hundreds and thousands of nucleotides, and any mutation in a single nucleotide (termed a Single Nucleotide Variant, or SNV) could have drastic consequences for the protein structure and function. Thus, being able to computationally pinpoint dangerous mutations reduces the amount of time and money spent testing each mutation in a lab, and paves the way for developing more targeted drugs.

Secondly, Evo 2 can design genetic code for highly specialized and controlled proteins which provide many fruitful possibilities for synthetic biology (making synthetic molecules using biological systems), from pharmaceuticals to plastic-degrading enzymes. It can generate entire mitochondrial genomes, minimal bacterial genomes, and entire yeast chromosomes–a feat that had not been done yet.

A notable perplexity of eukaryotic genomes is their many-layered epigenomic interactions: the complex power of the environment in controlling gene expression. Evo 2 works around this by using models of epigenomic structures, made possible through inference-time scaling. Put simply, inference-time scaling is a technique developed by NVIDIA that allows AI models to take time to “think” by evaluating multiple solutions before selecting the best one.

How is Evo 2 so knowledgeable, despite only being one year old? The answer lies in deep learning.

Just as in Large Language Models, or LLMs (think: ChatGPT, Gemini, etc.), Evo 2 decides what genes should look like by “training” on massive amounts of previously known data. Where LLMs train on previous text, Evo 2 trains on entire genomes of over 130,000 organisms. This training—the processing of mass amounts of data—is central to deep learning. In training, individual pieces of data called tokens are fed into a “neural networks”—a fancy name for a collection of software functions that are communicate data to one another. As their name suggests, neural networks are modeled after the human nervous system, which is made up of individual neurons that are analogous to software functions. Just like brain cells, “neurons” in the network can both take in information and produce output by communicating with other neurons. Each neural network has multiple layers, each with a certain number of neurons. Within each layer, each neuron sends information to every neuron in the next layer, allowing the model to process and distill large amounts of data. The more neurons involved, the more fine-tuned the final output will be. 

This neural network then attempts to solve a problem. Since practice makes perfect, the network attempts the problem over and over; each time, it strengthens the successful neural connections while diminishing others. This is called adjusting parameters, which are variables within a model that can be adjusted, dictating how the model behaves and what it produces. This minimizes error and increases accuracy. Evo 2 was trained with 7b and 40b parameters to have a 1 million token context window, meaning the genomic data was fed through many neurons and fine-tuned many times.

Example neural network
Example neural network modeled using tensorflow adapted from playground.tensorflow.org

The idea of anyone being able to create genetic code may spark fear; however, Evo 2 developers have prevented the model from returning productive answers to inquiries about pathogens, and the data set was carefully chosen to not include pathogens that infect humans and complex organisms. Furthermore, the positive possibilities of Evo 2 usage are likely much more than we are currently aware of: scientists believe Evo 2 will advance our understanding of biological systems by generalizing across massive genomic data of known biology. This may reveal higher-level patterns and unearth more biological truths from a birds-eye view.

It’s important to note that Evo 2 is a foundational model, emphasizing generalist capabilities over task-specific optimization. It was intended to be a foundation for scientists to build upon and alter for their own projects. Being open source, anyone can access the model code and training data. Anyone (even you!) can even generate their own strings of genetic code with Evo Designer. 

Biotechnology is rapidly advancing. For example, DNA origami allows scientists to fold DNA into highly specialized nanostructures of any shape–including smiley faces and China–potentially allowing scientists to use DNA code to design biological robots much smaller than any robot we have today. These tiny robots can target highly specific areas of the body, such as receptors on cancer cells. Evo 2, with its designing abilities, opens up many possibilities for DNA origami design. From gene therapy, to mutation-predictions, to miniature smiley faces, it is clear that computation is becoming increasingly important in understanding the most obscure intricacies of life—and we are just at the start.

 

Garyk Brixi, Matthew G. Durrant, Jerome Ku, Michael Poli, Greg Brockman, Daniel Chang, Gabriel A. Gonzalez, Samuel H. King, David B. Li, Aditi T. Merchant, Mohsen Naghipourfar, Eric Nguyen, Chiara Ricci-Tam, David W. Romero, Gwanggyu Sun, Ali Taghibakshi, Anton Vorontsov, Brandon Yang, Myra Deng, Liv Gorton, Nam Nguyen, Nicholas K. Wang, Etowah Adams, Stephen A. Baccus, Steven Dillmann, Stefano Ermon, Daniel Guo, Rajesh Ilango, Ken Janik, Amy X. Lu, Reshma Mehta, Mohammad R.K. Mofrad, Madelena Y. Ng, Jaspreet Pannu, Christopher Ré, Jonathan C. Schmok, John St. John, Jeremy Sullivan, Kevin Zhu, Greg Zynda, Daniel Balsam, Patrick Collison, Anthony B. Costa, Tina Hernandez-Boussard, Eric Ho, Ming-Yu Liu, Thomas McGrath, Kimberly Powell, Dave P. Burke, Hani Goodarzi, Patrick D. Hsu, Brian L. Hie (2025). Genome modeling and design across all domains of life with Evo 2. bioRxiv preprint doi: https://doi.org/10.1101/2025.02.18.638918.

 

Filed Under: Biology, Computer Science and Tech, Science Tagged With: AI, Computational biology

Unsupervised Thematic Clustering for Genre Classification in Literary Texts

May 4, 2025 by Wing Kiu Lau '26

Figure depicting the influence of distance metrics on ARI scores for each feature type.
Book genres
(Chapterly 2022)

Summary

In the last decade, computational literary studies have expanded, yet computational thematics remains less explored than areas like stylometry, which focuses on identifying stylistic similarities between texts. A 2024 study by researchers from the Max Planck Institute and the Polish Academy of Sciences investigated the most effective computational methods for measuring thematic similarity in literary texts, aiming to improve automated genre clustering.

Key Findings and Assumptions

  • Key Assumptions: 
    • Text pre-processing to emphasize thematic content over stylistic features could improve genre clustering. 
    • Unsupervised clustering would offer a more scalable and objective approach to genre categorization than manual tagging by humans.
    • Four genres were selected (detective, fantasy, romance, science fiction) for their similar level of broad qualities.
    • If the genres are truly distinct in terms of themes, computers should be able to separate them into clusters.
  • Best Performance: The best algorithms were 66-70% accurate at grouping books by genre. Thus showing unsupervised genre clustering is feasible despite the complexity of literary texts.
  • Text Pre-Processing: Medium and strong levels of text pre-processing significantly improved clustering, while weak pre-processing performed poorly.
  • Which methods worked best: Doc2vec, a method that captures word meaning and context, performed the best overall, followed by LDA (Latent Dirichlet Allocation), which finds major topics in texts. Even the simpler bag-of-words method, which just counts how often words appear, gave solid results.
  • Best way to compare genres: Jensen-Shannon divergence, which compares probability distributions, was the most effective metric, while simpler metrics like Euclidean distance performed poorly for genre clustering.

Methodology 

Sample Selection

The researchers selected canonical books from each of the four genres, ensuring they were from the same time period to control for language consistency.

Sample Pre-Processing and Analysis 

The researchers analyzed all 291 combinations of the techniques in each of the three stages: text pre-processing, feature extraction, and measuring text similarity. 

Stage 1: Different Levels of Text Pre-Processing  

  • The extent to which the text is simplified and cleaned up.
    • Weak → lemmatizing (reducing words to their base or dictionary form (e.g., “running” to “run”), removing 100 Most Frequent Words
    • Medium → lemmatizing, using only nouns, adjectives, verbs, and adverbs, removing character names
    • Strong → Same as medium, but also replaced complex words with simpler versions.

Stage 2: Identifying Key Text Features through Extraction Methods

  • Transforming pre-processed texts into feature lists.
    • Bag-of-Words → Counts how often each word appears.
    • Latent Dirichlet Allocation (LDA) → Tries to discover dominant topics across books.
    • Weighted Gene Co-expression Network Analysis (WGCNA) → A method borrowed from genetics to find clusters of related words.
    • Document-Level Embeddings (doc2vec) → Captures semantic relationships (connections between words based on their meanings (e.g., “dog” and “cat”)) for similarity assessment.

Stage 3: Distance metric (Measuring Text Similarity)

  • Quantifying similarity with metrics. 6 metrics were chosen: 
    • Euclidean, Manhattan, Delta, Cosine Delta, Cosine, Jensen-Shannon divergence 

To minimize the influence of individual books on the clustering results, rather than analyzing the full corpus at once, the researchers used multiple smaller samples. Each sample consisted of 30 books per genre (120 books total), and this sampling process was repeated 100 times for each combination. Additionally, models requiring training (LDA, WGCNA, and doc2vec) were retrained for each sample to reduce potential biases.

Clustering and Validation

The researchers applied Ward’s clustering algorithm on the distances, grouping texts into four clusters based on genre similarity. They then checked how well these clusters matched the actual genres of the books. To do this, they used a scoring system called the Adjusted Rand Index (ARI), which gives a number between 0 (least accurate) to 1 (most accurate). 

The results were visualized using a map projection, grouping similar books closer together, and revealing the underlying thematic structures and relationships among the novels.

Core Findings and Figures  

Results  

The best algorithms grouped literary texts with 66-70% accuracy, demonstrating that unsupervised clustering of fiction genres is feasible despite text complexity. Successful methods consistently used strong text pre-processing, emphasizing the importance of text cleaning and simplification to focus more on a book’s themes rather than its writing style.

Among the top features, six of the ten were based on LDA topics, proving its effectiveness in genre classification. Additionally, eight of the best distance metrics used Jensen–Shannon divergence, suggesting it is highly effective for genre differentiation.

Generalizability  

To assess generalizability, five statistical tests were used to analyze interactions between text pre-processing, feature extraction methods, distance metrics, and other factors. These models provided insights into the broader effectiveness of various methods for thematic analysis.

Text Pre-Processing and Genre Clustering  

Text pre-processing improves genre clustering, with low pre-processing performing the worst across all feature types. Medium and strong pre-processing showed similar results, suggesting replacing complex words with simpler words offers minimal improvements in genre recognition. 

The benefits of strong text pre-processing for document embeddings, LDA, and bag-of-words were minimal and inconsistent. The figure below suggests a positive correlation between Most Frequent Words and ARI and the degree of text pre-processing and ARI. This demonstrates that how we prepare texts matters just as much as what algorithms we use. Moreover, researchers can save time by avoiding replacing complex words with simpler words if medium and strong pre-processing show similar results. 

Figure depicting the influence of the number of Most Frequent Words, used as text features, on the model’s ability to detect themes, measured with ARI.
Fig 1. The influence of the number of Most Frequent Words, used as text features, on the model’s ability to detect themes, measured with ARI (Sobchuk and Šeļa, 2024, Figure 6).

Feature Types and Their Performance  

Doc2vec, which looks at how words relate to each other in meaning, performed best on average, followed by LDA, which remained stable across various settings, such as topic numbers and the number of Most Frequent Words. Perhaps researchers can use this method without excessive parameter tuning. The simple bag-of-words approach performed well despite its low computational cost, perhaps suggesting even basic approaches can compete with more complex models. WGCNA performed the worst on average, suggesting methods from other fields need careful adaptation before use.

LDA Performance and Parameter Sensitivity  

The performance of LDA did not significantly depend on the number of topics or the number of Most Frequent Words being tracked. The key factor influencing thematic classification was text pre-processing, with weak pre-processing significantly reducing ARI scores. Hence, this underscores the need for further research on text pre-processing, given its key role in the effectiveness of LDA and overall genre classification.  

Bag-of-Words Optimization

The effectiveness of Bag-of-Words depended on a balance between text pre-processing and how many Most Frequent Words are tracked. While increases in Most Frequent Words from 1,000 to 5,000 and medium text pre-processing significantly improved accuracy scores, further increases provided minimal gains. This ‘sweet spot’ means projects can achieve good results without maxing out computational resources, making computational thematics more accessible to smaller research teams and institutions.

Best and Worst Distance Metrics for Genre Recognition  

Jensen–Shannon divergence, which compares probability distributions, was the best choice for grouping similar genres, especially when used with LDA and bag-of-words. The Delta and Manhattan methods also worked reasonably well. Euclidean was the worst performer across LDA, bag-of-words, and WGCNA despite its widespread use in text analysis, suggesting further research is needed to replace industry-standard metrics. Cosine distance, while effective for authorship attribution, was not ideal for measuring LDA topic distances. Doc2vec is less affected by the comparison method used. 

Figure depicting the influence of distance metrics on ARI scores for each feature type.
Fig 2. The influence of distance metrics on ARI scores for each feature type (Sobchuk and Šeļa, 2024, Figure 3).

Main Findings  

Unsupervised learning can detect thematic similarities, though performance varies. Methods like cosine distance, used in authorship attribution, are less effective for thematic analysis when used with minimal preprocessing and a small number of Most Frequent Words.

Reliable thematic analysis can improve large-scale problems of inconsistent manual genre tagging in digital libraries and identifying unclassified or undiscovered genres. Additionally, it can enhance book recommendation systems by enabling content-based similarity detection instead of solely relying on user behavior. Much like how Spotify suggests songs based on acoustic features.

Conclusion  

This study demonstrates the value of computational methods in literary analysis, showing how thematic clustering can enhance genre classification and literary evolution. It establishes a foundation for future large-scale literary studies.

Limitations  

Key limitations include the simplification of complex literary relationships in clustering, which despite reducing complex literary relationships into more manageable structures, may not work the same way with different settings or capture every important textual feature.

The study also did not separate thematic content from elements like narrative perspective. Additionally, genre classification remains subjective and ambiguous, and future work could explore alternative approaches, such as user-generated tags from sites like Goodreads.

Implications and Future Research  

This research provides a computational framework for thematic analysis, offering the potential for improving genre classification and book recommendation systems. Future work should incorporate techniques like BERTopic and Top2Vec, test these methods on larger and more diverse datasets, and further explore text simplification and clustering strategies.

Bibliography 

Sobchuk, O., Šeļa, A. Computational thematics: comparing algorithms for clustering the genres of literary fiction. Humanit Soc Sci Commun 11, 438 (2024). https://doi.org/10.1057/s41599-024-02933-6

Book genres. (2022). Chapterly. Retrieved May 4, 2025, from https://www.chapterly.com/blog/popular-and-lucrative-book-genres-for-authors.

Filed Under: Computer Science and Tech Tagged With: Computational Analysis, Computer Science, Computer Science and Tech, Machine Learning

Motor Brain-Computer Interface Reanimates Paralyzed Hand

May 4, 2025 by Mauricio Cuba Almeida '27

Over five million people in the United States live with paralysis (Armour et al., 2016), representing a large portion of the US population. Though the extent of paralysis varies from person-to-person, most with paralysis experience unmet needs that subtract from their overall life satisfaction. A survey of those with paralysis revealed “peer support, support for family caregivers, [and] sports activities” as domains where individuals with paralysis experienced less fulfillment—with lower household income predicting a higher likelihood of unmet needs (Trezzini et al., 2019). Consequently, individuals with sufficient motor function have turned to video games as a means to meet some of these needs, as video games are sources of recreation, artistic expression, social connectedness, and enablement (Cairns et al., 2019). Oftentimes, however, these individuals are limited by what games they are able to engage with—as they often “avoid multiplayer games with able-bodied players” (Willsey et al., 2025). Thus, Willsey and colleagues (2025) explore brain-computer interfaces as a valuable potential solution for restoring more sophisticated motor control of not just video games, but of digital interfaces used for social networking or remote work.

Brain-computer interfaces (BCIs) are devices that read and analyze brain activity in order to produce commands that are then relayed to output devices, with the intent of restoring useful bodily function (Shih et al., 2012). Willsey et al. explain how current motor BCIs are unable to distinguish between the brain activity corresponding to the movement of different fingers, so BCIs have instead relied on detecting the more general movement of grasping a hand (where the fingers are treated as one group). This limits BCIs to controlling fewer dimensions of an instrument: just being able to control a computer’s point-and-click cursor control as compared to typing on a computer. Hence, Willsey et al. seek to expand BCIs to allow for greater object manipulation—implementing finger decoding that will differentiate the brain output signals for different fingers, allowing for “typing, playing a musical instrument or manipulating a multieffector digital interface such as a video game controller.” Improving BCIs would also involve continuous finger decoding, as finger decoding has mostly been done retrospectively, where finger signals are not classified and read until after the brain data is analyzed. 

Willsey et al. developed a BCI system that is capable of decoding three independent finger groups (with the thumb decoded into two dimensions), allowing for four total dimensions of control. By training on the participant’s brain over nine days as they attempt to move individual fingers, the BCI can learn to distinguish brain regions that correspond to finger movements. These four dimensions of control are well reflected in a quadcopter simulation, where a patient with an implemented BCI is able to manipulate a virtual hand to fly a quadcopter drone through various hoops of an obstacle course. Many applications, even beyond video games, are apparent. These finger controls can be extended to a robotic hand or could reanimate the paralyzed limb. 

Finger movement is decoded into three distinct groups (differentiated by color).
Finger movement is decoded into three distinct groups (differentiated by color; Willsey et al., 2025).
Participant navigates quadcopter through a hoop through decoded finger movements.
Participant navigates quadcopter through a hoop through decoded finger movements (Willsey et al., 2025).

Download Full Video

The patient’s feelings of social connectedness, enablement and recreation were greatly improved. Willsey et al. note how the patient often looked forward to the quadcopter sessions, frequently “[asking] when the next quadcopter session was.” Not only did the patient find enjoyment in controlling the quadcopter, but they found training not to be tedious and the controls intuitive. To date, this finger BCI proves to be the most capable kind of motor BCI, and will serve as a valuable model for non-motor BCIs, like Brain2Char, a system for decoding text from brain recordings.

However, BCIs raise significant ethical considerations that must be addressed alongside their development. Are users responsible for all outputs from a BCI, even with outputs unintended? Given that BCIs decode brain signaling and train on data from a very controlled setting, there is always the potential for natural “noise” that may upset a delicate BCI model. Ideally, BCIs are trained on a participant’s brain in a variety of different circumstances to mitigate these errors. Furthermore, BCIs may further stigmatize motor disabilities by encouraging individuals toward restoring “normal” abilities. I am particularly concerned about the cost of this technology. As with most new clinical technologies, implementation is expensive and ends up pricing out individuals with lower socioeconomic statuses. These are often the individuals that face the greatest need for technologies like BCI. As mentioned earlier, lower household income predicts more unmet needs for individuals with paralysis. Nonetheless, so long as they are developed responsibly and efforts are made to ensure their affordability, there is great promise in motor BCIs.

 

References

Armour, B. S., Courtney-Long, E. A., Fox, M. H., Fredine, H., & Cahill, A. (2016). Prevalence and Causes of Paralysis—United States, 2013. American Journal of Public Health, 106(10), 1855–1857. https://doi.org/10.2105/ajph.2016.303270

Cairns, P., Power, C., Barlet, M., Haynes, G., Kaufman, C., & Beeston, J. (2019). Enabled players: The value of accessible digital games. Games and Culture, 16(2), 262–282. https://doi.org/10.1177/1555412019893877

Shih, J. J., Krusienski, D. J., & Wolpaw, J. R. (2012). Brain-Computer interfaces in medicine. Mayo Clinic Proceedings, 87(3), 268–279. https://doi.org/10.1016/j.mayocp.2011.12.008

Trezzini, B., Brach, M., Post, M., & Gemperli, A. (2019). Prevalence of and factors associated with expressed and unmet service needs reported by persons with spinal cord injury living in the community. Spinal Cord, 57(6), 490–500. https://doi.org/10.1038/s41393-019-0243-y

Willsey, M. S., Shah, N. P., Avansino, D. T., Hahn, N. V., Jamiolkowski, R. M., Kamdar, F. B., Hochberg, L. R., Willett, F. R., & Henderson, J. M. (2025). A high-performance brain–computer interface for finger decoding and quadcopter game control in an individual with paralysis. Nature Medicine. https://doi.org/10.1038/s41591-024-03341-8

Filed Under: Computer Science and Tech, Psychology and Neuroscience, Science

Computer Vision Ethics

May 4, 2025 by Madina Sotvoldieva '28

Computer vision (CV) is a field of computer science that allows computers to “see” or, in more technical terms, recognize, analyze, and respond to visual data, such as videos and images. CV is widely used in our daily lives, from something as simple as recognizing handwritten text to something as complex as analyzing and interpreting MRI scans. With the advent of AI in the last few years, CV has also been improving rapidly. However, just like any subfield of AI nowadays, CV has its own set of ethical, social, and political implications, especially when used to analyze people’s visual data.

Although CV has been around for some time, there is limited work on its ethical limitations in the general AI field. Among the existing literature, authors categorized six ethical themes, which are espionage, identity theft, malicious attacks, copyright infringement, discrimination, and misinformation [1]. As seen in Figure 1, one of the main CV applications is face recognition, which could also lead to issues of error, function creep (the expansion of technology beyond its original purposes), and privacy. [2].

Computer Vision technologies related to Identity Theft
Figure 1: Specific applications of CV that could be used for Identity Theft.

To discuss CV’s ethics, the authors of the article take a critical approach to evaluating the implications through the framework of power dynamics. The three types of power that are analyzed are dispositional, episodic, and systemic powers [3]. 

Dispositional Power

Dispositional power is defined as the ability to bring out a significant outcome [4]. When people gain that power, they feel empowered to explore new opportunities, and their scope of agency increases (they become more independent in their actions) [5]. However, CV can threaten this dispositional power in several ways, ultimately reducing people’s autonomy. 

One way CV disempowers people is by limiting their information control. Since CV works with both pre-existing and real-time camera footage, people might be often unaware that they are being recorded and often cannot avoid that. This means that technology makes it hard for people to control the data that is being gathered about them, and protecting their personal information might get as extreme as hiding their faces. 

Apart from people being limited in controlling what data is being gathered about them, advanced technologies make it extremely difficult for an average person to know what specific information can be retrieved from visual data. Another way CV might disempower people of following their own judgment is through communicating who they are for them (automatically inferring people’s race, gender, and mood), creating a forced moral environment (where people act from fear of being watched rather than their own intentions), and potentially leading to over-dependence on computers (e.g., relying on face recognition for emotion interpretations). 

In all these and other ways, CV undermines the foundation of dispositional power by limiting people’s ability to control their information, make independent decisions, express themselves, and act freely.

Episodic Power

Episodic power, or as often referred to as power-over, defines the direct exercise of power by one individual or group over another. CV can both give new power or improve the efficiency of existing one [6]. While this isn’t always a bad thing (for example, parents watching over children), problems arise when CV makes that control too invasive or one-sided—especially in ways that limit people’s freedom to act independently.

 With CV taking security cameras to the next level, opportunities such as baby-room monitoring or fall detection for elderly people open up to us. However, it also leads to the issues of surveillance automation, which can lead to over-enforcement in scales as small as private individuals to bigger corporations (workplaces, insurance companies, etc.). Another power dynamic shifts that need to be considered, for example, when the smart doorbells show far beyond the person at the door and might violate a neighbor’s privacy by creating peer-to-peer surveillance. 

These examples show that while CV may offer convenience or safety, it can also tip power balances in ways that reduce personal freedom and undermine one’s autonomy.

Systemic Power

Systematic power is not viewed as an individual exercise of power, but rather a set of societal norms and practices that affect people’s autonomy by determining what opportunities people have, what values they hold, and what choices they make. CV can strengthen the systematic power by making law enforcement more efficient through smart cameras and increase businesses’ profit through business intelligence tools. 

However, CV can also reinforce the pre-existing systematic societal injustices. One example of that might be flawed facial recognition, when the algorithms are more likely to recognize White people and males [7], which led to a number of false arrests. This might lead to people receiving unequal opportunities (when biased systems are used for hiring process), or harm their self-worth (when falsely recognized as a criminal). 

Another matter of systematic power is the environmental cost of CV. AI systems rely on vast amounts of data, which requires intensive energy for processing and storage. As societies become increasingly dependent on AI technologies like CV, those trying to protect the environment have little ability to resist or reshape these damaging practices. The power lies with tech companies and industries, leaving citizens without the means to challenge the system. When the system becomes harder to challenge or change, that’s when the ethical concerns regarding CV arise.

Conclusion

Computer Vision is a powerful tool that keeps evolving each year. We already see numerous applications of it in our daily lives, starting from the self-checkouts in the stores and smart doorbells to autonomous vehicles and tumor detections. With the potential that CV holds in improving and making our lives safer, there are a number of ethical limitations that should be considered. We need to critically examine how CV affects people’s autonomy, might cause one-sided power dynamics, and reinforces societal prejudices. As we are rapidly transitioning into the AI-driven world, there is more to come in the field of computer vision. However, in the pursuit of innovation, we should ensure the progress does not come at the cost of our ethical values.

References:

[1] Lauronen, M.: Ethical issues in topical computer vision applications. Information Systems, Master’s Thesis. University of Jyväskylä. (2017). https://jyx.jyu.fi/bitstream/handle/123456789/55806/URN%3aNBN%3afi%3ajyu-201711084167.pdf?sequence=1&isAllowed=y

[2] Brey, P.: Ethical aspects of facial recognition systems in public places. J. Inf. Commun. Ethics Soc. 2(2), 97–109 (2004). https:// doi.org/10.1108/14779960480000246

[3] Haugaard, M.: Power: a “family resemblance concept.” Eur. J. Cult. Stud. 13(4), 419–438 (2010)

[4] Morriss, P.: Power: a philosophical analysis. Manchester University Press, Manchester, New York (2002)

[5] Morriss, P.: Power: a philosophical analysis. Manchester University Press, Manchester, New York (2002)

[6] Brey, P.: Ethical aspects of facial recognition systems in public places. J. Inf. Commun. Ethics Soc. 2(2), 97–109 (2004). https://doi.org/10.1108/14779960480000246

[7] Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. Conference on Fairness, Accountability, and Transparency, pp. 77–91 (2018) Coeckelbergh, M.: AI ethics. MIT Press (2020)

Filed Under: Computer Science and Tech, Science Tagged With: AI, AI ethics, artificial intelligence, Computer Science and Tech, Computer Vision, Ethics, Technology

AI – save or ruin the environment?

December 8, 2024 by Madina Sotvoldieva '28

With the fast speed that AI is currently developing, it has the potential to alleviate one of the most pressing problems—climate change. AI applications, such as smart electricity grids and sustainable agriculture, are predicted to mitigate environmental issues. On the flip side, the integration of AI in this field can also be counterproductive because of the high energy demand of the systems. If AI helps us to transition to a more sustainable lifestyle, the question is, at what cost?

The last decade saw exponential growth in data demand and the development of Large Language Models (LLMs)–computational models such as ChatGPT, designed to generate natural language. The algorithms resulted in increased energy consumption because of the big data volumes and computational power required, as well as increased water consumption needed to refrigerate data centers with that data. This consequently leads to higher greenhouse gas emissions (Fig.1). For example, the training of GPT-3 on a 500 billion-word database produced around 550 tons of carbon dioxide, equivalent to flying 33 times from Australia to the UK [1]. Moreover, information and communications technology (ICT) accounts for 3.9% of global greenhouse gas emissions (surpassing global air travel) [2]. As the number of training parameters grows, so does the energy consumption. It is expected to reach over 30% of the world’s total energy consumption by 2030. These environmental concerns about AI implementation led to a new term—Green AI.

Fig 1: CO2 equivalent emissions for training ML models (blue) and real-life cases (violet). In brackets, the billions of parameters are adjusted for each model [3].

Green algorithms are defined in two ways: green-in and green-by AI (Fig. 2). Algorithms that support the use of technology to tackle environmental issues are referred to as green-by AI. Green-in-design algorithms (green-in AI), on the other hand, are those that maximize energy efficiency to reduce the environmental impact of AI. 

 

Fig. 2. Overview of green-in vs. green-by algorithms.

 

Green-by AI has the potential to reduce greenhouse gas emissions by enhancing efficiency across many sectors, such as agriculture, biodiversity management, transportation, smart mobility, etc. 

  • Energy Efficiency. Machine Learning (ML) algorithms can optimize heating, air conditioning, and lighting by analyzing the data from the smart buildings, making them more energy efficient [4][5]. 
  • Smart Mobility. AI can predict and avoid traffic congestion by analyzing the current traffic patterns and optimizing routes. Moreover, ML contributes to Autonomous Vehicles by executing tasks like road following and obstacle detection, which improves overall road safety [6].
  • Sustainable agriculture. Data from sensors and satellites analyzed by ML can give farmers insights into crop health, soil conditions, and irrigation needs. This enables them to use the resources with precision and reduce environmental impacts. Moreover, predictive analytics minimize crop loss by allowing farmers to aid the diseases on time [7].
  • Climate Change. Computer-vision technologies can detect methane leaks in gas pipes, reducing emissions from fossil fuels. AI also plays a crucial role in reducing electricity usage by predicting demand and supply from solar and wind power.
  • Environmental Policies. AI’s ability to process data, identify trends, and predict outcomes will enable policymakers to come up with effective strategies to combat environmental issues [8].

Green-in AI, on the other hand, is an energy-efficient AI with a low carbon footprint, better quality data, and logical transparency. To ensure people’s trust, it offers clear and rational decision-making processes, thus also making it socially sustainable. Several promising approaches to reaching the green-in AI include algorithm, hardware, and data center optimization. Specifically, more efficient graphic processing units (GPUs) or parallelization (distributing computation among several processing cores) can reduce the environmental impacts of training AI. Anthony et al. proved that increasing the number of processing units to 15 will decrease greenhouse gas emissions [9]. However, the reduction in runtime must be significant enough for the parallelization method not to become counterproductive (when the execution time reduction is smaller than the increase in the number of cores, the emissions deteriorate). Other methods include computation at the locations where the data is collected to avoid data transmissions and limit the number of times an algorithm is run. 

Now that we know about AI’s impact and the ways to reduce it, what trends can we expect in the future? 

  • Hardware: Innovation in hardware design is focused on creating both eco-friendly and powerful AI accelerators, which can minimize energy consumption [10].
  • Neuromorphic computing is an emerging area in the computing technology field, aiming to create more efficient computing systems. It draws inspiration from the human brain, which performs complex tasks with much less energy than conventional computers. 
  • Energy-harvesting AI devices. Researchers are exploring the ways in which AI can harvest energy from its surroundings, for example from the lights or heat [11]. This way, AI can rely less on external power and become self-sufficient.

In conclusion, while AI holds great potential in alleviating many environmental issues, we should not forget about its own negative impact. While training AI models results in excessive greenhouse gas emissions, there are many ways to reduce energy consumption and make AI more environmentally friendly. Although we discussed several future trends in green-in AI, it is important to remember this field is still continuously evolving and new innovations will emerge in the future.

References:

[1] D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, J. Dean, Carbon emissions and large neural network training, 2021, arXiv:2104.10350.

[2] Bran, Knowles. “ACM TCP TechBrief on Computing and Carbon Emissions.” Association for Computing Machinery, Nov. 2021  www.acm.org/media-center/2021/october/tpc-tech-brief-climate-change  

[3] Nestor Maslej, Loredana Fattorini, Raymond Perrault, Vanessa Parli, Anka Reuel, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Russell Wald, and Jack Clark, “The AI Index 2024 Annual Report,” AI Index Steering Committee, Institute for Human-Centered AI, Stanford University, Stanford, CA, April 2024. 

[4] N. Milojevic-Dupont, F. Creutzig, Machine learning for geographically differentiated climate change mitigation in urban areas, Sustainable Cities Soc. 64 (2021) 102526.

[5] T.M. Ghazal, M.K. Hasan, M. Ahmad, H.M. Alzoubi, M. Alshurideh, Machine learning approaches for sustainable cities using internet of things, in: The Effect of Information Technology on Business and Marketing Intelligence Systems, Springer, 2023, pp. 1969–1986.

[6] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L.D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., End to end learning for self-driving cars, 2016, arXiv preprint arXiv:1604.07316. 

[7] R. Sharma, S.S. Kamble, A. Gunasekaran, V. Kumar, A. Kumar, A systematic literature review on machine learning applications for sustainable agriculture supply chain performance, Comput. Oper. Res. 119 (2020) 104926.

[8] N. Sánchez-Maroño, A. Rodríguez Arias, I. Lema-Lago, B. Guijarro-Berdiñas, A. Dumitru, A. Alonso-Betanzos, How agent-based modeling can help to foster sustainability projects, in: 26th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES, 2022.

[9] L.F.W. Anthony, B. Kanding, R. Selvan, Carbontracker: Tracking and predicting the carbon footprint of training deep learning models, 2020, arXiv preprint arXiv:2007.03051. 

[10] H. Rahmani, D. Shetty, M. Wagih, Y. Ghasempour, V. Palazzi, N.B. Carvalho, R. Correia, A. Costanzo, D. Vital, F. Alimenti, et al., Next-generation IoT devices: Sustainable eco-friendly manufacturing, energy harvesting, and wireless connectivity, IEEE J. Microw. 3 (1) (2023) 237–255.

[11]  Divya S., Panda S., Hajra S., Jeyaraj R., Paul A., Park S.H., Kim H.J., Oh T.H.

Smart data processing for energy harvesting systems using artificial intelligence

Filed Under: Computer Science and Tech Tagged With: AI, climate change, emissions, green-by AI, green-in AI, Language Models, sustainability, Technology

Machine learning and algorithmic bias

December 8, 2024 by Mauricio Cuba Almeida '27

Algorithms permeate modern society, especially AI algorithms. Artificial intelligence (AI) is built with various techniques, like machine learning, deep learning, or natural language processing, that trains AI to mimic humans at a certain task. Healthcare, loan approval, and security surveillance are a few industries that have begun using AI (Alowais et al., 2023; Purificato et al., 2022; Choung et al., 2024). Most people will inadvertently continue to interact with AI on a daily basis.

However, what are the problems faced by an increasing algorithmic society? Authors Sina Fazelpour and David Danks, in their article, explore this question in the context of algorithmic bias. Indeed, the problem they identify is that AI perpetuates bias. At its most neutral, Fazelpour and Danks (2021) explain that algorithmic bias is some “systematic deviation in algorithm output, performance, or impact, relative to some norm or standard,” suggesting that algorithms can be biased against a moral, statistical, or social norm. Fazelpour and Danks use a running example of a university training an AI algorithm with past student data to predict future student success. Thus, this algorithm exhibits a statistical bias if student success predictions are discordant with what has happened historically (in training data). Similarly, the algorithm exhibits a moral bias if it illegitimately depends on the student’s gender to produce a prediction. This is seen already in facial recognition algorithms that “perform worse for people with feminine features or darker skin” or recidivism prediction models that rate people of color as higher risk (Fazelpour & Danks, 2021). Clearly, algorithmic biases have the potential to preserve or exacerbate existing injustices under the guise of being “objective.” 

Algorithmic bias will manifest through different means. As Fazelpour and Danks discuss, harmful bias will be evident even prior to the creation of an algorithm if values and norms are not deeply considered. In the example of a student-success prediction model, universities must make value judgments, specifying what target variables define “student success,” whether it’s grades, respect from peers, or post-graduation salary. The more complex the goal, the more difficult and contested will choosing target variables be. Indeed, choosing target variables is a source of algorithmic bias. As Fazelpour and Danks explain, enrollment or financial aid decisions based on the prediction of student success may discriminate against minority students if first year performance was used in that prediction since minority students may face additional challenges.

Using training data that is biased will also lead to bias in an AI algorithm. In other words, bias in the measured world will be reflected in AI algorithms that mimic our world. For example, recruiting AI that reviews resumes is often trained on employees already hired by the company. In many cases, so-called gender-blind recruiting AI have discriminated against women by using gendered information on a resume that was absent from the resumes of a majority-male workplace (Pisanelli, 2022; Parasurama & Sedoc, 2021). Fazelpour and Danks also mention that biased data can arise from limitations and biases in measurement methods. This is what happens when facial recognition systems are trained predominantly on white faces. Consequently, these facial recognition systems are less effective when individuals do not look like the data the algorithm has been trained on.

Alternatively, users’ misinterpretations of predictive algorithms may produce biased results, Fazelpour and Danks argue. An algorithm is optimized for one purpose, and without even knowing, users may utilize this algorithm for another. A user could inadvertently interpret predicted “student success” as a metric for grades instead of what an algorithm is optimized to predict (e.g., likelihood to drop out). Decisions stemming from misinterpretations of algorithm predictions are doomed to be biased—and not just for the aforementioned reasons. Misunderstandings of algorithmic predictions lead to poor decisions if the variables predicting an outcome are also assumed to cause that outcome. Students in advanced courses may be predicted to have higher student success, but as Fazelpour and Danks put it, we shouldn’t enroll every underachieving student in an advanced course. Models such as these should also be applied in a context similar to when historical data was collected. Doing this is more important the longer a model is used as present data begins to differ from historical training data. In other words, student success models created for a small private college should not be deployed at a large public university nor many years later.

Fazelpour and Danks establish that algorithmic bias is nearly impossible to eliminate—solutions often must engage with the complexities of our society. The authors delve into several technical solutions, such as optimizing an algorithm using “fairness” as a constraint or training an algorithm on corrected historical data. This quickly reveals itself to be problematic, as determining fairness is a difficult value judgment. Nonetheless, algorithms provide tremendous benefit to us, even in moral and social ways. Algorithms can identify biases and serve as better alternatives to human practices. Fazelpour and Danks conclude that algorithms should continue to be studied in order to identify, mitigate, and prevent bias.

References

Alowais, S. A., Alghamdi, S. S., Alsuhebany, N., Alqahtani, T., Alshaya, A. I., Almohareb, S. N., Aldairem, A., Alrashed, M., Saleh, K. B., Badreldin, H. A., Yami, M. S. A., Harbi, S. A., & Albekairy, A. M. (2023). Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Medical Education, 23(1). https://doi.org/10.1186/s12909-023-04698-z

Choung, H., David, P., & Ling, T. (2024). Acceptance of AI-Powered Facial Recognition Technology in Surveillance scenarios: Role of trust, security, and privacy perceptions. Technology in Society, 102721. https://doi.org/10.1016/j.techsoc.2024.102721

Fazelpour, S., & Danks, D. (2021). Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 16(8). https://doi.org/10.1111/phc3.12760

Parasurama, P., & Sedoc, J. (2021, December 16). Degendering resumes for fair algorithmic resume screening. arXiv.org. https://arxiv.org/abs/2112.08910

Pisanelli, E. (2022). Your resume is your gatekeeper: Automated resume screening as a strategy to reduce gender gaps in hiring. Economics Letters, 221, 110892. https://doi.org/10.1016/j.econlet.2022.110892

Purificato, E., Lorenzo, F., Fallucchi, F., & De Luca, E. W. (2022). The use of responsible artificial intelligence techniques in the context of loan approval processes. International Journal of Human-Computer Interaction, 39(7), 1543–1562. https://doi.org/10.1080/10447318.2022.2081284

Filed Under: Computer Science and Tech Tagged With: AI, AI ethics, artificial intelligence, Ethics

Getting the Big Picture: Satellite Altimetry and the Future of Sea Level Rise Research

May 3, 2024 by Alexander Ordentlich '26

Anthropogenic climate change is drastically affecting the natural processes of the Earth at unprecedented rates. Increased fossil fuel emissions coupled with global deforestation have altered Earth’s energy budget, creating the potential for positive feedback loops to further warm our planet. While some of this warming manifests through glacier melting, powerful storm systems, and rising global temperatures, it’s estimated that 93% of the total energy gained from the greenhouse effect is stored in the ocean, with the remaining 7% contributing to atmospheric warming (Cazenave et al. 2018, as cited in von Schuckmann et al. 2016). This storage of heat in the ocean is responsible for oceanic thermal expansion and in combination with glacier melt is contributing to global sea level rise. Currently, an estimated 230 million people live below 1 m of the high tide line and if we do not curb emissions, sea level rise projections range 1.1 – 2.1 m by 2100 (Kulp et al. 2019, Sweet et al. 2022). Sea level rise’s global impact has thus been a prominent area of scientific research with leading methods utilizing satellite altimetry to measure the ocean’s height globally over time. 

Originating in the 1990s, surface sea level data has been recorded using a multitude of satellites amassing information from subseasonal to multi-decadal time scales (Cazenave et al. 2018). NASA’s sea level change portal reports this data sub-annually, recording a current sea level rise of 103.8 mm since 1993 (NASA). Seeking more information on the current trend of satellite altimetry, I reached out to French geophysicist Dr. Anny Cazenave of the French space agency CNES and director of Laboratoire d’Etudes en Geophysique et Oceanographie Spatiale (LEGOS) in Toulouse, France. Dr. Cazenave is a pioneer in geodesy, has worked as one of the leading scientists on numerous altimetry missions, was lead author of the sea level rise report for two Intergovernmental Panel on Climate Change (IPCC) reports, and recently won the prestigious Vetlesen Prize in 2020 (European Space Sciences Committee). 

When asked about recent advancements in altimetry technology, Dr. Cazenave directed me towards the recent international Surface Water and Ocean Topography satellite mission (SWOT) launched in 2022. SWOT is able to detect ocean features with ten times the resolution of current technology, enabling fine-scale analysis of oceans, lakes, rivers, and much more (NASA SWOT). Specifically for measuring sea level rise, SWOT utilizes a Ka-band Radar Interferometer (KaRIn) which is capable of measuring the elevation of almost all bodies of water on Earth. KaRIn operates by measuring deflected microwave signals off of Earth’s surface using two antennas split 10 meters apart, enabling the generation of a detailed topographic image of Earth’s surface (NASA SWOT). With SWOT’s high-resolution capabilities for topographically mapping sea level change anomalies close to shore, more accurate estimations for how sea level rise can affect coastal communities will be accessible in the future.

The figure above displays the difference in resolution between Copernicus Marine Service of ESA (European Space Agency) data and SWOT surface height anomaly data (NASA SWOT).

Finally, in light of recent developments in AI and machine learning, Dr. Cazenave noted the power of these computational methods in analyzing large data sets. The high-precision data provided by SWOT requires advanced methods of analysis to physically represent sea level rise changes, posing a challenge for researchers (Stanley 2023). A few recent papers have already highlighted the use of neural networks that are trained on current altimetry and sea surface temperature data (Xiao et al. 2023, Martin et al. 2023). These neural networks are then able to decipher the high-resolution data, enabling for a greater understanding of ocean dynamics and sea surface anomalies. Dr. Cazenave explained that the key questions to answer regarding sea level rise are: (1) how will ice sheets contribute to future sea level rise, (2) how much will sea level rise in coastal regions, and (3) how will rising sea levels contribute to shoreline erosion and retreat. With novel computational analysis techniques and advanced sea surface monitoring, many of these questions are being answered with greater accuracy. As we navigate the effects of climate change, combining science and policy will allow us to design multifaceted solutions that enable a sustainable future for all.

References

  1. Anny Cazenave​. European Space Sciences Committee. (n.d.). https://www.essc.esf.org/panels-members/anny-cazenave%E2%80%8B/
  2. Cazenave, A., Palanisamy, H., & Ablain, M. (2018). Contemporary sea level changes from satellite altimetry: What have we learned? What are the new challenges? Advances in Space Research, 62(7), 1639–1653. https://doi.org/10.1016/j.asr.2018.07.017
  3. Home. (n.d.). NASA Sea Level Change Portal. Retrieved April 24, 2024, from https://sealevel.nasa.gov/
  4. Joint NASA, CNES Water-Tracking Satellite Reveals First Stunning Views. (n.d.). NASA SWOT. Retrieved April 24, 2024, from https://swot.jpl.nasa.gov/news/99/joint-nasa-cnes-water-tracking-satellite-reveals-first-stunning-views
  5. Kulp, S. A., & Strauss, B. H. (2019). New elevation data triple estimates of global vulnerability to sea-level rise and coastal flooding. Nature Communications, 10(1), 4844. https://doi.org/10.1038/s41467-019-12808-z
  6. Martin, S. A., Manucharyan, G. E., & Klein, P. (2023). Synthesizing Sea Surface Temperature and Satellite Altimetry Observations Using Deep Learning Improves the Accuracy and Resolution of Gridded Sea Surface Height Anomalies. Journal of Advances in Modeling Earth Systems, 15(5), e2022MS003589. https://doi.org/10.1029/2022MS003589
  7. Stanley, S. (2023, October 17). Machine Learning Provides a Clearer Window into Ocean Motion. Eos. http://eos.org/research-spotlights/machine-learning-provides-a-clearer-window-into-ocean-motion
  8. Xiao, Q., Balwada, D., Jones, C. S., Herrero-González, M., Smith, K. S., & Abernathey, R. (2023). Reconstruction of Surface Kinematics From Sea Surface Height Using Neural Networks. Journal of Advances in Modeling Earth Systems, 15(10), e2023MS003709. https://doi.org/10.1029/2023MS003709
  9. von Schuckmann, K., Palmer, M., Trenberth, K. et al. An imperative to monitor Earth’s energy imbalance. Nature Clim Change 6, 138–144 (2016). https://doi.org/10.1038/nclimate2876

Filed Under: Computer Science and Tech, Environmental Science and EOS

Navigating the Unseen: Wireless Muon Technology Revolutionizes Indoor Positioning and Beyond

December 6, 2023 by Alexander Ordentlich '26

Cosmic rays have captivated scientists due to their enigmatic origins, imperceptibility, and natural abundance. Originating from celestial bodies ranging in distances from as close as our sun to as far as distant galaxies, these particles bombard our Earth at rates close to the speed of light. While these particles are responsible for the aurora borealis displays in the arctic, for the most part they go unnoticed and have been mainly researched in the context of astronomy and astrophysics (Howell 2018). However, recent development in muon tomography and research from Professor Hiroyuki Tanaka’s research group from the University of Tokyo has developed a wireless muometric navigation system (MuWNS) capable of using muons to create an indoor positioning system (Tanaka 2022).

Formation of muons from particle showers (Vlasov, 2023).

Muons are natural subatomic particles that are created from cosmic rays interacting with atoms in the atmosphere. With their mass around 207 times that of electrons, muons are capable of penetrating solid materials and water (Gururaj 2023). This unique property of muons has allowed for their use in mapping the interiors of hard-to-access places such as volcanoes, tropical storm cells, and even Egyptian pyramids (Morishima, 2017). Professor Tanaka’s team has now focused on improving the currently limited GPS system with a wireless muon detection system capable of navigation in places where radio waves used in GPS can not reach. This makes it an ideal technology for underground and underwater navigation, natural disaster relief, exploration of caves in planets, and much more. 

While the initial principle behind MuWNS involves the precise measurement of the timing and direction of cosmic-ray-generated muons through reference detectors, Professor Tanaka’s team had issues with the synchronization of time between the reference and receiver detectors (Tanaka, 2022). This precise time synchronization issue was displayed in their 2022 MuWNS prototype that had a navigation accuracy between 2-14 m, which Professor Tanaka claims is “far from the level required for the practical indoor navigation applications.” In a more recent article published in September 2023, Professor Tanaka has shifted his focus from using the timing of muons to measuring the directional vectors of incoming muons. Thus, instead of using the time of muon travel between the reference and receiver detectors for navigation, the next generation vector muPS (muometric positioning system) uses the angles of incoming muons through the reference and receiver detectors to locate the detector’s positioning. In essence, matching the angles of muons entering the two detectors confirms the same muon event. By identifying the same muon event, the angle and path of the muon is then used to determine the position of the receiver detector without relying on timing mechanisms. This approach minimizes the effects of time synchronization resulting in what he predicts as centimeter-level accuracy (Tanaka 2023). This new development has been greeted with excitement, earning Professor Tanaka’s team a spot in Time Magazine’s “The Best Inventions Of 2023” (Stokel-Walker 2023).

This image is from Professor Tanaka’s article on wireless muometric navigation systems. Image A depicts underwater navigation with floating reference detectors and muons marked as red lines. Image B depicts underground navigation using surface reference detectors to control the receiver. (Tanaka, 2022).

After being intrigued by Professor Tanaka’s work published in Nature (Tanaka 2023), I reached out to him asking a few questions for this article. The first question I asked was about the presence of muons and whether muon tomography could work on other celestial bodies. His response highlighted that muons are in fact generated in dust deposits on top of the surface of the Moon and Mars. Specifically, Professor Tanaka discussed how muons could be used to explore caves within the Moon. This would involve deploying a muPS navigating robot that uses muons generated in the regolith for navigation underground. This could allow us to explore hard to examine places on other planets without the physical presence of human exploration.

The second question involves the application of muPS within cell phones. Tanaka explains that our phones currently have a GPS receiver inside of them, allowing us to track their location when they are lost. However, if the cellphone is lost in an elevator, basement, cave, or room that has limited GPS signals, muPS could locate the phone instead. With 6.92 billion smartphone users worldwide, this application could be useful in natural disasters where individuals may be trapped under rubble and GPS signals cannot locate their phones (Zippia 2023). 

Finally, I asked Professor Tanaka what made him excited about muPS. He responded by discussing the current limitations with our present indoor/underground navigation systems and how they all rely on laser, sound, or radio waves to guide them through obstacles. This method he claims is not technically navigation because it does not provide coordinate information and thus is un-programmable. Tanaka states that “muPS is [the] only technique that provides the coordinate information besides GPS” and it can be used in locations where GPS is unavailable. 

In future technology, muon-based positioning systems may provide the opportunity to open new navigational and observational possibilities, propelling us into a world of new discoveries and exploration on Earth and beyond. 

 

Work Cited

  1. Gururaj, T. (2023, June 16). World’s first cosmic-ray GPS can detect underground movement. Interesting Engineering. https://interestingengineering.com/innovation/cosmic-ray-gps-underground-movement-disaster-management-muons 
  2. Howell, E. (2018, May 11). What are cosmic rays?. Space.com. https://www.space.com/32644-cosmic-rays.html 
  3. Morishima, K., Kuno, M., Nishio, A. et al. (2017). Discovery of a big void in Khufu’s Pyramid by observation of cosmic-ray muons. Nature 552, 386–390.. https://doi.org/10.1038/nature24647
  4. Stokel-Walker, C. (2023, October 24). Muon Positioning System: The 200 best inventions of 2023. Time. https://time.com/collection/best-inventions-2023/6326412/muon-positioning-system/ 
  5. Tanaka, H.K.M. Wireless muometric navigation system. Sci Rep 12, 10114 (2022). https://doi.org/10.1038/s41598-022-13280-4
  6. Tanaka, H.K.M. Muometric positioning system (muPS) utilizing direction vectors of cosmic-ray muons for wireless indoor navigation at a centimeter-level accuracy. Sci Rep 13, 15272 (2023). https://doi.org/10.1038/s41598-023-41910-y
  7. Vlasov, A. (2023, April 14). Muon Imaging: How Cosmic Rays help us see inside pyramids and volcanoes. IAEA. https://www.iaea.org/newscenter/news/muon-imaging-how-cosmic-rays-help-us-see-inside-pyramids-and-volcanoes 
  8. Zippia. 20 Vital Smartphone Usage Statistics [2023]: Facts, Data, and Trends On Mobile Use In The U.S. Zippia.com. Apr. 3, 2023, https://www.zippia.com/advice/smartphone-usage-statistics/

Filed Under: Computer Science and Tech, Math and Physics

  • Page 1
  • Page 2
  • Go to Next Page »

Primary Sidebar

CATEGORY CLOUD

Biology Chemistry and Biochemistry Computer Science and Tech Environmental Science and EOS Honors Projects Math and Physics Psychology and Neuroscience Science

RECENT POSTS

  • Venom As Medicine: Novel Pathways for Dravet Syndrome Treatment Using Modulatory Peptides from Scorpion Venom January 8, 2026
  • When Distraction Helps: Music as a Tool for Focus in ADHD Cases December 24, 2025
  • Timing Matters: How the menstrual cycle influences ACL injury in Female athletes December 24, 2025

FOLLOW US

  • Facebook
  • Twitter

Footer

TAGS

AI AI ethics Alzheimer's Disease antibiotics artificial intelligence bacteria Bathymetry Biology brain Cancer Biology Cell Biology Chemistry and Biochemistry Chlorofluorocarbons climate change Computer Science and Tech CRISPR Cytoskeleton Depression Dermatology dreams epigenetics Ethics Genes Gut microbiota honors Marine Biology Marine Mammals Marine noise Medicine memory Montreal Protocol Moss neurobiology neuroscience Nutrients Ozone hole Plants Psychology and Neuroscience REM seabirds sleep student Technology therapy Women's health

Copyright © 2026 · students.bowdoin.edu