AI

Biological ChatGPT: Rewriting Life With Evo 2

May 4, 2025 by Jenna Lam

What makes life life? Is there underlying code that, when written or altered, can be used to replicate or even create life? On February 19th 2025, scientists from Arc Institute, NVIDIA, Stanford, Berkeley, and UC San Francisco released Evo 2, a generative machine learning model that may help answer these questions. Unlike its precursor Evo 1, which was released a year earlier, Evo 2 is trained on genomic data of eukaryotes as well as prokaryotes. In total, it is trained on 9.3 trillion nucleotides from over 130,000 genomes, making it the largest AI model in biology. You can think of it as ChatGPT for creating genetic code—only it “thinks” in the language of DNA rather than human language, and it is being used to solve the most pressing health and disease challenges (rather than calculus homework).

Computers, defined broadly, are devices that store, process, and display information. Digital computers, such as your laptop or phone, function based on binary code—the most basic form of computer data composed of 0s and 1s, representing a current that is on or off. Evo 2 centers around the idea that DNA functions as nature’s “code,” which, through protein expression and organismal development, creates “computers” of life. Rather than binary, organisms function according to genetic code, made up of A, T, C, G, and U–the five major nucleotide bases that constitute DNA and RNA.

Although Evo 2 can potentially design code for artificial life, it has not yet designed an entire genome and is not being used to create artificial organisms. Instead, Evo 2 is being used to (1) predict genetic abnormalities and (2) generate genetic code.

11 Functions of Evo 2 in biology at the cellular/organismal, protein, RNA, and epigenome levels. — Functions of Evo 2 at different levels. Adapted from https://www.biorxiv.org/content/10.1101/2025.02.18.638918v1.full

Accurate over 90% of the time, Evo 2 can predict which BRCA1 (a gene central to understanding breast cancer) mutations are benign versus potentially pathogenic. This is big, since each gene is composed of hundreds and thousands of nucleotides, and any mutation in a single nucleotide (termed a Single Nucleotide Variant, or SNV) could have drastic consequences for the protein structure and function. Thus, being able to computationally pinpoint dangerous mutations reduces the amount of time and money spent testing each mutation in a lab, and paves the way for developing more targeted drugs.

Secondly, Evo 2 can design genetic code for highly specialized and controlled proteins which provide many fruitful possibilities for synthetic biology (making synthetic molecules using biological systems), from pharmaceuticals to plastic-degrading enzymes. It can generate entire mitochondrial genomes, minimal bacterial genomes, and entire yeast chromosomes–a feat that had not been done yet.

A notable perplexity of eukaryotic genomes is their many-layered epigenomic interactions: the complex power of the environment in controlling gene expression. Evo 2 works around this by using models of epigenomic structures, made possible through inference-time scaling. Put simply, inference-time scaling is a technique developed by NVIDIA that allows AI models to take time to “think” by evaluating multiple solutions before selecting the best one.

How is Evo 2 so knowledgeable, despite only being one year old? The answer lies in deep learning.

Just as in Large Language Models, or LLMs (think: ChatGPT, Gemini, etc.), Evo 2 decides what genes should look like by “training” on massive amounts of previously known data. Where LLMs train on previous text, Evo 2 trains on entire genomes of over 130,000 organisms. This training—the processing of mass amounts of data—is central to deep learning. In training, individual pieces of data called tokens are fed into a “neural networks”—a fancy name for a collection of software functions that are communicate data to one another. As their name suggests, neural networks are modeled after the human nervous system, which is made up of individual neurons that are analogous to software functions. Just like brain cells, “neurons” in the network can both take in information and produce output by communicating with other neurons. Each neural network has multiple layers, each with a certain number of neurons. Within each layer, each neuron sends information to every neuron in the next layer, allowing the model to process and distill large amounts of data. The more neurons involved, the more fine-tuned the final output will be.

This neural network then attempts to solve a problem. Since practice makes perfect, the network attempts the problem over and over; each time, it strengthens the successful neural connections while diminishing others. This is called adjusting parameters, which are variables within a model that can be adjusted, dictating how the model behaves and what it produces. This minimizes error and increases accuracy. Evo 2 was trained with 7b and 40b parameters to have a 1 million token context window, meaning the genomic data was fed through many neurons and fine-tuned many times.

Example neural network modeled using tensorflow adapted from playground.tensorflow.org

The idea of anyone being able to create genetic code may spark fear; however, Evo 2 developers have prevented the model from returning productive answers to inquiries about pathogens, and the data set was carefully chosen to not include pathogens that infect humans and complex organisms. Furthermore, the positive possibilities of Evo 2 usage are likely much more than we are currently aware of: scientists believe Evo 2 will advance our understanding of biological systems by generalizing across massive genomic data of known biology. This may reveal higher-level patterns and unearth more biological truths from a birds-eye view.

It’s important to note that Evo 2 is a foundational model, emphasizing generalist capabilities over task-specific optimization. It was intended to be a foundation for scientists to build upon and alter for their own projects. Being open source, anyone can access the model code and training data. Anyone (even you!) can even generate their own strings of genetic code with Evo Designer.

Biotechnology is rapidly advancing. For example, DNA origami allows scientists to fold DNA into highly specialized nanostructures of any shape–including smiley faces and China–potentially allowing scientists to use DNA code to design biological robots much smaller than any robot we have today. These tiny robots can target highly specific areas of the body, such as receptors on cancer cells. Evo 2, with its designing abilities, opens up many possibilities for DNA origami design. From gene therapy, to mutation-predictions, to miniature smiley faces, it is clear that computation is becoming increasingly important in understanding the most obscure intricacies of life—and we are just at the start.

Garyk Brixi, Matthew G. Durrant, Jerome Ku, Michael Poli, Greg Brockman, Daniel Chang, Gabriel A. Gonzalez, Samuel H. King, David B. Li, Aditi T. Merchant, Mohsen Naghipourfar, Eric Nguyen, Chiara Ricci-Tam, David W. Romero, Gwanggyu Sun, Ali Taghibakshi, Anton Vorontsov, Brandon Yang, Myra Deng, Liv Gorton, Nam Nguyen, Nicholas K. Wang, Etowah Adams, Stephen A. Baccus, Steven Dillmann, Stefano Ermon, Daniel Guo, Rajesh Ilango, Ken Janik, Amy X. Lu, Reshma Mehta, Mohammad R.K. Mofrad, Madelena Y. Ng, Jaspreet Pannu, Christopher Ré, Jonathan C. Schmok, John St. John, Jeremy Sullivan, Kevin Zhu, Greg Zynda, Daniel Balsam, Patrick Collison, Anthony B. Costa, Tina Hernandez-Boussard, Eric Ho, Ming-Yu Liu, Thomas McGrath, Kimberly Powell, Dave P. Burke, Hani Goodarzi, Patrick D. Hsu, Brian L. Hie (2025). Genome modeling and design across all domains of life with Evo 2. bioRxiv preprint doi: https://doi.org/10.1101/2025.02.18.638918.

Computer Vision Ethics

May 4, 2025 by Madina Sotvoldieva

Computer vision (CV) is a field of computer science that allows computers to “see” or, in more technical terms, recognize, analyze, and respond to visual data, such as videos and images. CV is widely used in our daily lives, from something as simple as recognizing handwritten text to something as complex as analyzing and interpreting MRI scans. With the advent of AI in the last few years, CV has also been improving rapidly. However, just like any subfield of AI nowadays, CV has its own set of ethical, social, and political implications, especially when used to analyze people’s visual data.

Although CV has been around for some time, there is limited work on its ethical limitations in the general AI field. Among the existing literature, authors categorized six ethical themes, which are espionage, identity theft, malicious attacks, copyright infringement, discrimination, and misinformation [1]. As seen in Figure 1, one of the main CV applications is face recognition, which could also lead to issues of error, function creep (the expansion of technology beyond its original purposes), and privacy. [2].

Computer Vision technologies related to Identity Theft — Figure 1: Specific applications of CV that could be used for Identity Theft.

To discuss CV’s ethics, the authors of the article take a critical approach to evaluating the implications through the framework of power dynamics. The three types of power that are analyzed are dispositional, episodic, and systemic powers [3].

Dispositional Power

Dispositional power is defined as the ability to bring out a significant outcome [4]. When people gain that power, they feel empowered to explore new opportunities, and their scope of agency increases (they become more independent in their actions) [5]. However, CV can threaten this dispositional power in several ways, ultimately reducing people’s autonomy.

One way CV disempowers people is by limiting their information control. Since CV works with both pre-existing and real-time camera footage, people might be often unaware that they are being recorded and often cannot avoid that. This means that technology makes it hard for people to control the data that is being gathered about them, and protecting their personal information might get as extreme as hiding their faces.

Apart from people being limited in controlling what data is being gathered about them, advanced technologies make it extremely difficult for an average person to know what specific information can be retrieved from visual data. Another way CV might disempower people of following their own judgment is through communicating who they are for them (automatically inferring people’s race, gender, and mood), creating a forced moral environment (where people act from fear of being watched rather than their own intentions), and potentially leading to over-dependence on computers (e.g., relying on face recognition for emotion interpretations).

In all these and other ways, CV undermines the foundation of dispositional power by limiting people’s ability to control their information, make independent decisions, express themselves, and act freely.

Episodic Power

Episodic power, or as often referred to as power-over, defines the direct exercise of power by one individual or group over another. CV can both give new power or improve the efficiency of existing one [6]. While this isn’t always a bad thing (for example, parents watching over children), problems arise when CV makes that control too invasive or one-sided—especially in ways that limit people’s freedom to act independently.

With CV taking security cameras to the next level, opportunities such as baby-room monitoring or fall detection for elderly people open up to us. However, it also leads to the issues of surveillance automation, which can lead to over-enforcement in scales as small as private individuals to bigger corporations (workplaces, insurance companies, etc.). Another power dynamic shifts that need to be considered, for example, when the smart doorbells show far beyond the person at the door and might violate a neighbor’s privacy by creating peer-to-peer surveillance.

These examples show that while CV may offer convenience or safety, it can also tip power balances in ways that reduce personal freedom and undermine one’s autonomy.

Systemic Power

Systematic power is not viewed as an individual exercise of power, but rather a set of societal norms and practices that affect people’s autonomy by determining what opportunities people have, what values they hold, and what choices they make. CV can strengthen the systematic power by making law enforcement more efficient through smart cameras and increase businesses’ profit through business intelligence tools.

However, CV can also reinforce the pre-existing systematic societal injustices. One example of that might be flawed facial recognition, when the algorithms are more likely to recognize White people and males [7], which led to a number of false arrests. This might lead to people receiving unequal opportunities (when biased systems are used for hiring process), or harm their self-worth (when falsely recognized as a criminal).

Another matter of systematic power is the environmental cost of CV. AI systems rely on vast amounts of data, which requires intensive energy for processing and storage. As societies become increasingly dependent on AI technologies like CV, those trying to protect the environment have little ability to resist or reshape these damaging practices. The power lies with tech companies and industries, leaving citizens without the means to challenge the system. When the system becomes harder to challenge or change, that’s when the ethical concerns regarding CV arise.

Conclusion

Computer Vision is a powerful tool that keeps evolving each year. We already see numerous applications of it in our daily lives, starting from the self-checkouts in the stores and smart doorbells to autonomous vehicles and tumor detections. With the potential that CV holds in improving and making our lives safer, there are a number of ethical limitations that should be considered. We need to critically examine how CV affects people’s autonomy, might cause one-sided power dynamics, and reinforces societal prejudices. As we are rapidly transitioning into the AI-driven world, there is more to come in the field of computer vision. However, in the pursuit of innovation, we should ensure the progress does not come at the cost of our ethical values.

References:

[1] Lauronen, M.: Ethical issues in topical computer vision applications. Information Systems, Master’s Thesis. University of Jyväskylä. (2017). https://jyx.jyu.fi/bitstream/handle/123456789/55806/URN%3aNBN%3afi%3ajyu-201711084167.pdf?sequence=1&isAllowed=y

[2] Brey, P.: Ethical aspects of facial recognition systems in public places. J. Inf. Commun. Ethics Soc. 2(2), 97–109 (2004). https:// doi.org/10.1108/14779960480000246

[3] Haugaard, M.: Power: a “family resemblance concept.” Eur. J. Cult. Stud. 13(4), 419–438 (2010)

[4] Morriss, P.: Power: a philosophical analysis. Manchester University Press, Manchester, New York (2002)

[5] Morriss, P.: Power: a philosophical analysis. Manchester University Press, Manchester, New York (2002)

[6] Brey, P.: Ethical aspects of facial recognition systems in public places. J. Inf. Commun. Ethics Soc. 2(2), 97–109 (2004). https://doi.org/10.1108/14779960480000246

[7] Buolamwini, J., Gebru, T.: Gender shades: intersectional accuracy disparities in commercial gender classification. Conference on Fairness, Accountability, and Transparency, pp. 77–91 (2018) Coeckelbergh, M.: AI ethics. MIT Press (2020)

AI – save or ruin the environment?

December 8, 2024 by Madina Sotvoldieva

With the fast speed that AI is currently developing, it has the potential to alleviate one of the most pressing problems—climate change. AI applications, such as smart electricity grids and sustainable agriculture, are predicted to mitigate environmental issues. On the flip side, the integration of AI in this field can also be counterproductive because of the high energy demand of the systems. If AI helps us to transition to a more sustainable lifestyle, the question is, at what cost?

The last decade saw exponential growth in data demand and the development of Large Language Models (LLMs)–computational models such as ChatGPT, designed to generate natural language. The algorithms resulted in increased energy consumption because of the big data volumes and computational power required, as well as increased water consumption needed to refrigerate data centers with that data. This consequently leads to higher greenhouse gas emissions (Fig.1). For example, the training of GPT-3 on a 500 billion-word database produced around 550 tons of carbon dioxide, equivalent to flying 33 times from Australia to the UK [1]. Moreover, information and communications technology (ICT) accounts for 3.9% of global greenhouse gas emissions (surpassing global air travel) [2]. As the number of training parameters grows, so does the energy consumption. It is expected to reach over 30% of the world’s total energy consumption by 2030. These environmental concerns about AI implementation led to a new term—Green AI.

Fig 1: CO2 equivalent emissions for training ML models (blue) and real-life cases (violet). In brackets, the billions of parameters are adjusted for each model [3].

Green algorithms are defined in two ways: green-in and green-by AI (Fig. 2). Algorithms that support the use of technology to tackle environmental issues are referred to as green-by AI. Green-in-design algorithms (green-in AI), on the other hand, are those that maximize energy efficiency to reduce the environmental impact of AI.

Fig. 2. Overview of green-in vs. green-by algorithms.

Green-by AI has the potential to reduce greenhouse gas emissions by enhancing efficiency across many sectors, such as agriculture, biodiversity management, transportation, smart mobility, etc.

Energy Efficiency. Machine Learning (ML) algorithms can optimize heating, air conditioning, and lighting by analyzing the data from the smart buildings, making them more energy efficient [4][5].
Smart Mobility. AI can predict and avoid traffic congestion by analyzing the current traffic patterns and optimizing routes. Moreover, ML contributes to Autonomous Vehicles by executing tasks like road following and obstacle detection, which improves overall road safety [6].
Sustainable agriculture. Data from sensors and satellites analyzed by ML can give farmers insights into crop health, soil conditions, and irrigation needs. This enables them to use the resources with precision and reduce environmental impacts. Moreover, predictive analytics minimize crop loss by allowing farmers to aid the diseases on time [7].
Climate Change. Computer-vision technologies can detect methane leaks in gas pipes, reducing emissions from fossil fuels. AI also plays a crucial role in reducing electricity usage by predicting demand and supply from solar and wind power.
Environmental Policies. AI’s ability to process data, identify trends, and predict outcomes will enable policymakers to come up with effective strategies to combat environmental issues [8].

Green-in AI, on the other hand, is an energy-efficient AI with a low carbon footprint, better quality data, and logical transparency. To ensure people’s trust, it offers clear and rational decision-making processes, thus also making it socially sustainable. Several promising approaches to reaching the green-in AI include algorithm, hardware, and data center optimization. Specifically, more efficient graphic processing units (GPUs) or parallelization (distributing computation among several processing cores) can reduce the environmental impacts of training AI. Anthony et al. proved that increasing the number of processing units to 15 will decrease greenhouse gas emissions [9]. However, the reduction in runtime must be significant enough for the parallelization method not to become counterproductive (when the execution time reduction is smaller than the increase in the number of cores, the emissions deteriorate). Other methods include computation at the locations where the data is collected to avoid data transmissions and limit the number of times an algorithm is run.

Now that we know about AI’s impact and the ways to reduce it, what trends can we expect in the future?

Hardware: Innovation in hardware design is focused on creating both eco-friendly and powerful AI accelerators, which can minimize energy consumption [10].
Neuromorphic computing is an emerging area in the computing technology field, aiming to create more efficient computing systems. It draws inspiration from the human brain, which performs complex tasks with much less energy than conventional computers.
Energy-harvesting AI devices. Researchers are exploring the ways in which AI can harvest energy from its surroundings, for example from the lights or heat [11]. This way, AI can rely less on external power and become self-sufficient.

In conclusion, while AI holds great potential in alleviating many environmental issues, we should not forget about its own negative impact. While training AI models results in excessive greenhouse gas emissions, there are many ways to reduce energy consumption and make AI more environmentally friendly. Although we discussed several future trends in green-in AI, it is important to remember this field is still continuously evolving and new innovations will emerge in the future.

References:

[1] D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, J. Dean, Carbon emissions and large neural network training, 2021, arXiv:2104.10350.

[2] Bran, Knowles. “ACM TCP TechBrief on Computing and Carbon Emissions.” Association for Computing Machinery, Nov. 2021 www.acm.org/media-center/2021/october/tpc-tech-brief-climate-change

[3] Nestor Maslej, Loredana Fattorini, Raymond Perrault, Vanessa Parli, Anka Reuel, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, Juan Carlos Niebles, Yoav Shoham, Russell Wald, and Jack Clark, “The AI Index 2024 Annual Report,” AI Index Steering Committee, Institute for Human-Centered AI, Stanford University, Stanford, CA, April 2024.

[4] N. Milojevic-Dupont, F. Creutzig, Machine learning for geographically differentiated climate change mitigation in urban areas, Sustainable Cities Soc. 64 (2021) 102526.

[5] T.M. Ghazal, M.K. Hasan, M. Ahmad, H.M. Alzoubi, M. Alshurideh, Machine learning approaches for sustainable cities using internet of things, in: The Effect of Information Technology on Business and Marketing Intelligence Systems, Springer, 2023, pp. 1969–1986.

[6] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L.D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., End to end learning for self-driving cars, 2016, arXiv preprint arXiv:1604.07316.

[7] R. Sharma, S.S. Kamble, A. Gunasekaran, V. Kumar, A. Kumar, A systematic literature review on machine learning applications for sustainable agriculture supply chain performance, Comput. Oper. Res. 119 (2020) 104926.

[8] N. Sánchez-Maroño, A. Rodríguez Arias, I. Lema-Lago, B. Guijarro-Berdiñas, A. Dumitru, A. Alonso-Betanzos, How agent-based modeling can help to foster sustainability projects, in: 26th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES, 2022.

[9] L.F.W. Anthony, B. Kanding, R. Selvan, Carbontracker: Tracking and predicting the carbon footprint of training deep learning models, 2020, arXiv preprint arXiv:2007.03051.

[10] H. Rahmani, D. Shetty, M. Wagih, Y. Ghasempour, V. Palazzi, N.B. Carvalho, R. Correia, A. Costanzo, D. Vital, F. Alimenti, et al., Next-generation IoT devices: Sustainable eco-friendly manufacturing, energy harvesting, and wireless connectivity, IEEE J. Microw. 3 (1) (2023) 237–255.

[11] Divya S., Panda S., Hajra S., Jeyaraj R., Paul A., Park S.H., Kim H.J., Oh T.H.

Smart data processing for energy harvesting systems using artificial intelligence

Machine learning and algorithmic bias

December 8, 2024 by Mauricio Cuba Almeida

Algorithms permeate modern society, especially AI algorithms. Artificial intelligence (AI) is built with various techniques, like machine learning, deep learning, or natural language processing, that trains AI to mimic humans at a certain task. Healthcare, loan approval, and security surveillance are a few industries that have begun using AI (Alowais et al., 2023; Purificato et al., 2022; Choung et al., 2024). Most people will inadvertently continue to interact with AI on a daily basis.

However, what are the problems faced by an increasing algorithmic society? Authors Sina Fazelpour and David Danks, in their article, explore this question in the context of algorithmic bias. Indeed, the problem they identify is that AI perpetuates bias. At its most neutral, Fazelpour and Danks (2021) explain that algorithmic bias is some “systematic deviation in algorithm output, performance, or impact, relative to some norm or standard,” suggesting that algorithms can be biased against a moral, statistical, or social norm. Fazelpour and Danks use a running example of a university training an AI algorithm with past student data to predict future student success. Thus, this algorithm exhibits a statistical bias if student success predictions are discordant with what has happened historically (in training data). Similarly, the algorithm exhibits a moral bias if it illegitimately depends on the student’s gender to produce a prediction. This is seen already in facial recognition algorithms that “perform worse for people with feminine features or darker skin” or recidivism prediction models that rate people of color as higher risk (Fazelpour & Danks, 2021). Clearly, algorithmic biases have the potential to preserve or exacerbate existing injustices under the guise of being “objective.”

Algorithmic bias will manifest through different means. As Fazelpour and Danks discuss, harmful bias will be evident even prior to the creation of an algorithm if values and norms are not deeply considered. In the example of a student-success prediction model, universities must make value judgments, specifying what target variables define “student success,” whether it’s grades, respect from peers, or post-graduation salary. The more complex the goal, the more difficult and contested will choosing target variables be. Indeed, choosing target variables is a source of algorithmic bias. As Fazelpour and Danks explain, enrollment or financial aid decisions based on the prediction of student success may discriminate against minority students if first year performance was used in that prediction since minority students may face additional challenges.

Using training data that is biased will also lead to bias in an AI algorithm. In other words, bias in the measured world will be reflected in AI algorithms that mimic our world. For example, recruiting AI that reviews resumes is often trained on employees already hired by the company. In many cases, so-called gender-blind recruiting AI have discriminated against women by using gendered information on a resume that was absent from the resumes of a majority-male workplace (Pisanelli, 2022; Parasurama & Sedoc, 2021). Fazelpour and Danks also mention that biased data can arise from limitations and biases in measurement methods. This is what happens when facial recognition systems are trained predominantly on white faces. Consequently, these facial recognition systems are less effective when individuals do not look like the data the algorithm has been trained on.

Alternatively, users’ misinterpretations of predictive algorithms may produce biased results, Fazelpour and Danks argue. An algorithm is optimized for one purpose, and without even knowing, users may utilize this algorithm for another. A user could inadvertently interpret predicted “student success” as a metric for grades instead of what an algorithm is optimized to predict (e.g., likelihood to drop out). Decisions stemming from misinterpretations of algorithm predictions are doomed to be biased—and not just for the aforementioned reasons. Misunderstandings of algorithmic predictions lead to poor decisions if the variables predicting an outcome are also assumed to cause that outcome. Students in advanced courses may be predicted to have higher student success, but as Fazelpour and Danks put it, we shouldn’t enroll every underachieving student in an advanced course. Models such as these should also be applied in a context similar to when historical data was collected. Doing this is more important the longer a model is used as present data begins to differ from historical training data. In other words, student success models created for a small private college should not be deployed at a large public university nor many years later.

Fazelpour and Danks establish that algorithmic bias is nearly impossible to eliminate—solutions often must engage with the complexities of our society. The authors delve into several technical solutions, such as optimizing an algorithm using “fairness” as a constraint or training an algorithm on corrected historical data. This quickly reveals itself to be problematic, as determining fairness is a difficult value judgment. Nonetheless, algorithms provide tremendous benefit to us, even in moral and social ways. Algorithms can identify biases and serve as better alternatives to human practices. Fazelpour and Danks conclude that algorithms should continue to be studied in order to identify, mitigate, and prevent bias.

References

Alowais, S. A., Alghamdi, S. S., Alsuhebany, N., Alqahtani, T., Alshaya, A. I., Almohareb, S. N., Aldairem, A., Alrashed, M., Saleh, K. B., Badreldin, H. A., Yami, M. S. A., Harbi, S. A., & Albekairy, A. M. (2023). Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Medical Education, 23(1). https://doi.org/10.1186/s12909-023-04698-z

Choung, H., David, P., & Ling, T. (2024). Acceptance of AI-Powered Facial Recognition Technology in Surveillance scenarios: Role of trust, security, and privacy perceptions. Technology in Society, 102721. https://doi.org/10.1016/j.techsoc.2024.102721

Fazelpour, S., & Danks, D. (2021). Algorithmic bias: Senses, sources, solutions. Philosophy Compass, 16(8). https://doi.org/10.1111/phc3.12760

Parasurama, P., & Sedoc, J. (2021, December 16). Degendering resumes for fair algorithmic resume screening. arXiv.org. https://arxiv.org/abs/2112.08910

Pisanelli, E. (2022). Your resume is your gatekeeper: Automated resume screening as a strategy to reduce gender gaps in hiring. Economics Letters, 221, 110892. https://doi.org/10.1016/j.econlet.2022.110892

Purificato, E., Lorenzo, F., Fallucchi, F., & De Luca, E. W. (2022). The use of responsible artificial intelligence techniques in the context of loan approval processes. International Journal of Human-Computer Interaction, 39(7), 1543–1562. https://doi.org/10.1080/10447318.2022.2081284

ChatGPT Beats Humans in Emotional Awareness Test: What’s Next?

December 3, 2023 by Nicholas Enbar-Salo '27

In recent times, it can seem like everything revolves around artificial intelligence (AI). From AI-powered robots performing surgery to facial recognition on smartphones, AI has become an integral part of modern life. While AI has affected nearly every industry, most have been slowly adapting AI into their field while trying to minimize the risks involved with AI. One such field with particularly great potential is the mental health care industry. Indeed, some studies have already begun to study the uses of AI to assist mental health work. For instance, one study used AI to predict the probability of suicide through users’ health insurance records (Choi et al., 2018), while another showed that AI could identify people with depression based on their social media posts (Aldarwish & Ahmed, 2017).

Perhaps the most wide-spread AI technology is ChatGPT, a public natural language processor chatbot that can help you with a plethora of tasks, from writing an essay to playing chess. Much discussion has been done about the potential of such chatbots in mental health care and therapy, but few studies have been published on the matter. However, a study by Zohar Elyoseph has started the conversation of chatbots’ potential, specifically ChatGPT, in therapy. In this study, Elyoseph and his team gave ChatGPT the Levels of Emotional Awareness Scale (LEAS) to measure ChatGPT’s capability for emotional awareness (EA), a core part of empathy and an essential skill of therapists (Elyoseph et al., 2023). The LEAS gives you 20 scenarios, in which someone experiences an event that supposedly elicits a response in the person in the scenario, and the test-taker must describe what emotions the person is likely feeling. Two examinations of the LEAS, one month apart, were done on ChatGPT to test two different versions of ChatGPT. This was done to see if updates during that month would improve its ability on the LEAS. On both examinations, two licensed psychologists scored the responses from ChatGPT to ensure reliability of its score. On the first examination in January 2023, ChatGPT achieved a score of 85 out of 100, compared to the French men’s and female’s averages of 56.21 and 58.94 respectively. On the second examination in February 2023, ChatGPT achieved a score of 98: nearly a perfect score, a significant improvement from the already high score of 85 a month prior, and a score that is higher than most licensed psychologists (Elyoseph et al., 2023).

This study shows that, not only is ChatGPT more capable than humans at EA, but it is also rapidly improving at it. This has massive implications for in-person therapy. While there is more to being a good therapist than just emotional awareness, it is a major part of it. Therefore, based on this study, there is potential for chatbots like ChatGPT to rival, or possibly even replace, therapists if developers are able to develop the other interpersonal traits of good therapists.

However, ChatGPT and AI needs more work to be done before it can really be implemented into the mental health field in this manner. To start, while AI is capable of the technical aspects of therapy, such as giving sound advice and validating a client’s emotions, ChatGPT and other chatbots sometimes give “illusory responses”, or fake responses that it claims are legitimate (Hagendorff et al., 2023). For example, ChatGPT will sometimes say “5 + 5 = 11” if you ask what 5 + 5 is, even though the answer is clearly wrong. While this is a very obvious example of an illusory response, harm can be done if the user is not able to distinguish between the real and illusory responses for more complex subjects. These responses can be extremely harmful in situations such as therapy, as clients rely on a therapist for guidance, and if such guidance were fake, it could harm rather than help the client. Furthermore, there are concerns regarding the dehumanization of therapy, the loss of jobs for therapists, and the breach of a client’s privacy if AI was to replace therapists (Abrams, 2023).

Fig 1. Sample conversation with Woebot, which provides basic therapy to users. Adapted from Darcy et al., 2021.

However, rudimentary AI programs are already sprouting that try to bolster the mental health infrastructure. Replika, for instance, is an avatar-based chatbot that offers therapeutic conversation with the user, and saves previous conversations to remember them in the future. Woebot provides a similar service (Figure 1), providing cognitive-behavioral therapy (CBT) for anxiety and depression to users (Pham et al., 2022). While some are scared about applications such as these, these technologies should be embraced since, as they become more refined, they could provide a low-commitment, accessible source of mental health care for those who are unable to reach out to a therapist, such as those who are nervous about reaching out to a real therapist, those who live in rural environments without convenient access to a therapist, or those who lack the financial means for mental health support. AI can also be used as a tool for therapists in the office. For example, an natural language processing application, Eleos, can take notes and highlight themes and risks for therapists to review after the session (Abrams, 2023).

There are certainly some drawbacks of AI in therapy, such as the dehumanization of therapy, that may not have a solution and could therefore limit AI’s influence in the field. There is certainly a chance that some people would never trust AI to give them empathetic advice. However, people said the same when robotic surgeries began being used in clinical settings, but most people seem to have embraced that due to its superb success rate. Regardless of whether these problems are resolved, AI in the mental health industry has massive potential, and we must make sure to ensure that the risks and drawbacks of such technology are addressed and refined so that we can make the most of this potential in the future and bring better options to those who need it.

Citations

Abrams, Z. (2023, July 1). AI is changing every aspect of psychology. Here’s what to watch for. Monitor on Psychology, 54(5). https://www.apa.org/monitor/2023/07/psychology-embracing-ai

Aldarwish MM, Ahmad HF. Predicting Depression Levels Using Social Media Posts. Proc – 2017 IEEE 13th Int Symp Auton Decentralized Syst ISADS 2017 2017;277–80.

Choi SB, Lee W, Yoon JH, Won JU, Kim DW. Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea. J Affect Disord. 2018;231(January):8–14.

Darcy, Alison & Daniels, Jade & Salinger, David & Wicks, Paul & Robinson, Athena. (2021). Evidence of Human-Level Bonds Established With a Digital Conversational Agent: Cross-sectional, Retrospective Observational Study. JMIR Formative Research. 5. e27868. 10.2196/27868.

Elyoseph, Z., Hadar-Shoval, D., Asraf, K., & Lvovsky, M. (2023). ChatGPT outperforms humans in emotional awareness evaluations. Frontiers in psychology, 14, 1199058.

Hagendorff, T., Fabi, S. & Kosinski, M. Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nat Comput Sci 3, 833–838.

Pham K. T., Nabizadeh A., Selek S. (2022). Artificial intelligence and chatbots in psychiatry. Psychiatry Q. 93, 249–253.

Mimicking the Human Brain: The Role of Heterogeneity in Artificial Intelligence

April 10, 2022 by Jenna Albanese '24

Picture this: you’re in the passenger seat of a car, weaving through an urban metropolis – say New York City. As expected, you see plenty of people: those who are rushed, lingering, tourists, locals, old, young, et cetera. But let’s zoom in: take just about any one of those individuals in the city, and you will find 86 billion nerve cells, or neurons, in their brain carrying them through daily life. For comparison, this means that the number of neurons in the human brain is about ten thousand times the number of residents in New York City.

But let’s zoom in even further: each one of those 86 billion neurons in the brain is ever-so-slightly different from one another. For example, while some neurons work extremely quickly in making decisions that guide basic processes in the brain, others work more slowly, basing their decisions off surrounding neurons’ activity. This difference in decision-making time among our neurons is called heterogeneity. Previously, researchers were unsure of heterogeneity’s importance in our lives, but its existence was certain. This is just one example of the almost incomprehensible detail of the brain that makes human thinking so complex, and even difficult to fully understand for modern researchers.

Now, let’s zoom in again, but this time not on the person’s brain. Instead, let’s zoom into the cell phone this individual might have in their pocket or their hand. While a cell phone does not function exactly the same as the human brain, aspects of the device are certainly modeled after human thinking. Virtual assistants, like Siri or Cortana, for instance, compose responses to general inquiries that resemble human interaction.

This type of highly advanced digital experience is the result of artificial intelligence. Since the 1940s, elements of artificial intelligence have been modeled after features of the human brain, fashioned as a neural network composed of nodes, some serving as inputs and others as outputs. The nodes are comparable to brain cells, and they communicate with each other through a series of algorithms to produce outputs. However, in these technological brain models, every node is typically modeled in the same way in terms of the time they take to respond to a given situation (Science Daily 2021). This is quite unlike the human brain, where heterogeneity ensures that each neuron responds to stimuli at different speeds. But does this even matter? Do intricate qualities of the brain like heterogeneity really make a difference in our thinking, or in digital functioning if incorporated into artificial intelligence?

The short answer is yes, at least in the case of heterogeneity. Researchers have recently investigated how heterogeneity influences an artificial neural network’s performance on visual and auditory information classification tasks. In the study, each neural network had a different “time constant,” which is how long the cells take to respond to a situation given the responses of nearby cells. In essence, the researchers varied the heterogeneity of the artificial neural networks. The results were astonishing: once heterogeneity was introduced, the artificial neural networks completed the tasks more efficiently and accurately. The strongest result revealed a 15-20% improvement on auditory tasks as soon as heterogeneity was introduced to the artificial neural network (Perez-Nieves et al. 2021).

This result indicates that heterogeneity helps us think systematically, improve our task performance, and learn in changing conditions (Perez-Nieves et al. 2021). So perhaps it would be advantageous to incorporate heterogeneity into standard artificial intelligence models. With this change, technology’s way of “thinking” will come one step closer to functioning like a human brain, adopting a similar level of complexity and intricacy.

So, why does this matter? If parts of artificial intelligence are modeled closer and closer to how the human brain works, real-world benefits abound, and we’re talking on a level grander than virtual assistants. One prominent example is in head and neck cancer prognosis. Clinical predictors of head and neck cancer prognosis include factors like age, pathological findings, HPV status, and tobacco and alcohol consumption (Chinnery et al. 2020). With a multitude of factors at play, physicians spend excessive amounts of time analyzing head and neck cancer patients’ lifestyles in order to deduce an accurate prognosis. Alternatively, artificial intelligence could be used to model this complex web of factors for these cancer patients, and physicians’ time could be spent on other endeavors.

This type of clinical application is still far from implementation, but remains in sight for modern researchers. As the brain is further explored and understood, more and more of the elements that comprise advanced human thinking can be incorporated into technology. Now, put yourself in the shoes of our New York City passerby: how would you feel if the small cell phone in your pocket was just as intelligent and efficient as the 86 billion neurons in your head? How about if that cell phone solved problems like you do and thought like you think, in essence serving as a smaller version of your own brain? It is almost unfathomable! Yet, by harnessing heterogeneity, researchers have come one step closer toward realizing this goal.

References

Chinnery, T., Arifin, A., Tay, K. Y., Leung, A., Nichols, A. C., Palma, D. A., Mattonen, S. A., & Lang, P. (2020). Utilizing artificial intelligence for head and neck cancer outcomes prediction from imaging. Canadian Association of Radiologists Journal, 72(1), 73–85. https://doi.org/10.1177/0846537120942134.

Perez-Nieves, N., Leung, V. C. H., Dragotti, P. L., & Goodman, D. F. M. (2021). Neural heterogeneity promotes robust learning. Nature Communications, 12(1). https://doi.org/10.1038/s41467-021-26022-3.

ScienceDaily. (2021, October 6). Brain cell differences could be key to learning in humans and ai. ScienceDaily. Retrieved February 27, 2022, from https://www.sciencedaily.com/releases/2021/10/211006112626.htm.

What is more urgent for AI research: long-term or short-term concerns? Experts disagree

April 26, 2021 by Micaela Simeone '22

In a 2015 TED Talk, philosopher and Founding Director of the Oxford Future of Humanity Institute Nick Bostrom discusses the prospect of machine superintelligence: AI that would supersede human-level general intelligence. He begins by noting that with the advent of machine-learning models, we have shifted into a new paradigm of algorithms that learn—often from raw data, similar to the human infant (Bostrom, “What Happens” 3:26 – 3:49).

We are, of course, still in the era of narrow AI: the human brain possesses many capabilities beyond those of the most powerful AI. However, Bostrom notes that artificial general intelligence (AGI)—AI that can perform any intellectual task a human can—has been projected by many experts to arrive around mid- to late-century (Müller and Bostrom, 1) and that the period in between the development of AGI and whatever comes next may not be long at all.

Of course, Bostrom notes, the ultimate limits to information processing in the machine substrate lie far outside the limits of biological tissue due to factors such as size and speed difference (“What Happens” 5:05 – 5:43). So, Bostrom says, the potential for superintelligence lies dormant for now, but in this century, scientists may unlock a new path forward in AI. We might then see an intelligence explosion constituting a new shift in the knowledge substrate, and resulting in superintelligence (6:00 – 6:09).

What we should worry about, Bostrom explains, are the consequences (which reach as far as existential risk) of creating an immensely powerful intelligence guided wholly by processes of optimization. Bostrom imagines that a superintelligent AI tasked with, for example, solving a highly complex mathematical problem, might view human morals as threats to a strictly mathematical approach. In this scenario, our future would be shaped by the preferences of the AI, for better or for worse (Bostrom, “What Happens” 10:02 – 10:28).

For Bostrom, then, the answer is to figure out how to create AI that uses its intelligence to learn what we value and is motivated to perform actions that it would predict we will approve of. We would thus leverage this intelligence as much as possible to solve the control problem: “the initial conditions for the intelligence explosion might need to be set up in just the right way, if we are to have a controlled detonation,” Bostrom says (“What Happens” 14:33 – 14:41).

Thinking too far ahead?

Experts disagree about what solutions are urgently needed in AI

Many academics think that concerns about superintelligence are too indefinite and too far in the future to merit much discussion. These thinkers usually also argue that our energies are better spent focused on short-term AI concerns, given that AI is already reshaping our lives in profound and not always positive ways. In a 2015 article, Oxford Internet Institute professor Luciano Floridi called discussions about a possible intelligence explosion “irresponsibly distracting,” arguing that we need to take care of the “serious and pressing problems” of present-day digital technologies (“Singularitarians” 9-10).

Beneficence versus non-maleficence

In conversations about how we can design AI systems that will better serve the interests of humanity and promote the common good, a distinction is often made between the negative principle (“do no harm”) and the positive principle (“do good”). Put another way, approaches toward developing principled AI can be either about ensuring that those systems are beneficent or ensuring they are non-maleficent. In the news, as one article points out, the two mindsets can mean the difference between headlines like “Using AI to eliminate bias from hiring” and “AI-assisted hiring is biased. Here’s how to make it more fair” (Bodnari, 2020).

Thinkers, like Bostrom, concerned with long-term AI worries such as superintelligence tend to structure their arguments more around the negative principle of non-maleficence. Though Bostrom does present a “common good principle” (312) in his 2014 book, Superintelligence: Paths, Dangers, Strategies, suggestions like this one are made more alongside the broader consideration that we need to be very careful with AI development in order to avoid the wide-ranging harm possible with general machine intelligence.

In an article from last year, Floridi once again accuses those concerned with superintelligence of alarmism and irresponsibility, arguing that their worries mislead public opinion to be fearful of AI progress rather than knowledgeable about the potential and much-needed solutions AI could bring about. Echoing the beneficence principle, Floridi writes, “we need all the good technology that we can design, develop, and deploy to cope with these challenges, and all human intelligence we can exercise to put this technology in the service of a better future” (“New Winter” 2).

In his afterword, Bostrom echoes the non-maleficence principle when he writes, “I just happen to think that, at this point in history, whereas we might get by with a vague sense that there are (astronomically) great things to hope for if the machine intelligence transition goes well, it seems more urgent that we develop a precise detailed understanding of what specific things could go wrong—so that we can make sure to avoid them” (Superintelligence 324).

Considerations regarding the two principles within the field of bioethics (where they originated), can be transferred to conversations about AI. In taking the beneficence approach (do good = help the patient), one worry in the medical community is that doctors risk negatively interfering in their patients’ lives or overstepping boundaries such as privacy. Similarly, with the superintelligence debate, perhaps the short-term, “do good now” camp risks sidelining, for example, preventative AI safety mechanisms in the pursuit of other more pressing beneficent outcomes such as problem-solving or human rights compliance.

There are many other complications involved. If we take the beneficence approach, the loaded questions of “whose common good?” and of who is making the decisions are paramount. On the other hand, taking an approach that centers doing good arguably also centers humanity and compassion, whereas non-maleficence may lead to more mathematical or impersonal calculations of how best to avoid specific risks or outcomes.

Bridging the gap

The different perspectives around hopes for AI and possible connections between them are outlined in a 2019 paper by Stephen Cave and Seán S. ÓhÉigeartaigh from the Leverhulme Centre for the Future of Intelligence at the University of Cambridge called “Bridging near- and long-term concerns about AI.”

The authors explain that researchers focused on the near-term prioritize immediate or imminent challenges such as privacy, accountability, algorithmic bias, and the safety of systems that are close to deployment. On the other hand, those working on the long-term examine concerns that are less certain, such as wide-scale job loss, superintelligence, and “fundamental questions about humanity’s place in a world with intelligent machines” (Cave and ÓhÉigeartaigh, 5).

Ultimately, Cave and ÓhÉigeartaigh argue that the disconnect between the two groups is a mistake, and that thinkers focused on one set of issues have good reasons to take seriously work done on the other.

The authors point to many possible benefits available to long-term research with insight from the present. For example, they write that immediate AI concerns will grow in importance as increasingly powerful systems are deployed. Technical safety research done now, they explain, could provide fundamental frameworks for future systems (5).

In considering what the long-term conversation has to offer us today, the authors write that “perhaps the most important point is that the medium to long term has a way of becoming the present. And it can do so unpredictably” (6). They emphasize that the impacts of both current and future AI systems might depend more on tipping points than even progressions, writing, “what the mainstream perceives to be distant-future speculation could therefore become reality sooner than expected” (6).

Regardless of the controversies over whether we should take the prospect of superintelligence seriously, support for investments in AI safety research unites many experts across the board. At the least, simply joining the conversation means asking one question which we might all agree is important: What does it mean to be human in a world increasingly shaped by the internet, digital technologies, algorithms, and machine-learning?

Works Cited

Bodnari, Andreea. “AI Ethics: First Do No Harm.” Towards Data Science, Sep 7, 2020, https://towardsdatascience.com/ai-ethics-first-do-no-harm-23fbff93017a

Bostrom, Nick. Superintelligence: Paths, Dangers, Strategies, 2014, Oxford University Press.

Bostrom, Nick. “What happens when our computers get smarter than we are?” Ted, March 2015, video, https://www.ted.com/talks/nick_bostrom_what_happens_when_our_computers_get_smarter_than_we_are

Cave, Stephen and ÓhÉigeartaigh, Seán S. “Bridging near- and long-term concerns about AI,” Nature Machine Intelligence, vol. 1, 2019, pp. 5-6. https://www.nature.com/articles/s42256-018-0003-2

Floridi, Luciano. “AI and Its New Winter: from Myths to Realities,” Philosophy & Technology, vol. 33, 2020, pp. 1-3, SpringerLink. https://link.springer.com/article/10.1007/s13347-020-00396-6

Floridi, Luciano. “Singularitarians, AItheists, and Why the Problem with Artificial Intelligence Is H.A.L. (Humanity At Large), Not HAL.” APA Newsletter on Philosophy and Computers, vol. 14, no. 2, Spring 2015, pp. 8-10. https://www.academia.edu/15037984/

Müller, Vincent C. and Bostrom, Nick. ‘Future progress in artificial intelligence: A Survey of Expert Opinion.” Fundamental Issues of Artificial Intelligence. Synthese Library; Berlin: Springer, 2014, www.nickbostrom.com

‘The Scariest Deepfake of All’: AI-Generated Text & GPT-3

March 1, 2021 by Micaela Simeone '22

Recent advances in machine-learning systems have led to both exciting and unnerving technologies—personal assistance bots, email spam filtering, and search engine algorithms are just a few omnipresent examples of technology made possible through these systems. Deepfakes (deep learning fakes), or, algorithm-generated synthetic media, constitute one example of a still-emerging and tremendously consequential development in machine-learning. WIRED recently called AI-generated text “the scariest deepfake of all”, turning heads to one of the most powerful text generators out there: artificial intelligence research lab OpenAI’s Generative Pre-Trained Transformer (GPT-3) language model.

GPT-3 is an autoregressive language model that uses its deep-learning experience to produce human-like text. Put simply, GPT-3 is directed to study the statistical patterns in a dataset of about a trillion words collected from the web and digitized books. GPT-3 then uses its digest of that massive corpus to respond to text prompts by generating new text with similar statistical patterns, endowing it with the ability to compose news articles, satire, and even poetry.

GPT-3’s creators designed the AI to learn language patterns and immediately saw GPT-3 scoring exceptionally well on reading-comprehension tests. But when OpenAI researchers configured the system to generate strikingly human-like text, they began to imagine how these generative capabilities could be used for harmful purposes. Previously, OpenAI had often released full code with its publications on new models. This time, GPT-3s creators decided to hide its underlying code from the public, not wanting to disseminate the full model or the millions of web pages used to train the system. In OpenAI’s research paper on GPT-3, authors note that “any socially harmful activity that relies on generating text could be augmented by powerful language models,” and “the misuse potential of language models increases as the quality of text synthesis improves.”

Just like humans are prone to internalizing the belief systems “fed” to us, machine-learning systems mimic what’s in their training data. In GPT-3’s case, biases present in the vast training corpus of Internet text led the AI to generate stereotyped and prejudiced content. Preliminary testing at OpenAI has shown that GPT-3-generated content reflects gendered stereotypes and reproduces racial and religious biases. Because of already fragmented trust and pervasive polarization online, Internet users find it increasingly difficult to trust online content. GPT-3-generated text online would require us to be even more critical consumers of online content. The ability for GPT-3 to mirror societal biases and prejudices in its generated text means that GPT-3 online might only give more voice to our darkest emotional, civic, and social tendencies.

Because GPT-3’s underlying code remains in the hands of OpenAI and its API (the interface where users can partially work with and test out GPT-3) is not freely accessible to the public, many concerns over its implications steer our focus towards a possible future where its synthetic text becomes ubiquitous online. Due to GPT-3’s frighteningly successful “conception” of natural language as well as creative capabilities and bias-susceptible processes, many are worried that a GPT-3-populated Internet could do a lot of harm to our information ecosystem. However, GPT-3 exhibits powerful affordances as well as limitations, and experts are asking us not to project too many fears about human-level AI onto GPT-3 just yet.

GPT-3: Online Journalist

GPT-3-generated news article that research participants had the greatest difficulty distinguishing from a human-written article

Fundamentally, concerns about GPT-3-generated text online come from an awareness of just how different a threat synthetic text poses versus other forms of synthetic media. In a recent article, WIRED contributor Renee DiResta writes that, throughout the development of photoshop and other image-editing CGI tools, we learned to develop a healthy skepticism, though without fully disbelieving such photos, because “we understand that each picture is rooted in reality.” She points out that generated media, such as deepfaked video or GPT-3 output, is different because there is no unaltered original, and we will have to adjust to a new level of unreality. In addition, synthetic text “will be easy to generate in high volume, and with fewer tells to enable detection.” Right now, it is possible to detect repetitive or recycled comments that use the same snippets of text to flood a comment section or persuade audiences. However, if such comments had been generated independently by an AI, DiResta notes, these manipulation campaigns would have been much harder to detect:

“Undetectable textfakes—masked as regular chatter on Twitter, Facebook, Reddit, and the like—have the potential to be far more subtle, far more prevalent, and far more sinister … The ability to manufacture a majority opinion, or create a fake commenter arms race—with minimal potential for detection—would enable sophisticated, extensive influence campaigns.” – Renee DiResta, WIRED

In their paper “Language Models are Few-Shot Learners,” GPT-3’s developers discuss the potential for misuse and threat actors—those seeking to use GPT-3 for malicious or harmful purposes. The paper states that threat actors can be organized by skill and resource levels, “ranging from low or moderately skilled and resourced actors who may be able to build a malicious product to … highly skilled and well resourced (e.g. state-sponsored) groups with long-term agendas.” Interestingly, OpenAI researchers write that threat actor agendas are “influenced by economic factors like scalability and ease of deployment” and that ease of use is another significant incentive for malicious use of AI. It seems that the very principles that guide the development of many emerging AI models like GPT-3—scalability, accessibility, and stable infrastructure—could also be what position these models as perfect options for threat actors seeking to undermine personal and collective agency online.

Staying with the projected scenario of GPT-3-text becoming widespread online, it is useful to consider the already algorithmic nature of our interactions online. In her article, DiResta writes about the Internet that “algorithmically generated content receives algorithmically generated responses, which feeds into algorithmically mediated curation systems that surface information based on engagement.” Introducing an AI “voice” into this environment could make our online interactions even less human. One example of possible algorithmic accomplices of GPT-3 are Google Autocomplete algorithms which internalize queries and often reflect “-ism” statements and biases while processing suggestions based on common searches. The presence of AI-generated texts could populate Google algorithms with even more problematic content and continue to narrow our ability to have control over how we acquire neutral, unbiased knowledge.

An Emotional Problem

Talk of GPT-3 passing The Turing Test reflects many concerns about creating increasingly powerful AI. GPT-3 seems to hint at the possibility of a future where AI is able to replicate those attributes we might hope are exclusively human—traits like creativity, ingenuity, and, of course, understanding language. As Microsoft AI Blog contributor Jennifer Langston writes in a recent post, “designing AI models that one day understand the world more like people starts with language, a critical component to understanding human intent.”

Of course, as a machine-learning model, GPT-3 relies on a neural network (inspired by neural pathways in the human brain) that can process language. Importantly, GPT-3 represents a massive acceleration in scale and computing power (rather than novel ML techniques), which give it the ability to exhibit something eerily close to human intelligence. A recent Vox article on the subject asks the question, “is human-level intelligence something that will require a fundamentally new approach, or is it something that emerges of its own accord as we pump more and more computing power into simple machine learning models?” For some, the idea that the only thing distinguishing human intelligence from our algorithms is our relative “computing power” is more than a little uncomfortable.

As mentioned earlier, GPT-3 has been able to exhibit creative and artistic qualities, generating a trove of literary content including poetry and satire. The attributes we’ve long understood to be distinctly human are now proving to be replicable by AI, raising new anxieties about humanity, identity, and the future.

GPT-3’s recreation of Allen Ginsberg’s “Howl”

GPT-3’s Limitations

While GPT-3 can generate impressively human-like text, most researchers maintain that this text is often “unmoored from reality,” and, even with GPT-3, we are still far from reaching artificial general intelligence. In a recent MIT Technology Review article, author Will Douglas Heaven points out that GPT-3 often returns contradictions or nonsense because its process is not guided by any true understanding of reality. Ultimately, researchers believe that GPT-3’s human-like output and versatility are the results of excellent engineering, not genuine intelligence. GPT-3 uses many of its parameters to memorize Internet text that doesn’t generalize easily, and essentially parrots back “some well-known facts, some half-truths, and some straight lies, strung together in what first looks like a smooth narrative,” according to Douglas. As it stands today, GPT-3 is just an early glimpse of AI’s world-altering potential, and remains a narrowly intelligent tool made by humans and reflecting our conceptions of the world.

A final point of optimism is that the field around ethical AI is ever-expanding, and developers at OpenAI are looking into the possibility of automatic discriminators that may have greater success than human evaluators at detecting AI model-generated text. In their research paper, developers wrote that “automatic detection of these models may be a promising area of future research.” Improving our ability to detect AI-generated text might be one way to regain agency in a possible future with bias-reproducing AI “journalists” or undetectable deepfaked text spreading misinformation online.

Ultimately, GPT-3 suggests that language is more predictable than many people assume, and challenges common assumptions about what makes humans unique. Plus, exactly what’s going on inside GPT-3 isn’t entirely clear, challenging us to continue to think about the AI “black box” problem and methods to figure out just how GPT-3 reiterates natural language after digesting millions of snippets of Internet text. However, perhaps GPT-3 gives us an opportunity to decide for ourselves whether even the most powerful of future text generators could undermine the distinctly human conception of the world and of poetry, language, and conversation. A tweet Douglas quotes in his article from user @mark_riedl provides one possible way to frame both our worries and hopes about tech like GPT-3: “Remember…the Turing Test is not for AI to pass, but for humans to fail.”

AI

GPT-3’s Limitations

Footer

TAGS