The Security Risks of Generative AI: From Identification and Mitigation to Responsible Use
By Mihai Christodorescu (Visa Research), Somesh Jha (University of Wisconsin-Madison), Rebecca Wright (Barnard College), John Mitchell (Stanford University), Matt Turek (Defence Advanced Research Projects Agency)
We hosted a panel discussion on GenAI opportunities and risks at the 2024 CRA Conference at Snowbird, which we summarize below. We thank Divesh Srivastava for the invitation, Tracy Camp and Janine Myszka for organizing and hosting the session, and the CRA for creating this fantastic venue for idea exchange.
Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities such as in-context learning, code-completion, and text-to-image generation and editing. These excitingly broad capabilities also present a dual-use dilemma—the new technology has the potential to be used not only for good but also by malicious actors to improve and accelerate their attacks. In this context, we wanted to discuss the risks of GenAI and ways to mitigate them as GenAI continues to revolutionize computing research, computer science broadly considered, and many other fields.
We had the pleasure of debating with three leading computer scientists on our panel:
- Rebecca Wright, Druckenmiller Professor and Chair of Computer Science at Barnard College, Director of Vagelos Computational Science Center, and Co-chair of the Cybersecurity Research Center in the Data Science Institute at Columbia University
- John Mitchell, Mary and Gordon Crary Family Professor of Computer Science and (by courtesy) Electrical Engineering and Education at Stanford University
- Matt Turek, Deputy Director of the Information Innovation Office (I2O) at DARPA
We acted as moderators, with the help of a lively audience who drove the debate.
Before diving into the details of the panel discussion, we highlight several topics that arose during the discussion:
- We do not have a foundational theory on evaluating GenAI models. Thus a key imperative is developing evaluation approaches for GenAI and GenAI-based systems, including their impact as sociotechnical systems.
- Threat models for GenAI are currently driven by empirical observations of attacks, and significant research is needed to understand adversaries and their potential uses for GenAI.
- While the GenAI boom is largely sustained by industry, academia plays a key role in driving open-ended research to answer fundamental questions for broad societal impact.
In the opening remarks, Somesh mentioned that GenAI amplifies creativity and productivity because of all the amazing things it can do, but it also exposes the potential for dual use, where the same GenAI technology can be used both for good and bad purposes. Rebecca saw GenAI as a promising technology that brings new threats we have not addressed yet, from deep fakes, to helping attackers create new exploits, and to privacy risks. John mentioned how GenAI can help automate some basic teaching activities, based on his experience with education through online courses, where GenAI can help with grading assignments and with better feedback. However, even in that context, students may try to trick AI tools associated with a course into revealing answers to homework and exams. Matt introduced DARPA as part of the US Department of Defense that is uniquely open to collaboration across industry and academia. He also made the case that some of the risks of GenAI come from the speed and scale at which things can go wrong, and thus the goal must be to build AI systems that you can bet your life on.
Panel Discussion
The panel focused on three questions:
- Which GenAI directions are most exciting? How to realize them?
- Which GenAI risks are most likely? How to mitigate them?
- What is the role for the computing research community?
Somesh kicked off the discussion with a question about which GenAI capabilities most excite the panelists, and what the biggest risks are keeping them up at night. John described several experiences in classroom settings, where AI models turned out to be quite good at summarizing and extracting subtle points from conversations, and also quite powerful in providing information on many topics thanks to the background knowledge from training data. And yet models are not good conversationalists as they have no sense of when to intervene in a conversation, thus making them not yet ready to help people work together constructively in team/group scenarios. The properties that models need to have in this context are not measured in current model leaderboards. John also worried about adversarial manipulation of such systems, where they can be tricked into making the wrong decisions of various kinds.
Rebecca saw a future enabled by GenAI where everything gets better and easier, as people get to do jobs that are more interesting because the less interesting jobs are taken care of. This assumes that we avoid a more dystopian future where human creativity is replaced by GenAI. She also worried about becoming too quickly overly dependent on responses from GenAI based systems and drawing incorrect conclusions or having incorrect behavior, and about misinformation and disinformation amplified by GenAI.
Matt was excited about the potential of GenAI to improve software systems by finding vulnerabilities and fixing them (a goal of the currently ongoing DARPA AI Cyber Challenge), maybe by working alongside formal method systems to generate code and proofs, transforming how we create robust software. On the flip side, he thought the evaluation of AI systems needs to be solved, from establishing trust into an AI system, to understanding which classes of questions AIs can answer, and to determining which AI can optimally solve a particular class of problems (goals for DARPA’s In the Moment program and AI Quantified program).
Audience Questions
The audience drove the remainder of the panel discussion, with a series of questions that covered a lot of ground.
On the question of the use of AI in wars, Matt mentioned that one use is to have AI systems assist humans in decision making (he gave the example of battlefield triage). In such settings, the fundamental question is how to build a quantitative alignment framework, such that the AI can take into account human attributes and core values. Ongoing work at DARPA involves not just AI researchers, but also cognitive scientists, subject matter experts, and decision making researchers.
A question from the audience touched on the risk of bad AI-generated content polluting the Internet and in turn being used in training new AI models, creating a downward spiral of worse and worse AI. John mentioned research work from Stanford that this problem, termed “model collapse”, can be triggered by some threshold in the AI-generated content on the Internet. This is an open problem without a clear path to a solution.
Another audience question was about AI impact on scale, where even imperfect AI models can be combined together to be more efficient than a human (say, 1000 copies of an AI model can replace one human). Rebecca observed that efficiency is an easy goal for companies and institutions to optimize for, and that scaling can also help adversaries, in which case maybe we can look at AI to help defenders scale as well (“use AI to fight AI”). Matt noted that even when adversaries do not use AI, there is a concern that they can bring more human power to an attack, basically to overwhelm defenses, and this creates unique and interesting challenges for US DOD.
A follow-up question continued the topic of model collapse due to dataset pollution, and wondered whether “out of band” mechanisms such as network analysis of content and content provenance, both building on the social nature of Internet content, may be useful in this space. Matt pointed out that in this line of thinking, DARPA projects such as Semantic Forensics (SemaFor) looked at questions of attribution (“Does this media come from where it claims to?”) and characterization (“Was the media generated for malicious purposes?”). Rebecca brought up the topic of accountability, as an alternative to provenance, especially to the degree that accountability can be combined (or traded off) against privacy. The key insight is that accountability needs to be aligned with incentives to make attacks more expensive and less scalable, possibly by purposefully slowing down the spread of inflammatory content. John discussed how research in fact checking uncovered that professional fact checkers already consider provenance first and foremost, before even looking at the content itself. He also pointed out that we should not expect LLMs to do everything for us, but rather we should integrate LLMs into larger systems that give us better trust guarantees.
An interesting question focused on the unequal distribution of GenAI benefits and risks, where underprivileged groups may be affected negatively by lack of access to GenAI. Rebecca discussed that inequities of all kinds can arise from GenAI adoption, especially that GenAI R&D is often focused towards monetizable audiences. From the education standpoint, John commented that design characteristics of the GenAI can lead to different outcomes when rolling out such capabilities to students. The announcement of “hey, AI is out there to help you” has a different effect based on the level of academic development. From the national security perspective, Matt observed that US DOD is the world’s second largest employer, with a diverse mix of employees from various disadvantaged backgrounds, and as such they are actively interested in equitable GenAI.
The discussion then focused on how the computing research community (academia and industry) can best engage with the GenAI space. Rebecca discussed how computing education must change to include GenAI use, and this will require people from multiple backgrounds, multiple disciplinary, economic, racial, and ethnic backgrounds — to be able to talk to each other, to work together, and to draw on different ways of thinking, learning, creating, and building. John thought that to solve problems that are compelling to researchers and students and have social significance, we may need to change the structure of conferences, as such topics may not fit squarely in the tradition of existing conferences. The other point he made for the department chairs in the audience is that AI research requires lots of resources, so the academic starter packages must account for that.
A final question from the audience touched on the system view of GenAI threats and how to model them. We discussed some of the attacks that have appeared up to now, including the use of GenAI for creating spearfishing, deep fakes, and better exploits. Additionally threats related to the prompt injections, to hallucinations, unpredictability, and data-feedback loops of AI are on everyone’s radar. John drew a parallel between the problem of prompt injection and the field of network security, where we have reasonably well established strategies, from modeling the attacker to analyzing the network system against that model. He thought this is a solvable problem, though it may take some time to address it.
Conclusion
In their closing comments, the panelists put forth key topics where they think the computing research community can focus in the GenAI space, especially given the GenAI development is driven by industry at a rapid pace. Matt saw a need to develop foundational theories on evaluating AI models and AI-based systems, and consider their impact as sociotechnical systems. Rebecca supported the need for foundational work to understand adversaries and potential defenses, and emphasized the need to explore how attackers’ capabilities are amplified by AI. John underlined the fact that a lot of the open-ended research comes from universities, focused on broad societal themes and supported by open publications, and that it is important this model of research needs to be sustained.
The panelists and the audience explored a wide set of topics around the risks of GenAI and where the research community can best play a role in shaping the development, evaluation, and deployment of GenAI-based systems. Many parallels can be drawn with the general field of cybersecurity, where technologies often have both benign and malicious uses as they amplify the capabilities of users and adversaries and can result in uneven impact due to inequities and unexpected consequences.
Explore More from the 2024 CRA Conference at Snowbird
For those interested in diving deeper into the discussions and topics from this year’s CRA Conference at Snowbird, you can explore the session slides and other resources available on the conference webpage. Visit our CRA 2024 Conference Resources page to access PDF slide decks from various sessions, including the panel on GenAI.