CRA-I Blog
The CRA-I Blog frequently shares news, timely information about the computing research industry community, and items of interest to the general community. Subscribe to blog emails here to stay connected.
The CRA-I Blog frequently shares news, timely information about the computing research industry community, and items of interest to the general community. Subscribe to blog emails here to stay connected.
CRA-I GenAI for Research and Science Roundtable
/in Community Updates, CRA-I Announcements, CRA-I EventRecently, the Computing Research Association – Industry (CRA-I) held a dynamic roundtable event on “Generative AI (GenAI) for Research and Science,” bringing together industry leaders, researchers, and experts to delve into the transformative potential of GenAI, a subset of AI that generates new content by learning data patterns, across various scientific disciplines.
CRA-I Council member Elizabeth Bruce from Microsoft moderated the roundtable, and she emphasized its significance in automating creative processes and its broad applicability across sectors. The panelists were Travis Johnson (Director of Bioinformatics at the Indiana BioSciences Research Institute), Jing Liu (Executive Director of the Michigan Institute for Data Science), Vijay Murugesan (Staff Scientist at Pacific Northwest National Lab), and Neil Thompson (Director of the MIT FutureTech Lab).
Johnson shared insights into how GenAI is revolutionizing drug discovery by generating new molecules based on existing data, expediting the identification of novel drug targets. Liu highlighted the potential of GenAI in facilitating interdisciplinary research and enhancing the scale of scientific endeavors. She also emphasized its role in distinguishing between valuable research and spurious results. Murugesan discussed a collaborative project with Microsoft leveraging GenAI to accelerate materials discovery for battery technology. He outlined how AI-driven screening of millions of materials drastically reduced the time and resources required for research. Thompson provided an economic perspective on the productivity gains enabled by GenAI. He underscored its role in empowering scientists to leverage computation more effectively and its potential to revolutionize scientific understanding through data-driven exploration.
At the end of the roundtable, Bruce asked the panelists what advice they would give young researchers today. Thompson initiated the final segment with sage advice from economist Hal Varian, emphasizing the importance of identifying areas complementary to emerging technologies like GenAI. He encouraged young researchers to focus on domains that can leverage AI advancements effectively, such as material science and biology, and to continuously adapt and evolve their research focus in response to technological advancements. Echoing Thompson’s sentiments, Murugesan underscored the significance of adaptability in the rapidly evolving research landscape. He highlighted the transformative impact of GenAI on traditional research paradigms and emphasized the need for researchers to stay agile and adaptable in their thinking and research focus to remain relevant and impactful. Liu shared a poignant reflection from her postdoctoral mentor, emphasizing the importance of mindful contribution to science amidst technological advancements. She encouraged young researchers to reflect on their research goals and aspirations, prioritizing rigor and meaningful contributions to scientific knowledge while leveraging evolving tools and methodologies. Finally, Johnson concluded the discussion by emphasizing the importance of responsible AI utilization in research. While advocating for the use of GenAI as a powerful tool for hypothesis generation and data analysis, he cautioned against over-reliance on AI-generated insights. Johnson urged young researchers to maintain expertise in their respective fields while leveraging AI as a supplementary tool to enhance research outcomes.
Overall, the CRA-I roundtable served as a forum for thought-provoking dialogue and collaboration, shedding light on the transformative impact of Generative AI in driving scientific innovation. See the full recording here.
ACM SIGARCH BLOG: AI Software Should be More Like Plain Old Software
/in Community UpdatesThe following is a repost from the ACM SIGARCH blog. It was written by Emery Berger (UMass) and CRA-Industry Co-Chair Ben Zorn (Microsoft Research) on Apr 23, 2024.
Large Language Models (LLMs) and other foundation models have revolutionized what is possible to implement in software. On a regular basis, new AI models with ever greater capabilities, such as converting text to video, are rolled out. This disruption is so striking that new terminology is needed. We refer to traditional software – the kind that does not call LLMs at runtime – as Plain Old Software (POSW). We call software that exploits LLMs during execution as AI Software (AISW).
One key reason we distinguish these two types is that, even though AISW greatly expands the kind of software that is possible to implement, AISW relinquishes the guarantees we have become accustomed to with POSW. Systems researchers have invested decades of effort into ensuring that POSW has robustness, privacy, security, and other guarantees that are implemented throughout the system stack. For example, hardware supports a separation of code and data with an “execute bit” that can successfully prevent many code exploit attacks. But AISW is susceptible to analogous attacks. AISWs are driven by prompts. If a prompt includes both a task description (“summarize this document”) and data (the document itself), AISWs can suffer from a “prompt injection” attack because they cannot easily determine if the document also contains additional potentially adversarial commands.
Carrying over the guarantees of POSW to AISW will require engagement and innovation from the research community and other disciplines across computer science including HCI, AI, etc. Only through a deep collaboration between these communities can these challenges be overcome. We outline here some of the implications of the shift from POSW to AISW to inform researchers on the needs and challenges of building a robust AISW infrastructure going forward.
We believe the familiar system stack that hasn’t changed dramatically in many decades must be reinvented with AISW in mind:
The parts of this new stack are analogous with the old stack, but also different in important ways. The LLM inference engine interfaces the LLM to the stack above it via an interface that includes the prompt context and a “generate next token” instruction. In traditional hardware, think of the prompt context as defining the location of the program counter and the “generate token” action as executing the next instruction. The AI controller manages what “instructions” can be executed by explicitly limiting what tokens the LLM is allowed to generate, much as an OS can prevent certain instructions in privileged mode. The prompt runtime is a set of services that are used to create the final prompt sent to the LLM and parse the resulting generation. Retrieval Augmented Generation (RAG), where additional content from external sources (like documents or the web) is added to the prompt, is an important part of every prompt runtime. Validating generations for correctness, when possible, is another important element of the prompt runtime.
To understand how this new stack differs from the old one, consider this simple AISW application that can be implemented in single sentence: “Given this research paper <attach the paper.pdf> and an organization <name a university, program committee, etc.>, give me 5 individuals in that organization that would be interested in reading this paper.”
This “program” can be implemented in ChatGPT or other existing LLMs with an interactive chat dialog. Having the user in the loop (who understands the standard disclaimers about hallucinations, etc.) provides a level of robustness to the program making it usable. But such a level of review and human intervention defeats the purpose of automating such tasks and is only necessary because the infrastructure supporting robust AISW is only starting to emerge. Right now, it is not obvious how one would transform this chat session application into a robust service. Our goal is a world where, from the perspective of a user, AISW “just works” like POSW.
Two key approaches that existing systems use to solve this problem are specification/verification and standardization. Let’s consider each issue in turn.
First, how do we specify and verify the output of an LLM? Several approaches to specifying both the syntax and semantics of LLM generations have been proposed and implemented. Constraining the output of an LLM to a particular well-formed syntax has clear value when generations are expected to be in JSON format, for example; this approach is supported by libraries like Pydantic Validators. Newly proposed languages like LMQL and Guidance provide “prompt-as-programs” approaches that specify constraints over LLM output as well as conditional execution paths in prompts, depending on the results produced by a previous round of prompting the LLM. These approaches constitute the evolution of the Prompt/Prompt Runtime and AI Controller parts of System Stack 2.0 shown above.
Syntax validation is only a small part of the deeper challenge of specifying and verifying constraints on LLM outputs that are not expressible using existing mathematics. How do you measure whether an LLM summary is faithful to the original content, or that an individual would truly be interested in reading your paper? These are not new questions and ideas for solutions pre-date the rise of LLMs. More than 10 years ago, the Automan project considered how to leverage crowdsourcing to implement functions that could not be computed with POSW. The solution acknowledged that the source of information (humans in the crowd) was inconsistent and potentially unreliable, much as LLMs are today. We believe that increasing the reliability of LLM generations will also involve some of the statistical techniques explored in Automan as well as leveraging potentially many different AI models focusing on different aspects of correctness. Techniques to go beyond ensuring syntactic constraints have also recently been explored, including the concept of Monitor-Guided Decoding (MGD) which uses static analysis to impose semantic constraints on generations.
Standardization in different layers of a stack allows innovation above and below the interfaces. We are familiar with existing HW instruction set architectures, like ARM and x86. These standard interfaces have existed and supported incredible changes in the technology both above and below them for over 40 years. What are the new standard interfaces for AISW? We believe one possible layer in System Stack 2.0 is a token-by-token interface between the prompt language runtime and the LLM. While at the level of ChatGPT, we see an entire prompt and response from the LLM, the interaction could instead happen a token at a time. Each time a token is generated, the calling program would have the opportunity to constrain what the LLM can generate.
Controlling the LLM at this level in a standard way is the focus of the recently released AI Controller Interface project. For example, using a controller like AICI, a prompt can specify that the output must be syntactically correct JSON (or any other language specified in a context-free grammar) and guarantee that it is. Similarly, the prompt can request the model to extract an exact quote from a document and guarantee that the LLM response does not hallucinate the output.
We believe that the co-development of emerging prompt-as-programming languages, like LMQL; standard interfaces, like AICI; and open source implementations of the underlying AI inference stack (such as the Berkeley vLLM project) are emerging technologies that are beginning to define what System Stack 2.0 will be.
This is a call to action for the systems community to embrace the importance of AISW on the future of systems and to focus effort across the community on ensuring that we incorporate state-of-the-art design and implementation as quickly as possible into these systems. The explosive growth of ChatGPT usage illustrates how quickly new AI technology can be invented and deployed. We anticipate rapid advances in the AI state-of-the-art going forward, which highlights the need for the systems research community to continue defining strong, durable abstractions around such systems that can form a basis for ensuring their security, robustness, etc. independently as the models evolve.
AISW will evolve and embed itself increasingly in many aspects of society and, as a discipline, the systems research community has a once-in-a-lifetime opportunity to ensure that future AISW is even more robust, secure, and safe than our existing POSW.
Heather Stephens (Oracle) Joins CRA-Industry Council
/in Community Updates, Council, CRA-I AnnouncementsCRA-Industry (CRA-I) is thrilled to share that Heather Stephens (Oracle) is now a member of the CRA-I Council. Heather joins a dynamic group of Council members, led by CRA-I Council Chair Divesh Srivastava (AT&T), who are dedicated to collaborating with the CRA-I Steering Committee. Together, they will steer the direction of future committees, engage with the community, and drive towards the goals of CRA-I.
Heather Stephens is a Senior Director at Oracle Corporation. She started her career in physics research at NASA and DOE working on large scale data models and analysis for satellites and environmental research prototypes and devices. She switched over to high tech to advance technology for real world use and be closer to the people that use it. She has spent the bulk of her career working in various startups as well as larger companies like Microsoft on products ranging from language development to IDEs to various Cloud platform/services to Connected Cars. She is actively engaged with the Leadership Institute at her alma mater, Montana State University, to foster innovation and technical leadership by creating an interdisciplinary science and technology extracurricular program. This experience led her to take a role at Oracle to build out a program that supports the education sector in advancing to modern Java. She is learning many things about the challenges educators face and how difficult it is to stay abreast of changes in modern technology. She wants to be a force for good to develop methods for tech and education to stay current, so students are excited about coding and so they have the necessary knowledge to build the next wave of technology that helps the world.
Please help the industry research community by continuing to nominate outstanding colleagues for the CRA-I Council. Read more here and send nominations to industryinfo@cra.org.
Welcome, Heather!