CRA-I Blog

The CRA-I Blog frequently shares news, timely information about the computing research industry community, and items of interest to the general community. Subscribe to blog emails here to stay connected.

Exploring Trust, Technology, and AI in Healthcare Data Sharing: Insights from the CRA-I Workshop

Workshop participants watching a panel. CRA-Industry (CRA-I) recently hosted a Sharing Healthcare Data workshop in October in Washington, DC. Over 35 healthcare professionals, academics, industry leaders, and government representatives convened to explore the intersection of healthcare data sharing, trust-building, and the evolving role of AI in patient care. The discussions highlighted crucial themes of inclusivity, patient-centered innovation, and the vital need for diverse perspectives to shape the future of healthcare data. The full agenda is available here. This workshop is spun off of the very successful CRA-I Sharing Healthcare Data Roundtable in December 2023.

The workshop was honored to feature keynote speakers Deborah Estrin from Cornell Tech and Tom Kalil from Renaissance Philanthropy. Estrin presented on Patient-Generated Data Sharing: Advancing Hybrid, Longitudinal Patient Care with Digital Biomarkers and Therapeutics (DBx, DTx), emphasizing the potential of patient-generated data in shaping comprehensive care. Kalil focused on the need for concrete, actionable steps in policy, urging the identification of specific data needs and detailed, agency-specific recommendations for policymakers. 

A key theme that emerged throughout the workshop was trust in healthcare data sharing, particularly for underserved communities. While technology can improve access, culturally competent, human-centered approaches remain essential for building lasting trust. Some challenges in this field that were brought up include regulatory, legal, and consent barriers, as well as the need for clear AI policies in healthcare.

The workshop emphasized that transforming healthcare data sharing, while achievable, requires thoughtful approaches. By expanding stakeholder involvement, refining data-sharing policies, and supporting ongoing innovation, we can create a healthcare ecosystem that respects patient autonomy and serves the needs of all. Moving forward, our efforts should aim to build frameworks that not only advance healthcare technology but also ensure it is used responsibly, fairly, and in a way that fosters trust across all communities. Please keep an eye out for the workshop report, which will be posted here, in the next few months. 

Call for Participation: Breadth of Partnership in Academia/Industry Relationships

The following CRA-I workshop is organized by Mary Hall, Amit Jain, and Vivek Sarkar.

Motivation and Goals

Collaboration between academia and industry is crucial for driving technological innovation and fostering economic growth. By identifying successful partnership models and a range of effective practices, stakeholders can strengthen these connections to stimulate new ways of engagement between academia and industry. 

In support of this goal, CRA-Industry (CRA-I) is hosting a small, 40-person workshop on “Breadth of Practices in Academia/Industry Relationships” in Seattle, WA, on March 20-21, 2025. Scott DeBoer, Executive Vice President of Micron, will be the keynote speaker. The workshop aims to bring together computing research stakeholders from both academia and industry to explore key questions, including:

  • Utilizing Industry Resources: How can academia effectively utilize industry resources and expertise to improve research outcomes?
  • Shaping Curriculum: How can industry play a more active role in helping universities align curriculum with evolving industry needs and trends?
  • Incentivizing Collaboration: What incentives can universities offer faculty and researchers to engage in collaborative projects with industry partners?
  • Promoting Entrepreneurship: How can universities better facilitate technology transfer and entrepreneurship to turn academic research into real-world applications?
  • Ensuring Equitable Partnerships: What strategies can be implemented to ensure fair distribution of benefits and recognition in academia-industry partnerships?
  • Research prioritization: How can input from industry on real-world applications help academia conduct research that is more adaptable and applicable in practice?

Participants will engage in meaningful discussions, share best practices, and contribute to the development of a “breadth of practices” document to be shared with the broader computing research community.

How to Participate

If you are interested in attending, please email Helen Wright (hwright@cra.org) by December 6th. If you are interested in participating as a speaker, please submit a brief paragraph highlighting your involvement with academia/industry partnerships. Due to space limitations and the goal of fostering open discussions among a small group, the workshop planning committee will select participants based on the relevance of their submissions to the workshop topics.

As part of our commitment to making this event accessible, CRA-I charges no registration fee to attend. Participants are only responsible for their travel and hotel expenses (a link for discounted hotel rates will be provided during registration). If you have any questions about these policies, please reach out to Helen Wright at hwright@cra.org.

CRA-I and the workshop organizers look forward to your submission!

ACM SIGPLAN BLOG: Prompts are Programs

The following is a repost from the ACM SIGPLAN Blog: PL Perspectives. It was written by Tommy Guy, Peli de Halleux, Reshabh K Sharma, and past CRA-I Co-Chair Ben Zorn on October 22, 2024. Please see the original post here. 

 

In this post, we highlight just how important it is to understand that an AI model prompt has much in common with a traditional software program.  Taking this perspective creates important opportunities and challenges for the programming language and software engineering communities and we urge these communities to undertake new research agendas to address them.

Moving Beyond Chat

ChatGPT, released in December 2022, had a huge impact on our understanding of what large language models (LLMs) can do and how we can use them.  The millions of people who have used it understand what a prompt is and how powerful they can be.  We marvel at the breadth and depth of the ability of the AI model to understand and respond to what we say and its ability to hold an informed conversation that allows us to refine its responses as needed.

Having said that, many chatbot users have experienced challenges in getting LLMs to do what they want.  Skill is required in phrasing the input to the chatbot so that it correctly interprets the user intent.  Similarly, the user may have very specific expectations of what the chatbot produces (such as data formatted in a particular way, such as JSON object), that is important to capture in the prompt.

Also, chat interactions with LLMs have significant limitations beyond challenges in phrasing a prompt.  Unlike writing and debugging a piece of code, having an interactive chat session does not result in an artifact that can then be reused, shared, parameterized, etc.  So, for one-off uses chat is a good experience, but for repeated application of a solution, chat falls short.

Prompts are Programs

The shortcomings of chatbots are overcome when LLM interactions are embedded into software systems that support automation, reuse, etc.   We call such systems AI Software systems (AISW) to distinguish them from software that does not leverage an LLM at runtime (which we call Plain Ordinary Software, POSW).   In this context, LLM prompts have to be considered part of the broader software system and have same robustness, security, etc. requirements that any software has.  In a related blog, we’ve outlined how much the evolution of AISW will impact the entire system stack.  In this post, we focus on how important prompts are in this new software ecosystem and what new challenges they present to our existing approaches to creating robust software.

Before proceeding, we clarify what we mean by a “prompt”.  First, our most familiar experience with prompting is what we type into a chatbot.  We call the direct input to the chatbot the user prompt.   Another, more complex prompt is the prompt that was written to process the user prompt, which is often called the system prompt.  The system prompt contains application-specific directions (such as “You are a chatbot…”) and is combined with other inputs (such as the user prompt, documents, etc.) before being sent to the LLM.  The system prompt is a fixed set of instructions that define the nature of the task to be completed, what other inputs are expected, and how the output should be generated.  In that way, the system prompt guides the execution of the LLM to compute a specific result, much as any software function. In the following discussion, our focus is mainly on thinking of system prompts as programs but many of the observations also directly apply to the user prompts as well.

An Example of a Prompt

We use the following prompt as an example, loosely adapted from a recent paper on prompt optimization to illustrate our discussion.

You are given two items: 1) a sentence and 2) a word contained in that sentence.
Return the part of speech tag for the given word in the sentence.

This system prompt describes the input it expects (in this case a pair of a sentence such as “The cat ate the hat.” and a word, such as “hat”), the transformation to perform, and the expected structure of the output.   With this example, it is easy to see that all the approaches we take to creating robust software should now be rethought in terms of how they apply to prompts.

If Prompts are Programs, What is the Programming Language?

There are many questions related to understanding the best way to prompt language models and it is a topic of active PL and AI research.  Expressing prompts purely in natural language can be effective in practice.  In addition, best practice guidelines for writing prompts often recommend structuring prompts using traditional document structuring mechanisms (like using markdown) and clearly delineating sections, such as a section of examples, output specifications, etc.  Uses of templating, where parts of prompts can be substituted programmatically, are also popular.  Approaches to controlling the structure and content in the output of prompts both in model training and through external specifications, such as OpenAI JSON mode, or Pydantic Validators, have been effective.

Efforts have also been made to more deeply integrate programming language constructs into the prompts themselves, including the Guidance and LMQL languages, which allows additional specifications.  All of these methods (1) observe the value of more explicit and precise specifications in the prompt and (2) leverage any opportunity to apply systematic checking to the resulting model output.

Prompting in natural language will evolve as the rich set of infrastructures that the LLMs can interact with become available.  Tools that extend the abilities of LLMs to take actions (such as retrieval augmented generation, search, or code execution) become abstractions that are available to the LLM to use but must be expressed in the prompt such that the user intent to leverage them is clear.  Much PL research is required to define such tool abstractions, help LLMs choose them effectively, and help prompt writers express their intent effectively.

Software Engineering for Prompts

If we understand that prompts are programs, then how do we transition our knowledge and tools for building POSW so that we can create robust and effective prompts?  Tooling for authoring, debugging, deploying and maintaining prompts is required and existing tools for POSW do not directly transfer.

One major difference between prompts and traditional software is that the underlying engine that interprets prompts, the LLM, is not deterministic and so the same prompt can result in different results in different calls even using the same LLM.  Also, because the types and varieties of LLMs are proliferating, it is even harder to ensure that the same prompt will produce the same result across different LLMs.  In fact, LLMs are evolving rapidly and there are important tradeoffs that can be made between inference cost, output quality, and local models versus cloud-hosted models. The implication of this fact is that when the underlying model changes, the prompt may require changes as well, which suggests that prompts will require continuous tweaking as models evolve.

There are a number of existing research approaches to automatically optimizing and updating prompts, such as DSPy,  but such technologies are still in their infancy.  Also, a given AI software application may choose to use different models at different times for efficiency, so like having binary formats that support multiple ISAs, (e.g., the Apple Universal binary format), prompts may require structure that supports multiple target LLMs.

Ultimately, tools that support testing, debugging, and optimizing the prompt/model pairing will be necessary and become widely used.  Because standards for prompt representation or even how prompts are integrated into existing software applications have not been adopted, research into the most effective approaches to these problems is needed.

Next Steps for Prompt Research

Because prompts are programs, the software engineering and programming languages communities have much to offer in improving our understanding and ability to create expressive, effective, efficient, and easy to write prompts.  There are incredible research opportunities to explore and the impact will inform the next generation of software systems that will be based on AISW.  Moreover, because writing prompts is much more accessible to non-programmers, an entirely new set of challenges relates to how our research can support individuals who are not professional developers to leverage LLMs through writing effective, expressive, robust and reusable prompts.

In this post, we’ve considered how a single prompt should be considered a program but, in practice, many applications that leverage AI contain multiple prompts that are chained together with traditional software.  Multi-prompt systems introduce even greater software engineering challenges, such as how to ensure that a composition of prompts is robust and predictable.   And this field is moving very fast.  Agentic systems, such as AutoGen and Swarm, where AI-based agents are defined and interact with each other, are already widely available.  How does our existing understanding of building robust software translate to these new scenarios?  Learning what such systems are capable of and how we can construct them robustly is increasingly important for the research community to explore.

The challenges and effective strategies for creating robust prompts are not well understood and will evolve as rapidly as the underlying LLM models and systems evolve.  The PL and SE communities have to be agile and eager to bring the decades of research and experience building languages and tools for robust software development to this new and important domain.

Biographies:
Tommy Guy
 is a Principal Architect on the Copilot AI team at Microsoft. His research interests include AI-assisted data mining, large-scale AB testing, and the productization of AI.

Peli de Halleux is a Principal Research Software Developer Engineer in Redmond, Washington working in the Research in Software Engineering (RiSE) group. His research interests include empowering individuals to build LLM-powered applications more efficiently.

Reshabh K Sharma is a PhD student at the University of Washington. His research lies at the intersection of programming languages and security, focusing on developing infrastructure for creating secure systems and improving existing systems using software-based mitigations to address various vulnerabilities, including those in LLM-based systems.

Ben Zorn is a Partner Researcher at Microsoft Research in Redmond, Washington working in (and previously having managed) the Research in Software Engineering (RiSE) group. His research interests include programming language design and implementation, end-user programing, and empowering individuals with responsible uses of artificial intelligence.

Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.