What is Data Science in the 21st Century?

Last July, a distinguished panel of computer scientists – David Culler (UC Berkeley), Rayid Ghani (U of Chicago), Rahel Jhirad (Hearst) and Rob Rutenbar (UIUC) — discussed this question with a group of approximately 100 CRA Conference at Snowbird attendees. There was agreement that data science is an interdisciplinary field, combining techniques from machine learning, natural language processing, data mining, algorithms, information retrieval, etc.

Computing Research and the Emerging Field of Data Science

Our ability to collect, manipulate, analyze, and act on vast amounts of data is having a profound impact on all aspects of society. This transformation has led to the emergence of data science as a new discipline. The explosive growth of interest in this area has been driven by research in social, natural, and physical sciences with access to data at an unprecedented scale and variety, by industry assembling huge amounts of operational and behavioral information to create new services and sources of revenue, and by government, social services and non-profits leveraging data for social good. This emerging discipline relies on a novel mix of mathematical and statistical modeling, computational thinking and methods, data representation and management, and domain expertise. While computing fields already provide many principles, tools and techniques to support data science applications and use cases, the computer science community also has the opportunity to contribute to the new research needed to further drive the development of the field. In addition, the community has the obligation to engage in developing guidelines for the responsible use of data science.