BigGIS: A Continuous Refinement Approach to Master Heterogeneity and Uncertainty in Spatio-Temporal Big Data
The following Great Innovative Idea is from Patrick Wiener from Karlsruhe University of Applied Sciences. Wiener along with his coauthors Manuel Stein and Daniel Seebacher from University of Konstanz, Julian Bruns, Matthias T. Frank, Viliam Simko, and Stefan Zander from FZI Research Center for Information Technology, and Jens Nimis from Karlsruhe University of Applied Sciences were among the winners at the Computing Community Consortium (CCC) sponsored Blue Sky Ideas Track Competition at the ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems 2016 (SIGSPATIAL 2016) in San Francisco, CA. Their winning paper is called BigGIS: A Continuous Refinement Approach to Master Heterogeneity and Uncertainty in Spatio-Temporal Big Data.
The Innovative Idea
Geographic information systems (GIS) are important for decision support based on spatial data. Due to technical and economical progress an ever increasing number of data sources are available leading to a rapidly growing fast and unreliable amount of data that can be beneficial (1) in the approximation of multivariate and causal predictions of future values as well as (2) in robust and proactive decision-making processes. However, today’s GIS are not designed for such big data demands and require new methodologies to effectively model uncertainty and generate meaningful knowledge. As a consequence, we introduce BigGIS, a predictive and prescriptive spatio-temporal analytics platform, that symbiotically combines big data analytics, semantic web technologies and visual analytics methodologies. We propose the continuous refinement model in BigGIS, which will on one hand allow to steadily improve the analysis results, e.g., by updating deployed machine learning models, and on the other hand to build the user’s trust in these results by creating awareness of underlying uncertainties and data provenance which is key for providing meaningful predictive and prescriptive decision support in various fields. We consider uncertainty to be reciprocally related to generating new insights and consequently knowledge. Thus, modeling uncertainty is a crucial task in BigGIS. From a high-level perspective, our approach consists of an integrated analytics pipeline which blends big data analytics and semantic web services on system-side with domain expert knowledge on human-side, thereby modeling uncertainty to continuously refine results to generate new knowledge.
Impact
Current GIS solution are mostly tackling big data related requirements in terms of data volume or data velocity. In the era of cloud computing, leveraging cloud-based resources is a widely adopted pattern. In addition, with the advent of big data analytics, performing massively parallel analytical tasks on large-scale data at rest or data in motion is as well becoming a feasible approach shaping the design of today’s GIS. Although scaling out enables GIS to tackle the aforementioned big data induced requirements, there are still two major open issues. Firstly, dealing with varying data types across multiple data sources (variety) lead to data and schema heterogeneity, e.g., to describe locations such as addresses, relative spatial relationships or different coordinates reference systems. Secondly, modeling the inherent uncertainties in data (veracity), e.g., real-world noise and erroneous values due to the nature of the data collecting process. Both being crucial tasks in data management and analytics that directly affect the information retrieval and decision-making quality and moreover the generated knowledge on human-side (value). By leveraging the the continuous refinement model, we present a holistic approach that explicitly deals with all big data dimensions. By integrating the user in the process, computers can learn from the cognitive and perceptive skills of human analysis to create hidden connections between data and the problem domain. This helps to decrease the noise and uncertainty and allows to build up trust in the analysis results on user side which will eventually lead to an increasing likelihood of relevant findings and generated knowledge.
Other Research
Our research team in BigGIS focuses on three distinct application domains, (1) disaster management, (2) smart city and health and (3) environmental management , where we see the use of a predictive and prescriptive GIS is of great benefit. These scenarios represent diverse categories of application domains. In brief, an illustrating scenario is the disaster management scenario, where we try to support rescue forces in assessing and managing large-scale and complex chemical disasters and providing and in-depth overview of the current situation within a small time frame in oder to prevent exposing the surrounding population to any hazardous substances. In this scenario, our main idea is to leverage the recent developments in the field of mobile robotics allowing in-situ components such as unmanned aerial vehicles (UAV). Equipped with hyperspectral cameras a UAV can scan the affected area more quickly and flexible thereby creating a huge data stream that BigGIS needs to efficiently process, analyze and predict, e.g. the type of substance(s) and/or the dispersion of the hazardous clouds. Therefore, other data sources can be leveraged, e.g. meteo data, to further enrich the data stream with the relevant features for our machine learning models. Our research focuses on an integrated view of heterogeneous spatio-temporal data from unstructured and unreliable datasets, as well as in the design of an analytical pipeline for predictive, prescriptive and visual analysis.
Researcher’s Background
Since June 2015, I am a research assistant at the Institute of Applied Research and Faculty of Management Science and Engineering at Karlsruhe University of Applied Sciences where I am responsible for designing and building an infrastructure for geospatial analytics at scale as part of the collaborative research project “BigGIS – Predictive and prescriptive GIS based on high-dimensional geo-temporal data structures”, which is funded by the German Federal Ministry of Education and Research (reference number: 01IS14012). I hold a M.Sc. in Management Science and Engineering from the Karlsruhe University of Applied Sciences in Germany. My research interests are container-based infrastructure designs and distributed computing on big data frameworks. My research goals focus on managing of container-based infrastructure at large-scale.
Links
For further information about our research, please check our website at http://biggis-project.eu/.