-
Visual Progression Analysis of Student Records Data
Authors:
Mohammad Raji,
John Duggan,
Blaise DeCotes,
Jian Huang,
Bradley Vander Zanden
Abstract:
University curriculum, both on a campus level and on a per-major level, are affected in a complex way by many decisions of many administrators and faculty over time. As universities across the United States share an urgency to significantly improve student success and success retention, there is a pressing need to better understand how the student population is progressing through the curriculum,…
▽ More
University curriculum, both on a campus level and on a per-major level, are affected in a complex way by many decisions of many administrators and faculty over time. As universities across the United States share an urgency to significantly improve student success and success retention, there is a pressing need to better understand how the student population is progressing through the curriculum, and how to provide better supporting infrastructure and refine the curriculum for the purpose of improving student outcomes. This work has developed a visual knowledge discovery system called eCamp that pulls together a variety of populationscale data products, including student grades, major descriptions, and graduation records. These datasets were previously disconnected and only available to and maintained by independent campus offices. The framework models and analyzes the multi-level relationships hidden within these data products, and visualizes the student flow patterns through individual majors as well as through a hierarchy of majors. These results support analytical tasks involving student outcomes, student retention, and curriculum design. It is shown how eCamp has revealed student progression information that was previously unavailable.
△ Less
Submitted 18 October, 2017;
originally announced October 2017.
-
BigDAWG Polystore Release and Demonstration
Authors:
Kyle OBrien,
Vijay Gadepally,
Jennie Duggan,
Adam Dziedzic,
Aaron Elmore,
Jeremy Kepner,
Samuel Madden,
Tim Mattson,
Zuohao She,
Michael Stonebraker
Abstract:
The Intel Science and Technology Center for Big Data is developing a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports "many sizes" of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of BigDAW…
▽ More
The Intel Science and Technology Center for Big Data is developing a reference implementation of a Polystore database. The BigDAWG (Big Data Working Group) system supports "many sizes" of database engines, multiple programming languages and complex analytics for a variety of workloads. Our recent efforts include application of BigDAWG to an ocean metagenomics problem and containerization of BigDAWG. We intend to release an open source BigDAWG v1.0 in the Spring of 2017. In this article, we will demonstrate a number of polystore applications developed with oceanographic researchers at MIT and describe our forthcoming open source release of the BigDAWG system.
△ Less
Submitted 18 January, 2017;
originally announced January 2017.
-
The BigDAWG Polystore System and Architecture
Authors:
Vijay Gadepally,
Peinan Chen,
Jennie Duggan,
Aaron Elmore,
Brandon Haynes,
Jeremy Kepner,
Samuel Madden,
Tim Mattson,
Michael Stonebraker
Abstract:
Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficien…
▽ More
Organizations are often faced with the challenge of providing data management solutions for large, heterogenous datasets that may have different underlying data and programming models. For example, a medical dataset may have unstructured text, relational data, time series waveforms and imagery. Trying to fit such datasets in a single data management system can have adverse performance and efficiency effects. As a part of the Intel Science and Technology Center on Big Data, we are developing a polystore system designed for such problems. BigDAWG (short for the Big Data Analytics Working Group) is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands and a middleware that provides a uniform multi--island interface. Initial results from a prototype of the BigDAWG system applied to a medical dataset validate polystore concepts. In this article, we will describe polystore databases, the current BigDAWG architecture and its application on the MIMIC II medical dataset, initial performance results and our future development plans.
△ Less
Submitted 23 September, 2016;
originally announced September 2016.
-
The BigDAWG Architecture
Authors:
Vijay Gadepally,
Jennie Duggan,
Aaron Elmore,
Jeremy Kepner,
Samuel Madden,
Tim Mattson,
Michael Stonebraker
Abstract:
BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform mult…
▽ More
BigDAWG is a polystore system designed to work on complex problems that naturally span across different processing or storage engines. BigDAWG provides an architecture that supports diverse database systems working with different data models, support for the competing notions of location transparency and semantic completeness via islands of information and a middleware that provides a uniform multi-island interface. In this article, we describe the current architecture of BigDAWG, its application on the MIMIC II medical dataset, and our plans for the mechanics of cross-system queries. During the presentation, we will also deliver a brief demonstration of the current version of BigDAWG.
△ Less
Submitted 28 February, 2016;
originally announced February 2016.