AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Ho, Nguyen; Vo, Huy; Vu, Mai; Pedersen, Torben Bach

doi:10.1109/TBDATA.2019.2907987

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1906.09995 (cs)

[Submitted on 24 Jun 2019 (v1), last revised 7 Jul 2019 (this version, v2)]

Title:AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Authors:Nguyen Ho, Huy Vo, Mai Vu, Torben Bach Pedersen

View PDF

Abstract:Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first important step is to identify temporal correlations between data sets. Given the characteristics of Big Data in terms of volume and velocity, techniques that identify correlations not only need to be fast and scalable, but also need to help users in ordering the correlations across temporal scales so that they can focus on important relationships. In this paper, we present AMIC (Adaptive Mutual Information-based Correlation), a method based on mutual information to identify correlations at multiple temporal scales in large time series. Discovered correlations are suggested to users in an order based on the strength of the relationships. Our method supports an adaptive streaming technique that minimizes duplicated computation and is implemented on top of Apache Spark for scalability. We also provide a comprehensive evaluation on the effectiveness and the scalability of AMIC using both synthetic and real-world data sets.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:1906.09995 [cs.DC]
	(or arXiv:1906.09995v2 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1906.09995
Related DOI:	https://doi.org/10.1109/TBDATA.2019.2907987

Submission history

From: Nguyen Ho Ms. [view email]
[v1] Mon, 24 Jun 2019 14:31:46 UTC (2,856 KB)
[v2] Sun, 7 Jul 2019 21:28:02 UTC (2,856 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators