Skip to main content

Showing 1–12 of 12 results for author: Veeramachaneni, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2212.13558  [pdf, other

    cs.LG stat.ML

    AER: Auto-Encoder with Regression for Time Series Anomaly Detection

    Authors: Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni

    Abstract: Anomaly detection on time series data is increasingly common across various industrial domains that monitor metrics in order to prevent potential accidents and economic losses. However, a scarcity of labeled data and ambiguous definitions of anomalies can complicate these efforts. Recent unsupervised machine learning methods have made remarkable progress in tackling this problem using either singl… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

    Comments: This work is accepted by IEEE BigData 2022. The paper contains 10 pages, 6 figures, and 4 tables

  2. arXiv:2010.00509  [pdf, other

    cs.LG stat.ML

    Cardea: An Open Automated Machine Learning Framework for Electronic Health Records

    Authors: Sarah Alnegheimish, Najat Alrashed, Faisal Aleissa, Shahad Althobaiti, Dongyu Liu, Mansour Alsaleh, Kalyan Veeramachaneni

    Abstract: An estimated 180 papers focusing on deep learning and EHR were published between 2010 and 2018. Despite the common workflow structure appearing in these publications, no trusted and verified software framework exists, forcing researchers to arduously repeat previous work. In this paper, we propose Cardea, an extensible open-source automated machine learning framework encapsulating common predictio… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

  3. arXiv:2009.07769  [pdf, other

    cs.LG stat.ML

    TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks

    Authors: Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni

    Abstract: Time series anomalies can offer information relevant to critical situations facing various fields, from finance and aerospace to the IT, security, and medical domains. However, detecting anomalies in time series data is particularly challenging due to the vague definition of anomalies and said data's frequent lack of labels and highly complex temporal correlations. Current state-of-the-art unsuper… ▽ More

    Submitted 14 November, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: Alexander Geiger and Dongyu Liu contributed equally. To appear in the proceedings of IEEE International Conference on Big Data

  4. arXiv:1908.07009  [pdf, other

    cs.LG stat.ML

    Towards Reducing Biases in Combining Multiple Experts Online

    Authors: Yi Sun, Ivan Ramirez, Alfredo Cuesta-Infante, Kalyan Veeramachaneni

    Abstract: In many real life situations, including job and loan applications, gatekeepers must make justified and fair real-time decisions about a person's fitness for a particular opportunity. In this paper, we aim to accomplish approximate group fairness in an online stochastic decision-making process, where the fairness metric we consider is equalized odds. Our work follows from the classical learning-fro… ▽ More

    Submitted 24 May, 2021; v1 submitted 19 August, 2019; originally announced August 2019.

    Comments: Accepted to IJCAI 2021

  5. arXiv:1907.00503  [pdf, other

    cs.LG stat.ML

    Modeling Tabular data using Conditional GAN

    Authors: Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, Kalyan Veeramachaneni

    Abstract: Modeling the probability distribution of rows in tabular data and generating realistic synthetic data is a non-trivial task. Tabular data usually contains a mix of discrete and continuous columns. Continuous columns may have multiple modes whereas discrete columns are sometimes imbalanced making the modeling difficult. Existing statistical and deep neural network models fail to properly model this… ▽ More

    Submitted 27 October, 2019; v1 submitted 30 June, 2019; originally announced July 2019.

    Comments: Accepted to NeurIPS 2019

  6. arXiv:1906.12348  [pdf, other

    cs.LG cs.IR stat.ML

    MLFriend: Interactive Prediction Task Recommendation for Event-Driven Time-Series Data

    Authors: Lei Xu, Shubhra Kanti Karmaker Santu, Kalyan Veeramachaneni

    Abstract: Most automation in machine learning focuses on model selection and hyper parameter tuning, and many overlook the challenge of automatically defining predictive tasks. We still heavily rely on human experts to define prediction tasks, and generate labels by aggregating raw data. In this paper, we tackle the challenge of defining useful prediction problems on event-driven time-series data. We introd… ▽ More

    Submitted 28 June, 2019; originally announced June 2019.

    Comments: 12 pages

  7. arXiv:1905.08942  [pdf, other

    cs.SE cs.LG stat.ML

    The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development

    Authors: Micah J. Smith, Carles Sala, James Max Kanter, Kalyan Veeramachaneni

    Abstract: As machine learning is applied more widely, data scientists often struggle to find or create end-to-end machine learning systems for specific tasks. The proliferation of libraries and frameworks and the complexity of the tasks have led to the emergence of "pipeline jungles" - brittle, ad hoc ML systems. To address these problems, we introduce the Machine Learning Bazaar, a new framework for develo… ▽ More

    Submitted 7 April, 2020; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: To appear in SIGMOD '20

    Journal ref: In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 785-800

  8. arXiv:1902.05009  [pdf, other

    cs.LG cs.HC stat.ML

    ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning

    Authors: Qianwen Wang, Yao Ming, Zhihua Jin, Qiaomu Shen, Dongyu Liu, Micah J. Smith, Kalyan Veeramachaneni, Huamin Qu

    Abstract: To relieve the pain of manually selecting machine learning algorithms and tuning hyperparameters, automated machine learning (AutoML) methods have been developed to automatically search for good models. Due to the huge model search space, it is impossible to try all models. Users tend to distrust automatic results and increase the search budget as much as they can, thereby undermining the efficien… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: Published in the ACM Conference on Human Factors in Computing Systems (CHI), 2019, Glasgow, Scotland UK

    Journal ref: In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). Association for Computing Machinery, New York, NY, USA, Paper 681, 1-12

  9. arXiv:1901.03892  [pdf, other

    cs.CV cs.LG cs.MM stat.ML

    SteganoGAN: High Capacity Image Steganography with GANs

    Authors: Kevin Alex Zhang, Alfredo Cuesta-Infante, Lei Xu, Kalyan Veeramachaneni

    Abstract: Image steganography is a procedure for hiding messages inside pictures. While other techniques such as cryptography aim to prevent adversaries from reading the secret message, steganography aims to hide the presence of the message itself. In this paper, we propose a novel technique for hiding arbitrary binary data in images using generative adversarial networks which allow us to optimize the perce… ▽ More

    Submitted 29 January, 2019; v1 submitted 12 January, 2019; originally announced January 2019.

  10. arXiv:1812.01226  [pdf, other

    cs.LG stat.ML

    Learning Vine Copula Models For Synthetic Data Generation

    Authors: Yi Sun, Alfredo Cuesta-Infante, Kalyan Veeramachaneni

    Abstract: A vine copula model is a flexible high-dimensional dependence model which uses only bivariate building blocks. However, the number of possible configurations of a vine copula grows exponentially as the number of variables increases, making model selection a major challenge in development. In this work, we formulate a vine structure learning problem with both vector and reinforcement learning repre… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

  11. arXiv:1811.11960  [pdf, other

    cs.LG stat.ML

    Prediction Factory: automated development and collaborative evaluation of predictive models

    Authors: Gaurav Sheni, Benjamin Schreck, Roy Wedge, James Max Kanter, Kalyan Veeramachaneni

    Abstract: In this paper, we present a data science automation system called Prediction Factory. The system uses several key automation algorithms to enable data scientists to rapidly develop predictive models and share them with domain experts. To assess the system's impact, we implemented 3 different interfaces for creating predictive modeling projects: baseline automation, full automation, and optional au… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

  12. arXiv:1811.11264  [pdf, other

    cs.LG stat.ML

    Synthesizing Tabular Data using Generative Adversarial Networks

    Authors: Lei Xu, Kalyan Veeramachaneni

    Abstract: Generative adversarial networks (GANs) implicitly learn the probability distribution of a dataset and can draw samples from the distribution. This paper presents, Tabular GAN (TGAN), a generative adversarial network which can generate tabular data like medical or educational records. Using the power of deep neural networks, TGAN generates high-quality and fully synthetic tables while simultaneousl… ▽ More

    Submitted 27 November, 2018; originally announced November 2018.