Skip to main content

Showing 1–22 of 22 results for author: Chung, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.01634  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Online Gradient Boosting Decision Tree: In-Place Updates for Efficient Adding/Deleting Data

    Authors: Huawei Lin, Jun Woo Chung, Yingjie Lao, Weijie Zhao

    Abstract: Gradient Boosting Decision Tree (GBDT) is one of the most popular machine learning models in various applications. However, in the traditional settings, all data should be simultaneously accessed in the training procedure: it does not allow to add or delete any data instances after training. In this paper, we propose an efficient online learning framework for GBDT supporting both incremental and d… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 25 pages, 11 figures, 16 tables. Keywords: Decremental Learning, Incremental Learning, Machine Unlearning, Online Learning, Gradient Boosting Decision Trees, GBDTs

  2. arXiv:2310.16058  [pdf, other

    cs.LG stat.AP

    A Sparse Bayesian Learning for Diagnosis of Nonstationary and Spatially Correlated Faults with Application to Multistation Assembly Systems

    Authors: Jihoon Chung, Zhenyu Kong

    Abstract: Sensor technology developments provide a basis for effective fault diagnosis in manufacturing systems. However, the limited number of sensors due to physical constraints or undue costs hinders the accurate diagnosis in the actual process. In addition, time-varying operational conditions that generate nonstationary process faults and the correlation information in the process require to consider fo… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  3. arXiv:2307.13868  [pdf, other

    stat.ME cs.LG stat.ML

    Learning sources of variability from high-dimensional observational studies

    Authors: Eric W. Bridgeford, Jaewon Chung, Brian Gilbert, Sambit Panda, Adam Li, Cencheng Shen, Alexandra Badea, Brian Caffo, Joshua T. Vogelstein

    Abstract: Causal inference studies whether the presence of a variable influences an observed outcome. As measured by quantities such as the "average treatment effect," this paradigm is employed across numerous biological fields, from vaccine and drug development to policy interventions. Unfortunately, the majority of these methods are often limited to univariate outcomes. Our work generalizes causal estiman… ▽ More

    Submitted 28 November, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  4. arXiv:2303.06815  [pdf, other

    cs.LG stat.ML

    On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee

    Authors: Chenyang Li, Jihoon Chung, Mengnan Du, Haimin Wang, Xianlian Zhou, Bo Shen

    Abstract: Model compression is a crucial part of deploying neural networks (NNs), especially when the memory and storage of computing devices are limited in many applications. This paper focuses on two model compression techniques: low-rank approximation and weight pruning in neural networks, which are very popular nowadays. However, training NN with low-rank approximation and weight pruning always suffers… ▽ More

    Submitted 15 August, 2024; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: 44 pages

  5. arXiv:2210.17274  [pdf, other

    cs.LG stat.AP

    Anomaly Detection in Additive Manufacturing Processes using Supervised Classification with Imbalanced Sensor Data based on Generative Adversarial Network

    Authors: Jihoon Chung, Bo Shen, Zhenyu, Kong

    Abstract: Supervised classification methods have been widely utilized for the quality assurance of the advanced manufacturing process, such as additive manufacturing (AM) for anomaly (defects) detection. However, since abnormal states (with defects) occur much less frequently than normal ones (without defects) in a manufacturing process, the number of sensor data samples collected from a normal state is usu… ▽ More

    Submitted 25 November, 2022; v1 submitted 28 October, 2022; originally announced October 2022.

  6. arXiv:2210.17272  [pdf

    cs.LG stat.AP

    Reinforcement Learning-based Defect Mitigation for Quality Assurance of Additive Manufacturing

    Authors: Jihoon Chung, Bo Shen, Andrew Chung Chee Law, Zhenyu, Kong

    Abstract: Additive Manufacturing (AM) is a powerful technology that produces complex 3D geometries using various materials in a layer-by-layer fashion. However, quality assurance is the main challenge in AM industry due to the possible time-varying processing conditions during AM process. Notably, new defects may occur during printing, which cannot be mitigated by offline analysis tools that focus on existi… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  7. arXiv:2210.16176  [pdf, other

    stat.AP cs.LG

    A Novel Sparse Bayesian Learning and Its Application to Fault Diagnosis for Multistation Assembly Systems

    Authors: Jihoon Chung, Bo Shen, Zhenyu, Kong

    Abstract: This paper addresses the problem of fault diagnosis in multistation assembly systems. Fault diagnosis is to identify process faults that cause the excessive dimensional variation of the product using dimensional measurements. For such problems, the challenge is solving an underdetermined system caused by a common phenomenon in practice; namely, the number of measurements is less than that of the p… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

  8. arXiv:2109.14002  [pdf, other

    cs.LG math.NA stat.ML

    slimTrain -- A Stochastic Approximation Method for Training Separable Deep Neural Networks

    Authors: Elizabeth Newman, Julianne Chung, Matthias Chung, Lars Ruthotto

    Abstract: Deep neural networks (DNNs) have shown their success as high-dimensional function approximators in many applications; however, training DNNs can be challenging in general. DNN training is commonly phrased as a stochastic optimization problem whose challenges include non-convexity, non-smoothness, insufficient regularization, and complicated data distributions. Hence, the performance of DNNs on a g… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: 26 pages, 10 figures, 1 table

    MSC Class: 68T07; 65K99; 65C20 ACM Class: G.1.6

  9. arXiv:2011.14990  [pdf, other

    q-bio.NC stat.ME

    Multiscale Comparative Connectomics

    Authors: Vivek Gopalakrishnan, Jaewon Chung, Eric Bridgeford, Benjamin D. Pedigo, Jesús Arroyo, Lucy Upchurch, G. Allan Johnson, Nian Wang, Youngser Park, Carey E. Priebe, Joshua T. Vogelstein

    Abstract: The connectome, a map of the structural and/or functional connections in the brain, provides a complex representation of the neurobiological phenotypes on which it supervenes. This information-rich data modality has the potential to transform our understanding of the relationship between patterns in brain connectivity and neurological processes, disorders, and diseases. However, existing computati… ▽ More

    Submitted 2 December, 2024; v1 submitted 30 November, 2020; originally announced November 2020.

  10. arXiv:2007.10504  [pdf, other

    cs.AI cs.LG stat.ML

    Battlesnake Challenge: A Multi-agent Reinforcement Learning Playground with Human-in-the-loop

    Authors: Jonathan Chung, Anna Luo, Xavier Raffin, Scott Perry

    Abstract: We present the Battlesnake Challenge, a framework for multi-agent reinforcement learning with Human-In-the-Loop Learning (HILL). It is developed upon Battlesnake, a multiplayer extension of the traditional Snake game in which 2 or more snakes compete for the final survival. The Battlesnake Challenge consists of an offline module for model training and an online module for live competitions. We dev… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

  11. arXiv:2006.04088  [pdf, other

    stat.ML cs.LG

    An Efficient Framework for Clustered Federated Learning

    Authors: Avishek Ghosh, Jichan Chung, Dong Yin, Kannan Ramchandran

    Abstract: We address the problem of federated learning (FL) where users are distributed and partitioned into clusters. This setup captures settings where different groups of users have their own objectives (learning tasks) but by aggregating their data with others in the same cluster (same learning task), they can leverage the strength in numbers in order to perform more efficient federated learning. For th… ▽ More

    Submitted 8 June, 2021; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: Preliminary results appeared at NeurIPS 2020

  12. arXiv:1912.02591  [pdf, other

    eess.AS cs.LG cs.MM cs.SD stat.ML

    Investigating U-Nets with various Intermediate Blocks for Spectrogram-based Singing Voice Separation

    Authors: Woosung Choi, Minseok Kim, Jaehwa Chung, Daewon Lee, Soonyoung Jung

    Abstract: Singing Voice Separation (SVS) tries to separate singing voice from a given mixed musical signal. Recently, many U-Net-based models have been proposed for the SVS task, but there were no existing works that evaluate and compare various types of intermediate blocks that can be used in the U-Net architecture. In this paper, we introduce a variety of intermediate spectrogram transformation blocks. We… ▽ More

    Submitted 8 October, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: 8 pages 4 tables 6 figures, accepted to ISMIR 2020

  13. arXiv:1912.02522  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge

    Authors: Joon Son Chung, Arsha Nagrani, Ernesto Coto, Weidi Xie, Mitchell McLaren, Douglas A Reynolds, Andrew Zisserman

    Abstract: The VoxCeleb Speaker Recognition Challenge 2019 aimed to assess how well current speaker recognition technology is able to identify speakers in unconstrained or `in the wild' data. It consisted of: (i) a publicly available speaker recognition dataset from YouTube videos together with ground truth annotation and standardised evaluation software; and (ii) a public challenge and workshop held at Inte… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: ISCA Archive

  14. Valid Two-Sample Graph Testing via Optimal Transport Procrustes and Multiscale Graph Correlation with Applications in Connectomics

    Authors: Jaewon Chung, Bijan Varjavand, Jesus Arroyo, Anton Alyakin, Joshua Agterberg, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Testing whether two graphs come from the same distribution is of interest in many real world scenarios, including brain network analysis. Under the random dot product graph model, the nonparametric hypothesis testing frame-work consists of embedding the graphs using the adjacency spectral embedding (ASE), followed by aligning the embeddings using the median flip heuristic, and finally applying the… ▽ More

    Submitted 13 September, 2021; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: 12 pages, 3 figures

  15. arXiv:1911.01458  [pdf, other

    eess.IV cs.LG physics.med-ph stat.ML

    Dual-domain Cascade of U-nets for Multi-channel Magnetic Resonance Image Reconstruction

    Authors: Roberto Souza, Mariana Bento, Nikita Nogovitsyn, Kevin J. Chung, R. Marc Lebel, Richard Frayne

    Abstract: The U-net is a deep-learning network model that has been used to solve a number of inverse problems. In this work, the concatenation of two-element U-nets, termed the W-net, operating in k-space (K) and image (I) domains, were evaluated for multi-channel magnetic resonance (MR) image reconstruction. The two element network combinations were evaluated for the four possible image-k-space domain conf… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

  16. arXiv:1908.06486  [pdf, other

    stat.ML cs.LG stat.ME

    Independence Testing for Temporal Data

    Authors: Cencheng Shen, Jaewon Chung, Ronak Mehta, Ting Xu, Joshua T. Vogelstein

    Abstract: Temporal data are increasingly prevalent in modern data science. A fundamental question is whether two time series are related or not. Existing approaches often have limitations, such as relying on parametric assumptions, detecting only linear associations, and requiring multiple tests and corrections. While many non-parametric and universally consistent dependence measures have recently been prop… ▽ More

    Submitted 27 May, 2024; v1 submitted 18 August, 2019; originally announced August 2019.

    Comments: 19 pages main + 6 pages appendix

    Journal ref: Transactions on Machine Learning Research, 2024

  17. arXiv:1904.05329  [pdf, other

    cs.SI stat.ML stat.OT

    GraSPy: Graph Statistics in Python

    Authors: Jaewon Chung, Benjamin D. Pedigo, Eric W. Bridgeford, Bijan K. Varjavand, Hayden S. Helm, Joshua T. Vogelstein

    Abstract: We introduce GraSPy, a Python library devoted to statistical inference, machine learning, and visualization of random graphs and graph populations. This package provides flexible and easy-to-use algorithms for analyzing and understanding graphs with a scikit-learn compliant API. GraSPy can be downloaded from Python Package Index (PyPi), and is released under the Apache 2.0 open-source license. The… ▽ More

    Submitted 14 August, 2019; v1 submitted 29 March, 2019; originally announced April 2019.

    Journal ref: Journal of Machine Learning Research 20.158 (2019): 1-7

  18. arXiv:1702.07367  [pdf, other

    math.NA stat.ML

    Stochastic Newton and Quasi-Newton Methods for Large Linear Least-squares Problems

    Authors: Julianne Chung, Matthias Chung, J. Tanner Slagel, Luis Tenorio

    Abstract: We describe stochastic Newton and stochastic quasi-Newton approaches to efficiently solve large linear least-squares problems where the very large data sets present a significant computational burden (e.g., the size may exceed computer memory or data are collected in real-time). In our proposed framework, stochasticity is introduced in two different frameworks as a means to overcome these computat… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

  19. arXiv:1511.06382  [pdf, other

    cs.LG stat.ML

    Iterative Refinement of the Approximate Posterior for Directed Belief Networks

    Authors: R Devon Hjelm, Kyunghyun Cho, Junyoung Chung, Russ Salakhutdinov, Vince Calhoun, Nebojsa Jojic

    Abstract: Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which ca… ▽ More

    Submitted 20 February, 2018; v1 submitted 19 November, 2015; originally announced November 2015.

  20. arXiv:1506.03410  [pdf, other

    stat.ML cs.LG

    Sparse Projection Oblique Randomer Forests

    Authors: Tyler M. Tomita, James Browne, Cencheng Shen, Jaewon Chung, Jesse L. Patsolic, Benjamin Falk, Jason Yim, Carey E. Priebe, Randal Burns, Mauro Maggioni, Joshua T. Vogelstein

    Abstract: Decision forests, including Random Forests and Gradient Boosting Trees, have recently demonstrated state-of-the-art performance in a variety of machine learning settings. Decision forests are typically ensembles of axis-aligned decision trees; that is, trees that split only along feature dimensions. In contrast, many recent extensions to decision forests are based on axis-oblique splits. Unfortuna… ▽ More

    Submitted 3 October, 2019; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: 31 pages; submitted to Journal of Machine Learning Research for review

    MSC Class: 68T10 ACM Class: I.5.2

    Journal ref: Journal of Machine Learning Research 21(104), 1-39, 2020

  21. arXiv:1502.02367  [pdf, other

    cs.NE cs.LG stat.ML

    Gated Feedback Recurrent Neural Networks

    Authors: Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, Yoshua Bengio

    Abstract: In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively… ▽ More

    Submitted 17 June, 2015; v1 submitted 9 February, 2015; originally announced February 2015.

    Comments: 9 pages, removed appendix

  22. arXiv:1211.2881   

    cs.CV cs.LG stat.ML

    Deep Attribute Networks

    Authors: Junyoung Chung, Donghoon Lee, Youngjoo Seo, Chang D. Yoo

    Abstract: Obtaining compact and discriminative features is one of the major challenges in many of the real-world image classification tasks such as face verification and object recognition. One possible approach is to represent input image on the basis of high-level features that carry semantic meaning which humans can understand. In this paper, a model coined deep attribute network (DAN) is proposed to add… ▽ More

    Submitted 28 November, 2012; v1 submitted 12 November, 2012; originally announced November 2012.

    Comments: This paper has been withdrawn by the author due to a crucial grammatical errors