Skip to main content

Showing 1–21 of 21 results for author: Jang, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2411.10801  [pdf, other

    stat.ME

    Mixing Samples to Address Weak Overlap in Causal Inference

    Authors: Jaehyuk Jang, Suehyun Kim, Kwonsang Lee

    Abstract: In observational studies, the assumption of sufficient overlap (positivity) is fundamental for the identification and estimation of causal effects. Failing to account for this assumption yields inaccurate and potentially infeasible estimators. To address this issue, we introduce a simple yet novel approach, \textit{mixing}, which mitigates overlap violations by constructing a synthetic treated gro… ▽ More

    Submitted 4 April, 2025; v1 submitted 16 November, 2024; originally announced November 2024.

    Comments: 37 pages, 5 figures

  2. arXiv:2410.14866  [pdf, other

    stat.ME

    Fast and Optimal Changepoint Detection and Localization using Bonferroni Triplets

    Authors: Jayoon Jang, Guenther Walther

    Abstract: The paper considers the problem of detecting and localizing changepoints in a sequence of independent observations. We propose to evaluate a local test statistic on a triplet of time points, for each such triplet in a particular collection. This collection is sparse enough so that the results of the local tests can simply be combined with a weighted Bonferroni correction. This results in a simple… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

  3. arXiv:2307.08594  [pdf, other

    stat.ME

    Tight Distribution-Free Confidence Intervals for Local Quantile Regression

    Authors: Jayoon Jang, Emmanuel Candès

    Abstract: It is well known that it is impossible to construct useful confidence intervals (CIs) about the mean or median of a response $Y$ conditional on features $X = x$ without making strong assumptions about the joint distribution of $X$ and $Y$. This paper introduces a new framework for reasoning about problems of this kind by casting the conditional problem at different levels of resolution, ranging fr… ▽ More

    Submitted 26 January, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 50 pages, 46 figures

  4. arXiv:2302.09656  [pdf, other

    cs.LG stat.ML

    Credal Bayesian Deep Learning

    Authors: Michele Caprio, Souradeep Dutta, Kuk Jin Jang, Vivian Lin, Radoslav Ivanov, Oleg Sokolsky, Insup Lee

    Abstract: Uncertainty quantification and robustness to distribution shifts are important goals in machine learning and artificial intelligence. Although Bayesian Neural Networks (BNNs) allow for uncertainty in the predictions to be assessed, different sources of predictive uncertainty cannot be distinguished properly. We present Credal Bayesian Deep Learning (CBDL). Heuristically, CBDL allows to train an (u… ▽ More

    Submitted 22 October, 2024; v1 submitted 19 February, 2023; originally announced February 2023.

    MSC Class: Primary: 68T37; Secondary: 68T05; 68W25

    Journal ref: Transaction of Machine Learning Research, 2024, ISSN: 2835-8856

  5. arXiv:2012.08855  [pdf, other

    cs.LG stat.ML

    Time-Aware Tensor Decomposition for Missing Entry Prediction

    Authors: Dawon Ahn, Jun-Gi Jang, U Kang

    Abstract: Given a time-evolving tensor with missing entries, how can we effectively factorize it for precisely predicting the missing entries? Tensor factorization has been extensively utilized for analyzing various multi-dimensional real-world data. However, existing models for tensor factorization have disregarded the temporal property for tensor factorization while most real-world data are closely relate… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    Comments: 20 pages

  6. arXiv:2012.04181  [pdf, other

    q-fin.TR q-fin.RM q-fin.ST stat.AP

    Systemic Risk in Market Microstructure of Crude Oil and Gasoline Futures Prices: A Hawkes Flocking Model Approach

    Authors: Hyun Jin Jang, Kiseop Lee, Kyungsub Lee

    Abstract: We propose the Hawkes flocking model that assesses systemic risk in high-frequency processes at the two perspectives -- endogeneity and interactivity. We examine the futures markets of WTI crude oil and gasoline for the past decade, and perform a comparative analysis with conditional value-at-risk as a benchmark measure. In terms of high-frequency structure, we derive the empirical findings. The e… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Journal ref: Journal of Futures Markets, 40, 2020, 247-275

  7. arXiv:2009.00093  [pdf, other

    cs.LG cs.CV stat.ML

    Online Class-Incremental Continual Learning with Adversarial Shapley Value

    Authors: Dongsub Shim, Zheda Mai, Jihwan Jeong, Scott Sanner, Hyunwoo Kim, Jongseong Jang

    Abstract: As image-based deep learning becomes pervasive on every device, from cell phones to smart watches, there is a growing need to develop methods that continually learn from data while minimizing memory footprint and power consumption. While memory replay techniques have shown exceptional promise for this task of continual learning, the best method for selecting which buffered images to replay is stil… ▽ More

    Submitted 22 March, 2021; v1 submitted 31 August, 2020; originally announced September 2020.

    Comments: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI-21)

  8. arXiv:2008.12559  [pdf, other

    cs.LG stat.ML

    Fast Partial Fourier Transform

    Authors: Yong-chan Park, Jun-Gi Jang, U Kang

    Abstract: Given a time series vector, how can we efficiently compute a specified part of Fourier coefficients? Fast Fourier transform (FFT) is a widely used algorithm that computes the discrete Fourier transform in many machine learning applications. Despite its pervasive use, all known FFT algorithms do not provide a fine-tuning option for the user to specify one's demand, that is, the output size (the num… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

    Comments: 15 pages, 3 figures

  9. arXiv:2007.04758  [pdf, ps, other

    q-fin.RM stat.OT

    A Bivariate Compound Dynamic Contagion Process for Cyber Insurance

    Authors: Jiwook Jang, Rosy Oh

    Abstract: As corporates and governments become more digital, they become vulnerable to various forms of cyber attack. Cyber insurance products have been used as risk management tools, yet their pricing does not reflect actual risk, including that of multiple, catastrophic and contagious losses. For the modelling of aggregate losses from cyber events, in this paper we introduce a bivariate compound dynamic c… ▽ More

    Submitted 12 June, 2020; originally announced July 2020.

  10. arXiv:2006.06743  [pdf, other

    cs.LG stat.ML

    Faster DBSCAN via subsampled similarity queries

    Authors: Heinrich Jiang, Jennifer Jang, Jakub Łącki

    Abstract: DBSCAN is a popular density-based clustering algorithm. It computes the $ε$-neighborhood graph of a dataset and uses the connected components of the high-degree nodes to decide the clusters. However, the full neighborhood graph may be too costly to compute with a worst-case complexity of $O(n^2)$. In this paper, we propose a simple variant called SNG-DBSCAN, which clusters based on a subsampled… ▽ More

    Submitted 21 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  11. arXiv:2004.03133  [pdf, other

    cs.CL cs.LG stat.ML

    Neutralizing Gender Bias in Word Embedding with Latent Disentanglement and Counterfactual Generation

    Authors: Seungjae Shin, Kyungwoo Song, JoonHo Jang, Hyemi Kim, Weonyoung Joo, Il-Chul Moon

    Abstract: Recent research demonstrates that word embeddings, trained on the human-generated corpus, have strong gender biases in embedding spaces, and these biases can result in the discriminative results from the various downstream tasks. Whereas the previous methods project word embeddings into a linear subspace for debiasing, we introduce a \textit{Latent Disentanglement} method with a siamese auto-encod… ▽ More

    Submitted 3 November, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: Findings of EMNLP2020

  12. arXiv:1910.04500  [pdf, other

    cs.LG eess.AS stat.ML

    Orthogonality Constrained Multi-Head Attention For Keyword Spotting

    Authors: Mingu Lee, Jinkyu Lee, Hye Jin Jang, Byeonggeun Kim, Wonil Chang, Kyuwoong Hwang

    Abstract: Multi-head attention mechanism is capable of learning various representations from sequential data while paying attention to different subsequences, e.g., word-pieces or syllables in a spoken word. From the subsequences, it retrieves richer information than a single-head attention which only summarizes the whole sequence into one context vector. However, a naive use of the multi-head attention doe… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: Accepted to ASRU 2019

  13. arXiv:1905.10521  [pdf, other

    cs.LG stat.ML

    Bivariate Beta-LSTM

    Authors: Kyungwoo Song, JoonHo Jang, Seung jae Shin, Il-Chul Moon

    Abstract: Long Short-Term Memory (LSTM) infers the long term dependency through a cell state maintained by the input and the forget gate structures, which models a gate output as a value in [0,1] through a sigmoid function. However, due to the graduality of the sigmoid function, the sigmoid gate is not flexible in representing multi-modality or skewness. Besides, the previous models lack modeling on the cor… ▽ More

    Submitted 16 November, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

    Comments: AAAI 2020

  14. arXiv:1904.00816  [pdf, other

    cs.CV cs.LG stat.ML

    k-Same-Siamese-GAN: k-Same Algorithm with Generative Adversarial Network for Facial Image De-identification with Hyperparameter Tuning and Mixed Precision Training

    Authors: Yi-Lun Pan, Min-Jhih Huang, Kuo-Teng Ding, Ja-Ling Wu, Jyh-Shing Jang

    Abstract: For a data holder, such as a hospital or a government entity, who has a privately held collection of personal data, in which the revealing and/or processing of the personal identifiable data is restricted and prohibited by law. Then, "how can we ensure the data holder does conceal the identity of each individual in the imagery of personal data while still preserving certain useful aspects of the d… ▽ More

    Submitted 17 September, 2019; v1 submitted 27 March, 2019; originally announced April 2019.

  15. arXiv:1810.13105  [pdf, other

    cs.LG stat.ML

    DBSCAN++: Towards fast and scalable density clustering

    Authors: Jennifer Jang, Heinrich Jiang

    Abstract: DBSCAN is a classical density-based clustering procedure with tremendous practical relevance. However, DBSCAN implicitly needs to compute the empirical density for each sample point, leading to a quadratic worst-case time complexity, which is too slow on large datasets. We propose DBSCAN++, a simple modification of DBSCAN which only requires computing the densities for a chosen subset of points. W… ▽ More

    Submitted 17 May, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

  16. arXiv:1809.03721  [pdf, other

    cs.LG cs.CV cs.NE eess.SP stat.ML

    Deep Asymmetric Networks with a Set of Node-wise Variant Activation Functions

    Authors: Jinhyeok Jang, Hyunjoong Cho, Jaehong Kim, Jaeyeon Lee, Seungjoon Yang

    Abstract: This work presents deep asymmetric networks with a set of node-wise variant activation functions. The nodes' sensitivities are affected by activation function selections such that the nodes with smaller indices become increasingly more sensitive. As a result, features learned by the nodes are sorted by the node indices in the order of their importance. Asymmetric networks not only learn input feat… ▽ More

    Submitted 17 May, 2019; v1 submitted 11 September, 2018; originally announced September 2018.

  17. arXiv:1807.11655  [pdf, other

    cs.CR cs.LG stat.ML

    Security and Privacy Issues in Deep Learning

    Authors: Ho Bae, Jaehee Jang, Dahuin Jung, Hyemi Jang, Heonseok Ha, Hyungyu Lee, Sungroh Yoon

    Abstract: To promote secure and private artificial intelligence (SPAI), we review studies on the model security and data privacy of DNNs. Model security allows system to behave as intended without being affected by malicious external influences that can compromise its integrity and efficiency. Security attacks can be divided based on when they occur: if an attack occurs during training, it is known as a poi… ▽ More

    Submitted 9 March, 2021; v1 submitted 31 July, 2018; originally announced July 2018.

  18. arXiv:1805.09621  [pdf, other

    cs.LG cs.CV stat.ML

    Backpropagation with N-D Vector-Valued Neurons Using Arbitrary Bilinear Products

    Authors: Zhe-Cheng Fan, Tak-Shing T. Chan, Yi-Hsuan Yang, Jyh-Shing R. Jang

    Abstract: Vector-valued neural learning has emerged as a promising direction in deep learning recently. Traditionally, training data for neural networks (NNs) are formulated as a vector of scalars; however, its performance may not be optimal since associations among adjacent scalars are not modeled. In this paper, we propose a new vector neural architecture called the Arbitrary BIlinear Product Neural Netwo… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: 14 pages, 8 figures, 3 tables

  19. arXiv:1805.07978  [pdf, other

    cs.LG stat.ML

    Energy-Efficient Inference Accelerator for Memory-Augmented Neural Networks on an FPGA

    Authors: Seongsik Park, Jaehee Jang, Seijoon Kim, Sungroh Yoon

    Abstract: Memory-augmented neural networks (MANNs) are designed for question-answering tasks. It is difficult to run a MANN effectively on accelerators designed for other neural networks (NNs), in particular on mobile devices, because MANNs require recurrent data paths and various types of operations related to external memory access. We implement an accelerator for MANNs on a field-programmable gate array… ▽ More

    Submitted 11 February, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Accepted to DATE 2019

  20. arXiv:1805.07909  [pdf, other

    cs.LG stat.ML

    Quickshift++: Provably Good Initializations for Sample-Based Mean Shift

    Authors: Heinrich Jiang, Jennifer Jang, Samory Kpotufe

    Abstract: We provide initial seedings to the Quick Shift clustering algorithm, which approximate the locally high-density regions of the data. Such seedings act as more stable and expressive cluster-cores than the singleton modes found by Quick Shift. We establish statistical consistency guarantees for this modification. We then show strong clustering performance on real datasets as well as promising applic… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: ICML 2018. Code release: https://github.com/google/quickshift

  21. arXiv:1702.02741  [pdf, other

    cs.CV stat.ML

    Automatic Estimation of Fetal Abdominal Circumference from Ultrasound Images

    Authors: Jaeseong Jang, Yejin Park, Bukweon Kim, Sung Min Lee, Ja-Young Kwon, Jin Keun Seo

    Abstract: Ultrasound diagnosis is routinely used in obstetrics and gynecology for fetal biometry, and owing to its time-consuming process, there has been a great demand for automatic estimation. However, the automated analysis of ultrasound images is complicated because they are patient-specific, operator-dependent, and machine-specific. Among various types of fetal biometry, the accurate estimation of abdo… ▽ More

    Submitted 2 November, 2017; v1 submitted 9 February, 2017; originally announced February 2017.