Skip to main content

Showing 1–46 of 46 results for author: Jin, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2501.18912  [pdf, other

    stat.AP cs.SI

    Analyzing Classroom Interaction Data Using Prompt Engineering and Network Analysis

    Authors: Gwanghee Kim, Ick Hoon Jin, Minjeong Jeon

    Abstract: Classroom interactions play a vital role in developing critical thinking, collaborative problem-solving abilities, and enhanced learning outcomes. While analyzing these interactions is crucial for improving educational practices, the examination of classroom dialogues presents significant challenges due to the complexity and high-dimensionality of conversational data. This study presents an integr… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  2. arXiv:2408.09632  [pdf, other

    cs.LG cs.CL stat.ML

    MoDeGPT: Modular Decomposition for Large Language Model Compression

    Authors: Chi-Heng Lin, Shangqian Gao, James Seale Smith, Abhishek Patel, Shikhar Tuli, Yilin Shen, Hongxia Jin, Yen-Chang Hsu

    Abstract: Large Language Models (LLMs) have reshaped the landscape of artificial intelligence by demonstrating exceptional performance across various tasks. However, substantial computational requirements make their deployment challenging on devices with limited resources. Recently, compression methods using low-rank matrix techniques have shown promise, yet these often lead to degraded accuracy or introduc… ▽ More

    Submitted 2 May, 2025; v1 submitted 18 August, 2024; originally announced August 2024.

    Comments: ICLR 2025 Oral

    MSC Class: 15A23 (Primary) ACM Class: I.2.7

  3. arXiv:2408.01582  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Conformal Diffusion Models for Individual Treatment Effect Estimation and Inference

    Authors: Hengrui Cai, Huaqing Jin, Lexin Li

    Abstract: Estimating treatment effects from observational data is of central interest across numerous application domains. Individual treatment effect offers the most granular measure of treatment effect on an individual level, and is the most useful to facilitate personalized care. However, its estimation and inference remain underdeveloped due to several challenges. In this article, we propose a novel con… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  4. arXiv:2406.02847  [pdf, other

    cs.LG stat.ML

    Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

    Authors: Brian K Chen, Tianyang Hu, Hui Jin, Hwee Kuan Lee, Kenji Kawaguchi

    Abstract: In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require parameter updates. In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias ter… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  5. arXiv:2405.08912  [pdf, other

    stat.ME

    High dimensional test for functional covariates

    Authors: Huaqing Jin, Fei Jiang

    Abstract: As medical devices become more complex, they routinely collect extensive and complicated data. While classical regressions typically examine the relationship between an outcome and a vector of predictors, it becomes imperative to identify the relationship with predictors possessing functional structures. In this article, we introduce a novel inference procedure for examining the relationship betwe… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 35 pages,4 figures, 4 tables

  6. arXiv:2405.04026  [pdf, other

    stat.ML cs.LG

    Federated Control in Markov Decision Processes

    Authors: Hao Jin, Yang Peng, Liangyu Zhang, Zhihua Zhang

    Abstract: We study problems of federated control in Markov Decision Processes. To solve an MDP with large state space, multiple learning agents are introduced to collaboratively learn its optimal policy without communication of locally collected experience. In our settings, these agents have limited capabilities, which means they are restricted within different regions of the overall state space during the… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  7. arXiv:2405.03236  [pdf, other

    cs.LG stat.ML

    Federated Reinforcement Learning with Constraint Heterogeneity

    Authors: Hao Jin, Liangyu Zhang, Zhihua Zhang

    Abstract: We study a Federated Reinforcement Learning (FedRL) problem with constraint heterogeneity. In our setting, we aim to solve a reinforcement learning problem with multiple constraints while $N$ training agents are located in $N$ different environments with limited access to the constraint signals and they are expected to collaboratively learn a policy satisfying all constraint signals. Such learning… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  8. arXiv:2403.14908  [pdf, other

    stat.AP

    Analysis of Log Data from an International Online Educational Assessment System: A Multi-state Survival Modeling Approach to Reaction Time between and across Action Sequence

    Authors: Jina Park, Ick Hoon Jin, Minjeong Jeon

    Abstract: With increasingly available computer-based or online assessments, researchers have shown keen interest in analyzing log data to improve our understanding of test takers' problem-solving processes. In this paper, we propose a multi-state survival model (MSM) to action sequence data from log files, focusing on modeling test takers' reaction times between actions, in order to investigate which factor… ▽ More

    Submitted 25 May, 2025; v1 submitted 21 March, 2024; originally announced March 2024.

  9. arXiv:2401.17855  [pdf, other

    stat.AP cs.HC cs.IR

    Network-based Topic Structure Visualization

    Authors: Yeseul Jeon, Jina Park, Ick Hoon Jin, Dongjun Chungc

    Abstract: In the real world, many topics are inter-correlated, making it challenging to investigate their structure and relationships. Understanding the interplay between topics and their relevance can provide valuable insights for researchers, guiding their studies and informing the direction of research. In this paper, we utilize the topic-words distribution, obtained from topic models, as item-response d… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  10. arXiv:2306.02106  [pdf, other

    stat.AP

    Impacts of Innovation School System in Korea: A Latent Space Item Response Model with Neyman-Scott Point Process

    Authors: Seorim Yi, Minkyu Kim, Jaewoo Park, Minjeong Jeon, Ick Hoon Jin

    Abstract: South Korea's educational system has faced criticism for its lack of focus on critical thinking and creativity, resulting in high levels of stress and anxiety among students. As part of the government's effort to improve the educational system, the innovation school system was introduced in 2009, which aims to develop students' creativity as well as their non-cognitive skills. To better understand… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 June, 2023; originally announced June 2023.

  11. arXiv:2208.12435  [pdf, other

    stat.ME stat.AP

    Comparing multiple latent space embeddings using topological analysis

    Authors: Kisung You, Ilmun Kim, Ick Hoon Jin, Minjeong Jeon, Dennis Shung

    Abstract: The latent space model is one of the well-known methods for statistical inference of network data. While the model has been much studied for a single network, it has not attracted much attention to analyze collectively when multiple networks and their latent embeddings are present. We adopt a topology-based representation of latent space embeddings to learn over a population of network model fits,… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

    Comments: 46 pages, 11 figures

  12. arXiv:2205.06989  [pdf, other

    stat.ME stat.CO

    lsirm12pl: An R package for latent space item response modeling

    Authors: Dongyoung Go, Gwanghee Kim, Jina Park, Junyong Park, Minjeong Jeon, Ick Hoon Jin

    Abstract: The item response model in latent space (LSIRM; Jeon et al., 2021) uncovers unobserved interactions between respondents and items in the item response data by embedding both in a shared latent metric space. The R package lsirm12pl implements Bayesian estimation of the LSIRM and its extensions for various response types, base model specifications, and missing data handling. Furthermore, lsirm12pl p… ▽ More

    Submitted 7 March, 2025; v1 submitted 14 May, 2022; originally announced May 2022.

  13. arXiv:2205.05838  [pdf, other

    cs.LG stat.ML

    Orthogonal Gromov-Wasserstein Discrepancy with Efficient Lower Bound

    Authors: Hongwei Jin, Zishun Yu, Xinhua Zhang

    Abstract: Comparing structured data from possibly different metric-measure spaces is a fundamental task in machine learning, with applications in, e.g., graph classification. The Gromov-Wasserstein (GW) discrepancy formulates a coupling between the structured data based on optimal transportation, tackling the incomparability between different structures by aligning the intra-relational geometries. Although… ▽ More

    Submitted 10 July, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

    Comments: Published as a conference paper in UAI 2022

  14. arXiv:2204.02634  [pdf, other

    cs.LG stat.ML

    Federated Reinforcement Learning with Environment Heterogeneity

    Authors: Hao Jin, Yang Peng, Wenhao Yang, Shusen Wang, Zhihua Zhang

    Abstract: We study a Federated Reinforcement Learning (FedRL) problem in which $n$ agents collaboratively learn a single policy without sharing the trajectories they collected during agent-environment interaction. We stress the constraint of environment heterogeneity, which means $n$ environments corresponding to these $n$ agents have different state transitions. To obtain a value function or a policy funct… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: Artificial Intelligence and Statistics 2022

  15. arXiv:2203.14306  [pdf, other

    stat.ME stat.AP

    A Latent Space Accumulator Model for Response Time: Applications to Cognitive Assessment Data

    Authors: Ick Hoon Jin, Jonghyun Yun, Hyunjoo Kim, Minjeong Jeon

    Abstract: Response time has attracted increased interest in educational and psychological assessment for, e.g., measuring test takers' processing speed, improving the measurement accuracy of ability, and understanding aberrant response behavior. Most models for response time analysis are based on a parametric assumption about the response time distribution. The Cox proportional hazard model has been utilize… ▽ More

    Submitted 20 June, 2023; v1 submitted 27 March, 2022; originally announced March 2022.

  16. arXiv:2203.06830  [pdf, other

    stat.AP

    A Bayesian Precision Response-adaptive Phase II Clinical Trial Design for Radiotherapies with Competing Risk Survival Outcomes

    Authors: Jina Park, Wenjing Hu, Ick Hoon Jin, Hao Liu, Yong Zang

    Abstract: Many phase II clinical trials have used survival outcomes as the primary endpoints in recent decades. Suppose the radiotherapy is evaluated in a phase II trial using survival outcomes. In that case, the competing risk issue often arises because the time to disease progression can be censored by the time to normal tissue complications, and vice versa. Besides, much literature has examined that pati… ▽ More

    Submitted 13 March, 2022; originally announced March 2022.

  17. arXiv:2203.00173  [pdf, other

    stat.AP stat.CO

    Oncology Dose Finding Using Approximate Bayesian Computation Design

    Authors: Huaqing Jin, Wenbin Du, Guosheng Yin

    Abstract: In the development of new cancer treatment, an essential step is to determine the maximum tolerated dose (MTD) via phase I clinical trials. Generally speaking, phase I trial designs can be classified as either model-based or algorithm-based approaches. Model-based phase I designs are typically more efficient by using all observed data, while there is a potential risk of model misspecification that… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: 4 figures and 3 tables

  18. arXiv:2201.05132  [pdf, other

    stat.ML cs.LG

    Hyperparameter Importance for Machine Learning Algorithms

    Authors: Honghe Jin

    Abstract: Hyperparameter plays an essential role in the fitting of supervised machine learning algorithms. However, it is computationally expensive to tune all the tunable hyperparameters simultaneously especially for large data sets. In this paper, we give a definition of hyperparameter importance that can be estimated by subsampling procedures. According to the importance, hyperparameters can then be tune… ▽ More

    Submitted 13 January, 2022; originally announced January 2022.

  19. arXiv:2112.12904  [pdf, other

    stat.ME

    Quantile Regression with Multiple Proxy Variables

    Authors: Dongyoung Go, Jongho Im, Ick Hoon Jin

    Abstract: Data integration has become increasingly popular owing to the availability of multiple data sources. This study considered quantile regression estimation when a key covariate had multiple proxies across several datasets. In a unified estimation procedure, the proposed method incorporates multiple proxies that have various relationships with the unobserved covariates. The proposed approach allows t… ▽ More

    Submitted 21 October, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

  20. How social networks influence human behavior: An integrated latent space approach for differential social influence

    Authors: Jina Park, Ick Hoon Jin, Minjeong Jeon

    Abstract: How social networks influence human behavior has been an interesting topic in applied research. Existing methods often utilized scale-level behavioral data to estimate the influence of a social network on human behavior. This study proposes a novel approach to studying social influence that utilizes item-level behavioral measures. Under the latent space modeling framework, we integrate the two int… ▽ More

    Submitted 27 February, 2023; v1 submitted 11 September, 2021; originally announced September 2021.

    Journal ref: Psychometrika 88 (2023) 1529-1555

  21. arXiv:2106.07374  [pdf, other

    cs.IR stat.AP

    Network-based Topic Interaction Map for Big Data Mining of COVID-19 Biomedical Literature

    Authors: Yeseul Jeon, Dongjun Chung, Jina Park, Ick Hoon Jin

    Abstract: Since the emergence of the worldwide pandemic of COVID-19, relevant research has been published at a dazzling pace, which yields an abundant amount of big data in biomedical literature. Due to the high volum of relevant literature, it is practically impossible to follow up the research manually. Topic modeling is a well-known unsupervised learning that aims to reveal latent topics from text data.… ▽ More

    Submitted 8 December, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  22. arXiv:2102.00796  [pdf, other

    stat.ME

    Unit Information Prior for Adaptive Information Borrowing from Multiple Historical Datasets

    Authors: Huaqing Jin, Guosheng Yin

    Abstract: In clinical trials, there often exist multiple historical studies for the same or related treatment investigated in the current trial. Incorporating historical data in the analysis of the current study is of great importance, as it can help to gain more information, improve efficiency, and provide a more comprehensive evaluation of treatment. Enlightened by the unit information prior (UIP) concept… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: 4 figures; 2 tables in manuscript. 2 figures and one table in supplementary

  23. Mapping unobserved item-respondent interactions: A latent space item response model with interaction map

    Authors: Minjeong Jeon, Ick Hoon Jin, Michael Schweinberger, Samuel Baugh

    Abstract: Classic item response models assume that all items with the same difficulty have the same response probability among all respondents with the same ability. These assumptions, however, may very well be violated in practice, and it is not straightforward to assess whether these assumptions are violated, because neither the abilities of respondents nor the difficulties of items are observed. An examp… ▽ More

    Submitted 15 November, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

    Journal ref: Psychometrika (2021)

  24. arXiv:2007.07224  [pdf, other

    cs.IR cs.LG stat.ML

    AutoRec: An Automated Recommender System

    Authors: Ting-Hsiang Wang, Qingquan Song, Xiaotian Han, Zirui Liu, Haifeng Jin, Xia Hu

    Abstract: Realistic recommender systems are often required to adapt to ever-changing data and tasks or to explore different models systematically. To address the need, we present AutoRec, an open-source automated machine learning (AutoML) platform extended from the TensorFlow ecosystem and, to our knowledge, the first framework to leverage AutoML for model search and hyperparameter tuning in deep recommenda… ▽ More

    Submitted 26 June, 2020; originally announced July 2020.

  25. arXiv:2006.13698  [pdf, other

    stat.ME stat.CO

    Bayesian Shrinkage for Functional Network Models, with Applications to Longitudinal Item Response Data

    Authors: Jaewoo Park, Yeseul Jeon, Minsuk Shin, Minjeong Jeon, Ick Hoon Jin

    Abstract: Longitudinal item response data are common in social science, educational science, and psychology, among other disciplines. Studying the time-varying relationships between items is crucial for educational assessment or designing marketing strategies from survey questions. Although dynamic network models have been widely developed, we cannot apply them directly to item response data because there a… ▽ More

    Submitted 22 October, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

  26. arXiv:2006.11321  [pdf, other

    cs.LG stat.ML

    AutoOD: Automated Outlier Detection via Curiosity-guided Search and Self-imitation Learning

    Authors: Yuening Li, Zhengzhang Chen, Daochen Zha, Kaixiong Zhou, Haifeng Jin, Haifeng Chen, Xia Hu

    Abstract: Outlier detection is an important data mining task with numerous practical applications such as intrusion detection, credit card fraud detection, and video surveillance. However, given a specific complicated task with big data, the process of building a powerful deep learning based system for outlier detection still highly relies on human expertise and laboring trials. Although Neural Architecture… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  27. arXiv:2006.07356  [pdf, other

    stat.ML cs.LG

    Implicit Bias of Gradient Descent for Mean Squared Error Regression with Two-Layer Wide Neural Networks

    Authors: Hui Jin, Guido Montúfar

    Abstract: We investigate gradient descent training of wide neural networks and the corresponding implicit bias in function space. For univariate regression, we show that the solution of training a width-$n$ shallow ReLU network is within $n^{- 1/2}$ of the function which fits the training data and whose difference from the initial function has the smallest 2-norm of the second derivative weighted by a curva… ▽ More

    Submitted 28 May, 2023; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: 97 pages, 14 figures. Added the discussion of SGD and implications to generalization

    MSC Class: 68Q32; 68T05 ACM Class: I.2.6; G.3

  28. arXiv:2005.10970  [pdf, other

    cs.LG stat.ML

    A Complex KBQA System using Multiple Reasoning Paths

    Authors: Kechen Qin, Yu Wang, Cheng Li, Kalpa Gunaratna, Hongxia Jin, Virgil Pavlu, Javed A. Aslam

    Abstract: Multi-hop knowledge based question answering (KBQA) is a complex task for natural language understanding. Many KBQA approaches have been proposed in recent years, and most of them are trained based on labeled reasoning path. This hinders the system's performance as many correct reasoning paths are not labeled as ground truth, and thus they cannot be learned. In this paper, we introduce an end-to-e… ▽ More

    Submitted 21 May, 2020; originally announced May 2020.

  29. arXiv:2003.07657  [pdf, other

    stat.AP

    Applying the Network Item Response Model to Student Assessment Data

    Authors: Alex Brodersen, Ick Hoon Jin, Ying Cheng, Minjeong Jeon

    Abstract: This study discusses an alternative tool for modeling student assessment data. The model constructs networks from a matrix item responses and attempts to represent these data in low dimensional Euclidean space. This procedure has advantages over common methods used for modeling student assessment data such as Item Response Theory because it relaxes the highly restrictive local-independence assumpt… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

  30. arXiv:1911.07142  [pdf, other

    stat.CO stat.AP

    Bayesian Model Selection for High-Dimensional Ising Models, With Applications to Educational Data

    Authors: Jaewoo Park, Ick Hoon Jin, Michael Schweinberger

    Abstract: Doubly-intractable posterior distributions arise in many applications of statistics concerned with discrete and dependent data, including physics, spatial statistics, machine learning, the social sciences, and other fields. A specific example is psychometrics, which has adapted high-dimensional Ising models from machine learning, with a view to studying the interactions among binary item responses… ▽ More

    Submitted 19 May, 2021; v1 submitted 16 November, 2019; originally announced November 2019.

  31. arXiv:1910.13601  [pdf, other

    cs.LG stat.ML

    Deep Weakly-supervised Anomaly Detection

    Authors: Guansong Pang, Chunhua Shen, Huidong Jin, Anton van den Hengel

    Abstract: Recent semi-supervised anomaly detection methods that are trained using small labeled anomaly examples and large unlabeled data (mostly normal data) have shown largely improved performance over unsupervised methods. However, these methods often focus on fitting abnormalities illustrated by the given anomaly examples only (i.e.,, seen anomalies), and consequently they fail to generalize to those th… ▽ More

    Submitted 5 June, 2023; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: Accepted to KDD 2023

  32. arXiv:1910.06538  [pdf, other

    stat.ME

    Network Mediation Analysis Using Model-based Eigenvalue Decomposition

    Authors: Chang Che, Ick Hoon Jin, Zhiyong Zhang

    Abstract: This paper proposes a new two-stage network mediation method based on the use of a latent network approach -- model-based eigenvalue decomposition -- for analyzing social network data with nodal covariates. In the decomposition stage of the observed network, no assumption on the metric of the latent space structure is required. In the mediation stage, the most important eigenvectors of a network a… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

  33. arXiv:1908.06395  [pdf, other

    stat.ML cs.LG

    Towards Better Generalization: BP-SVRG in Training Deep Neural Networks

    Authors: Hao Jin, Dachao Lin, Zhihua Zhang

    Abstract: Stochastic variance-reduced gradient (SVRG) is a classical optimization method. Although it is theoretically proved to have better convergence performance than stochastic gradient descent (SGD), the generalization performance of SVRG remains open. In this paper we investigate the effects of some training techniques, mini-batching and learning rate decay, on the generalization performance of SVRG,… ▽ More

    Submitted 18 August, 2019; originally announced August 2019.

  34. arXiv:1901.00546  [pdf, other

    cs.LG cs.CR stat.ML

    Multi-Label Adversarial Perturbations

    Authors: Qingquan Song, Haifeng Jin, Xiao Huang, Xia Hu

    Abstract: Adversarial examples are delicately perturbed inputs, which aim to mislead machine learning models towards incorrect outputs. While most of the existing work focuses on generating adversarial perturbations in multi-class classification problems, many real-world applications fall into the multi-label setting in which one instance could be associated with more than one label. For example, a spammer… ▽ More

    Submitted 2 January, 2019; originally announced January 2019.

  35. arXiv:1812.10234  [pdf, other

    cs.LG cs.CL stat.ML

    A New Concept of Deep Reinforcement Learning based Augmented General Sequence Tagging System

    Authors: Yu Wang, Abhishek Patel, Hongxia Jin

    Abstract: In this paper, a new deep reinforcement learning based augmented general sequence tagging system is proposed. The new system contains two parts: a deep neural network (DNN) based sequence tagging model and a deep reinforcement learning (DRL) based augmented tagger. The augmented tagger helps improve system performance by modeling the data with minority tags. The new system is evaluated on SLU and… ▽ More

    Submitted 26 December, 2018; originally announced December 2018.

    Comments: Published at 2018 COLING

  36. arXiv:1811.07308  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    A Variational Dirichlet Framework for Out-of-Distribution Detection

    Authors: Wenhu Chen, Yilin Shen, Hongxia Jin, William Wang

    Abstract: With the recently rapid development in deep learning, deep neural networks have been widely adopted in many real-life applications. However, deep neural networks are also known to have very little control over its uncertainty for unseen examples, which potentially causes very harmful and annoying consequences in practical scenarios. In this paper, we are particularly interested in designing a high… ▽ More

    Submitted 20 April, 2019; v1 submitted 18 November, 2018; originally announced November 2018.

    Comments: Tech Report

  37. arXiv:1810.07876  [pdf, other

    stat.AP

    Multilevel Network Item Response Modeling for Discovering Differences Between Innovation and Regular School Systems in Korea

    Authors: Ick Hoon Jin, Minjeong Jeon, Michael Schweinberger, Jonghyun Yun, Lizhen Lin

    Abstract: The innovation school system in South Korea has been developed in response to the traditional high-pressure school system in South Korea, with a view to cultivating a bottom-up and student-centered educational culture. Despite its ambitious goals, questions have been raised about the success of the innovation school system. Leveraging data from the Gyeonggi Education Panel Study (GEPS) along with… ▽ More

    Submitted 20 January, 2022; v1 submitted 17 October, 2018; originally announced October 2018.

  38. arXiv:1810.05297  [pdf, other

    stat.AP

    Bayesian Hierarchical Spatial Model for Small Area Estimation with Non-ignorable Nonresponses and Its Applications to the NHANES Dental Caries Assessments

    Authors: Ick Hoon Jin, Fang Liu, Evercita C. Eugenio, Kisung You, Suyu Liu

    Abstract: The National Health and Nutrition Examination Survey (NHANES) is a major program of the National Center for Health Statistics, designed to assess the health and nutritional status of adults and children in the United States. The analysis of NHANES dental caries data faces several challenges, including (1) the data were collected using a complex, multistage, stratified, unequal-probability sampling… ▽ More

    Submitted 14 October, 2019; v1 submitted 11 October, 2018; originally announced October 2018.

  39. arXiv:1810.04811  [pdf, other

    stat.CO

    Stochastic Approximation Hamiltonian Monte Carlo

    Authors: Jonghyun Yun, Minsuk Shin, Ick Hoon Jin, Faming Liang

    Abstract: Recently, the Hamilton Monte Carlo (HMC) has become widespread as one of the more reliable approaches to efficient sample generation processes. However, HMC is difficult to sample in a multimodal posterior distribution because the HMC chain cannot cross energy barrier between modes due to the energy conservation property. In this paper, we propose a Stochastic Approximate Hamilton Monte Carlo (SAH… ▽ More

    Submitted 19 June, 2020; v1 submitted 10 October, 2018; originally announced October 2018.

  40. Social Network Mediation Analysis: a Latent Space Approach

    Authors: Haiyan Liu, Ick Hoon Jin, Zhiyong Zhang, Ying Yuan

    Abstract: Social networks contain data on both actor attributes and social connections among them. Such connections reflect the dependence among social actors, which is important for individual's mental health and social development. To investigate the potential mediation role of a social network, we propose a mediation model with a social network as a mediator. In the model, dependence among actors is acco… ▽ More

    Submitted 24 June, 2020; v1 submitted 8 October, 2018; originally announced October 2018.

    Journal ref: Psychometrika 86 (2021) 272-298

  41. arXiv:1807.06756  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities

    Authors: Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, Zhaoxuan Chen

    Abstract: The detection of software vulnerabilities (or vulnerabilities for short) is an important problem that has yet to be tackled, as manifested by the many vulnerabilities reported on a daily basis. This calls for machine learning methods for vulnerability detection. Deep learning is attractive for this purpose because it alleviates the requirement to manually define features. Despite the tremendous su… ▽ More

    Submitted 11 January, 2021; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: To be published in IEEE TDSC

  42. arXiv:1806.10282  [pdf, other

    cs.LG cs.AI stat.ML

    Auto-Keras: An Efficient Neural Architecture Search System

    Authors: Haifeng Jin, Qingquan Song, Xia Hu

    Abstract: Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms, e.g., NASNet, PNAS, usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling more efficient training during the search. In this paper,… ▽ More

    Submitted 26 March, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: The code of Auto-Keras is available at https://autokeras.com

  43. arXiv:1710.07850  [pdf, other

    stat.ML cs.AI cs.LG

    Deep Neural Network Approximation using Tensor Sketching

    Authors: Shiva Prasad Kasiviswanathan, Nina Narodytska, Hongxia Jin

    Abstract: Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a smaller network architecture that approximates the operation of the target network? The qu… ▽ More

    Submitted 21 October, 2017; originally announced October 2017.

    Comments: 19 pages

  44. arXiv:1704.06033  [pdf

    cs.CV cs.AI stat.ML

    Predicting Cognitive Decline with Deep Learning of Brain Metabolism and Amyloid Imaging

    Authors: Hongyoon Choi, Kyong Hwan Jin

    Abstract: For effective treatment of Alzheimer disease (AD), it is important to identify subjects who are most likely to exhibit rapid cognitive decline. Herein, we developed a novel framework based on a deep convolutional neural network which can predict future cognitive decline in mild cognitive impairment (MCI) patients using flurodeoxyglucose and florbetapir positron emission tomography (PET). The archi… ▽ More

    Submitted 20 April, 2017; originally announced April 2017.

    Comments: 24 pages

  45. arXiv:1701.01093  [pdf, ps, other

    cs.DS cs.CR stat.ML

    Private Incremental Regression

    Authors: Shiva Prasad Kasiviswanathan, Kobbi Nissim, Hongxia Jin

    Abstract: Data is continuously generated by modern data sources, and a recent challenge in machine learning has been to develop techniques that perform well in an incremental (streaming) setting. In this paper, we investigate the problem of private machine learning, where as common in practice, the data is not given at once, but rather arrives incrementally over time. We introduce the problems of private… ▽ More

    Submitted 4 January, 2017; originally announced January 2017.

    Comments: To appear in PODS 2017

  46. A Doubly Latent Space Joint Model for Local Item and Person Dependence in the Analysis of Item Response Data

    Authors: Ick Hoon Jin, Minjeong Jeon

    Abstract: Item response theory (IRT) models explain an observed item response as a function of a respondent's latent trait and the item's property. IRT is one of the most widely utilized tools for item response analysis; however, local item and person independence, which is a critical assumption for IRT, is often violated in real testing situations. In this article, we propose a new type of analytical appro… ▽ More

    Submitted 1 June, 2018; v1 submitted 20 December, 2016; originally announced December 2016.

    Journal ref: Psychometrika 84 (2019) 236-260