Skip to main content

Showing 1–12 of 12 results for author: Hu, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.17248  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Efficient Quantification of Multimodal Interaction at Sample Level

    Authors: Zequn Yang, Hongfa Wang, Di Hu

    Abstract: Interactions between modalities -- redundancy, uniqueness, and synergy -- collectively determine the composition of multimodal information. Understanding these interactions is crucial for analyzing information dynamics in multimodal systems, yet their accurate sample-level quantification presents significant theoretical and computational challenges. To address this, we introduce the Lightweight Sa… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML 2025

  2. arXiv:2502.00818  [pdf, other

    stat.ML cs.LG

    Error-quantified Conformal Inference for Time Series

    Authors: Junxi Wu, Dongjian Hu, Yajie Bao, Shu-Tao Xia, Changliang Zou

    Abstract: Uncertainty quantification in time series prediction is challenging due to the temporal dependence and distribution shift on sequential data. Conformal inference provides a pivotal and flexible instrument for assessing the uncertainty of machine learning models through prediction sets. Recently, a series of online conformal inference methods updated thresholds of prediction sets by performing onli… ▽ More

    Submitted 2 February, 2025; originally announced February 2025.

    Comments: ICLR 2025 camera version

  3. arXiv:2411.17402  [pdf, other

    stat.ME

    Receiver operating characteristic curve analysis with non-ignorable missing disease status

    Authors: Dingding Hu, Tao Yu, Pengfei Li

    Abstract: This article considers the receiver operating characteristic (ROC) curve analysis for medical data with non-ignorable missingness in the disease status. In the framework of the logistic regression models for both the disease status and the verification status, we first establish the identifiability of model parameters, and then propose a likelihood method to estimate the model parameters, the ROC… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: 20 pages, 1 figure

  4. arXiv:2409.06473  [pdf, other

    stat.AP

    Some statistical aspects of the Covid-19 response

    Authors: Simon N. Wood, Ernst C. Wit, Paul M. McKeigue, Danshu Hu, Beth Flood, Lauren Corcoran, Thea Abou Jawad

    Abstract: This paper discusses some statistical aspects of the U.K. Covid-19 pandemic response, focussing particularly on cases where we believe that a statistically questionable approach or presentation has had a substantial impact on public perception, or government policy, or both. We discuss the presentation of statistics relating to Covid risk, and the risk of the response measures, arguing that biases… ▽ More

    Submitted 5 February, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

    Comments: Version finally accepted by Journal of the Royal Statistical Society (Series A) as a discussion paper

  5. arXiv:2205.00505  [pdf, ps, other

    stat.ME

    Statistical inference for the two-sample problem under likelihood ratio ordering, with application to the ROC curve estimation

    Authors: Dingding Hu, Meng Yuan, Tao Yu, Pengfei Li

    Abstract: The receiver operating characteristic (ROC) curve is a powerful statistical tool and has been widely applied in medical research. In the ROC curve estimation, a commonly used assumption is that larger the biomarker value, greater severity the disease. In this paper, we mathematically interpret ``greater severity of the disease" as ``larger probability of being diseased". This in turn is equivalent… ▽ More

    Submitted 22 February, 2023; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: 35 pages, 2 figure

  6. arXiv:2106.05001  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    No Fear of Heterogeneity: Classifier Calibration for Federated Learning with Non-IID Data

    Authors: Mi Luo, Fei Chen, Dapeng Hu, Yifan Zhang, Jian Liang, Jiashi Feng

    Abstract: A central challenge in training classification models in the real-world federated system is learning with non-IID data. To cope with this, most of the existing works involve enforcing regularization in local optimization or improving the model aggregation scheme at the server. Other works also share public datasets or synthesized samples to supplement the training of under-represented classes or i… ▽ More

    Submitted 28 October, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 22 pages, NeurIPS 2021

  7. arXiv:2006.08690  [pdf, other

    cs.LG stat.ML

    Generalized and Scalable Optimal Sparse Decision Trees

    Authors: Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, Margo Seltzer

    Abstract: Decision tree optimization is notoriously difficult from a computational perspective but essential for the field of interpretable machine learning. Despite efforts over the past 40 years, only recently have optimization breakthroughs been made that have allowed practical algorithms to find optimal decision trees. These new techniques have the potential to trigger a paradigm shift where it is possi… ▽ More

    Submitted 22 November, 2022; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: This paper was published in ICML 2020

    ACM Class: I.2.6

  8. arXiv:2005.09624  [pdf, other

    cs.LG cs.AI stat.ML

    Batch-Augmented Multi-Agent Reinforcement Learning for Efficient Traffic Signal Optimization

    Authors: Yueh-Hua Wu, I-Hau Yeh, David Hu, Hong-Yuan Mark Liao

    Abstract: The goal of this work is to provide a viable solution based on reinforcement learning for traffic signal control problems. Although the state-of-the-art reinforcement learning approaches have yielded great success in a variety of domains, directly applying it to alleviate traffic congestion can be challenging, considering the requirement of high sample efficiency and how training data is gathered.… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  9. arXiv:2004.09031  [pdf, other

    cs.LG stat.ML

    Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification

    Authors: Huanrui Yang, Minxue Tang, Wei Wen, Feng Yan, Daniel Hu, Ang Li, Hai Li, Yiran Chen

    Abstract: Modern deep neural networks (DNNs) often require high memory consumption and large computational loads. In order to deploy DNN algorithms efficiently on edge or mobile devices, a series of DNN compression algorithms have been explored, including factorization methods. Factorization methods approximate the weight matrix of a DNN layer with the multiplication of two or multiple low-rank matrices. Ho… ▽ More

    Submitted 19 April, 2020; originally announced April 2020.

    Comments: In proceeding of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). To be presented at EDLCV 2020 workshop co-located with CVPR 2020

  10. Joint Embedding Learning and Low-Rank Approximation: A Framework for Incomplete Multi-view Learning

    Authors: Hong Tao, Chenping Hou, Dongyun Yi, Jubo Zhu, Dewen Hu

    Abstract: In real-world applications, not all instances in multi-view data are fully represented. To deal with incomplete data, Incomplete Multi-view Learning (IML) rises. In this paper, we propose the Joint Embedding Learning and Low-Rank Approximation (JELLA) framework for IML. The JELLA framework approximates the incomplete data by a set of low-rank matrices and learns a full and common embedding by line… ▽ More

    Submitted 16 December, 2019; v1 submitted 24 December, 2018; originally announced December 2018.

  11. arXiv:1812.04407  [pdf, other

    cs.IR cs.LG stat.ML

    Learning Item-Interaction Embeddings for User Recommendations

    Authors: Xiaoting Zhao, Raphael Louca, Diane Hu, Liangjie Hong

    Abstract: Industry-scale recommendation systems have become a cornerstone of the e-commerce shopping experience. For Etsy, an online marketplace with over 50 million handmade and vintage items, users come to rely on personalized recommendations to surface relevant items from its massive inventory. One hallmark of Etsy's shopping experience is the multitude of ways in which a user can interact with an item t… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  12. arXiv:1811.05544  [pdf, other

    cs.CL cs.LG stat.ML

    An Introductory Survey on Attention Mechanisms in NLP Problems

    Authors: Dichao Hu

    Abstract: First derived from human intuition, later adapted to machine translation for automatic token alignment, attention mechanism, a simple method that can be used for encoding sequence data based on the importance score each element is assigned, has been widely applied to and attained significant improvement in various tasks in natural language processing, including sentiment classification, text summa… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

    Comments: 9 pages