Skip to main content

Showing 1–22 of 22 results for author: Gan, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.02238  [pdf

    cs.LG

    Federated Causal Inference in Healthcare: Methods, Challenges, and Applications

    Authors: Haoyang Li, Jie Xu, Kyra Gan, Fei Wang, Chengxi Zang

    Abstract: Federated causal inference enables multi-site treatment effect estimation without sharing individual-level data, offering a privacy-preserving solution for real-world evidence generation. However, data heterogeneity across sites, manifested in differences in covariate, treatment, and outcome, poses significant challenges for unbiased and efficient estimation. In this paper, we present a comprehens… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

  2. arXiv:2504.20908  [pdf, other

    cs.LG

    MOSIC: Model-Agnostic Optimal Subgroup Identification with Multi-Constraint for Improved Reliability

    Authors: Wenxin Chen, Weishen Pan, Kyra Gan, Fei Wang

    Abstract: Identifying subgroups that benefit from specific treatments using observational data is a critical challenge in personalized medicine. Most existing approaches solely focus on identifying a subgroup with an improved treatment effect. However, practical considerations, such as ensuring a minimum subgroup size for representativeness or achieving sufficient confounder balance for reliability, are als… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  3. arXiv:2502.05145  [pdf, other

    cs.LG

    From Restless to Contextual: A Thresholding Bandit Approach to Improve Finite-horizon Performance

    Authors: Jiamin Xu, Ivan Nazarov, Aditya Rastogi, África Periáñez, Kyra Gan

    Abstract: Online restless bandits extend classic contextual bandits by incorporating state transitions and budget constraints, representing each agent as a Markov Decision Process (MDP). This framework is crucial for finite-horizon strategic resource allocation, optimizing limited costly interventions for long-term benefits. However, learning the underlying MDP for each agent poses a major challenge in fini… ▽ More

    Submitted 3 March, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  4. arXiv:2410.15564  [pdf, other

    cs.LG stat.ME stat.ML

    Reward Maximization for Pure Exploration: Minimax Optimal Good Arm Identification for Nonparametric Multi-Armed Bandits

    Authors: Brian Cho, Dominik Meier, Kyra Gan, Nathan Kallus

    Abstract: In multi-armed bandits, the tasks of reward maximization and pure exploration are often at odds with each other. The former focuses on exploiting arms with the highest means, while the latter may require constant exploration across all arms. In this work, we focus on good arm identification (GAI), a practical bandit inference objective that aims to label arms with means above a threshold as quickl… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

  5. arXiv:2410.11759  [pdf, other

    cs.LG

    LoSAM: Local Search in Additive Noise Models with Mixed Mechanisms and General Noise for Global Causal Discovery

    Authors: Sujai Hiremath, Promit Ghosal, Kyra Gan

    Abstract: Inferring causal relationships from observational data is crucial when experiments are costly or infeasible. Additive noise models (ANMs) enable unique directed acyclic graph (DAG) identification, but existing ANM methods often rely on restrictive assumptions on the data generating process, limiting their applicability to real-world settings. We propose local search in additive noise models, LoSAM… ▽ More

    Submitted 12 February, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

  6. arXiv:2408.12004  [pdf, other

    cs.LG stat.ME stat.ML

    CSPI-MT: Calibrated Safe Policy Improvement with Multiple Testing for Threshold Policies

    Authors: Brian M Cho, Ana-Roxana Pop, Kyra Gan, Sam Corbett-Davies, Israel Nir, Ariel Evnine, Nathan Kallus

    Abstract: When modifying existing policies in high-risk settings, it is often necessary to ensure with high certainty that the newly proposed policy improves upon a baseline, such as the status quo. In this work, we consider the problem of safe policy improvement, where one only adopts a new policy if it is deemed to be better than the specified baseline with at least pre-specified probability. We focus on… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  7. arXiv:2406.13187  [pdf, other

    cs.LG

    Boosting Consistency in Dual Training for Long-Tailed Semi-Supervised Learning

    Authors: Kai Gan, Tong Wei, Min-Ling Zhang

    Abstract: While long-tailed semi-supervised learning (LTSSL) has received tremendous attention in many real-world classification problems, existing LTSSL algorithms typically assume that the class distributions of labeled and unlabeled data are almost identical. Those LTSSL algorithms built upon the assumption can severely suffer when the class distributions of labeled and unlabeled data are mismatched sinc… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  8. arXiv:2405.14848  [pdf, other

    stat.ML cs.LG

    Local Causal Discovery for Structural Evidence of Direct Discrimination

    Authors: Jacqueline Maasch, Kyra Gan, Violet Chen, Agni Orfanoudaki, Nil-Jana Akpinar, Fei Wang

    Abstract: Identifying the causal pathways of unfairness is a critical objective for improving policy design and algorithmic decision-making. Prior work in causal fairness analysis often requires knowledge of the causal graph, hindering practical applications in complex or low-knowledge domains. Moreover, global discovery methods that learn causal structure from data can display unstable performance on finit… ▽ More

    Submitted 19 December, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Journal ref: The 39th Annual AAAI Conference on Artificial Intelligence (AAAI 2025)

  9. arXiv:2405.14496  [pdf, other

    cs.LG

    Hybrid Top-Down Global Causal Discovery with Local Search for Linear and Nonlinear Additive Noise Models

    Authors: Sujai Hiremath, Jacqueline R. M. A. Maasch, Mengxiao Gao, Promit Ghosal, Kyra Gan

    Abstract: Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose strong parametric assumptions. To address these challenges, we propose a novel hybrid approach for global causal discovery in observational data that leverages l… ▽ More

    Submitted 13 January, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: To appear at the Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)

  10. arXiv:2405.11756  [pdf, other

    cs.LG

    Erasing the Bias: Fine-Tuning Foundation Models for Semi-Supervised Learning

    Authors: Kai Gan, Tong Wei

    Abstract: Semi-supervised learning (SSL) has witnessed remarkable progress, resulting in the emergence of numerous method variations. However, practitioners often encounter challenges when attempting to deploy these methods due to their subpar performance. In this paper, we present a novel SSL approach named FineSSL that significantly addresses this limitation by adapting pre-trained foundation models. We i… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024

  11. arXiv:2402.06122  [pdf, other

    stat.ME cs.LG stat.ML

    Peeking with PEAK: Sequential, Nonparametric Composite Hypothesis Tests for Means of Multiple Data Streams

    Authors: Brian Cho, Kyra Gan, Nathan Kallus

    Abstract: We propose a novel nonparametric sequential test for composite hypotheses for means of multiple data streams. Our proposed method, \emph{peeking with expectation-based averaged capital} (PEAK), builds upon the testing-by-betting framework and provides a non-asymptotic $α$-level test across any stopping time. Our contributions are two-fold: (1) we propose a novel betting scheme and provide theoreti… ▽ More

    Submitted 2 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: To appear at the Forty-first International Conference on Machine Learning (ICML 2024)

  12. arXiv:2402.01995  [pdf, other

    cs.LG math.OC

    Online Uniform Sampling: Randomized Learning-Augmented Approximation Algorithms with Application to Digital Health

    Authors: Xueqing Liu, Kyra Gan, Esmaeil Keyvanshokooh, Susan Murphy

    Abstract: Motivated by applications in digital health, this work studies the novel problem of online uniform sampling (OUS), where the goal is to distribute a sampling budget uniformly across unknown decision times. In the OUS problem, the algorithm is given a budget $b$ and a time horizon $T$, and an adversary then chooses a value $τ^* \in [b,T]$, which is revealed to the algorithm online. At each decision… ▽ More

    Submitted 19 October, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  13. arXiv:2310.17816  [pdf, other

    stat.ML cs.LG stat.ME

    Local Discovery by Partitioning: Polynomial-Time Causal Discovery Around Exposure-Outcome Pairs

    Authors: Jacqueline Maasch, Weishen Pan, Shantanu Gupta, Volodymyr Kuleshov, Kyra Gan, Fei Wang

    Abstract: Causal discovery is crucial for causal inference in observational studies, as it can enable the identification of valid adjustment sets (VAS) for unbiased effect estimation. However, global causal discovery is notoriously hard in the nonparametric setting, with exponential time and sample complexity in the worst case. To address this, we propose local discovery by partitioning (LDP): a local causa… ▽ More

    Submitted 1 June, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

    Journal ref: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence (2024)

  14. arXiv:2309.11930  [pdf, other

    cs.LG cs.CV

    Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning

    Authors: Bo Ye, Kai Gan, Tong Wei, Min-Ling Zhang

    Abstract: In open-world semi-supervised learning, a machine learning model is tasked with uncovering novel categories from unlabeled data while maintaining performance on seen categories from labeled data. The central challenge is the substantial learning gap between seen and novel categories, as the model learns the former faster due to accurate supervisory information. Moreover, capturing the semantics of… ▽ More

    Submitted 17 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  15. arXiv:2305.18511  [pdf, other

    cs.LG math.OC

    Contextual Bandits with Budgeted Information Reveal

    Authors: Kyra Gan, Esmaeil Keyvanshokooh, Xueqing Liu, Susan Murphy

    Abstract: Contextual bandit algorithms are commonly used in digital health to recommend personalized treatments. However, to ensure the effectiveness of the treatments, patients are often requested to take actions that have no immediate benefit to them, which we refer to as pro-treatment actions. In practice, clinicians have a limited budget to encourage patients to take these actions and collect additional… ▽ More

    Submitted 13 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: International Conference on Artificial Intelligence and Statistics, 2024

  16. arXiv:2305.13680  [pdf, other

    cs.SE

    ChatGPT, Can You Generate Solutions for my Coding Exercises? An Evaluation on its Effectiveness in an undergraduate Java Programming Course

    Authors: Eng Lieh Ouh, Benjamin Kok Siew Gan, Kyong Jin Shim, Swavek Wlodkowski

    Abstract: In this study, we assess the efficacy of employing the ChatGPT language model to generate solutions for coding exercises within an undergraduate Java programming course. ChatGPT, a large-scale, deep learning-driven natural language processing model, is capable of producing programming code based on textual input. Our evaluation involves analyzing ChatGPT-generated solutions for 80 diverse programm… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  17. arXiv:2305.13662  [pdf, other

    cs.SE

    Are you cloud-certified? Preparing Computing Undergraduates for Cloud Certification with Experiential Learning

    Authors: Eng Lieh Ouh, Benjamin Kok Siew Gan

    Abstract: Cloud Computing skills have been increasing in demand. Many software engineers are learning these skills and taking cloud certification examinations to be job competitive. Preparing undergraduates to be cloud-certified remains challenging as cloud computing is a relatively new topic in the computing curriculum, and many of these certifications require working experience. In this paper, we report o… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  18. arXiv:2208.11087  [pdf

    eess.SP cs.AI cs.CV cs.HC cs.LG

    Locally temporal-spatial pattern learning with graph attention mechanism for EEG-based emotion recognition

    Authors: Yiwen Zhu, Kaiyu Gan, Zhong Yin

    Abstract: Technique of emotion recognition enables computers to classify human affective states into discrete categories. However, the emotion may fluctuate instead of maintaining a stable state even within a short time interval. There is also a difficulty to take the full use of the EEG spatial distribution due to its 3-D topology structure. To tackle the above issues, we proposed a locally temporal-spatia… ▽ More

    Submitted 19 August, 2022; originally announced August 2022.

  19. ITSS: Interactive Web-Based Authoring and Playback Integrated Environment for Programming Tutorials

    Authors: Eng Lieh Ouh, Benjamin Kok Siew Gan, David Lo

    Abstract: Video-based programming tutorials are a popular form of tutorial used by authors to guide learners to code. Still, the interactivity of these videos is limited primarily to control video flow. There are existing works with increased interactivity that are shown to improve the learning experience. Still, these solutions require setting up a custom recording environment and are not well-integrated w… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

  20. arXiv:2103.04250  [pdf, other

    cs.LG stat.ML

    Greedy Approximation Algorithms for Active Sequential Hypothesis Testing

    Authors: Kyra Gan, Su Jia, Andrew Li

    Abstract: In the problem of active sequential hypothesis testing (ASHT), a learner seeks to identify the true hypothesis from among a known set of hypotheses. The learner is given a set of actions and knows the random distribution of the outcome of any action under any true hypothesis. Given a target error $δ>0$, the goal is to sequentially select the fewest number of actions so as to identify the true hypo… ▽ More

    Submitted 6 October, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

  21. Extracting Semantic Concepts and Relations from Scientific Publications by Using Deep Learning

    Authors: Fatima N. AL-Aswadi, Huah Yong Chan, Keng Hoon Gan

    Abstract: With the large volume of unstructured data that increases constantly on the web, the motivation of representing the knowledge in this data in the machine-understandable form is increased. Ontology is one of the major cornerstones of representing the information in a more meaningful way on the semantic Web. The current ontology repositories are quite limited either for their scope or for currentnes… ▽ More

    Submitted 4 September, 2020; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: Proposal

  22. arXiv:2002.11096  [pdf, other

    stat.ML cs.LG math.OC

    Causal Inference With Selectively Deconfounded Data

    Authors: Kyra Gan, Andrew A. Li, Zachary C. Lipton, Sridhar Tayur

    Abstract: Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data;(b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporati… ▽ More

    Submitted 6 March, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021, San Diego, California, USA. PMLR: Volume 130. Copyright 2021 by the author(s)