Skip to main content

Showing 1–24 of 24 results for author: Si, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.14203  [pdf, ps, other

    math.OC cs.LG

    Bellman Optimality of Average-Reward Robust Markov Decision Processes with a Constant Gain

    Authors: Shengbo Wang, Nian Si

    Abstract: Learning and optimal control under robust Markov decision processes (MDPs) have received increasing attention, yet most existing theory, algorithms, and applications focus on finite-horizon or discounted models. The average-reward formulation, while natural in many operations research and management contexts, remains underexplored. This is primarily because the dynamic programming foundations are… ▽ More

    Submitted 17 September, 2025; originally announced September 2025.

  2. arXiv:2509.09371  [pdf, ps, other

    stat.ME cs.LG

    Representation-Aware Distributionally Robust Optimization: A Knowledge Transfer Framework

    Authors: Zitao Wang, Nian Si, Molei Liu

    Abstract: We propose REpresentation-Aware Distributionally Robust Estimation (READ), a novel framework for Wasserstein distributionally robust learning that accounts for predictive representations when guarding against distributional shifts. Unlike classical approaches that treat all feature perturbations equally, READ embeds a multidimensional alignment parameter into the transport cost, allowing the model… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

  3. arXiv:2505.12202  [pdf, ps, other

    cs.LG stat.ML

    Near-Optimal Sample Complexities of Divergence-based S-rectangular Distributionally Robust Reinforcement Learning

    Authors: Zhenghao Li, Shengbo Wang, Nian Si

    Abstract: Distributionally robust reinforcement learning (DR-RL) has recently gained significant attention as a principled approach that addresses discrepancies between training and testing environments. To balance robustness, conservatism, and computational traceability, the literature has introduced DR-RL models with SA-rectangular and S-rectangular adversaries. While most existing statistical analyses fo… ▽ More

    Submitted 2 October, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

  4. arXiv:2505.10007  [pdf, ps, other

    cs.LG math.OC stat.ML

    Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning

    Authors: Zijun Chen, Shengbo Wang, Nian Si

    Abstract: Motivated by practical applications where stable long-term performance is critical-such as robotics, operations research, and healthcare-we study the problem of distributionally robust (DR) average-reward reinforcement learning. We propose two algorithms that achieve near-optimal sample complexity. The first reduces the problem to a DR discounted Markov decision process (MDP), while the second, An… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  5. RemoteChess: Enhancing Older Adults' Social Connectedness via Designing a Virtual Reality Chinese Chess (Xiangqi) Community

    Authors: Qianjie Wei, Xiaoying Wei, Yiqi Liang, Fan Lin, Nuonan Si, Mingming Fan

    Abstract: The decline of social connectedness caused by distance and physical limitations severely affects older adults' well-being and mental health. While virtual reality (VR) is promising for older adults to socialize remotely, existing social VR designs primarily focus on verbal communication (e.g., reminiscent, chat). Actively engaging in shared activities is also an important aspect of social connecti… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 15 pages, 8 figures

  6. arXiv:2502.08146  [pdf, other

    cs.LG stat.ME stat.ML

    Knowledge-Guided Wasserstein Distributionally Robust Optimization

    Authors: Zitao Wang, Ziyuan Wang, Molei Liu, Nian Si

    Abstract: Transfer learning is a popular strategy to leverage external knowledge and improve statistical efficiency, particularly with a limited target sample. We propose a novel knowledge-guided Wasserstein Distributionally Robust Optimization (KG-WDRO) framework that adaptively incorporates multiple sources of external knowledge to overcome the conservativeness of vanilla WDRO, which often results in over… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  7. arXiv:2406.19619  [pdf, other

    stat.ML cs.LG math.ST

    ScoreFusion: Fusing Score-based Generative Models via Kullback-Leibler Barycenters

    Authors: Hao Liu, Junze Tony Ye, Jose Blanchet, Nian Si

    Abstract: We introduce ScoreFusion, a theoretically grounded method for fusing multiple pre-trained diffusion models that are assumed to generate from auxiliary populations. ScoreFusion is particularly useful for enhancing the generative modeling of a target population with limited observed data. Our starting point considers the family of KL barycenters of the auxiliary populations, which is proven to be an… ▽ More

    Submitted 16 April, 2025; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 41 pages, 21 figures. Accepted as an Oral (top 2%) paper by AISTATS 2025

  8. arXiv:2406.11281  [pdf, ps, other

    stat.ML cs.LG

    Statistical Learning of Distributionally Robust Stochastic Control in Continuous State Spaces

    Authors: Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou

    Abstract: We explore the control of stochastic systems with potentially continuous state and action spaces, characterized by the state dynamics $X_{t+1} = f(X_t, A_t, W_t)$. Here, $X$, $A$, and $W$ represent the state, action, and exogenous random noise processes, respectively, with $f$ denoting a known function that describes state transitions. Traditionally, the noise process $\{W_t, t \geq 0\}$ is assume… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2401.16692  [pdf, other

    cs.LG

    Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

    Authors: Yewen Fan, Nian Si, Xiangchen Song, Kun Zhang

    Abstract: The adoption of deep learning across various fields has been extensive, yet there is a lack of focus on evaluating the performance of deep learning pipelines. Typically, with the increased use of large datasets and complex models, the training process is run only once and the result is compared to previous benchmarks. This practice can lead to imprecise comparisons due to the variance in neural ne… ▽ More

    Submitted 17 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  10. arXiv:2401.15811  [pdf, other

    stat.ME cs.IR

    Seller-Side Experiments under Interference Induced by Feedback Loops in Two-Sided Platforms

    Authors: Zhihua Zhu, Zheng Cai, Liang Zheng, Nian Si

    Abstract: Two-sided platforms are central to modern commerce and content sharing and often utilize A/B testing for developing new features. While user-side experiments are common, seller-side experiments become crucial for specific interventions and metrics. This paper investigates the effects of interference caused by feedback loops on seller-side experiments in two-sided platforms, with a particular focus… ▽ More

    Submitted 9 February, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  11. arXiv:2401.03190  [pdf, other

    cs.CL cs.AI cs.CV

    MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model Editing

    Authors: Nianwen Si, Hao Zhang, Weiqiang Zhang

    Abstract: Large language models are known for encoding a vast amount of factual knowledge, but they often becomes outdated due to the ever-changing nature of external information. A promising solution to this challenge is the utilization of model editing methods to update the knowledge in an efficient manner. However, the majority of existing model editing techniques are limited to monolingual frameworks, t… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: Work in progress

  12. arXiv:2311.15766  [pdf, other

    cs.CL

    Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges

    Authors: Nianwen Si, Hao Zhang, Heyu Chang, Wenlin Zhang, Dan Qu, Weiqiang Zhang

    Abstract: In recent years, large language models (LLMs) have spurred a new research paradigm in natural language processing. Despite their excellent capability in knowledge-based question answering and reasoning, their potential to retain faulty or even harmful knowledge poses risks of malicious application. The challenge of mitigating this issue and transforming these models into purer assistants is crucia… ▽ More

    Submitted 7 December, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Work in progress

  13. arXiv:2311.09018  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    On the Foundation of Distributionally Robust Reinforcement Learning

    Authors: Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou

    Abstract: Motivated by the need for a robust policy in the face of environment shifts between training and deployment, we contribute to the theoretical foundation of distributionally robust reinforcement learning (DRRL). This is accomplished through a comprehensive modeling framework centered around robust Markov decision processes (RMDPs). This framework obliges the decision maker to choose an optimal poli… ▽ More

    Submitted 24 August, 2025; v1 submitted 15 November, 2023; originally announced November 2023.

  14. arXiv:2310.17496  [pdf, other

    stat.ME cs.LG econ.EM

    Tackling Interference Induced by Data Training Loops in A/B Tests: A Weighted Training Approach

    Authors: Nian Si

    Abstract: In modern recommendation systems, the standard pipeline involves training machine learning models on historical data to predict user behaviors and improve recommendations continuously. However, these data training loops can introduce interference in A/B tests, where data generated by control and treatment algorithms, potentially with different distributions, are combined. To address these challeng… ▽ More

    Submitted 4 April, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

  15. arXiv:2310.02050  [pdf, other

    cs.CL cs.CV

    Tuning Large language model for End-to-end Speech Translation

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Xiaolin Jiao

    Abstract: With the emergence of large language models (LLMs), multimodal models based on LLMs have demonstrated significant potential. Models such as LLaSM, X-LLM, and SpeechGPT exhibit an impressive ability to comprehend and generate human instructions. However, their performance often falters when faced with complex tasks like end-to-end speech translation (E2E-ST), a cross-language and cross-modal transl… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  16. arXiv:2309.11651  [pdf, other

    eess.SY cs.LG math.AP math.OC

    Drift Control of High-Dimensional RBM: A Computational Method Based on Neural Networks

    Authors: Baris Ata, J. Michael Harrison, Nian Si

    Abstract: Motivated by applications in queueing theory, we consider a stochastic control problem whose state space is the $d$-dimensional positive orthant. The controlled process $Z$ evolves as a reflected Brownian motion whose covariance matrix is exogenously specified, as are its directions of reflection from the orthant's boundary surfaces. A system manager chooses a drift vector $θ(t)$ at each time $t$… ▽ More

    Submitted 7 August, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

  17. arXiv:2305.18420  [pdf, other

    cs.LG math.OC stat.ML

    Sample Complexity of Variance-reduced Distributionally Robust Q-learning

    Authors: Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou

    Abstract: Dynamic decision-making under distributional shifts is of fundamental interest in theory and applications of reinforcement learning: The distribution of the environment in which the data is collected can differ from that of the environment in which the model is deployed. This paper presents two novel model-free algorithms, namely the distributionally robust Q-learning and its variance-reduced coun… ▽ More

    Submitted 4 September, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

  18. Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Wei-Qiang Zhang

    Abstract: The end-to-end speech translation (E2E-ST) model has gradually become a mainstream paradigm due to its low latency and less error propagation. However, it is non-trivial to train such a model well due to the task complexity and data scarcity. The speech-and-text modality differences result in the E2E-ST model performance usually inferior to the corresponding machine translation (MT) model. Based o… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Journal ref: IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 31, 2023

  19. arXiv:2304.10295  [pdf, other

    cs.CL cs.SD eess.AS

    Decouple Non-parametric Knowledge Distillation For End-to-end Speech Translation

    Authors: Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Zhen Li

    Abstract: Existing techniques often attempt to make knowledge transfer from a powerful machine translation (MT) to speech translation (ST) model with some elaborate techniques, which often requires transcription as extra input during training. However, transcriptions are not always available, and how to improve the ST model performance without transcription, i.e., data efficiency, has rarely been studied in… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted by ICASSP 2023

  20. arXiv:2302.13203  [pdf, other

    cs.LG stat.ML

    A Finite Sample Complexity Bound for Distributionally Robust Q-learning

    Authors: Shengbo Wang, Nian Si, Jose Blanchet, Zhengyuan Zhou

    Abstract: We consider a reinforcement learning setting in which the deployment environment is different from the training environment. Applying a robust Markov decision processes formulation, we extend the distributionally robust $Q$-learning framework studied in Liu et al. [2022]. Further, we improve the design and analysis of their multi-level Monte Carlo estimator. Assuming access to a simulator, we prov… ▽ More

    Submitted 31 July, 2024; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: Accepted by AISTATS 2023

  21. arXiv:2205.09809  [pdf, other

    cs.LG stat.ME

    Calibration Matters: Tackling Maximization Bias in Large-scale Advertising Recommendation Systems

    Authors: Yewen Fan, Nian Si, Kun Zhang

    Abstract: Calibration is defined as the ratio of the average predicted click rate to the true click rate. The optimization of calibration is essential to many online advertising recommendation systems because it directly affects the downstream bids in ads auctions and the amount of money charged to advertisers. Despite its importance, calibration optimization often suffers from a problem called "maximizatio… ▽ More

    Submitted 21 March, 2023; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Accepted in ICLR 2023

  22. arXiv:2106.01070  [pdf, ps, other

    stat.ML cs.CY cs.LG math.ST

    Testing Group Fairness via Optimal Transport Projections

    Authors: Nian Si, Karthyek Murthy, Jose Blanchet, Viet Anh Nguyen

    Abstract: We present a statistical testing framework to detect if a given machine learning classifier fails to satisfy a wide range of group fairness notions. The proposed test is a flexible, interpretable, and statistically rigorous tool for auditing whether exhibited biases are intrinsic to the algorithm or due to the randomness in the data. The statistical challenges, which may arise from multiple impact… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Journal ref: International Conference on Machine Learning 2021

  23. arXiv:2007.04458  [pdf, other

    cs.LG stat.ML

    Robust Bayesian Classification Using an Optimistic Score Ratio

    Authors: Viet Anh Nguyen, Nian Si, Jose Blanchet

    Abstract: We build a Bayesian contextual classification model using an optimistic score ratio for robust binary classification when there is limited information on the class-conditional, or contextual, distribution. The optimistic score searches for the distribution that is most plausible to explain the observed outcomes in the testing sample among all distributions belonging to the contextual ambiguity set… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

  24. arXiv:2006.05630  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Distributionally Robust Batch Contextual Bandits

    Authors: Nian Si, Fan Zhang, Zhengyuan Zhou, Jose Blanchet

    Abstract: Policy learning using historical observational data is an important problem that has found widespread applications. Examples include selecting offers, prices, advertisements to send to customers, as well as selecting which medication to prescribe to a patient. However, existing literature rests on the crucial assumption that the future environment where the learned policy will be deployed is the s… ▽ More

    Submitted 11 September, 2023; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: The short version has been accepted in ICML 2020