Skip to main content

Showing 1–20 of 20 results for author: Fu, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2503.21352  [pdf

    cs.AI stat.AP

    Using large language models to produce literature reviews: Usages and systematic biases of microphysics parametrizations in 2699 publications

    Authors: Tianhang Zhang, Shengnan Fu, David M. Schultz, Zhonghua Zheng

    Abstract: Large language models afford opportunities for using computers for intensive tasks, realizing research opportunities that have not been considered before. One such opportunity could be a systematic interrogation of the scientific literature. Here, we show how a large language model can be used to construct a literature review of 2699 publications associated with microphysics parametrizations in th… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  2. arXiv:2502.04204  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Short-length Adversarial Training Helps LLMs Defend Long-length Jailbreak Attacks: Theoretical and Empirical Evidence

    Authors: Shaopeng Fu, Liang Ding, Jingfeng Zhang, Di Wang

    Abstract: Jailbreak attacks against large language models (LLMs) aim to induce harmful behaviors in LLMs through carefully crafted adversarial prompts. To mitigate attacks, one way is to perform adversarial training (AT)-based alignment, i.e., training LLMs on some of the most adversarial prompts to help them learn how to behave safely under attacks. During AT, the length of adversarial prompts plays a crit… ▽ More

    Submitted 7 June, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

  3. arXiv:2410.11444  [pdf, other

    cs.LG cs.AI stat.ML

    A Theoretical Survey on Foundation Models

    Authors: Shi Fu, Yuzhu Chen, Yingjie Wang, Dacheng Tao

    Abstract: Understanding the inner mechanisms of black-box foundation models (FMs) is essential yet challenging in artificial intelligence and its applications. Over the last decade, the long-running focus has been on their explainability, leading to the development of post-hoc explainable methods to rationalize the specific decisions already made by black-box FMs. However, these explainable methods have cer… ▽ More

    Submitted 24 November, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: 63 pages, 16 figures

  4. arXiv:2310.06112  [pdf, other

    cs.LG stat.ML

    Theoretical Analysis of Robust Overfitting for Wide DNNs: An NTK Approach

    Authors: Shaopeng Fu, Di Wang

    Abstract: Adversarial training (AT) is a canonical method for enhancing the robustness of deep neural networks (DNNs). However, recent studies empirically demonstrated that it suffers from robust overfitting, i.e., a long time AT can be detrimental to the robustness of DNNs. This paper presents a theoretical explanation of robust overfitting for DNNs. Specifically, we non-trivially extend the neural tangent… ▽ More

    Submitted 4 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: In Twelfth International Conference on Learning Representations (ICLR 2024)

  5. arXiv:2308.11676  [pdf, other

    stat.ME cs.AI cs.LG

    Does Misclassifying Non-confounding Covariates as Confounders Affect the Causal Inference within the Potential Outcomes Framework?

    Authors: Yonghe Zhao, Qiang Huang, Shuai Fu, Huiyan Sun

    Abstract: The Potential Outcome Framework (POF) plays a prominent role in the field of causal inference. Most causal inference models based on the POF (CIMs-POF) are designed for eliminating confounding bias and default to an underlying assumption of Confounding Covariates. This assumption posits that the covariates consist solely of confounders. However, the assumption of Confounding Covariates is challeng… ▽ More

    Submitted 4 September, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 12 pages, 4 figures

  6. arXiv:2203.12964  [pdf, other

    cs.LG cs.AI stat.ML

    Knowledge Removal in Sampling-based Bayesian Inference

    Authors: Shaopeng Fu, Fengxiang He, Dacheng Tao

    Abstract: The right to be forgotten has been legislated in many countries, but its enforcement in the AI industry would cause unbearable costs. When single data deletion requests come, companies may need to delete the whole models learned with massive resources. Existing works propose methods to remove knowledge learned from data for explicitly parameterized models, which however are not appliable to the sa… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: In International Conference on Learning Representations, 2022

  7. arXiv:2104.03743  [pdf, other

    cs.LG cs.CE stat.AP

    Residual Gaussian Process: A Tractable Nonparametric Bayesian Emulator for Multi-fidelity Simulations

    Authors: Wei W. Xing, Akeel A. Shah, Peng Wang, Shandian Zhe Qian Fu, Robert. M. Kirby

    Abstract: Challenges in multi-fidelity modeling relate to accuracy, uncertainty estimation and high-dimensionality. A novel additive structure is introduced in which the highest fidelity solution is written as a sum of the lowest fidelity solution and residuals between the solutions at successive fidelity levels, with Gaussian process priors placed over the low fidelity solution and each of the residuals. T… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  8. arXiv:2101.06417  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian Inference Forgetting

    Authors: Shaopeng Fu, Fengxiang He, Yue Xu, Dacheng Tao

    Abstract: The right to be forgotten has been legislated in many countries but the enforcement in machine learning would cause unbearable costs: companies may need to delete whole models learned from massive resources due to single individual requests. Existing works propose to remove the knowledge learned from the requested data via its influence function which is no longer naturally well-defined in Bayesia… ▽ More

    Submitted 18 February, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

  9. arXiv:2012.13573  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Robustness, Privacy, and Generalization of Adversarial Training

    Authors: Fengxiang He, Shaopeng Fu, Bohan Wang, Dacheng Tao

    Abstract: Adversarial training can considerably robustify deep neural networks to resist adversarial attacks. However, some works suggested that adversarial training might comprise the privacy-preserving and generalization abilities. This paper establishes and quantifies the privacy-robustness trade-off and generalization-robustness trade-off in adversarial training from both theoretical and empirical aspec… ▽ More

    Submitted 25 December, 2020; originally announced December 2020.

  10. arXiv:2007.07177  [pdf, other

    cs.LG cs.CV cs.GR cs.IR stat.ML

    MosAIc: Finding Artistic Connections across Culture with Conditional Image Retrieval

    Authors: Mark Hamilton, Stephanie Fu, Mindren Lu, Johnny Bui, Darius Bopp, Zhenbang Chen, Felix Tran, Margaret Wang, Marina Rogers, Lei Zhang, Chris Hoder, William T. Freeman

    Abstract: We introduce MosAIc, an interactive web app that allows users to find pairs of semantically related artworks that span different cultures, media, and millennia. To create this application, we introduce Conditional Image Retrieval (CIR) which combines visual similarity search with user supplied filters or "conditions". This technique allows one to find pairs of similar images that span distinct sub… ▽ More

    Submitted 27 February, 2021; v1 submitted 14 July, 2020; originally announced July 2020.

  11. arXiv:2002.00426  [pdf

    q-bio.PE stat.AP

    A Simple Prediction Model for the Development Trend of 2019-nCov Epidemics Based on Medical Observations

    Authors: Ye Liang, Dan Xu, Shang Fu, Kewa Gao, Jingjing Huan, Linyong Xu, Jia-da Li

    Abstract: In order to predict the development trend of the 2019 coronavirus (2019-nCov), we established an prediction model to predict the number of diagnoses case in China except Hubei Province. From January 25 to January 29, 2020, we optimized 6 prediction models, 5 of them based on the number of medical observations to predicts the peak time of confirmed diagnosis will appear on the period of morning of… ▽ More

    Submitted 2 February, 2020; originally announced February 2020.

    Comments: Written on February 1, 2020 at 15:00 (GMT+08:00) 12 pages, 7 figures

  12. arXiv:1912.11464  [pdf, other

    cs.LG stat.ML

    Attack-Resistant Federated Learning with Residual-based Reweighting

    Authors: Shuhao Fu, Chulin Xie, Bo Li, Qifeng Chen

    Abstract: Federated learning has a variety of applications in multiple domains by utilizing private training data stored on different devices. However, the aggregation process in federated learning is highly vulnerable to adversarial attacks so that the global model may behave abnormally under attacks. To tackle this challenge, we present a novel aggregation algorithm with residual-based reweighting to defe… ▽ More

    Submitted 8 January, 2021; v1 submitted 24 December, 2019; originally announced December 2019.

    Comments: 8 pages, 6 figures and 4 tables

  13. arXiv:1906.01078  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Increasing Compactness Of Deep Learning Based Speech Enhancement Models With Parameter Pruning And Quantization Techniques

    Authors: Jyun-Yi Wu, Cheng Yu, Szu-Wei Fu, Chih-Ting Liu, Shao-Yi Chien, Yu Tsao

    Abstract: Most recent studies on deep learning based speech enhancement (SE) focused on improving denoising performance. However, successful SE applications require striking a desirable balance between denoising performance and computational cost in real scenarios. In this study, we propose a novel parameter pruning (PP) technique, which removes redundant channels in a neural network. In addition, a paramet… ▽ More

    Submitted 31 July, 2019; v1 submitted 31 May, 2019; originally announced June 2019.

    Comments: 4pages, 6 figures

  14. arXiv:1901.03749  [pdf

    cs.CV cs.LG stat.ML

    Translating SAR to Optical Images for Assisted Interpretation

    Authors: Shilei Fu, Feng Xu, Ya-Qiu Jin

    Abstract: Despite the advantages of all-weather and all-day high-resolution imaging, SAR remote sensing images are much less viewed and used by general people because human vision is not adapted to microwave scattering phenomenon. However, expert interpreters can be trained by compare side-by-side SAR and optical images to learn the translation rules from SAR to optical. This paper attempts to develop machi… ▽ More

    Submitted 8 January, 2019; originally announced January 2019.

    Comments: 4 pages, 5 figures, 2 tables, conference

  15. arXiv:1806.00446  [pdf, other

    stat.ME

    Bayesian Logistic Regression for Small Areas with Numerous Households

    Authors: Balgobin Nandram, Lu Chen, Shuting Fu, Binod Manandhar

    Abstract: We analyze binary data, available for a relatively large number (big data) of families (or households), which are within small areas, from a population-based survey. Inference is required for the finite population proportion of individuals with a specific character for each area. To accommodate the binary data and important features of all sampled individuals, we use a hierarchical Bayesian logist… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

    Comments: 36 pages, 11 figures

  16. arXiv:1709.03658  [pdf

    stat.ML cs.LG cs.SD

    End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

    Authors: Szu-Wei Fu, Tao-Wei Wang, Yu Tsao, Xugang Lu, Hisashi Kawai

    Abstract: Speech enhancement model is used to map a noisy speech to a clean speech. In the training stage, an objective function is often adopted to optimize the model parameters. However, in most studies, there is an inconsistency between the model optimization criterion and the evaluation criterion on the enhanced speech. For example, in measuring speech intelligibility, most of the evaluation metric is b… ▽ More

    Submitted 15 March, 2018; v1 submitted 11 September, 2017; originally announced September 2017.

    Comments: Accepted in IEEE Transactions on Audio, Speech and Language Processing (TASLP)

  17. arXiv:1704.08504  [pdf

    stat.ML cs.LG cs.SD

    Complex spectrogram enhancement by convolutional neural network with multi-metrics learning

    Authors: Szu-Wei Fu, Ting-yao Hu, Yu Tsao, Xugang Lu

    Abstract: This paper aims to address two issues existing in the current speech enhancement methods: 1) the difficulty of phase estimations; 2) a single objective function cannot consider multiple metrics simultaneously. To solve the first problem, we propose a novel convolutional neural network (CNN) model for complex spectrogram enhancement, namely estimating clean real and imaginary (RI) spectrograms from… ▽ More

    Submitted 9 September, 2017; v1 submitted 27 April, 2017; originally announced April 2017.

  18. arXiv:1703.02205  [pdf

    stat.ML cs.LG cs.SD

    Raw Waveform-based Speech Enhancement by Fully Convolutional Networks

    Authors: Szu-Wei Fu, Yu Tsao, Xugang Lu, Hisashi Kawai

    Abstract: This study proposes a fully convolutional network (FCN) model for raw waveform-based speech enhancement. The proposed system performs speech enhancement in an end-to-end (i.e., waveform-in and waveform-out) manner, which dif-fers from most existing denoising methods that process the magnitude spectrum (e.g., log power spectrum (LPS)) only. Because the fully connected layers, which are involved in… ▽ More

    Submitted 15 June, 2017; v1 submitted 6 March, 2017; originally announced March 2017.

  19. arXiv:1508.05628  [pdf, other

    stat.ME

    An adaptive kriging method for solving nonlinear inverse statistical problems

    Authors: Shuai Fu, Mathieu Couplet, Nicolas Bousquet

    Abstract: In various industrial contexts, estimating the distribution of unobserved random vectors Xi from some noisy indirect observations H(Xi) + Ui is required. If the relation between Xi and the quantity H(Xi), measured with the error Ui, is implemented by a CPU-consuming computer model H, a major practical difficulty is to perform the statistical inference with a relatively small number of runs of H. F… ▽ More

    Submitted 23 August, 2015; originally announced August 2015.

  20. Estimating Discrete Markov Models From Various Incomplete Data Schemes

    Authors: Alberto Pasanisi, Shuai Fu, Nicolas Bousquet

    Abstract: The parameters of a discrete stationary Markov model are transition probabilities between states. Traditionally, data consist in sequences of observed states for a given number of individuals over the whole observation period. In such a case, the estimation of transition probabilities is straightforwardly made by counting one-step moves from a given state to another. In many real-life problems, ho… ▽ More

    Submitted 22 February, 2012; v1 submitted 7 September, 2010; originally announced September 2010.

    Comments: 26 pages - preprint accepted in 20th February 2012 for publication in Computational Statistics and Data Analysis (please cite the journal's paper)

    Journal ref: Computational Statistics and Data Analysis - Volume 56, Issue 9, September 2012, Pages 2609-2625