Skip to main content

Showing 1–50 of 96 results for author: Pham, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08146  [pdf, ps, other

    cs.DS cs.LG

    Tensor Sketch: Fast and Scalable Polynomial Kernel Approximation

    Authors: Ninh Pham, Rasmus Pagh

    Abstract: Approximation of non-linear kernels using random feature maps has become a powerful technique for scaling kernel methods to large datasets. We propose \textit{Tensor Sketch}, an efficient random feature map for approximating polynomial kernels. Given $n$ training samples in $\R^d$ Tensor Sketch computes low-dimensional embeddings in $\R^D$ in time $\BO{n(d+D \log{D})}$ making it well-suited for hi… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Extension of KDD 2013 and correcting the variance bound

  2. Development and evaluation of a deep learning algorithm for German word recognition from lip movements

    Authors: Dinh Nam Pham, Torsten Rahne

    Abstract: When reading lips, many people benefit from additional visual information from the lip movements of the speaker, which is, however, very error prone. Algorithms for lip reading with artificial intelligence based on artificial neural networks significantly improve word recognition but are not available for the German language. A total of 1806 video clips with only one German-speaking person each we… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

    Comments: English version of journal article in HNO 2022

    Journal ref: HNO 70, 456-465 (2022)

  3. arXiv:2503.13429  [pdf, other

    cs.CV

    Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes

    Authors: Nhi Pham, Bernt Schiele, Adam Kortylewski, Jonas Fischer

    Abstract: With the rise of neural networks, especially in high-stakes applications, these networks need two properties (i) robustness and (ii) interpretability to ensure their safety. Recent advances in classifiers with 3D volumetric object representations have demonstrated a greatly enhanced robustness in out-of-distribution data. However, these 3D-aware classifiers have not been studied from the perspecti… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  4. On the State of Coherence in the Land of Type Classes

    Authors: Dimi Racordon, Eugene Flesselle, Cao Nguyen Pham

    Abstract: Type classes are a popular tool for implementing generic algorithms and data structures without loss of efficiency, bridging the gap between parametric and ad-hoc polymorphism. Since their initial development in Haskell, they now feature prominently in numerous other industry-ready programming languages, notably including Swift, Rust, and Scala. The success of type classes hinges in large part on… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Journal ref: The Art, Science, and Engineering of Programming, 2025, Vol. 10, Issue 1, Article 15

  5. arXiv:2502.16152  [pdf, other

    cs.LG cs.GT

    DUPRE: Data Utility Prediction for Efficient Data Valuation

    Authors: Kieu Thao Nguyen Pham, Rachael Hwee Ling Sim, Quoc Phong Nguyen, See Kiong Ng, Bryan Kian Hsiang Low

    Abstract: Data valuation is increasingly used in machine learning (ML) to decide the fair compensation for data owners and identify valuable or harmful data for improving ML models. Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility (e.g., validation accuracy) and retraining the ML model for multiple data subsets. While most existing works on efficient e… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: 16 pages, 7 figures, the paper got accepted AAMAS 2025

  6. arXiv:2502.06759  [pdf, other

    cs.CL cs.AI cs.DB

    Rationalization Models for Text-to-SQL

    Authors: Gaetano Rossiello, Nhan Pham, Michael Glass, Junkyu Lee, Dharmashankar Subramanian

    Abstract: We introduce a framework for generating Chain-of-Thought (CoT) rationales to enhance text-to-SQL model fine-tuning. These rationales consist of intermediate SQL statements and explanations, serving as incremental steps toward constructing the final SQL query. The process begins with manually annotating a small set of examples, which are then used to prompt a large language model in an iterative, d… ▽ More

    Submitted 20 March, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

    Comments: Published at ICLR 2025 Workshop on Reasoning and Planning for LLMs

  7. arXiv:2501.09512  [pdf, ps, other

    cs.CL cs.LG

    PIER: A Novel Metric for Evaluating What Matters in Code-Switching

    Authors: Enes Yavuz Ugan, Ngoc-Quan Pham, Leonard Bärmann, Alex Waibel

    Abstract: Code-switching, the alternation of languages within a single discourse, presents a significant challenge for Automatic Speech Recognition. Despite the unique nature of the task, performance is commonly measured with established metrics such as Word-Error-Rate (WER). However, in this paper, we question whether these general metrics accurately assess performance on code-switching. Specifically, usin… ▽ More

    Submitted 21 January, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

    Comments: Accepted at ICASSP 2025

  8. arXiv:2412.17241  [pdf, other

    cs.CV cs.AI

    QTSeg: A Query Token-Based Dual-Mix Attention Framework with Multi-Level Feature Distribution for Medical Image Segmentation

    Authors: Phuong-Nam Tran, Nhat Truong Pham, Duc Ngoc Minh Dang, Eui-Nam Huh, Choong Seon Hong

    Abstract: Medical image segmentation plays a crucial role in assisting healthcare professionals with accurate diagnoses and enabling automated diagnostic processes. Traditional convolutional neural networks (CNNs) often struggle with capturing long-range dependencies, while transformer-based architectures, despite their effectiveness, come with increased computational complexity. Recent efforts have focused… ▽ More

    Submitted 13 February, 2025; v1 submitted 22 December, 2024; originally announced December 2024.

  9. arXiv:2411.04077  [pdf, other

    cs.CV

    H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models

    Authors: Nhi Pham, Michael Schott

    Abstract: By leveraging both texts and images, large vision language models (LVLMs) have shown significant progress in various multi-modal tasks. Nevertheless, these models often suffer from hallucinations, e.g., they exhibit inconsistencies between the visual input and the textual output. To address this, we propose H-POPE, a coarse-to-fine-grained benchmark that systematically assesses hallucination in ob… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

    Comments: Poster at https://sites.google.com/berkeley.edu/bb-stat/home

  10. arXiv:2410.14997  [pdf, other

    cs.SD cs.AI eess.AS

    Improving Pronunciation and Accent Conversion through Knowledge Distillation And Synthetic Ground-Truth from Native TTS

    Authors: Tuan Nam Nguyen, Seymanur Akti, Ngoc Quan Pham, Alexander Waibel

    Abstract: Previous approaches on accent conversion (AC) mainly aimed at making non-native speech sound more native while maintaining the original content and speaker identity. However, non-native speakers sometimes have pronunciation issues, which can make it difficult for listeners to understand them. Hence, we developed a new AC approach that not only focuses on accent conversion but also improves pronunc… ▽ More

    Submitted 4 March, 2025; v1 submitted 19 October, 2024; originally announced October 2024.

    Comments: accepted at ICASSP 2025

  11. arXiv:2410.08229  [pdf, other

    cs.CV cs.NE eess.IV

    Improvement of Spiking Neural Network with Bit Planes and Color Models

    Authors: Nhan T. Luu, Duong T. Luu, Nam N. Pham, Thang C. Truong

    Abstract: Spiking neural network (SNN) has emerged as a promising paradigm in computational neuroscience and artificial intelligence, offering advantages such as low energy consumption and small memory footprint. However, their practical adoption is constrained by several challenges, prominently among them being performance optimization. In this study, we present a novel approach to enhance the performance… ▽ More

    Submitted 8 November, 2024; v1 submitted 28 September, 2024; originally announced October 2024.

  12. arXiv:2410.06423  [pdf, other

    cs.LG cs.AI

    FAIREDU: A Multiple Regression-Based Method for Enhancing Fairness in Machine Learning Models for Educational Applications

    Authors: Nga Pham, Minh Kha Do, Tran Vu Dai, Pham Ngoc Hung, Anh Nguyen-Duc

    Abstract: Fairness in artificial intelligence and machine learning (AI/ML) models is becoming critically important, especially as decisions made by these systems impact diverse groups. In education, a vital sector for all countries, the widespread application of AI/ML systems raises specific concerns regarding fairness. Current research predominantly focuses on fairness for individual sensitive features, wh… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  13. arXiv:2410.03734  [pdf, other

    cs.SD cs.CL eess.AS

    Accent conversion using discrete units with parallel data synthesized from controllable accented TTS

    Authors: Tuan Nam Nguyen, Ngoc Quan Pham, Alexander Waibel

    Abstract: The goal of accent conversion (AC) is to convert speech accents while preserving content and speaker identity. Previous methods either required reference utterances during inference, did not preserve speaker identity well, or used one-to-one systems that could only be trained for each non-native accent. This paper presents a promising AC model that can convert many accents into native to overcome… ▽ More

    Submitted 30 September, 2024; originally announced October 2024.

    Comments: Accepted at Syndata4genAI

  14. arXiv:2409.04415  [pdf, other

    cs.AI

    Improved Parallel Algorithm for Non-Monotone Submodular Maximization under Knapsack Constraint

    Authors: Tan D. Tran, Canh V. Pham, Dung T. K. Ha, Phuong N. H. Pham

    Abstract: This work proposes an efficient parallel algorithm for non-monotone submodular maximization under a knapsack constraint problem over the ground set of size $n$. Our algorithm improves the best approximation factor of the existing parallel one from $8+ε$ to $7+ε$ with $O(\log n)$ adaptive complexity. The key idea of our approach is to create a new alternate threshold algorithmic framework. This s… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: In Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI), Main Track

  15. arXiv:2408.13850  [pdf, other

    cs.LG cs.AI

    Condensed Sample-Guided Model Inversion for Knowledge Distillation

    Authors: Kuluhan Binici, Shivam Aggarwal, Cihan Acar, Nam Trung Pham, Karianto Leman, Gim Hee Lee, Tulika Mitra

    Abstract: Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, gener… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  16. arXiv:2408.12480  [pdf, other

    cs.LG cs.CL

    Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese

    Authors: Khang T. Doan, Bao G. Huynh, Dung T. Hoang, Thuc D. Pham, Nhat H. Pham, Quan T. M. Nguyen, Bang Q. Vo, Suong N. Hoang

    Abstract: In this report, we introduce Vintern-1B, a reliable 1-billion-parameters multimodal large language model (MLLM) for Vietnamese language tasks. By integrating the Qwen2-0.5B-Instruct language model with the InternViT-300M-448px visual model, Vintern-1B is optimized for a range of applications, including optical character recognition (OCR), document extraction, and general question-answering in Viet… ▽ More

    Submitted 23 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  17. arXiv:2408.02290  [pdf, other

    cs.CL

    Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages

    Authors: Carlos Mullov, Ngoc-Quan Pham, Alexander Waibel

    Abstract: Multilingual neural machine translation systems learn to map sentences of different languages into a common representation space. Intuitively, with a growing number of seen languages the encoder sentence representation grows more flexible and easily adaptable to new languages. In this work, we test this hypothesis by zero-shot translating from unseen languages. To deal with unknown vocabularies fr… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: Accepted to ACL 2024

  18. Segment-Based Test Case Prioritization: A Multi-objective Approach

    Authors: Hieu Huynh, Nhu Pham, Tien N. Nguyen, Vu Nguyen

    Abstract: Regression testing of software is a crucial but time-consuming task, especially in the context of user interface (UI) testing where multiple microservices must be validated simultaneously. Test case prioritization (TCP) is a cost-efficient solution to address this by scheduling test cases in an execution order that maximizes an objective function, generally aimed at increasing the fault detection… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

    Comments: ISSTA 2024

  19. arXiv:2406.16777  [pdf, other

    cs.CL cs.AI

    Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024

    Authors: Sai Koneru, Thai-Binh Nguyen, Ngoc-Quan Pham, Danni Liu, Zhaolin Li, Alexander Waibel, Jan Niehues

    Abstract: Large Language Models (LLMs) are currently under exploration for various tasks, including Automatic Speech Recognition (ASR), Machine Translation (MT), and even End-to-End Speech Translation (ST). In this paper, we present KIT's offline submission in the constrained + LLM track by incorporating recently proposed techniques that can be added to any cascaded speech translation. Specifically, we inte… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  20. arXiv:2402.15679  [pdf, ps, other

    cs.LG cs.CV

    Scalable Density-based Clustering with Random Projections

    Authors: Haochuan Xu, Ninh Pham

    Abstract: We present sDBSCAN, a scalable density-based clustering algorithm in high dimensions with cosine distance. Utilizing the neighborhood-preserving property of random projections, sDBSCAN can quickly identify core points and their neighborhoods, the primary hurdle of density-based clustering. Theoretically, sDBSCAN outputs a clustering structure similar to DBSCAN under mild conditions with high proba… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  21. arXiv:2402.09264  [pdf, other

    cs.LG cs.HC

    UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers

    Authors: Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

    Abstract: Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's outp… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  22. arXiv:2401.11487  [pdf, other

    cs.CL cs.CY

    Towards Better Inclusivity: A Diverse Tweet Corpus of English Varieties

    Authors: Nhi Pham, Lachlan Pham, Adam L. Meyers

    Abstract: The prevalence of social media presents a growing opportunity to collect and analyse examples of English varieties. Whilst usage of these varieties was - and, in many cases, still is - used only in spoken contexts or hard-to-access private messages, social media sites like Twitter provide a platform for users to communicate informally in a scrapeable format. Notably, Indian English (Hinglish), Sin… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 10 pages (including limitations, references and appendices), 2 figures

  23. arXiv:2401.05425  [pdf

    eess.SP cs.LG

    An Unobtrusive and Lightweight Ear-worn System for Continuous Epileptic Seizure Detection

    Authors: Abdul Aziz, Nhat Pham, Neel Vora, Cody Reynolds, Jaime Lehnen, Pooja Venkatesh, Zhuoran Yao, Jay Harvey, Tam Vu, Kan Ding, Phuc Nguyen

    Abstract: Epilepsy is one of the most common neurological diseases globally (around 50 million people worldwide). Fortunately, up to 70% of people with epilepsy could live seizure-free if properly diagnosed and treated, and a reliable technique to monitor the onset of seizures could improve the quality of life of patients who are constantly facing the fear of random seizure attacks. The scalp-based EEG test… ▽ More

    Submitted 24 October, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  24. arXiv:2401.01108  [pdf, other

    cs.CL

    Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework

    Authors: Ha Le, Bao Tran, Phuong Le, Tan Nguyen, Dac Nguyen, Ngoan Pham, Dang Huynh

    Abstract: Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, o… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted manuscript at VLSP 2023

  25. arXiv:2312.09877  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Distributed Learning of Mixtures of Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham

    Abstract: In modern machine learning problems we deal with datasets that are either distributed by nature or potentially large for which distributing the computations is usually a standard way to proceed, since centralized algorithms are in general ineffective. We propose a distributed learning approach for mixtures of experts (MoE) models with an aggregation strategy to construct a reduction estimator from… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  26. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  27. arXiv:2310.14434  [pdf, other

    cs.CR

    Enhancing Accuracy-Privacy Trade-off in Differentially Private Split Learning

    Authors: Ngoc Duy Pham, Khoa Tran Phan, Naveen Chilamkurti

    Abstract: Split learning (SL) aims to protect user data privacy by distributing deep models between client-server and keeping private data locally. Only processed or `smashed' data can be transmitted from the clients to the server during the SL process. However, recently proposed model inversion attacks can recover the original data from the smashed data. In order to enhance privacy protection against such… ▽ More

    Submitted 15 October, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

  28. arXiv:2309.11506  [pdf, other

    cs.IR cs.AI cs.CL

    Matching Table Metadata with Business Glossaries Using Large Language Models

    Authors: Elita Lobo, Oktie Hassanzadeh, Nhan Pham, Nandana Mihindukulasooriya, Dharmashankar Subramanian, Horst Samulowitz

    Abstract: Enterprises often own large collections of structured data in the form of large databases or an enterprise data lake. Such data collections come with limited metadata and strict access policies that could limit access to the data contents and, therefore, limit the application of classic retrieval and analysis solutions. As a result, there is a need for solutions that can effectively utilize the av… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: This paper is a work in progress with findings based on limited evidence. Please exercise discretion when interpreting the findings

  29. arXiv:2308.03415  [pdf, other

    cs.CL cs.AI

    End-to-End Evaluation for Low-Latency Simultaneous Speech Translation

    Authors: Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel

    Abstract: The challenge of low-latency speech translation has recently draw significant interest in the research community as shown by several publications and shared tasks. Therefore, it is essential to evaluate these different approaches in realistic scenarios. However, currently only specific aspects of the systems are evaluated and often it is not possible to compare different approaches. In this work… ▽ More

    Submitted 17 July, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Demo paper at EMNLP 2023

  30. arXiv:2306.11925  [pdf, other

    cs.CV

    LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching

    Authors: Duy M. H. Nguyen, Hoang Nguyen, Nghiem T. Diep, Tan N. Pham, Tri Cao, Binh T. Nguyen, Paul Swoboda, Nhat Ho, Shadi Albarqouni, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Obtaining large pre-trained models that can be fine-tuned to new tasks with limited annotated samples has remained an open challenge for medical imaging data. While pre-trained deep networks on ImageNet and vision-language foundation models trained on web-scale data are prevailing approaches, their effectiveness on medical tasks is limited due to the significant domain shift between natural and me… ▽ More

    Submitted 18 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023

  31. arXiv:2306.05320  [pdf, other

    cs.CL cs.SD

    KIT's Multilingual Speech Translation System for IWSLT 2023

    Authors: Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

    Abstract: Many existing speech translation benchmarks focus on native-English speech in high-quality recording conditions, which often do not match the conditions in real-life use-cases. In this paper, we describe our speech translation system for the multilingual track of IWSLT 2023, which evaluates translation quality on scientific conference talks. The test condition features accented input speech and te… ▽ More

    Submitted 12 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: IWSLT 2023

  32. arXiv:2305.06044  [pdf, other

    cs.LG stat.ML

    Correlation visualization under missing values: a comparison between imputation and direct parameter estimation methods

    Authors: Nhat-Hao Pham, Khanh-Linh Vo, Mai Anh Vu, Thu Nguyen, Michael A. Riegler, Pål Halvorsen, Binh T. Nguyen

    Abstract: Correlation matrix visualization is essential for understanding the relationships between variables in a dataset, but missing data can pose a significant challenge in estimating correlation coefficients. In this paper, we compare the effects of various missing data methods on the correlation plot, focusing on two common missing patterns: random and monotone. We aim to provide practical strategies… ▽ More

    Submitted 5 September, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

  33. arXiv:2304.08252  [pdf, other

    cs.RO

    PaaS: Planning as a Service for reactive driving in CARLA Leaderboard

    Authors: Nhat Hao Truong, Huu Thien Mai, Tuan Anh Tran, Minh Quang Tran, Duc Duy Nguyen, Ngoc Viet Phuong Pham

    Abstract: End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation.… ▽ More

    Submitted 14 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: accepted on 05.06.2023, revised on 15.06.2023, to be published on ICSSE 2023

  34. arXiv:2301.10439  [pdf, other

    cs.CL cs.LG

    ViDeBERTa: A powerful pre-trained language model for Vietnamese

    Authors: Cong Dao Tran, Nhut Huy Pham, Anh Nguyen, Truong Son Hy, Tu Vu

    Abstract: This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese, with three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large, which are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts using DeBERTa architecture. Although many successful pre-trained language models based on Transformer have been widely proposed for the Engl… ▽ More

    Submitted 10 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  35. arXiv:2212.00250  [pdf, other

    cs.CR cs.DC

    Split Learning without Local Weight Sharing to Enhance Client-side Data Privacy

    Authors: Ngoc Duy Pham, Tran Khoa Phan, Alsharif Abuadbba, Yansong Gao, Doan Nguyen, Naveen Chilamkurti

    Abstract: Split learning (SL) aims to protect user data privacy by distributing deep models between client-server and keeping private data locally. In SL training with multiple clients, the local model weights are shared among the clients for local model update. This paper first reveals data privacy leakage exacerbated from local weight sharing among the clients in SL through model inversion attacks. Then,… ▽ More

    Submitted 21 July, 2024; v1 submitted 30 November, 2022; originally announced December 2022.

  36. arXiv:2211.11703  [pdf, other

    cs.CL cs.SD eess.AS

    Towards continually learning new languages

    Authors: Ngoc-Quan Pham, Jan Niehues, Alexander Waibel

    Abstract: Multilingual speech recognition with neural networks is often implemented with batch-learning, when all of the languages are available before training. An ability to add new languages after the prior training sessions can be economically beneficial, but the main challenge is catastrophic forgetting. In this work, we combine the qualities of weight factorization and elastic weight consolidation in… ▽ More

    Submitted 17 July, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Work in progress

  37. arXiv:2209.14494  [pdf, other

    cs.CL

    Multi-stage Information Retrieval for Vietnamese Legal Texts

    Authors: Nhat-Minh Pham, Ha-Thanh Nguyen, Trong-Hop Do

    Abstract: This study deals with the problem of information retrieval (IR) for Vietnamese legal texts. Despite being well researched in many languages, information retrieval has still not received much attention from the Vietnamese research community. This is especially true for the case of legal documents, which are hard to process. This study proposes a new approach for information retrieval for Vietnamese… ▽ More

    Submitted 11 November, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: Presented at PKAW 2022 (arXiv:2211.03888) Report-no: PKAW/2022/01

    Report number: Report-no: PKAW/2022/01

  38. arXiv:2209.09649  [pdf, other

    q-fin.ST cs.LG

    Predicting Mutual Funds' Performance using Deep Learning and Ensemble Techniques

    Authors: Nghia Chu, Binh Dao, Nga Pham, Huy Nguyen, Hien Tran

    Abstract: Predicting fund performance is beneficial to both investors and fund managers, and yet is a challenging task. In this paper, we have tested whether deep learning models can predict fund performance more accurately than traditional statistical techniques. Fund performance is typically evaluated by the Sharpe ratio, which represents the risk-adjusted performance to ensure meaningful comparability ac… ▽ More

    Submitted 31 July, 2023; v1 submitted 18 September, 2022; originally announced September 2022.

    Comments: 16 pages, 4 figures, 4 tables

  39. vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM

    Authors: Thanh Tin Nguyen, Long H. Nguyen, Nhat Truong Pham, Liu Tai Nguyen, Van Huong Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: This study presents our approach on the automatic Vietnamese image captioning for healthcare domain in text processing tasks of Vietnamese Language and Speech Processing (VLSP) Challenge 2021, as shown in Figure 1. In recent years, image captioning often employs a convolutional neural network-based architecture as an encoder and a long short-term memory (LSTM) as a decoder to generate sentences. T… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: Accepted for publication in the VNU Journal of Science: Computer Science and Communication Engineering

    Journal ref: VNU Journal of Science: Computer Science and Communication Engineering, 38(2), 2022

  40. arXiv:2206.04864  [pdf, other

    cs.LG cs.CR

    Binarizing Split Learning for Data Privacy Enhancement and Computation Reduction

    Authors: Ngoc Duy Pham, Alsharif Abuadbba, Yansong Gao, Tran Khoa Phan, Naveen Chilamkurti

    Abstract: Split learning (SL) enables data privacy preservation by allowing clients to collaboratively train a deep learning model with the server without sharing raw data. However, SL still has limitations such as potential data privacy leakage and high computation at clients. In this study, we propose to binarize the SL local layers for faster computation (up to 17.5 times less forward-propagation time in… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  41. arXiv:2206.01382  [pdf, ps, other

    cs.DS cs.CV

    Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor Search

    Authors: Ninh Pham, Tao Liu

    Abstract: We present Falconn++, a novel locality-sensitive filtering approach for approximate nearest neighbor search on angular distance. Falconn++ can filter out potential far away points in any hash bucket \textit{before} querying, which results in higher quality candidates compared to other hashing-based solutions. Theoretically, Falconn++ asymptotically achieves lower query time complexity than Falconn… ▽ More

    Submitted 22 October, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: To appear in NeurIPS 2022

  42. arXiv:2205.12304  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Adaptive multilingual speech recognition with pretrained models

    Authors: Ngoc-Quan Pham, Alex Waibel, Jan Niehues

    Abstract: Multilingual speech recognition with supervised learning has achieved great results as reflected in recent research. With the development of pretraining methods on audio and text data, it is imperative to transfer the knowledge from unsupervised multilingual models to facilitate recognition, especially in many languages with limited data. Our work investigated the effectiveness of using two pretra… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  43. arXiv:2202.13934  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Functional mixture-of-experts for classification

    Authors: Nhat Thien Pham, Faicel Chamroukhi

    Abstract: We develop a mixtures-of-experts (ME) approach to the multiclass classification where the predictors are univariate functions. It consists of a ME model in which both the gating network and the experts network are constructed upon multinomial logistic activation functions with functional inputs. We perform a regularized maximum likelihood estimation in which the coefficient functions enjoy interpr… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: Submitted to the 53èmes Journées de la Société Française de Statistique

  44. arXiv:2202.03558  [pdf, other

    cs.LG cs.AI

    Attacking c-MARL More Effectively: A Data Driven Approach

    Authors: Nhan H. Pham, Lam M. Nguyen, Jie Chen, Hoang Thanh Lam, Subhro Das, Tsui-Wei Weng

    Abstract: In recent years, a proliferation of methods were developed for cooperative multi-agent reinforcement learning (c-MARL). However, the robustness of c-MARL agents against adversarial attacks has been rarely explored. In this paper, we propose to evaluate the robustness of c-MARL agents via a model-based approach, named c-MBA. Our proposed formulation can craft much stronger adversarial state perturb… ▽ More

    Submitted 10 September, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

  45. arXiv:2202.02249  [pdf, other

    stat.ME cs.LG stat.CO stat.ML

    Functional Mixtures-of-Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham, Van Hà Hoang, Geoffrey J. McLachlan

    Abstract: We consider the statistical analysis of heterogeneous data for prediction in situations where the observations include functions, typically time series. We extend the modeling with Mixtures-of-Experts (ME), as a framework of choice in modeling heterogeneity in data for prediction with vectorial observations, to this functional data analysis context. We first present a new family of ME models, name… ▽ More

    Submitted 20 December, 2023; v1 submitted 4 February, 2022; originally announced February 2022.

    MSC Class: 62-XX; 62R10 ACM Class: G.3

  46. arXiv:2201.06806  [pdf, ps, other

    cs.LG

    An Efficient Hashing-based Ensemble Method for Collaborative Outlier Detection

    Authors: Kitty Li, Ninh Pham

    Abstract: In collaborative outlier detection, multiple participants exchange their local detectors trained on decentralized devices without exchanging their own data. A key problem of collaborative outlier detection is efficiently aggregating multiple local detectors to form a global detector without breaching the privacy of participants' data and degrading the detection accuracy. We study locality-sensitiv… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

  47. arXiv:2201.03019  [pdf, other

    cs.LG cs.AI

    Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay

    Authors: Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, Tulika Mitra

    Abstract: Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained neural network (teacher) to a more compact one (student) in the absence of original training data. Existing works use a validation set to monitor the accuracy of the student over real data and report the highest performance throughout the entire process. However, validation data may not be available at distillation time… ▽ More

    Submitted 29 July, 2024; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: AAAI Conference on Artificial Intelligence

  48. arXiv:2110.11293  [pdf, other

    cs.CV

    An Empirical Study on GANs with Margin Cosine Loss and Relativistic Discriminator

    Authors: Cuong V. Nguyen, Tien-Dung Cao, Tram Truong-Huu, Khanh N. Pham, Binh T. Nguyen

    Abstract: Generative Adversarial Networks (GANs) have emerged as useful generative models, which are capable of implicitly learning data distributions of arbitrarily complex dimensions. However, the training of GANs is empirically well-known for being highly unstable and sensitive. The loss functions of both the discriminator and generator concerning their parameters tend to oscillate wildly during training… ▽ More

    Submitted 21 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 16 pages, 5 figures

  49. arXiv:2109.09026  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Hybrid Data Augmentation and Deep Attention-based Dilated Convolutional-Recurrent Neural Networks for Speech Emotion Recognition

    Authors: Nhat Truong Pham, Duc Ngoc Minh Dang, Sy Dzung Nguyen

    Abstract: Speech emotion recognition (SER) has been one of the significant tasks in Human-Computer Interaction (HCI) applications. However, it is hard to choose the optimal features and deal with imbalance labeled data. In this article, we investigate hybrid data augmentation (HDA) methods to generate and balance data based on traditional and generative adversarial networks (GAN) methods. To evaluate the ef… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 12 pages, 16 figures, 6 tables

  50. arXiv:2109.08860   

    cs.GT

    Groups Influence with Minimum Cost in Social Networks

    Authors: Phuong N. H. Pham, Canh V. Pham, Hieu V. Duong, Thanh T. Nguyen, My T. Thai

    Abstract: This paper studies a Group Influence with Minimum cost which aims to find a seed set with smallest cost that can influence all target groups, where each user is associated with a cost and a group is influenced if the total score of the influenced users belonging to the group is at least a certain threshold. As the group-influence function is neither submodular nor supermodular, theoretical bounds… ▽ More

    Submitted 14 December, 2022; v1 submitted 18 September, 2021; originally announced September 2021.

    Comments: The paper contains some errors