Skip to main content

Showing 1–50 of 56 results for author: Nguyen, T M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10371  [pdf, ps, other

    cs.CV cs.LG

    Revisiting Transformers with Insights from Image Filtering

    Authors: Laziz U. Abdullaev, Maksim Tkachenko, Tan M. Nguyen

    Abstract: The self-attention mechanism, a cornerstone of Transformer-based state-of-the-art deep learning architectures, is largely heuristic-driven and fundamentally challenging to interpret. Establishing a robust theoretical foundation to explain its remarkable success and limitations has therefore become an increasingly prominent focus in recent research. Some notable directions have explored understandi… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 12 pages, 6 figures

  2. arXiv:2506.07719  [pdf, ps, other

    cs.CL

    Multilingual Grammatical Error Annotation: Combining Language-Agnostic Framework with Language-Specific Flexibility

    Authors: Mengyang Qiu, Tran Minh Nguyen, Zihao Huang, Zelong Li, Yang Gu, Qingyu Gao, Siliang Liu, Jungyeul Park

    Abstract: Grammatical Error Correction (GEC) relies on accurate error annotation and evaluation, yet existing frameworks, such as $\texttt{errant}$, face limitations when extended to typologically diverse languages. In this paper, we introduce a standardized, modular framework for multilingual grammatical error annotation. Our approach combines a language-agnostic foundation with structured language-specifi… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: BEA2025

  3. ViMRHP: A Vietnamese Benchmark Dataset for Multimodal Review Helpfulness Prediction via Human-AI Collaborative Annotation

    Authors: Truc Mai-Thanh Nguyen, Dat Minh Nguyen, Son T. Luu, Kiet Van Nguyen

    Abstract: Multimodal Review Helpfulness Prediction (MRHP) is an essential task in recommender systems, particularly in E-commerce platforms. Determining the helpfulness of user-generated reviews enhances user experience and improves consumer decision-making. However, existing datasets focus predominantly on English and Indonesian, resulting in a lack of linguistic diversity, especially for low-resource lang… ▽ More

    Submitted 4 July, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted at NLDB 2025

  4. arXiv:2505.02508  [pdf, ps, other

    stat.ML cs.LG math.ST

    Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces

    Authors: Yang Lyu, Yuchun Qian, Tan Minh Nguyen, Xin T. Tong

    Abstract: Diffusion models is a popular computational tool to generate new data samples. It utilizes a forward diffusion process that add noise to the data distribution and then use a reverse process to remove noises to produce samples from the data distribution. However, when the empirical data distribution consists of $n$ data point, using the empirical diffusion model will necessarily produce one of the… ▽ More

    Submitted 6 May, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

  5. arXiv:2505.00968  [pdf, ps, other

    cs.LG cs.AI

    Tree-Sliced Wasserstein Distance with Nonlinear Projection

    Authors: Thanh Tran, Viet-Hoang Tran, Thanh Chu, Trang Pham, Laurent El Ghaoui, Tam Le, Tan M. Nguyen

    Abstract: Tree-Sliced methods have recently emerged as an alternative to the traditional Sliced Wasserstein (SW) distance, replacing one-dimensional lines with tree-based metric spaces and incorporating a splitting mechanism for projecting measures. This approach enhances the ability to capture the topological structures of integration domains in Sliced Optimal Transport while maintaining low computational… ▽ More

    Submitted 9 June, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

    Comments: Accepted at ICML 2025

  6. arXiv:2504.00977  [pdf, ps, other

    cs.CL

    Chinese Grammatical Error Correction: A Survey

    Authors: Mengyang Qiu, Qingyu Gao, Linxuan Yang, Yang Gu, Tran Minh Nguyen, Zihao Huang, Jungyeul Park

    Abstract: Chinese Grammatical Error Correction (CGEC) is a critical task in Natural Language Processing, addressing the growing demand for automated writing assistance in both second-language (L2) and native (L1) Chinese writing. While L2 learners struggle with mastering complex grammatical structures, L1 users also benefit from CGEC in academic, professional, and formal contexts where writing precision is… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  7. arXiv:2503.11249  [pdf, other

    cs.LG cs.AI

    Spherical Tree-Sliced Wasserstein Distance

    Authors: Viet-Hoang Tran, Thanh T. Chu, Khoi N. M. Nguyen, Trang Pham, Tam Le, Tan M. Nguyen

    Abstract: Sliced Optimal Transport (OT) simplifies the OT problem in high-dimensional spaces by projecting supports of input measures onto one-dimensional lines and then exploiting the closed-form expression of the univariate OT to reduce the computational burden of OT. Recently, the Tree-Sliced method has been introduced to replace these lines with more intricate structures, known as tree systems. This app… ▽ More

    Submitted 20 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

  8. arXiv:2503.11144  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    MoLEx: Mixture of Layer Experts for Finetuning with Sparse Upcycling

    Authors: Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: Large-scale pre-training of deep models, followed by fine-tuning them, has become the cornerstone of natural language processing (NLP). The prevalence of data coupled with computational resources has led to large models with a considerable number of parameters. While the massive size of these models has led to remarkable success in many NLP tasks, a detriment is the expense required to retrain all… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  9. arXiv:2503.11050  [pdf, other

    cs.LG cs.AI

    Distance-Based Tree-Sliced Wasserstein Distance

    Authors: Hoang V. Tran, Khoi N. M. Nguyen, Trang Pham, Thanh T. Chu, Tam Le, Tan M. Nguyen

    Abstract: To overcome computational challenges of Optimal Transport (OT), several variants of Sliced Wasserstein (SW) has been developed in the literature. These approaches exploit the closed-form expression of the univariate OT by projecting measures onto (one-dimensional) lines. However, projecting measures onto low-dimensional spaces can lead to a loss of topological information. Tree-Sliced Wasserstein… ▽ More

    Submitted 13 March, 2025; originally announced March 2025.

  10. arXiv:2503.00687  [pdf, other

    cs.LG

    Transformer Meets Twicing: Harnessing Unattended Residual Information

    Authors: Laziz Abdullaev, Tan M. Nguyen

    Abstract: Transformer-based deep learning models have achieved state-of-the-art performance across numerous language and vision tasks. While the self-attention mechanism, a core component of transformers, has proven capable of handling complex data patterns, it has been observed that the representational capacity of the attention matrix degrades significantly across transformer layers, thereby hurting its o… ▽ More

    Submitted 7 March, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

    Comments: 10 pages in the main text. Published at ICLR 2025

  11. arXiv:2502.20525  [pdf, other

    cs.LG cs.AI

    Revisiting Kernel Attention with Correlated Gaussian Process Representation

    Authors: Long Minh Bui, Tho Tran Huu, Duy Dinh, Tan Minh Nguyen, Trong Nghia Hoang

    Abstract: Transformers have increasingly become the de facto method to model sequential data with state-of-the-art performance. Due to its widespread use, being able to estimate and calibrate its modeling uncertainty is important to understand and design robust transformer models. To achieve this, previous works have used Gaussian processes (GPs) to perform uncertainty calibration for the attention units of… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 21 pages, 4 figures

    Journal ref: The 40th Conference on Uncertainty in Artificial Intelligence, 2024

  12. arXiv:2502.18821  [pdf, other

    cs.LG

    CAMEx: Curvature-aware Merging of Experts

    Authors: Dung V. Nguyen, Minh H. Nguyen, Luc Q. Nguyen, Rachel S. Y. Teo, Tan M. Nguyen, Linh Duy Tran

    Abstract: Existing methods for merging experts during model training and fine-tuning predominantly rely on Euclidean geometry, which assumes a flat parameter space. This assumption can limit the model's generalization ability, especially during the pre-training phase, where the parameter manifold might exhibit more complex curvature. Curvature-aware merging methods typically require additional information a… ▽ More

    Submitted 3 March, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

    Comments: 10 pages, 5 Figures, 7 Tables. Published at ICLR 2025

  13. arXiv:2502.15315  [pdf, other

    cs.LG

    Tight Clusters Make Specialized Experts

    Authors: Stefan K. Nielsen, Rachel S. Y. Teo, Laziz U. Abdullaev, Tan M. Nguyen

    Abstract: Sparse Mixture-of-Experts (MoE) architectures have emerged as a promising approach to decoupling model capacity from computational cost. At the core of the MoE model is the router, which learns the underlying clustering structure of the input distribution in order to send input tokens to appropriate experts. However, latent clusters may be unidentifiable in high dimension, which causes slow conver… ▽ More

    Submitted 1 March, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

  14. arXiv:2411.14765  [pdf, other

    cs.LG

    An Attention-based Framework for Fair Contrastive Learning

    Authors: Stefan K. Nielsen, Tan M. Nguyen

    Abstract: Contrastive learning has proven instrumental in learning unbiased representations of data, especially in complex environments characterized by high-cardinality and high-dimensional sensitive information. However, existing approaches within this setting require predefined modelling assumptions of bias-causing interactions that limit the model's ability to learn debiased representations. In this wor… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

  15. arXiv:2411.04323  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Efficient Symmetry-Aware Materials Generation via Hierarchical Generative Flow Networks

    Authors: Tri Minh Nguyen, Sherif Abdulkader Tawfik, Truyen Tran, Sunil Gupta, Santu Rana, Svetha Venkatesh

    Abstract: Discovering new solid-state materials requires rapidly exploring the vast space of crystal structures and locating stable regions. Generating stable materials with desired properties and compositions is extremely difficult as we search for very small isolated pockets in the exponentially many possibilities, considering elements from the periodic table and their 3D arrangements in crystal lattices.… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

  16. arXiv:2410.14574  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts

    Authors: Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: Sparse Mixture of Experts (SMoE) has become the key to unlocking unparalleled scalability in deep learning. SMoE has the potential to exponentially increase parameter count while maintaining the efficiency of the model by only activating a small subset of these parameters for a given sample. However, it has been observed that SMoE suffers from unstable training and has difficulty adapting to new d… ▽ More

    Submitted 18 October, 2024; originally announced October 2024.

    Comments: 10 pages in the main text. Published at NeurIPS 2024. The code is available at https://github.com/rachtsy/MomentumSMoE

  17. arXiv:2410.04692  [pdf, other

    cs.LG stat.ML

    A Clifford Algebraic Approach to E(n)-Equivariant High-order Graph Neural Networks

    Authors: Viet-Hoang Tran, Thieu N. Vo, Tho Tran Huu, Tan Minh Nguyen

    Abstract: Designing neural network architectures that can handle data symmetry is crucial. This is especially important for geometric graphs whose properties are equivariance under Euclidean transformations. Current equivariant graph neural networks (EGNNs), particularly those using message passing, have a limitation in expressive power. Recent high-order graph neural networks can overcome this limitation,… ▽ More

    Submitted 13 March, 2025; v1 submitted 6 October, 2024; originally announced October 2024.

  18. arXiv:2410.04213  [pdf, ps, other

    cs.LG

    Equivariant Polynomial Functional Networks

    Authors: Thieu N. Vo, Viet-Hoang Tran, Tho Tran Huu, An Nguyen The, Thanh Tran, Minh-Khoi Nguyen-Nhat, Duy-Tung Pham, Tan Minh Nguyen

    Abstract: Neural Functional Networks (NFNs) have gained increasing interest due to their wide range of applications, including extracting information from implicit representations of data, editing network weights, and evaluating policies. A key design principle of NFNs is their adherence to the permutation and scaling symmetries inherent in the connectionist structure of the input neural networks. Recent NF… ▽ More

    Submitted 5 October, 2024; originally announced October 2024.

  19. arXiv:2410.04209  [pdf, other

    cs.LG

    Equivariant Neural Functional Networks for Transformers

    Authors: Viet-Hoang Tran, Thieu N. Vo, An Nguyen The, Tho Tran Huu, Minh-Khoi Nguyen-Nhat, Thanh Tran, Duy-Tung Pham, Tan Minh Nguyen

    Abstract: This paper systematically explores neural functional networks (NFN) for transformer architectures. NFN are specialized neural networks that treat the weights, gradients, or sparsity patterns of a deep neural network (DNN) as input data and have proven valuable for tasks such as learnable optimizers, implicit data representations, and weight editing. While NFN have been extensively developed for ML… ▽ More

    Submitted 7 March, 2025; v1 submitted 5 October, 2024; originally announced October 2024.

    Comments: Accepted in ICLR 2025

  20. arXiv:2410.03292  [pdf, other

    cs.LG

    Demystifying the Token Dynamics of Deep Selective State Space Models

    Authors: Thieu N Vo, Tung D. Pham, Xin T. Tong, Tan Minh Nguyen

    Abstract: Selective state space models (SSM), such as Mamba, have gained prominence for their effectiveness in modeling sequential data. Despite their outstanding empirical performance, a comprehensive theoretical understanding of deep selective SSM remains elusive, hindering their further development and adoption for applications that need high fidelity. In this paper, we investigate the dynamical properti… ▽ More

    Submitted 7 March, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: Accepted at ICLR 2025 (spotlight)

  21. arXiv:2409.11697  [pdf, other

    cs.LG

    Monomial Matrix Group Equivariant Neural Functional Networks

    Authors: Viet-Hoang Tran, Thieu N. Vo, Tho H. Tran, An T. Nguyen, Tan M. Nguyen

    Abstract: Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs often depend on permutation symmetries in neural networks' weights, which traditionally arise from the unordered arrangement of neurons in hidden layers.… ▽ More

    Submitted 13 March, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: 10 pages in the main text. Published at NeurIPS 2024. The code is available at https://github.com/MathematicalAI-NUS/Monomial-NFN

  22. arXiv:2408.12480  [pdf, other

    cs.LG cs.CL

    Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese

    Authors: Khang T. Doan, Bao G. Huynh, Dung T. Hoang, Thuc D. Pham, Nhat H. Pham, Quan T. M. Nguyen, Bang Q. Vo, Suong N. Hoang

    Abstract: In this report, we introduce Vintern-1B, a reliable 1-billion-parameters multimodal large language model (MLLM) for Vietnamese language tasks. By integrating the Qwen2-0.5B-Instruct language model with the InternViT-300M-448px visual model, Vintern-1B is optimized for a range of applications, including optical character recognition (OCR), document extraction, and general question-answering in Viet… ▽ More

    Submitted 23 August, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

  23. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  24. arXiv:2406.13770  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Elliptical Attention

    Authors: Stefan K. Nielsen, Laziz U. Abdullaev, Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: Pairwise dot-product self-attention is key to the success of transformers that achieve state-of-the-art performance across a variety of applications in language and vision. This dot-product self-attention computes attention weights among the input tokens using Euclidean distance, which makes the model prone to representation collapse and vulnerable to contaminated samples. In this paper, we propos… ▽ More

    Submitted 31 October, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: 10 pages in the main text. Published at NeurIPS 2024. The code is available at https://github.com/stefvk/Elliptical-Attention

  25. arXiv:2406.13762  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Unveiling the Hidden Structure of Self-Attention via Kernel Principal Component Analysis

    Authors: Rachel S. Y. Teo, Tan M. Nguyen

    Abstract: The remarkable success of transformers in sequence modeling tasks, spanning various applications in natural language processing and computer vision, is attributed to the critical role of self-attention. Similar to the development of most deep learning models, the construction of these attention mechanisms relies on heuristics and experience. In our work, we derive self-attention from kernel princi… ▽ More

    Submitted 30 October, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: 10 pages in the main text. Published at NeurIPS 2024. The code is available at https://github.com/rachtsy/KPCA_code

  26. arXiv:2406.13725  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Tree-Sliced Wasserstein Distance: A Geometric Perspective

    Authors: Viet-Hoang Tran, Trang Pham, Tho Tran, Minh Khoi Nguyen Nhat, Thanh Chu, Tam Le, Tan M. Nguyen

    Abstract: Many variants of Optimal Transport (OT) have been developed to address its heavy computation. Among them, notably, Sliced Wasserstein (SW) is widely used for application domains by projecting the OT problem onto one-dimensional lines, and leveraging the closed-form expression of the univariate OT to reduce the computational burden. However, projecting measures onto low-dimensional spaces can lead… ▽ More

    Submitted 9 June, 2025; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2025

  27. arXiv:2402.15989  [pdf, other

    cs.AI eess.SY

    PIDformer: Transformer Meets Control Theory

    Authors: Tam Nguyen, César A. Uribe, Tan M. Nguyen, Richard G. Baraniuk

    Abstract: In this work, we address two main shortcomings of transformer architectures: input corruption and rank collapse in their output representation. We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its solutions, leading to lower-rank outputs and diminished representation capacity. Moreover, the steady-state solution of the model is sensitive to input p… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  28. arXiv:2402.11813  [pdf, other

    cs.RO cs.AI

    A novel framework for adaptive stress testing of autonomous vehicles in multi-lane roads

    Authors: Linh Trinh, Quang-Hung Luu, Thai M. Nguyen, Hai L. Vu

    Abstract: Stress testing is an approach for evaluating the reliability of systems under extreme conditions which help reveal vulnerable scenarios that standard testing may overlook. Identifying such scenarios is of great importance in autonomous vehicles (AV) and other safety-critical systems. Since failure events are rare, naive random search approaches require a large number of vehicle operation hours to… ▽ More

    Submitted 19 September, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  29. arXiv:2401.08613  [pdf, other

    cs.NI

    Digital Infrastructure for Connected and Automated Vehicles

    Authors: Quang-Hung Luu, Thai M. Nguyen, Nan Zheng, Hai L. Vu

    Abstract: Connected and automated vehicles (CAV) are expected to deliver a much safer, more efficient, and eco-friendlier mobility. Being an indispensable component of the future transportation, their key driving features of CAVs include not only the automated functionality but also the cooperative capability. Despite the CAVs themselves are emerging and active research areas, there is a lack of a comprehen… ▽ More

    Submitted 30 November, 2023; originally announced January 2024.

    Comments: 24 pages, 2 figures, 1 table

  30. arXiv:2312.00751  [pdf, other

    cs.CL cs.AI

    Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals

    Authors: Tam Nguyen, Tan M. Nguyen, Richard G. Baraniuk

    Abstract: Transformers have achieved remarkable success in a wide range of natural language processing and computer vision applications. However, the representation capacity of a deep transformer model is degraded due to the over-smoothing issue in which the token representations become identical when the model's depth grows. In this work, we show that self-attention layers in transformers minimize a functi… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 24 papes

  31. arXiv:2311.03260  [pdf, other

    cs.LG cs.AI stat.ML

    From Coupled Oscillators to Graph Neural Networks: Reducing Over-smoothing via a Kuramoto Model-based Approach

    Authors: Tuan Nguyen, Hirotada Honda, Takashi Sano, Vinh Nguyen, Shugo Nakamura, Tan M. Nguyen

    Abstract: We propose the Kuramoto Graph Neural Network (KuramotoGNN), a novel class of continuous-depth graph neural networks (GNNs) that employs the Kuramoto model to mitigate the over-smoothing phenomenon, in which node features in GNNs become indistinguishable as the number of layers increases. The Kuramoto model captures the synchronization behavior of non-linear coupled oscillators. Under the view of c… ▽ More

    Submitted 5 March, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

  32. arXiv:2311.03235  [pdf, other

    cs.LG cs.CL stat.ML

    p-Laplacian Transformer

    Authors: Tuan Nguyen, Tam Nguyen, Vinh Nguyen, Tan M. Nguyen

    Abstract: $p$-Laplacian regularization, rooted in graph and image signal processing, introduces a parameter $p$ to control the regularization effect on these data. Smaller values of $p$ promote sparsity and interpretability, while larger values encourage smoother solutions. In this paper, we first show that the self-attention mechanism obtains the minimal Laplacian regularization ($p=2… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  33. A Text-based Approach For Link Prediction on Wikipedia Articles

    Authors: Anh Hoang Tran, Tam Minh Nguyen, Son T. Luu

    Abstract: This paper present our work in the DSAA 2023 Challenge about Link Prediction for Wikipedia Articles. We use traditional machine learning models with POS tags (part-of-speech tags) features extracted from text to train the classification model for predicting whether two nodes has the link. Then, we use these tags to test on various machine learning models. We obtained the results by F1 score at 0.9… ▽ More

    Submitted 6 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted by DSAA 2023 Conference in the DSAA Student Competition Section

  34. arXiv:2307.12522  [pdf, other

    cs.SE cs.HC

    Automated Mapping of Adaptive App GUIs from Phones to TVs

    Authors: Han Hu, Ruiqi Dong, John Grundy, Thai Minh Nguyen, Huaxiao Liu, Chunyang Chen

    Abstract: With the increasing interconnection of smart devices, users often desire to adopt the same app on quite different devices for identical tasks, such as watching the same movies on both their smartphones and TVs. However, the significant differences in screen size, aspect ratio, and interaction styles make it challenging to adapt Graphical User Interfaces (GUIs) across these devices. Although there… ▽ More

    Submitted 5 November, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: 30 pages, 15 figures

  35. arXiv:2306.06620  [pdf, other

    cs.SE cs.AI

    ARIST: An Effective API Argument Recommendation Approach

    Authors: Son Nguyen, Cuong Tran Manh, Kien T. Tran, Tan M. Nguyen, Thu-Trang Nguyen, Kien-Tuan Ngo, Hieu Dinh Vo

    Abstract: Learning and remembering to use APIs are difficult. Several techniques have been proposed to assist developers in using APIs. Most existing techniques focus on recommending the right API methods to call, but very few techniques focus on recommending API arguments. In this paper, we propose ARIST, a novel automated argument recommendation approach which suggests arguments by predicting developers'… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  36. arXiv:2211.04454  [pdf, other

    cs.CL cs.LG

    SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content

    Authors: Apurva Gandhi, Ryan Serrao, Biyi Fang, Gilbert Antonius, Jenna Hong, Tra My Nguyen, Sheng Yi, Ehi Nosakhare, Irene Shaffer, Soundararajan Srinivasan, Vivek Gupta

    Abstract: We present SLATE, a sequence labeling approach for extracting tasks from free-form content such as digitally handwritten (or "inked") notes on a virtual whiteboard. Our approach allows us to create a single, low-latency model to simultaneously perform sentence segmentation and classification of these sentences into task/non-task sentences. SLATE greatly outperforms a baseline two-model (sentence s… ▽ More

    Submitted 17 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022 as an Industry Track paper

  37. arXiv:2210.05794  [pdf, other

    cs.LG cs.CL cs.CV

    Designing Robust Transformers using Robust Kernel Density Estimation

    Authors: Xing Han, Tongzheng Ren, Tan Minh Nguyen, Khai Nguyen, Joydeep Ghosh, Nhat Ho

    Abstract: Recent advances in Transformer architectures have empowered their empirical success in a variety of tasks across different domains. However, existing works mainly focus on predictive accuracy and computational cost, without considering other practical issues, such as robustness to contaminated samples. Recent work by Nguyen et al., (2022) has shown that the self-attention mechanism, which is the c… ▽ More

    Submitted 8 November, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted by NeurIPS 2023 as a poster; 23 pages, 5 figures, 11 tables

  38. arXiv:2202.07096  [pdf, other

    cs.AI

    Learning to Discover Medicines

    Authors: Tri Minh Nguyen, Thin Nguyen, Truyen Tran

    Abstract: Discovering new medicines is the hallmark of human endeavor to live a better and longer life. Yet the pace of discovery has slowed down as we need to venture into more wildly unexplored biomedical space to find one that matches today's high standard. Modern AI-enabled by powerful computing, large biomedical databases, and breakthroughs in deep learning-offers a new hope to break this loop as AI is… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  39. arXiv:2202.01195  [pdf, other

    q-bio.BM cs.LG

    Mitigating cold start problems in drug-target affinity prediction with interaction knowledge transferring

    Authors: Tri Minh Nguyen, Thin Nguyen, Truyen Tran

    Abstract: Motivation: Predicting the drug-target interaction is crucial for drug discovery as well as drug repurposing. Machine learning is commonly used in drug-target affinity (DTA) problem. However, machine learning model faces the cold-start problem where the model performance drops when predicting the interaction of a novel drug or target. Previous works try to solve the cold start problem by learning… ▽ More

    Submitted 16 January, 2022; originally announced February 2022.

  40. arXiv:2110.08678  [pdf, other

    cs.LG cs.CL stat.ML

    Improving Transformers with Probabilistic Attention Keys

    Authors: Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher

    Abstract: Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkable performance across a variety of natural language processing (NLP) and computer vision tasks. It has been observed that for many applications, those attention heads learn redundant embedding, and most of them can be removed without degrading the performance of the model. Inspired by this observati… ▽ More

    Submitted 12 June, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: 27 pages, 16 figures, 10 tables

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  41. arXiv:2110.04840  [pdf, other

    cs.LG cs.AI math.DS math.NA

    Heavy Ball Neural Ordinary Differential Equations

    Authors: Hedi Xia, Vai Suliafu, Hangjie Ji, Tan M. Nguyen, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: We propose heavy ball neural ordinary differential equations (HBNODEs), leveraging the continuous limit of the classical momentum accelerated gradient descent, to improve neural ODEs (NODEs) training and inference. HBNODEs have two properties that imply practical advantages over NODEs: (i) The adjoint state of an HBNODE also satisfies an HBNODE, accelerating both forward and backward ODE solvers,… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Comments: 23 pages, 9 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2021

    MSC Class: 68T07 ACM Class: I.2

  42. arXiv:2109.12777  [pdf, other

    cs.LG cs.CL

    ReINTEL Challenge 2020: A Comparative Study of Hybrid Deep Neural Network for Reliable Intelligence Identification on Vietnamese SNSs

    Authors: Hoang Viet Trinh, Tung Tien Bui, Tam Minh Nguyen, Huy Quang Dao, Quang Huu Pham, Ngoc N. Tran, Ta Minh Thanh

    Abstract: The overwhelming abundance of data has created a misinformation crisis. Unverified sensationalism that is designed to grab the readers' short attention span, when crafted with malice, has caused irreparable damage to our society's structure. As a result, determining the reliability of an article has become a crucial task. After various ablation studies, we propose a multi-input model that can effe… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Journal ref: Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing (VLSP), Hanoi, Vietnam, 2020, pp. 6-12

  43. arXiv:2108.02347  [pdf, other

    cs.LG cs.AI math.NA

    FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

    Authors: Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

    Abstract: We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and fa… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: 18 pages, 8 figures

    MSC Class: 68T07 ACM Class: I.2

  44. arXiv:2104.12255  [pdf, other

    cs.CR

    0

    Authors: Quan Thoi Minh Nguyen

    Abstract: What is the funniest number in cryptography? 0. The reason is that for all x, x*0 = 0, i.e., the equation is always satisfied no matter what x is. This article discusses crypto bugs in four BLS signatures' libraries (ethereum/py ecc, supranational/blst, herumi/bls, sigp/milagro bls) that revolve around 0. Furthermore, we develop "splitting zero" attacks to show a weakness in the proof-of-possessio… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  45. arXiv:2103.12983  [pdf, other

    cs.AI cs.LG

    Counterfactual Explanation with Multi-Agent Reinforcement Learning for Drug Target Prediction

    Authors: Tri Minh Nguyen, Thomas P Quinn, Thin Nguyen, Truyen Tran

    Abstract: Motivation: Many high-performance DTA models have been proposed, but they are mostly black-box and thus lack human interpretability. Explainable AI (XAI) can make DTA models more trustworthy, and can also enable scientists to distill biological knowledge from the models. Counterfactual explanation is one popular approach to explaining the behaviour of a deep neural network, which works by systemat… ▽ More

    Submitted 1 June, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

  46. Interpreting the Latent Space of Generative Adversarial Networks using Supervised Learning

    Authors: Toan Pham Van, Tam Minh Nguyen, Ngoc N. Tran, Hoai Viet Nguyen, Linh Bao Doan, Huy Quang Dao, Thanh Ta Minh

    Abstract: With great progress in the development of Generative Adversarial Networks (GANs), in recent years, the quest for insights in understanding and manipulating the latent space of GAN has gained more and more attention due to its wide range of applications. While most of the researches on this task have focused on unsupervised learning method, which induces difficulties in training and limitation in r… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: Published in 2020 International Conference on Advanced Computing and Applications (ACOMP)

    Journal ref: 2020 International Conference on Advanced Computing and Applications (ACOMP), Quy Nhon, Vietnam, 2020, pp. 49-54

  47. arXiv:2009.12146  [pdf, other

    cs.LG stat.ML

    GEFA: Early Fusion Approach in Drug-Target Affinity Prediction

    Authors: Tri Minh Nguyen, Thin Nguyen, Thao Minh Le, Truyen Tran

    Abstract: Predicting the interaction between a compound and a target is crucial for rapid drug repurposing. Deep learning has been successfully applied in drug-target affinity (DTA) problem. However, previous deep learning-based methods ignore modeling the direct interactions between drug and protein residues. This would lead to inaccurate learning of target representation which may change due to the drug b… ▽ More

    Submitted 27 September, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

  48. arXiv:2007.06493  [pdf, ps, other

    cs.CL

    HSD Shared Task in VLSP Campaign 2019:Hate Speech Detection for Social Good

    Authors: Xuan-Son Vu, Thanh Vu, Mai-Vu Tran, Thanh Le-Cong, Huyen T M. Nguyen

    Abstract: The paper describes the organisation of the "HateSpeech Detection" (HSD) task at the VLSP workshop 2019 on detecting the fine-grained presence of hate speech in Vietnamese textual items (i.e., messages) extracted from Facebook, which is the most popular social network site (SNS) in Vietnam. The task is organised as a multi-class classification task and based on a large-scale dataset containing 25,… ▽ More

    Submitted 13 July, 2020; originally announced July 2020.

  49. arXiv:2006.06919  [pdf, other

    cs.LG math.DS stat.ML

    MomentumRNN: Integrating Momentum into Recurrent Neural Networks

    Authors: Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and gradient descent (GD). We then integrate momentum into this framework and propose a new family of RNNs, called {\em MomentumRNNs}. We theoretically prove and numeri… ▽ More

    Submitted 11 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 21 pages, 11 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2020

    MSC Class: 68T07 ACM Class: I.2

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2020

  50. arXiv:2004.01403  [pdf, ps, other

    cs.CR

    A "Final" Security Bug

    Authors: Quan Thoi Minh Nguyen

    Abstract: This article discusses a fixed critical security bug in Google Tink's Ed25519 Java implementation. The bug allows remote attackers to extract the private key with only two Ed25519 signatures. The vulnerability comes from the misunderstanding of what "final" in Java programming language means. The bug was discovered during security review before Google Tink was officially released. It reinforces th… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.