Skip to main content

Showing 1–26 of 26 results for author: Ngo, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.05321  [pdf

    cs.LG

    Using Federated Machine Learning in Predictive Maintenance of Jet Engines

    Authors: Asaph Matheus Barbosa, Thao Vy Nhat Ngo, Elaheh Jafarigol, Theodore B. Trafalis, Emuobosa P. Ojoboh

    Abstract: The goal of this paper is to predict the Remaining Useful Life (RUL) of turbine jet engines using a federated machine learning framework. Federated Learning enables multiple edge devices/nodes or servers to collaboratively train a shared model without sharing sensitive data, thus preserving data privacy and security. By implementing a nonlinear model, the system aims to capture complex relationshi… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

  2. arXiv:2501.06545  [pdf, other

    cs.IT eess.SP

    Energy-Aware Resource Allocation for Energy Harvesting Powered Wireless Sensor Nodes

    Authors: Ngoc M. Ngo, Trung T. Nguyen, Phuc H. Nguyen, Van-Dinh Nguyen

    Abstract: Low harvested energy poses a significant challenge to sustaining continuous communication in energy harvesting (EH)-powered wireless sensor networks. This is mainly due to intermittent and limited power availability from radio frequency signals. In this paper, we introduce a novel energy-aware resource allocation problem aimed at enabling the asynchronous accumulate-then-transmit protocol, offerin… ▽ More

    Submitted 11 January, 2025; originally announced January 2025.

    Comments: To be appeared in IEEE COMMUNICATIONS LETTERS

  3. arXiv:2501.00874  [pdf, other

    cs.CL cs.IR

    LUSIFER: Language Universal Space Integration for Enhanced Multilingual Embeddings with Large Language Models

    Authors: Hieu Man, Nghia Trung Ngo, Viet Dac Lai, Ryan A. Rossi, Franck Dernoncourt, Thien Huu Nguyen

    Abstract: Recent advancements in large language models (LLMs) based embedding models have established new state-of-the-art benchmarks for text embedding tasks, particularly in dense vector-based retrieval. However, these models predominantly focus on English, leaving multilingual embedding capabilities largely unexplored. To address this limitation, we present LUSIFER, a novel zero-shot approach that adapts… ▽ More

    Submitted 7 May, 2025; v1 submitted 1 January, 2025; originally announced January 2025.

  4. arXiv:2411.11265  [pdf, other

    cs.LG q-bio.QM

    GROOT: Effective Design of Biological Sequences with Limited Experimental Data

    Authors: Thanh V. T. Tran, Nhat Khang Ngo, Viet Anh Nguyen, Truong Son Hy

    Abstract: Latent space optimization (LSO) is a powerful method for designing discrete, high-dimensional biological sequences that maximize expensive black-box functions, such as wet lab experiments. This is accomplished by learning a latent space from available data and using a surrogate model to guide optimization algorithms toward optimal outputs. However, existing methods struggle when labeled data is li… ▽ More

    Submitted 17 November, 2024; originally announced November 2024.

  5. arXiv:2411.09213  [pdf, other

    cs.CL cs.AI cs.IR

    Comprehensive and Practical Evaluation of Retrieval-Augmented Generation Systems for Medical Question Answering

    Authors: Nghia Trung Ngo, Chien Van Nguyen, Franck Dernoncourt, Thien Huu Nguyen

    Abstract: Retrieval-augmented generation (RAG) has emerged as a promising approach to enhance the performance of large language models (LLMs) in knowledge-intensive tasks such as those from medical domain. However, the sensitive nature of the medical domain necessitates a completely accurate and trustworthy system. While existing RAG benchmarks primarily focus on the standard retrieve-answer setting, they o… ▽ More

    Submitted 14 November, 2024; originally announced November 2024.

  6. arXiv:2411.08785  [pdf, other

    cs.CL cs.AI

    Zero-shot Cross-lingual Transfer Learning with Multiple Source and Target Languages for Information Extraction: Language Selection and Adversarial Training

    Authors: Nghia Trung Ngo, Thien Huu Nguyen

    Abstract: The majority of previous researches addressing multi-lingual IE are limited to zero-shot cross-lingual single-transfer (one-to-one) setting, with high-resource languages predominantly as source training data. As a result, these works provide little understanding and benefit for the realistic goal of developing a multi-lingual IE system that can generalize to as many languages as possible. Our stud… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

  7. arXiv:2409.19117  [pdf, other

    cs.LG eess.SP

    Range-aware Positional Encoding via High-order Pretraining: Theory and Practice

    Authors: Viet Anh Nguyen, Nhat Khang Ngo, Truong Son Hy

    Abstract: Unsupervised pre-training on vast amounts of graph data is critical in real-world applications wherein labeled data is limited, such as molecule properties prediction or materials science. Existing approaches pre-train models for specific graph domains, neglecting the inherent connections within networks. This limits their ability to transfer knowledge to various supervised tasks. In this work, we… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

  8. arXiv:2408.03402  [pdf, other

    cs.CL cs.IR

    ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning

    Authors: Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen

    Abstract: Large Language Models (LLMs) excel in various natural language processing tasks, but leveraging them for dense passage embedding remains challenging. This is due to their causal attention mechanism and the misalignment between their pre-training objectives and the text ranking tasks. Despite some recent efforts to address these issues, existing frameworks for LLM-based text embeddings have been li… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

  9. Private Blockchain-based Procurement and Asset Management System with QR Code

    Authors: Alonel A. Hugo, Gerard Nathaniel C. Ngo

    Abstract: The developed system aims to incorporate a private blockchain technology in the procurement process for the supply office. The procurement process includes the canvassing, purchasing, delivery and inspection of items, inventory, and disposal. The blockchain-based system includes a distributed ledger technology, peer-to-peer network, Proof-of-Authority consensus mechanism, and SHA3-512 cryptographi… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Journal ref: HUGO, Alonel A.; NGO, Gerard Nathaniel C.. Private Blockchain-based Procurement and Asset Management System with QR Code. International Journal of Computing Sciences Research, [S.l.], v. 8, p. 2971-2983, July 2024

  10. arXiv:2406.05349  [pdf, other

    cs.CV

    Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid

    Authors: Thanh-Huy Nguyen, Thi Kim Ngan Ngo, Mai Anh Vu, Ting-Yuan Tu

    Abstract: The ability of three-dimensional (3D) spheroid modeling to study the invasive behavior of breast cancer cells has drawn increased attention. The deep learning-based image processing framework is very effective at speeding up the cell morphological analysis process. Out-of-focus photos taken while capturing 3D cells under several z-slices, however, could negatively impact the deep learning model. I… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  11. arXiv:2402.16634  [pdf, other

    eess.IV cs.CV q-bio.QM

    Boosting Skull-Stripping Performance for Pediatric Brain Images

    Authors: William Kelley, Nathan Ngo, Adrian V. Dalca, Bruce Fischl, Lilla Zöllei, Malte Hoffmann

    Abstract: Skull-stripping is the removal of background and non-brain anatomical features from brain images. While many skull-stripping tools exist, few target pediatric populations. With the emergence of multi-institutional pediatric data acquisition efforts to broaden the understanding of perinatal brain development, it is essential to develop robust and well-tested tools ready for the relevant data proces… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 5 pages, 5 figures, 1 table, skull-stripping, brain extraction, newborn, infant, toddler, pediatric MRI, machine learning, accepted by the IEEE International Symposium on Biomedical Imaging

  12. arXiv:2402.04821  [pdf, other

    cs.LG

    E(3)-Equivariant Mesh Neural Networks

    Authors: Thuan Trang, Nhat Khang Ngo, Daniel Levy, Thieu N. Vo, Siamak Ravanbakhsh, Truong Son Hy

    Abstract: Triangular meshes are widely used to represent three-dimensional objects. As a result, many recent works have address the need for geometric deep learning on 3D mesh. However, we observe that the complexities in many of these architectures does not translate to practical performance, and simple deep models for geometric graphs are competitive in practice. Motivated by this observation, we minimall… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  13. arXiv:2309.16685  [pdf, other

    q-bio.BM cs.LG

    Target-aware Variational Auto-encoders for Ligand Generation with Multimodal Protein Representation Learning

    Authors: Nhat Khang Ngo, Truong Son Hy

    Abstract: Without knowledge of specific pockets, generating ligands based on the global structure of a protein target plays a crucial role in drug discovery as it helps reduce the search space for potential drug-like candidates in the pipeline. However, contemporary methods require optimizing tailored networks for each protein, which is arduous and costly. To address this issue, we introduce TargetVAE, a ta… ▽ More

    Submitted 2 August, 2023; originally announced September 2023.

  14. arXiv:2309.09400  [pdf, other

    cs.CL cs.AI

    CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

    Authors: Thuat Nguyen, Chien Van Nguyen, Viet Dac Lai, Hieu Man, Nghia Trung Ngo, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen

    Abstract: The driving factors behind the development of large language models (LLMs) with impressive learning capabilities are their colossal model sizes and extensive training datasets. Along with the progress in natural language processing, LLMs have been frequently made accessible to the public to foster deeper investigation and applications. However, when it comes to training datasets for these LLMs, es… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: Ongoing Work

  15. arXiv:2307.16039  [pdf, other

    cs.CL cs.LG

    Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

    Authors: Viet Dac Lai, Chien Van Nguyen, Nghia Trung Ngo, Thuat Nguyen, Franck Dernoncourt, Ryan A. Rossi, Thien Huu Nguyen

    Abstract: A key technology for the development of large language models (LLMs) involves instruction tuning that helps align the models' responses with human expectations to realize impressive learning abilities. Two major approaches for instruction tuning characterize supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF), which are currently applied to produce the best commercia… ▽ More

    Submitted 1 August, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

  16. arXiv:2304.05613  [pdf, other

    cs.CL cs.AI

    ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning

    Authors: Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, Thien Huu Nguyen

    Abstract: Over the last few years, large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP) that fundamentally transform research and developments in the field. ChatGPT represents one of the most exciting LLM systems developed recently to showcase impressive skills for language generation and highly attract public attention. Among various exciting ap… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  17. arXiv:2302.08680  [pdf, other

    cs.LG

    Modeling Polypharmacy and Predicting Drug-Drug Interactions using Deep Generative Models on Multimodal Graphs

    Authors: Nhat Khang Ngo, Truong Son Hy, Risi Kondor

    Abstract: Latent representations of drugs and their targets produced by contemporary graph autoencoder models have proved useful in predicting many types of node-pair interactions on large networks, including drug-drug, drug-target, and target-target interactions. However, most existing approaches model either the node's latent spaces in which node distributions are rigid or do not effectively capture the i… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2209.09941

  18. Multiresolution Graph Transformers and Wavelet Positional Encoding for Learning Hierarchical Structures

    Authors: Nhat Khang Ngo, Truong Son Hy, Risi Kondor

    Abstract: Contemporary graph learning algorithms are not well-defined for large molecules since they do not consider the hierarchical interactions among the atoms, which are essential to determine the molecular properties of macromolecules. In this work, we propose Multiresolution Graph Transformers (MGT), the first graph transformer architecture that can learn to represent large molecules at multiple scale… ▽ More

    Submitted 21 July, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  19. arXiv:2209.09941  [pdf, other

    q-bio.BM cs.LG

    Predicting Drug-Drug Interactions using Deep Generative Models on Graphs

    Authors: Nhat Khang Ngo, Truong Son Hy, Risi Kondor

    Abstract: Latent representations of drugs and their targets produced by contemporary graph autoencoder-based models have proved useful in predicting many types of node-pair interactions on large networks, including drug-drug, drug-target, and target-target interactions. However, most existing approaches model the node's latent spaces in which node distributions are rigid and disjoint; these limitations hind… ▽ More

    Submitted 30 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

  20. arXiv:2207.10780  [pdf, ps, other

    cs.CR

    Cryptographic and Financial Fairness

    Authors: Daniele Friolo, Fabio Massacci, Chan Nam Ngo, Daniele Venturi

    Abstract: A recent trend in multi-party computation is to achieve cryptographic fairness via monetary penalties, i.e. each honest player either obtains the output or receives a compensation in the form of a cryptocurrency. We pioneer another type of fairness, financial fairness, that is closer to the real-world valuation of financial transactions. Intuitively, a penalty protocol is financially fair if the n… ▽ More

    Submitted 11 August, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

  21. arXiv:2207.04945  [pdf, other

    cs.CV cs.GR cs.MM

    SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

    Authors: Jie Qin, Shuaihang Yuan, Jiaxin Chen, Boulbaba Ben Amor, Yi Fang, Nhat Hoang-Xuan, Chi-Bien Chu, Khoi-Nguyen Nguyen-Ngoc, Thien-Tri Cao, Nhat-Khang Ngo, Tuan-Luc Huynh, Hai-Dang Nguyen, Minh-Triet Tran, Haoyang Luo, Jianning Wang, Zheng Zhang, Zihao Xin, Yang Wang, Feng Wang, Ying Tang, Haiqin Chen, Yan Wang, Qunying Zhou, Ji Zhang, Hongyuan Wang

    Abstract: Sketch-based 3D shape retrieval (SBSR) is an important yet challenging task, which has drawn more and more attention in recent years. Existing approaches address the problem in a restricted setting, without appropriately simulating real application scenarios. To mimic the realistic setting, in this track, we adopt large-scale sketches drawn by amateurs of different levels of drawing skills, as wel… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  22. arXiv:2202.08316  [pdf, other

    cs.CL

    FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction

    Authors: Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen

    Abstract: This paper presents FAMIE, a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction. FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration. This hinders the engagement, pr… ▽ More

    Submitted 4 May, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted to NAACL 2022 (System Demonstrations)

  23. arXiv:2011.04349  [pdf, other

    cs.CV cs.AI

    MAGNeto: An Efficient Deep Learning Method for the Extractive Tags Summarization Problem

    Authors: Hieu Trong Phung, Anh Tuan Vu, Tung Dinh Nguyen, Lam Thanh Do, Giang Nam Ngo, Trung Thanh Tran, Ngoc C. Lê

    Abstract: In this work, we study a new image annotation task named Extractive Tags Summarization (ETS). The goal is to extract important tags from the context lying in an image and its corresponding tags. We adjust some state-of-the-art deep learning models to utilize both visual and textual information. Our proposed solution consists of different widely used blocks like convolutional and self-attention lay… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  24. arXiv:2002.03223  [pdf, other

    stat.ML cs.LG stat.ME

    Conjoined Dirichlet Process

    Authors: Michelle N. Ngo, Dustin S. Pluta, Alexander N. Ngo, Babak Shahbaba

    Abstract: Biclustering is a class of techniques that simultaneously clusters the rows and columns of a matrix to sort heterogeneous data into homogeneous blocks. Although many algorithms have been proposed to find biclusters, existing methods suffer from the pre-specification of the number of biclusters or place constraints on the model structure. To address these issues, we develop a novel, non-parametric… ▽ More

    Submitted 8 February, 2020; originally announced February 2020.

  25. arXiv:1910.01842  [pdf, other

    cs.CV cs.LG stat.ML

    SELF: Learning to Filter Noisy Labels with Self-Ensembling

    Authors: Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, Thomas Brox

    Abstract: Deep neural networks (DNNs) have been shown to over-fit a dataset when being trained with noisy labels for a long enough time. To overcome this problem, we present a simple and effective method self-ensemble label filtering (SELF) to progressively filter out the wrong labels during training. Our method improves the task performance by gradually allowing supervision only from the potentially non-no… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

  26. arXiv:1909.13055  [pdf, other

    cs.CV cs.LG eess.IV

    DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision

    Authors: Duc Tam Nguyen, Maximilian Dax, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Zhongyu Lou, Thomas Brox

    Abstract: Deep neural network (DNN) based salient object detection in images based on high-quality labels is expensive. Alternative unsupervised approaches rely on careful selection of multiple handcrafted saliency methods to generate noisy pseudo-ground-truth labels. In this work, we propose a two-stage mechanism for robust unsupervised object saliency prediction, where the first stage involves refinement… ▽ More

    Submitted 15 March, 2021; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: NeuRIPS-2019 (Vancouver, Canada): camera ready version