Skip to main content

Showing 1–50 of 51 results for author: Huynh, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.04741  [pdf, ps, other

    cs.CV

    Vision-Language Models Can't See the Obvious

    Authors: Yasser Dahou, Ngoc Dung Huynh, Phuc H. Le-Khac, Wamiq Reyaz Para, Ankit Singh, Sanath Narayan

    Abstract: We present Saliency Benchmark (SalBench), a novel benchmark designed to assess the capability of Large Vision-Language Models (LVLM) in detecting visually salient features that are readily apparent to humans, such as a large circle amidst a grid of smaller ones. This benchmark focuses on low-level features including color, intensity, and orientation, which are fundamental to human visual processin… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2505.15123  [pdf, ps, other

    cs.CV cs.AI

    Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

    Authors: Ta Duc Huy, Duy Anh Huynh, Yutong Xie, Yuankai Qi, Qi Chen, Phi Le Nguyen, Sen Kim Tran, Son Lam Phung, Anton van den Hengel, Zhibin Liao, Minh-Son To, Johan W. Verjans, Vu Minh Hieu Phan

    Abstract: Visual grounding (VG) is the capability to identify the specific regions in an image associated with a particular text description. In medical imaging, VG enhances interpretability by highlighting relevant pathological features corresponding to textual descriptions, improving model transparency and trustworthiness for wider adoption of deep learning models in clinical practice. Current models stru… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Under Review

  3. arXiv:2505.03214  [pdf, other

    cs.SE cs.AI

    DocSpiral: A Platform for Integrated Assistive Document Annotation through Human-in-the-Spiral

    Authors: Qiang Sun, Sirui Li, Tingting Bi, Du Huynh, Mark Reynolds, Yuanyi Luo, Wei Liu

    Abstract: Acquiring structured data from domain-specific, image-based documents such as scanned reports is crucial for many downstream tasks but remains challenging due to document variability. Many of these documents exist as images rather than as machine-readable text, which requires human annotation to train automated extraction systems. We present DocSpiral, the first Human-in-the-Spiral assistive docum… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  4. arXiv:2505.00495  [pdf, other

    cs.LG cs.PF

    Enhancing Tropical Cyclone Path Forecasting with an Improved Transformer Network

    Authors: Nguyen Van Thanh, Nguyen Dang Huynh, Nguyen Ngoc Tan, Nguyen Thai Minh, Nguyen Nam Hoang

    Abstract: A storm is a type of extreme weather. Therefore, forecasting the path of a storm is extremely important for protecting human life and property. However, storm forecasting is very challenging because storm trajectories frequently change. In this study, we propose an improved deep learning method using a Transformer network to predict the movement trajectory of a storm over the next 6 hours. The sto… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  5. arXiv:2503.24164  [pdf, ps, other

    cs.MM

    SVLA: A Unified Speech-Vision-Language Assistant with Multimodal Reasoning and Speech Generation

    Authors: Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Imran Razzak, Hakim Hacid, Sunil Aryal

    Abstract: Large vision and language models show strong performance in tasks like image captioning, visual question answering, and retrieval. However, challenges remain in integrating speech, text, and vision into a unified model, especially for spoken tasks. Speech generation methods vary (some produce speech directly), others through text (but their impact on quality is unclear). Evaluation often relies on… ▽ More

    Submitted 7 July, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

    Comments: 21 pages

  6. arXiv:2503.10693  [pdf, other

    cs.CV eess.IV

    Knowledge Consultation for Semi-Supervised Semantic Segmentation

    Authors: Thuan Than, Nhat-Anh Nguyen-Dang, Dung Nguyen, Salwa K. Al Khatib, Ahmed Elhagry, Hai Phan, Yihui He, Zhiqiang Shen, Marios Savvides, Dang Huynh

    Abstract: Semi-Supervised Semantic Segmentation reduces reliance on extensive annotations by using unlabeled data and state-of-the-art models to improve overall performance. Despite the success of deep co-training methods, their underlying mechanisms remain underexplored. This work revisits Cross Pseudo Supervision with dual heterogeneous backbones and introduces Knowledge Consultation (SegKC) to further en… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  7. arXiv:2501.04343  [pdf, other

    cs.LO cs.AI cs.CL

    TimelineKGQA: A Comprehensive Question-Answer Pair Generator for Temporal Knowledge Graphs

    Authors: Qiang Sun, Sirui Li, Du Huynh, Mark Reynolds, Wei Liu

    Abstract: Question answering over temporal knowledge graphs (TKGs) is crucial for understanding evolving facts and relationships, yet its development is hindered by limited datasets and difficulties in generating custom QA pairs. We propose a novel categorization framework based on timeline-context relationships, along with \textbf{TimelineKGQA}, a universal temporal QA generator applicable to any TKGs. The… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

  8. arXiv:2501.03939  [pdf, other

    cs.CV cs.MM

    Visual question answering: from early developments to recent advances -- a survey

    Authors: Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Sunil Aryal, Imran Razzak, Hakim Hacid

    Abstract: Visual Question Answering (VQA) is an evolving research field aimed at enabling machines to answer questions about visual content by integrating image and language processing techniques such as feature extraction, object detection, text embedding, natural language understanding, and language generation. With the growth of multimodal data research, VQA has gained significant attention due to its br… ▽ More

    Submitted 11 January, 2025; v1 submitted 7 January, 2025; originally announced January 2025.

    Comments: 20 papers

  9. Serial Scammers and Attack of the Clones: How Scammers Coordinate Multiple Rug Pulls on Decentralized Exchanges

    Authors: Phuong Duy Huynh, Son Hoang Dau, Nicholas Huppert, Joshua Cervenjak, Hoonie Sun, Hong Yen Tran, Xiaodong Li, Emanuele Viterbo

    Abstract: We explored the ubiquitous phenomenon of serial scammers, each of whom deployed dozens to thousands of addresses to conduct a series of similar Rug Pulls on popular decentralized exchanges. We first constructed two datasets of around 384,000 scammer addresses behind all one-day Simple Rug Pulls on Uniswap (Ethereum) and Pancakeswap (BSC), and identified distinctive scam patterns including star, ch… ▽ More

    Submitted 10 February, 2025; v1 submitted 14 December, 2024; originally announced December 2024.

    Comments: Accepted by WWW'25

  10. arXiv:2410.22648  [pdf, other

    cs.CV

    SimpsonsVQA: Enhancing Inquiry-Based Learning with a Tailored Dataset

    Authors: Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Sunil Aryal, Imran Razzak, Hakim Hacid

    Abstract: Visual Question Answering (VQA) has emerged as a promising area of research to develop AI-based systems for enabling interactive and immersive learning. Numerous VQA datasets have been introduced to facilitate various tasks, such as answering questions or identifying unanswerable ones. However, most of these datasets are constructed using real-world images, leaving the performance of existing mode… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

  11. arXiv:2405.00571  [pdf, other

    cs.CV cs.AI

    Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

    Authors: Young Kyun Jang, Dat Huynh, Ashish Shah, Wen-Kai Chen, Ser-Nam Lim

    Abstract: Composed Image Retrieval (CIR) is a complex task that retrieves images using a query, which is configured with an image and a caption that describes desired modifications to that image. Supervised CIR approaches have shown strong performance, but their reliance on expensive manually-annotated datasets restricts their scalability and broader applicability. To address these issues, previous studies… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  12. arXiv:2404.15516  [pdf, other

    cs.CV cs.AI

    Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

    Authors: Young Kyun Jang, Donghyun Kim, Zihang Meng, Dat Huynh, Ser-Nam Lim

    Abstract: Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification. Current techniques rely on supervised learning for CIR models using labeled triplets of the reference image, text, target image. These specific triplets are not as commonly available as simple image-text pairs, limiting the widespread use of CIR and its scalability. On the o… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 15 pages

  13. arXiv:2401.01108  [pdf, other

    cs.CL

    Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework

    Authors: Ha Le, Bao Tran, Phuong Le, Tan Nguyen, Dac Nguyen, Ngoan Pham, Dang Huynh

    Abstract: Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, o… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted manuscript at VLSP 2023

  14. The Impact of COVID-19 on Chronic Pain: Multidimensional Clustering Reveals Deep Insights into Spinal Cord Stimulation Patients

    Authors: Sara Berger, Carla Agurto, Guillermo Cecchi, Elif Eyigoz, Brad Hershey, Kristen Lechleiter, NAVITAS/ENVISION studies physician author group, Dat Huynh, Matt McDonald, Jeffrey L Rogers

    Abstract: The emergence of COVID-19 offered a unique opportunity to study chronic pain patients as they responded to sudden changes in social environments, increased community stress, and reduced access to care. We report findings from n=70 Spinal Cord Stimulation (SCS) patients before and during initial pandemic stages resulting from advances in home monitoring and artificial intelligence that produced nov… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 9 pages

  15. arXiv:2309.08579  [pdf, ps, other

    cs.CE

    Polytopal composite finite elements for modeling concrete fracture based on nonlocal damage models

    Authors: Hai D. Huynh, S. Natarajan, H. Nguyen-Xuan, Xiaoying Zhuang

    Abstract: The paper presents an assumed strain formulation over polygonal meshes to accurately evaluate the strain fields in nonlocal damage models. An assume strained technique based on the Hu-Washizu variational principle is employed to generate a new strain approximation instead of direct derivation from the basis functions and the displacement fields. The underlying idea embedded in arbitrary finite pol… ▽ More

    Submitted 11 July, 2023; originally announced September 2023.

  16. arXiv:2309.04700  [pdf, other

    cs.CR

    From Programming Bugs to Multimillion-Dollar Scams: An Analysis of Trapdoor Tokens on Uniswap

    Authors: Phuong Duy Huynh, Thisal De Silva, Son Hoang Dau, Xiaodong Li, Iqbal Gondal, Emanuele Viterbo

    Abstract: We investigate in this work a recently emerged type of scam ERC-20 token called Trapdoor, which has cost investors billions of US dollars on Uniswap, the largest decentralised exchange on Ethereum, from 2020 to 2023. In essence, Trapdoor tokens allow users to buy but preventing them from selling by embedding logical bugs and/or owner-only features in their smart contracts. By manually inspecting a… ▽ More

    Submitted 19 December, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: 22 pages, 11 figures

  17. arXiv:2309.03918  [pdf, other

    cs.AI cs.CY cs.LG

    A recommender for the management of chronic pain in patients undergoing spinal cord stimulation

    Authors: Tigran Tchrakian, Mykhaylo Zayats, Alessandra Pascale, Dat Huynh, Pritish Parida, Carla Agurto Rios, Sergiy Zhuk, Jeffrey L. Rogers, ENVISION Studies Physician Author Group, Boston Scientific Research Scientists Consortium

    Abstract: Spinal cord stimulation (SCS) is a therapeutic approach used for the management of chronic pain. It involves the delivery of electrical impulses to the spinal cord via an implanted device, which when given suitable stimulus parameters can mask or block pain signals. Selection of optimal stimulation parameters usually happens in the clinic under the care of a provider whereas at-home SCS optimizati… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

  18. arXiv:2308.16391  [pdf, other

    cs.CR cs.CE cs.LG q-fin.ST

    Improving the Accuracy of Transaction-Based Ponzi Detection on Ethereum

    Authors: Phuong Duy Huynh, Son Hoang Dau, Xiaodong Li, Phuc Luong, Emanuele Viterbo

    Abstract: The Ponzi scheme, an old-fashioned fraud, is now popular on the Ethereum blockchain, causing considerable financial losses to many crypto investors. A few Ponzi detection methods have been proposed in the literature, most of which detect a Ponzi scheme based on its smart contract source code. This contract-code-based approach, while achieving very high accuracy, is not robust because a Ponzi devel… ▽ More

    Submitted 17 July, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: 17 pages, 9 figures, 4 tables

  19. arXiv:2307.03871  [pdf

    cs.CV

    HUMS2023 Data Challenge Result Submission

    Authors: Dhiraj Neupane, Lakpa Dorje Tamang, Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Sunil Aryal

    Abstract: We implemented a simple method for early detection in this research. The implemented methods are plotting the given mat files and analyzing scalogram images generated by performing Continuous Wavelet Transform (CWT) on the samples. Also, finding the mean, standard deviation (STD), and peak-to-peak (P2P) values from each signal also helped detect faulty signs. We have implemented the autoregressive… ▽ More

    Submitted 14 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: This report is being submitted as part of the Data Challenge organized by HUmS2023

  20. arXiv:2306.10484  [pdf, other

    eess.IV cs.CV

    The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data

    Authors: Luuk H. Boulogne, Julian Lorenz, Daniel Kienzle, Robin Schon, Katja Ludwig, Rainer Lienhart, Simon Jegou, Guang Li, Cong Chen, Qi Wang, Derik Shi, Mayug Maniparambil, Dominik Muller, Silvan Mertes, Niklas Schroter, Fabio Hellmann, Miriam Elia, Ine Dirks, Matias Nicolas Bossa, Abel Diaz Berenguer, Tanmoy Mukherjee, Jef Vandemeulebroucke, Hichem Sahli, Nikos Deligiannis, Panagiotis Gonidakis , et al. (13 additional authors not shown)

    Abstract: Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training m… ▽ More

    Submitted 25 June, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

  21. Why Are Conversational Assistants Still Black Boxes? The Case For Transparency

    Authors: Trung Dong Huynh, William Seymour, Luc Moreau, Jose Such

    Abstract: Much has been written about privacy in the context of conversational and voice assistants. Yet, there have been remarkably few developments in terms of the actual privacy offered by these devices. But how much of this is due to the technical and design limitations of speech as an interaction modality? In this paper, we set out to reframe the discussion on why commercial conversational assistants d… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: To appear in the Proceedings of the 2023 ACM conference on Conversational User Interfaces (CUI 23)

  22. arXiv:2305.18330  [pdf, other

    cs.IR cs.AI cs.CL

    #REVAL: a semantic evaluation framework for hashtag recommendation

    Authors: Areej Alsini, Du Q. Huynh, Amitava Datta

    Abstract: Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of ev… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 18 pages, 4 figures

    ACM Class: I.2.7

  23. arXiv:2206.06251  [pdf

    cs.SE cs.AI cs.CY

    A Methodology and Software Architecture to Support Explainability-by-Design

    Authors: Trung Dong Huynh, Niko Tsakalakis, Ayah Helal, Sophie Stalla-Bourdillon, Luc Moreau

    Abstract: Algorithms play a crucial role in many technological systems that control or affect various aspects of our lives. As a result, providing explanations for their decisions to address the needs of users and organisations is increasingly expected by laws, regulations, codes of conduct, and the public. However, as laws and regulations do not prescribe how to meet such expectations, organisations are of… ▽ More

    Submitted 25 May, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

  24. arXiv:2206.04438  [pdf, other

    cs.AI cs.CY

    A taxonomy of explanations to support Explainability-by-Design

    Authors: Niko Tsakalakis, Sophie Stalla-Bourdillon, Trung Dong Huynh, Luc Moreau

    Abstract: As automated decision-making solutions are increasingly applied to all aspects of everyday life, capabilities to generate meaningful explanations for a variety of stakeholders (i.e., decision-makers, recipients of decisions, auditors, regulators...) become crucial. In this paper, we present a taxonomy of explanations that was developed as part of a holistic 'Explainability-by-Design' approach for… ▽ More

    Submitted 14 November, 2024; v1 submitted 9 June, 2022; originally announced June 2022.

  25. arXiv:2205.05211  [pdf, other

    cs.DS cs.CR math.CO

    TreePIR: Efficient Private Retrieval of Merkle Proofs via Tree Colorings with Fast Indexing and Zero Storage Overhead

    Authors: Son Hoang Dau, Quang Cao, Rinaldo Gagiano, Duy Huynh, Xun Yi, Phuc Lu Le, Quang-Hung Luu, Emanuele Viterbo, Yu-Chih Huang, Jingge Zhu, Mohammad M. Jalalzai, Chen Feng

    Abstract: A Batch Private Information Retrieval (batch-PIR) scheme allows a client to retrieve multiple data items from a database without revealing them to the storage server(s). Most existing approaches for batch-PIR are based on batch codes, in particular, probabilistic batch codes (PBC) (Angel et al. S&P'18), which incur large storage overheads. In this work, we show that \textit{zero} storage overhead… ▽ More

    Submitted 4 June, 2024; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: 25 pages

    MSC Class: 05C05; 05C15; 05C85; 05C90; ACM Class: G.2.2; F.2.0; E.1

  26. arXiv:2202.10594  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey

    Authors: Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Imran Razzak, Kevin Lee, Chetan Arora, Ali Hassani, Arkady Zaslavsky

    Abstract: A Machine-Critical Application is a system that is fundamentally necessary to the success of specific and sensitive operations such as search and recovery, rescue, military, and emergency management actions. Recent advances in Machine Learning, Natural Language Processing, voice recognition, and speech processing technologies have naturally allowed the development and deployment of speech-based co… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  27. arXiv:2111.12698  [pdf, other

    cs.CV

    Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

    Authors: Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar

    Abstract: Open-vocabulary instance segmentation aims at segmenting novel classes without mask annotations. It is an important step toward reducing laborious human supervision. Most existing works first pretrain a model on captioned images covering many novel classes and then finetune it on limited base classes with mask annotations. However, the high-level textual information learned from caption pretrainin… ▽ More

    Submitted 19 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

  28. arXiv:2108.01808  [pdf, other

    cs.CV cs.AI cs.LG

    Leaf Recognition Using Convolutional Neural Networks Based Features

    Authors: Boi M. Quach, Dinh V. Cuong, Nhung Pham, Dang Huynh, Binh T. Nguyen

    Abstract: There is a warning light for the loss of plant habitats worldwide that entails concerted efforts to conserve plant biodiversity. Thus, plant species classification is of crucial importance to address this environmental challenge. In recent years, there is a considerable increase in the number of studies related to plant taxonomy. While some researchers try to improve their recognition performance… ▽ More

    Submitted 2 September, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: 20 pages; 9 figures; 5 tables

  29. arXiv:2105.10438  [pdf, other

    cs.CV

    Compositional Fine-Grained Low-Shot Learning

    Authors: Dat Huynh, Ehsan Elhamifar

    Abstract: We develop a novel compositional generative model for zero- and few-shot learning to recognize fine-grained classes with a few or no training samples. Our key observation is that generating holistic features for fine-grained classes fails to capture small attribute differences between classes. Therefore, we propose a feature composition framework that learns to extract attribute features from trai… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

  30. arXiv:2010.10343  [pdf, other

    cs.LG cs.AI cs.DB

    Provenance Graph Kernel

    Authors: David Kohan Marzagão, Trung Dong Huynh, Ayah Helal, Sean Baccas, Luc Moreau

    Abstract: Provenance is a record that describes how entities, activities, and agents have influenced a piece of data; it is commonly represented as graphs with relevant labels on both their nodes and edges. With the growing adoption of provenance in a wide range of application domains, users are increasingly confronted with an abundance of graph data, which may prove challenging to process. Graph kernels, o… ▽ More

    Submitted 14 September, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: 14 pages

    ACM Class: I.2.6

  31. Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy

    Authors: Sharib Ali, Mariia Dmitrieva, Noha Ghatwary, Sophia Bano, Gorkem Polat, Alptekin Temizel, Adrian Krenzer, Amar Hekalo, Yun Bo Guo, Bogdan Matuszewski, Mourad Gridach, Irina Voiculescu, Vishnusai Yoganand, Arnav Chavan, Aryan Raj, Nhan T. Nguyen, Dat Q. Tran, Le Duy Huynh, Nicolas Boutry, Shahadate Rezvy, Haijian Chen, Yoon Ho Choi, Anand Subramanian, Velmurugan Balasubramanian, Xiaohong W. Gao , et al. (12 additional authors not shown)

    Abstract: The Endoscopy Computer Vision Challenge (EndoCV) is a crowd-sourcing initiative to address eminent problems in developing reliable computer aided detection and diagnosis endoscopy systems and suggest a pathway for clinical translation of technologies. Whilst endoscopy is a widely used diagnostic and treatment tool for hollow-organs, there are several core challenges often faced by endoscopists, ma… ▽ More

    Submitted 17 February, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 32 pages

  32. arXiv:2010.05507  [pdf, other

    cs.CV

    Scene Gated Social Graph: Pedestrian Trajectory Prediction Based on Dynamic Social Graphs and Scene Constraints

    Authors: Hao Xue, Du Q. Huynh, Mark Reynolds

    Abstract: Pedestrian trajectory prediction is valuable for understanding human motion behaviors and it is challenging because of the social influence from other pedestrians, the scene constraints and the multimodal possibilities of predicted trajectories. Most existing methods only focus on two of the above three key elements. In order to jointly consider all these elements, we propose a novel trajectory pr… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  33. arXiv:2010.01258  [pdf, ps, other

    cs.IR cs.LG

    Hit ratio: An Evaluation Metric for Hashtag Recommendation

    Authors: Areej Alsini, Du Q. Huynh, Amitava Datta

    Abstract: Hashtag recommendation is a crucial task, especially with an increase of interest in using social media platforms such as Twitter in the last decade. Hashtag recommendation systems automatically suggest hashtags to a user while writing a tweet. Most of the research in the area of hashtag recommendation have used classical metrics such as hit rate, precision, recall, and F1-score to measure the acc… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

  34. arXiv:2009.13060  [pdf, other

    cs.CL

    A Simple and Efficient Ensemble Classifier Combining Multiple Neural Network Models on Social Media Datasets in Vietnamese

    Authors: Huy Duc Huynh, Hang Thi-Thuy Do, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

    Abstract: Text classification is a popular topic of natural language processing, which has currently attracted numerous research efforts worldwide. The significant increase of data in social media requires the vast attention of researchers to analyze such data. There are various studies in this field in many languages but limited to the Vietnamese language. Therefore, this study aims to classify Vietnamese… ▽ More

    Submitted 28 September, 2020; v1 submitted 28 September, 2020; originally announced September 2020.

    Comments: Accepted by The 34th Pacific Asia Conference on Language, Information and Computation (PACLIC2020)

  35. arXiv:2006.08299  [pdf, ps, other

    cs.LG stat.ML

    Cryptotree: fast and accurate predictions on encrypted structured data

    Authors: Daniel Huynh

    Abstract: Applying machine learning algorithms to private data, such as financial or medical data, while preserving their confidentiality, is a difficult task. Homomorphic Encryption (HE) is acknowledged for its ability to allow computation on encrypted data, where both the input and output are encrypted, which therefore enables secure inference on private data. Nonetheless, because of the constraints of HE… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  36. arXiv:2005.06305  [pdf, other

    cs.CV cs.LG cs.NE

    Binarizing MobileNet via Evolution-based Searching

    Authors: Hai Phan, Zechun Liu, Dang Huynh, Marios Savvides, Kwang-Ting Cheng, Zhiqiang Shen

    Abstract: Binary Neural Networks (BNNs), known to be one among the effectively compact network architectures, have achieved great outcomes in the visual tasks. Designing efficient binary architectures is not trivial due to the binary nature of the network. In this paper, we propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet, a compact network wi… ▽ More

    Submitted 15 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    Comments: Accepted by CVPR2020

  37. arXiv:2004.09760  [pdf, other

    cs.CV

    Take a NAP: Non-Autoregressive Prediction for Pedestrian Trajectories

    Authors: Hao Xue, Du. Q. Huynh, Mark Reynolds

    Abstract: Pedestrian trajectory prediction is a challenging task as there are three properties of human movement behaviors which need to be addressed, namely, the social influence from other pedestrians, the scene constraints, and the multimodal (multiroute) nature of predictions. Although existing methods have explored these key properties, the prediction process of these methods is autoregressive. This me… ▽ More

    Submitted 21 April, 2020; originally announced April 2020.

  38. VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

    Authors: Anjany Sekuboyina, Malek E. Husseini, Amirhossein Bayat, Maximilian Löffler, Hans Liebl, Hongwei Li, Giles Tetteh, Jan Kukačka, Christian Payer, Darko Štern, Martin Urschler, Maodong Chen, Dalong Cheng, Nikolas Lessmann, Yujin Hu, Tianfu Wang, Dong Yang, Daguang Xu, Felix Ambellan, Tamaz Amiranashvili, Moritz Ehlke, Hans Lamecker, Sebastian Lehnert, Marilia Lirio, Nicolás Pérez de Olaguer , et al. (44 additional authors not shown)

    Abstract: Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to co… ▽ More

    Submitted 5 April, 2022; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: Challenge report for the VerSe 2019 and 2020. Published in Medical Image Analysis (DOI: https://doi.org/10.1016/j.media.2021.102166)

    Journal ref: Medical Image Analysis, Volume 73, October 2021, 102166

  39. arXiv:1911.03648  [pdf, other

    cs.CL cs.LG

    Hate Speech Detection on Vietnamese Social Media Text using the Bidirectional-LSTM Model

    Authors: Hang Thi-Thuy Do, Huy Duc Huynh, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen, Anh Gia-Tuan Nguyen

    Abstract: In this paper, we describe our system which participates in the shared task of Hate Speech Detection on Social Networks of VLSP 2019 evaluation campaign. We are provided with the pre-labeled dataset and an unlabeled dataset for social media comments or posts. Our mission is to pre-process and build machine learning models to classify comments/posts. In this report, we use Bidirectional Long Short-… ▽ More

    Submitted 9 November, 2019; originally announced November 2019.

    Journal ref: VLSP Workshop 2019

  40. arXiv:1907.12629  [pdf, other

    cs.CV cs.LG

    MoBiNet: A Mobile Binary Network for Image Classification

    Authors: Hai Phan, Dang Huynh, Yihui He, Marios Savvides, Zhiqiang Shen

    Abstract: MobileNet and Binary Neural Networks are two among the most widely used techniques to construct deep learning models for performing a variety of tasks on mobile and embedded platforms.In this paper, we present a simple yet efficient scheme to exploit MobileNet binarization at activation function and model weights. However, training a binary network from scratch with separable depth-wise and point-… ▽ More

    Submitted 30 July, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

  41. arXiv:1906.11465  [pdf, other

    cs.CV cs.LG

    Loss Switching Fusion with Similarity Search for Video Classification

    Authors: Lei Wang, Du Q. Huynh, Moussa Reda Mansour

    Abstract: From video streaming to security and surveillance applications, video data play an important role in our daily living today. However, managing a large amount of video data and retrieving the most useful information for the user remain a challenging task. In this paper, we propose a novel video classification system that would benefit the scene understanding task. We define our classification probl… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: Accepted by ICIP 2019

  42. A Comparative Review of Recent Kinect-based Action Recognition Algorithms

    Authors: Lei Wang, Du Q. Huynh, Piotr Koniusz

    Abstract: Video-based human action recognition is currently one of the most active research areas in computer vision. Various research studies indicate that the performance of action recognition is highly dependent on the type of features being extracted and how the actions are represented. Since the release of the Kinect camera, a large number of Kinect-based human action recognition techniques have been p… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

    Comments: Accepted by the IEEE Transactions on Image Processing

  43. arXiv:1906.05910  [pdf, other

    cs.CV

    Hallucinating IDT Descriptors and I3D Optical Flow Features for Action Recognition with CNNs

    Authors: Lei Wang, Piotr Koniusz, Du Q. Huynh

    Abstract: In this paper, we revive the use of old-fashioned handcrafted video representations for action recognition and put new life into these techniques via a CNN-based hallucination step. Despite of the use of RGB and optical flow frames, the I3D model (amongst others) thrives on combining its output with the Improved Dense Trajectory (IDT) and extracted with its low-level video descriptors encoded via… ▽ More

    Submitted 18 August, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: First two authors contributed equally. This paper is accepted by ICCV'19

    Journal ref: ICCV 2019

  44. arXiv:1903.05160  [pdf, other

    math.NA cs.CE

    An extended polygonal finite element method for large deformation fracture analysis

    Authors: Hai D. Huynh, Phuong Tran, Xiaoying Zhuang, H. Nguyen-Xuan

    Abstract: The modeling of large deformation fracture mechanics has been a challenging problem regarding the accuracy of numerical methods and their ability to deal with considerable changes in deformations of meshes where having the presence of cracks. This paper further investigates the extended finite element method (XFEM) for the simulation of large strain fracture for hyper-elastic materials, in particu… ▽ More

    Submitted 20 March, 2019; v1 submitted 9 March, 2019; originally announced March 2019.

  45. arXiv:1809.07895  [pdf, other

    cs.CV

    Large-Scale Video Classification with Feature Space Augmentation coupled with Learned Label Relations and Ensembling

    Authors: Choongyeun Cho, Benjamin Antin, Sanchit Arora, Shwan Ashrafi, Peilin Duan, Dang The Huynh, Lee James, Hang Tuan Nguyen, Mojtaba Solgi, Cuong Van Than

    Abstract: This paper presents the Axon AI's solution to the 2nd YouTube-8M Video Understanding Challenge, achieving the final global average precision (GAP) of 88.733% on the private test set (ranked 3rd among 394 teams, not considering the model size constraint), and 87.287% using a model that meets size requirement. Two sets of 7 individual models belonging to 3 different families were trained separately.… ▽ More

    Submitted 20 September, 2018; originally announced September 2018.

  46. LiveRank: How to Refresh Old Datasets

    Authors: The Dang Huynh, Fabien Mathieu, Laurent Viennot

    Abstract: This paper considers the problem of refreshing a dataset. More precisely , given a collection of nodes gathered at some time (Web pages, users from an online social network) along with some structure (hyperlinks, social relationships), we want to identify a significant fraction of the nodes that still exist at present time. The liveness of an old node can be tested through an online query at prese… ▽ More

    Submitted 6 January, 2016; originally announced January 2016.

  47. arXiv:1501.06350  [pdf, ps, other

    cs.DS

    D-Iteration: diffusion approach for solving PageRank

    Authors: Dohy Hong, The Dang Huynh, Fabien Mathieu

    Abstract: In this paper we present a new method that can accelerate the computation of the PageRank importance vector. Our method, called D-Iteration (DI), is based on the decomposition of the matrix-vector product that can be seen as a fluid diffusion model and is potentially adapted to asynchronous implementation. We give theoretical results about the convergence of our algorithm and we show through exper… ▽ More

    Submitted 6 May, 2015; v1 submitted 26 January, 2015; originally announced January 2015.

  48. arXiv:1412.1820  [pdf, other

    cs.CL

    Context-Dependent Fine-Grained Entity Type Tagging

    Authors: Dan Gillick, Nevena Lazic, Kuzman Ganchev, Jesse Kirchner, David Huynh

    Abstract: Entity type tagging is the task of assigning category labels to each mention of an entity in a document. While standard systems focus on a small set of types, recent work (Ling and Weld, 2012) suggests that using a large fine-grained label set can lead to dramatic improvements in downstream tasks. In the absence of labeled training data, existing fine-grained tagging systems obtain examples automa… ▽ More

    Submitted 1 August, 2016; v1 submitted 3 December, 2014; originally announced December 2014.

  49. arXiv:1409.6813  [pdf, other

    cs.CV

    Histogram of Oriented Principal Components for Cross-View Action Recognition

    Authors: Hossein Rahmani, Arif Mahmood, Du Huynh, Ajmal Mian

    Abstract: Existing techniques for 3D action recognition are sensitive to viewpoint variations because they extract features from depth images which are viewpoint dependent. In contrast, we directly process pointclouds for cross-view action recognition from unknown and unseen views. We propose the Histogram of Oriented Principal Components (HOPC) descriptor that is robust to noise, viewpoint, scale and actio… ▽ More

    Submitted 3 September, 2015; v1 submitted 23 September, 2014; originally announced September 2014.

  50. arXiv:1408.3810  [pdf, other

    cs.CV

    Action Classification with Locality-constrained Linear Coding

    Authors: Hossein Rahmani, Arif Mahmood, Du Huynh, Ajmal Mian

    Abstract: We propose an action classification algorithm which uses Locality-constrained Linear Coding (LLC) to capture discriminative information of human body variations in each spatiotemporal subsequence of a video sequence. Our proposed method divides the input video into equally spaced overlapping spatiotemporal subsequences, each of which is decomposed into blocks and then cells. We use the Histogram o… ▽ More

    Submitted 22 September, 2014; v1 submitted 17 August, 2014; originally announced August 2014.

    Comments: ICPR 2014