-
Hierarchical Neural Collapse Detection Transformer for Class Incremental Object Detection
Authors:
Duc Thanh Pham,
Hong Dang Nguyen,
Nhat Minh Nguyen Quoc,
Linh Ngo Van,
Sang Dinh Viet,
Duc Anh Nguyen
Abstract:
Recently, object detection models have witnessed notable performance improvements, particularly with transformer-based models. However, new objects frequently appear in the real world, requiring detection models to continually learn without suffering from catastrophic forgetting. Although Incremental Object Detection (IOD) has emerged to address this challenge, these existing models are still not…
▽ More
Recently, object detection models have witnessed notable performance improvements, particularly with transformer-based models. However, new objects frequently appear in the real world, requiring detection models to continually learn without suffering from catastrophic forgetting. Although Incremental Object Detection (IOD) has emerged to address this challenge, these existing models are still not practical due to their limited performance and prolonged inference time. In this paper, we introduce a novel framework for IOD, called Hier-DETR: Hierarchical Neural Collapse Detection Transformer, ensuring both efficiency and competitive performance by leveraging Neural Collapse for imbalance dataset and Hierarchical relation of classes' labels.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Humanity's Last Exam
Authors:
Long Phan,
Alice Gatti,
Ziwen Han,
Nathaniel Li,
Josephina Hu,
Hugh Zhang,
Chen Bo Calvin Zhang,
Mohamed Shaaban,
John Ling,
Sean Shi,
Michael Choi,
Anish Agrawal,
Arnav Chopra,
Adam Khoja,
Ryan Kim,
Richard Ren,
Jason Hausenloy,
Oliver Zhang,
Mantas Mazeika,
Dmitry Dodonov,
Tung Nguyen,
Jaeho Lee,
Daron Anderson,
Mikhail Doroshenko,
Alun Cennyth Stokes
, et al. (1084 additional authors not shown)
Abstract:
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of…
▽ More
Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai.
△ Less
Submitted 19 April, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
Gendec: A Machine Learning-based Framework for Gender Detection from Japanese Names
Authors:
Duong Tien Pham,
Luan Thanh Nguyen
Abstract:
Every human has their own name, a fundamental aspect of their identity and cultural heritage. The name often conveys a wealth of information, including details about an individual's background, ethnicity, and, especially, their gender. By detecting gender through the analysis of names, researchers can unlock valuable insights into linguistic patterns and cultural norms, which can be applied to pra…
▽ More
Every human has their own name, a fundamental aspect of their identity and cultural heritage. The name often conveys a wealth of information, including details about an individual's background, ethnicity, and, especially, their gender. By detecting gender through the analysis of names, researchers can unlock valuable insights into linguistic patterns and cultural norms, which can be applied to practical applications. Hence, this work presents a novel dataset for Japanese name gender detection comprising 64,139 full names in romaji, hiragana, and kanji forms, along with their biological genders. Moreover, we propose Gendec, a framework for gender detection from Japanese names that leverages diverse approaches, including traditional machine learning techniques or cutting-edge transfer learning models, to predict the gender associated with Japanese names accurately. Through a thorough investigation, the proposed framework is expected to be effective and serve potential applications in various domains.
△ Less
Submitted 18 November, 2023;
originally announced November 2023.
-
AlertTrap: A study on object detection in remote insects trap monitoring system using on-the-edge deep learning platform
Authors:
An D. Le,
Duy A. Pham,
Dong T. Pham,
Hien B. Vo
Abstract:
Fruit flies are one of the most harmful insect species to fruit yields. In AlertTrap, implementation of Single-Shot Multibox Detector (SSD) architecture with different state-of-the-art backbone feature extractors such as MobileNetV1 and MobileNetV2 appears to be potential solutions for the real-time detection problem. SSD-MobileNetV1 and SSD-MobileNetV2 perform well and result in AP at 0.5 of 0.95…
▽ More
Fruit flies are one of the most harmful insect species to fruit yields. In AlertTrap, implementation of Single-Shot Multibox Detector (SSD) architecture with different state-of-the-art backbone feature extractors such as MobileNetV1 and MobileNetV2 appears to be potential solutions for the real-time detection problem. SSD-MobileNetV1 and SSD-MobileNetV2 perform well and result in AP at 0.5 of 0.957 and 1.0, respectively. You Only Look Once (YOLO) v4-tiny outperforms the SSD family with 1.0 in AP at 0.5; however, its throughput velocity is considerably slower, which shows SSD models are better candidates for real-time implementation. We also tested the models with synthetic test sets simulating expected environmental disturbances. The YOLOv4-tiny had better tolerance to these disturbances than the SSD models. The Raspberry Pi system successfully gathered environmental data and pest counts, sending them via email over 4 G. However, running the full YOLO version in real time on Raspberry Pi is not feasible, indicating the need for a lighter object detection algorithm for future research. Among model candidates, YOLOv4-tiny generally performs best, with SSD-MobileNetV2 also comparable and sometimes better, especially in scenarios with synthetic disturbances. SSD models excel in processing time, enabling real-time, high-accuracy detection.
△ Less
Submitted 10 April, 2025; v1 submitted 26 December, 2021;
originally announced December 2021.
-
Generative Adversarial Network (GAN) and Enhanced Root Mean Square Error (ERMSE): Deep Learning for Stock Price Movement Prediction
Authors:
Ashish Kumar,
Abeer Alsadoon,
P. W. C. Prasad,
Salma Abdullah,
Tarik A. Rashid,
Duong Thu Hang Pham,
Tran Quoc Vinh Nguyen
Abstract:
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be p…
▽ More
The prediction of stock price movement direction is significant in financial circles and academic. Stock price contains complex, incomplete, and fuzzy information which makes it an extremely difficult task to predict its development trend. Predicting and analysing financial data is a nonlinear, time-dependent problem. With rapid development in machine learning and deep learning, this task can be performed more effectively by a purposely designed network. This paper aims to improve prediction accuracy and minimizing forecasting error loss through deep learning architecture by using Generative Adversarial Networks. It was proposed a generic model consisting of Phase-space Reconstruction (PSR) method for reconstructing price series and Generative Adversarial Network (GAN) which is a combination of two neural networks which are Long Short-Term Memory (LSTM) as Generative model and Convolutional Neural Network (CNN) as Discriminative model for adversarial training to forecast the stock market. LSTM will generate new instances based on historical basic indicators information and then CNN will estimate whether the data is predicted by LSTM or is real. It was found that the Generative Adversarial Network (GAN) has performed well on the enhanced root mean square error to LSTM, as it was 4.35% more accurate in predicting the direction and reduced processing time and RMSE by 78 secs and 0.029, respectively. This study provides a better result in the accuracy of the stock index. It seems that the proposed system concentrates on minimizing the root mean square error and processing time and improving the direction prediction accuracy, and provides a better result in the accuracy of the stock index.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
Supporting Multiprocessor Resource Synchronization Protocols in RTEMS
Authors:
Junjie Shi,
Jan Duy Thien Pham,
Malte Münch,
Jan Viktor Hafemeister,
Jian-Jia Chen,
Kuan-Hsun Chen
Abstract:
When considering recurrent tasks in real-time systems, concurrent accesses to shared resources, can cause race conditions or data corruptions. Such a problem has been extensively studied since the 1990s, and numerous resource synchronization protocols have been developed for both uni-processor and multiprocessor real-time systems, with the assumption that the implementation overheads are negligibl…
▽ More
When considering recurrent tasks in real-time systems, concurrent accesses to shared resources, can cause race conditions or data corruptions. Such a problem has been extensively studied since the 1990s, and numerous resource synchronization protocols have been developed for both uni-processor and multiprocessor real-time systems, with the assumption that the implementation overheads are negligible. However, in practice, the implementation overheads may impact the performance of different protocols depending upon the practiced scenarios, e.g., resources are accessed locally or remotely, and tasks spin or suspend themselves when the requested resources are not available. In this paper, to show the applicability of different protocols in real-world systems, we detail the implementation of several state-of-the-art multiprocessor resource synchronization protocols in RTEMS. To study the impact of the implementation overheads, we deploy these implemented protocols on a real platform with synthetic task set. The measured results illustrate that the developed resource synchronization protocols in RTEMS are comparable to the existed protocol, i.e., MrsP.
△ Less
Submitted 20 June, 2022; v1 submitted 13 April, 2021;
originally announced April 2021.
-
Development of a Fuzzy-based Patrol Robot Using in Building Automation System
Authors:
Thi Thanh Van Nguyen,
Manh Duong Phung,
Dinh Tuan Pham,
Quang Vinh Tran
Abstract:
A Building Automation System (BAS) has functions of monitoring and controlling the operation of all building sub-systems such as HVAC (Heating-Ventilation, Air-conditioning Control), electric consumption management, fire alarm control, security and access control, and appliance switching control. In the BAS, almost operations are automatically performed at the control centre, the building security…
▽ More
A Building Automation System (BAS) has functions of monitoring and controlling the operation of all building sub-systems such as HVAC (Heating-Ventilation, Air-conditioning Control), electric consumption management, fire alarm control, security and access control, and appliance switching control. In the BAS, almost operations are automatically performed at the control centre, the building security therefore must be strictly protected. In the traditional system, the security is usually ensured by a number of cameras installed at fixed positions and it may results in a limited vision. To overcome this disadvantage, our paper presents a novel security system in which a mobile robot is used as a patrol. The robot is equipped with fuzzy-based algorithms to allow it to avoid the obstacles in an unknown environment as well as other necessary mechanisms demanded for its patrol mission. The experiment results show that the system satisfies the requirements for the objective of monitoring and securing the building.
△ Less
Submitted 13 May, 2020;
originally announced June 2020.
-
Recognition of 26 Degrees of Freedom of Hands Using Model-based approach and Depth-Color Images
Authors:
Cong Hoang Quach,
Minh Trien Pham,
Anh Viet Dang,
Dinh Tuan Pham,
Thuan Hoang Tran,
Manh Duong Phung
Abstract:
In this study, we present an model-based approach to recognize full 26 degrees of freedom of a human hand. Input data include RGB-D images acquired from a Kinect camera and a 3D model of the hand constructed from its anatomy and graphical matrices. A cost function is then defined so that its minimum value is achieved when the model and observation images are matched. To solve the optimization prob…
▽ More
In this study, we present an model-based approach to recognize full 26 degrees of freedom of a human hand. Input data include RGB-D images acquired from a Kinect camera and a 3D model of the hand constructed from its anatomy and graphical matrices. A cost function is then defined so that its minimum value is achieved when the model and observation images are matched. To solve the optimization problem in 26 dimensional space, the particle swarm optimization algorimth with improvements are used. In addition, parallel computation in graphical processing units (GPU) is utilized to handle computationally expensive tasks. Simulation and experimental results show that the system can recognize 26 degrees of freedom of hands with the processing time of 0.8 seconds per frame. The algorithm is robust to noise and the hardware requirement is simple with a single camera.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Adaptive neural network based dynamic surface control for uncertain dual arm robots
Authors:
Dung Tien Pham,
Thai Van Nguyen,
Hai Xuan Le,
Linh Nguyen,
Nguyen Huu Thai,
Tuan Anh Phan,
Hai Tuan Pham,
Anh Hoai Duong
Abstract:
The paper discusses an adaptive strategy to effectively control nonlinear manipulation motions of a dual arm robot (DAR) under system uncertainties including parameter variations, actuator nonlinearities and external disturbances. It is proposed that the control scheme is first derived from the dynamic surface control (DSC) method, which allows the robot's end-effectors to robustly track the desir…
▽ More
The paper discusses an adaptive strategy to effectively control nonlinear manipulation motions of a dual arm robot (DAR) under system uncertainties including parameter variations, actuator nonlinearities and external disturbances. It is proposed that the control scheme is first derived from the dynamic surface control (DSC) method, which allows the robot's end-effectors to robustly track the desired trajectories. Moreover, since exactly determining the DAR system's dynamics is impractical due to the system uncertainties, the uncertain system parameters are then proposed to be adaptively estimated by the use of the radial basis function network (RBFN). The adaptation mechanism is derived from the Lyapunov theory, which theoretically guarantees stability of the closed-loop control system. The effectiveness of the proposed RBFN-DSC approach is demonstrated by implementing the algorithm in a synthetic environment with realistic parameters, where the obtained results are highly promising.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.