Search | arXiv e-print repository

PIGUIQA: A Physical Imaging Guided Perceptual Framework for Underwater Image Quality Assessment

Authors: Weizhi Xian, Mingliang Zhou, Leong Hou U, Lang Shujun, Bin Fang, Tao Xiang, Zhaowei Shang, Weijia Jia

Abstract: In this paper, we propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA), termed PIGUIQA. First, we formulate UIQA as a comprehensive problem that considers the combined effects of direct transmission attenuation and backward scattering on image perception. By leveraging underwater radiative transfer theory, we systematically integrate physics-based i… ▽ More In this paper, we propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA), termed PIGUIQA. First, we formulate UIQA as a comprehensive problem that considers the combined effects of direct transmission attenuation and backward scattering on image perception. By leveraging underwater radiative transfer theory, we systematically integrate physics-based imaging estimations to establish quantitative metrics for these distortions. Second, recognizing spatial variations in image content significance and human perceptual sensitivity to distortions, we design a module built upon a neighborhood attention mechanism for local perception of images. This module effectively captures subtle features in images, thereby enhancing the adaptive perception of distortions on the basis of local information. Third, by employing a global perceptual aggregator that further integrates holistic image scene with underwater distortion information, the proposed model accurately predicts image quality scores. Extensive experiments across multiple benchmarks demonstrate that PIGUIQA achieves state-of-the-art performance while maintaining robust cross-dataset generalizability. The implementation is publicly available at https://anonymous.4open.science/r/PIGUIQA-A465/ △ Less

Submitted 5 March, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.15302 [pdf, other]

doi 10.1609/aaai.v39i12.33466

Tokenphormer: Structure-aware Multi-token Graph Transformer for Node Classification

Authors: Zijie Zhou, Zhaoqi Lu, Xuekai Wei, Rongqin Chen, Shenghui Zhang, Pak Lon Ip, Leong Hou U

Abstract: Graph Neural Networks (GNNs) are widely used in graph data mining tasks. Traditional GNNs follow a message passing scheme that can effectively utilize local and structural information. However, the phenomena of over-smoothing and over-squashing limit the receptive field in message passing processes. Graph Transformers were introduced to address these issues, achieving a global receptive field but… ▽ More Graph Neural Networks (GNNs) are widely used in graph data mining tasks. Traditional GNNs follow a message passing scheme that can effectively utilize local and structural information. However, the phenomena of over-smoothing and over-squashing limit the receptive field in message passing processes. Graph Transformers were introduced to address these issues, achieving a global receptive field but suffering from the noise of irrelevant nodes and loss of structural information. Therefore, drawing inspiration from fine-grained token-based representation learning in Natural Language Processing (NLP), we propose the Structure-aware Multi-token Graph Transformer (Tokenphormer), which generates multiple tokens to effectively capture local and structural information and explore global information at different levels of granularity. Specifically, we first introduce the walk-token generated by mixed walks consisting of four walk types to explore the graph and capture structure and contextual information flexibly. To ensure local and global information coverage, we also introduce the SGPM-token (obtained through the Self-supervised Graph Pre-train Model, SGPM) and the hop-token, extending the length and density limit of the walk-token, respectively. Finally, these expressive tokens are fed into the Transformer model to learn node representations collaboratively. Experimental results demonstrate that the capability of the proposed Tokenphormer can achieve state-of-the-art performance on node classification tasks. △ Less

Submitted 11 April, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

Comments: Accepted by AAAI 2025

Report number: Vol. 39 No. 12: AAAI-25 Technical Tracks 12

Journal ref: Proc. AAAI Conf. Artif. Intell. 39 (2025) 13428-13436

arXiv:2411.00874 [pdf, other]

VecCity: A Taxonomy-guided Library for Map Entity Representation Learning

Authors: Wentao Zhang, Jingyuan Wang, Yifan Yang, Leong Hou U

Abstract: Electronic maps consist of diverse entities, such as points of interest (POIs), road networks, and land parcels, playing a vital role in applications like ITS and LBS. Map entity representation learning (MapRL) generates versatile and reusable data representations, providing essential tools for efficiently managing and utilizing map entity data. Despite the progress in MapRL, two key challenges co… ▽ More Electronic maps consist of diverse entities, such as points of interest (POIs), road networks, and land parcels, playing a vital role in applications like ITS and LBS. Map entity representation learning (MapRL) generates versatile and reusable data representations, providing essential tools for efficiently managing and utilizing map entity data. Despite the progress in MapRL, two key challenges constrain further development. First, existing research is fragmented, with models classified by the type of map entity, limiting the reusability of techniques across different tasks. Second, the lack of unified benchmarks makes systematic evaluation and comparison of models difficult. To address these challenges, we propose a novel taxonomy for MapRL that organizes models based on functional module-such as encoders, pre-training tasks, and downstream tasks-rather than by entity type. Building on this taxonomy, we present a taxonomy-driven library, VecCity, which offers easy-to-use interfaces for encoding, pre-training, fine-tuning, and evaluation. The library integrates datasets from nine cities and reproduces 21 mainstream MapRL models, establishing the first standardized benchmarks for the field. VecCity also allows users to modify and extend models through modular components, facilitating seamless experimentation. Our comprehensive experiments cover multiple types of map entities and evaluate 21 VecCity pre-built models across various downstream tasks. Experimental results demonstrate the effectiveness of VecCity in streamlining model development and provide insights into the impact of various components on performance. By promoting modular design and reusability, VecCity offers a unified framework to advance research and innovation in MapRL. The code is available at https://github.com/Bigscity-VecCity/VecCity. △ Less

Submitted 7 May, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

arXiv:2410.20487 [pdf, other]

Efficient Diversity-based Experience Replay for Deep Reinforcement Learning

Authors: Kaiyan Zhao, Yiming Wang, Yuyang Chen, Yan Li, Leong Hou U, Xiaoguang Niu

Abstract: Experience replay is widely used to improve learning efficiency in reinforcement learning by leveraging past experiences. However, existing experience replay methods, whether based on uniform or prioritized sampling, often suffer from low efficiency, particularly in real-world scenarios with high-dimensional state spaces. To address this limitation, we propose a novel approach, Efficient Diversity… ▽ More Experience replay is widely used to improve learning efficiency in reinforcement learning by leveraging past experiences. However, existing experience replay methods, whether based on uniform or prioritized sampling, often suffer from low efficiency, particularly in real-world scenarios with high-dimensional state spaces. To address this limitation, we propose a novel approach, Efficient Diversity-based Experience Replay (EDER). EDER employs a determinantal point process to model the diversity between samples and prioritizes replay based on the diversity between samples. To further enhance learning efficiency, we incorporate Cholesky decomposition for handling large state spaces in realistic environments. Additionally, rejection sampling is applied to select samples with higher diversity, thereby improving overall learning efficacy. Extensive experiments are conducted on robotic manipulation tasks in MuJoCo, Atari games, and realistic indoor environments in Habitat. The results demonstrate that our approach not only significantly improves learning efficiency but also achieves superior performance in high-dimensional, realistic environments. △ Less

Submitted 18 May, 2025; v1 submitted 27 October, 2024; originally announced October 2024.

arXiv:2407.00936 [pdf, other]

Large Language Model Enhanced Knowledge Representation Learning: A Survey

Authors: Xin Wang, Zirui Chen, Haofen Wang, Leong Hou U, Zhao Li, Wenbin Guo

Abstract: Knowledge Representation Learning (KRL) is crucial for enabling applications of symbolic knowledge from Knowledge Graphs (KGs) to downstream tasks by projecting knowledge facts into vector spaces. Despite their effectiveness in modeling KG structural information, KRL methods are suffering from the sparseness of KGs. The rise of Large Language Models (LLMs) built on the Transformer architecture pre… ▽ More Knowledge Representation Learning (KRL) is crucial for enabling applications of symbolic knowledge from Knowledge Graphs (KGs) to downstream tasks by projecting knowledge facts into vector spaces. Despite their effectiveness in modeling KG structural information, KRL methods are suffering from the sparseness of KGs. The rise of Large Language Models (LLMs) built on the Transformer architecture presents promising opportunities for enhancing KRL by incorporating textual information to address information sparsity in KGs. LLM-enhanced KRL methods, including three key approaches, encoder-based methods that leverage detailed contextual information, encoder-decoder-based methods that utilize a unified Seq2Seq model for comprehensive encoding and decoding, and decoder-based methods that utilize extensive knowledge from large corpora, have significantly advanced the effectiveness and generalization of KRL in addressing a wide range of downstream tasks. This work provides a broad overview of downstream tasks while simultaneously identifying emerging research directions in these evolving domains. △ Less

Submitted 8 April, 2025; v1 submitted 30 June, 2024; originally announced July 2024.

arXiv:2305.15932 [pdf, other]

BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering

Authors: Jie He, Simon Chi Lok U, Víctor Gutiérrez-Basulto, Jeff Z. Pan

Abstract: Unsupervised commonsense reasoning (UCR) is becoming increasingly popular as the construction of commonsense reasoning datasets is expensive, and they are inevitably limited in their scope. A popular approach to UCR is to fine-tune language models with external knowledge (e.g., knowledge graphs), but this usually requires a large number of training examples. In this paper, we propose to transform… ▽ More Unsupervised commonsense reasoning (UCR) is becoming increasingly popular as the construction of commonsense reasoning datasets is expensive, and they are inevitably limited in their scope. A popular approach to UCR is to fine-tune language models with external knowledge (e.g., knowledge graphs), but this usually requires a large number of training examples. In this paper, we propose to transform the downstream multiple choice question answering task into a simpler binary classification task by ranking all candidate answers according to their reasonableness. To this end, for training the model, we convert the knowledge graph triples into reasonable and unreasonable texts. Extensive experimental results show the effectiveness of our approach on various multiple choice question answering benchmarks. Furthermore, compared with existing UCR approaches using KGs, ours is less data hungry. Our code is available at https://github.com/probe2/BUCA. △ Less

Submitted 11 April, 2025; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: ACL 2023

arXiv:2203.16864 [pdf]

Saving lives: design and implementation of lifeline emergency ad hoc network

Authors: Se-Hang Cheong, Yain-Whar Si, Leong-Hou U

Abstract: This paper aims to propose a system for automatically forming ad hoc networks using mobile phones and battery-powered wireless routers for emergency situations. The system also provides functions to send emergency messages and identify the location of victims based on the network topology information. Optimized link state routing protocol is used to instantly form an ad hoc emergency network bas… ▽ More This paper aims to propose a system for automatically forming ad hoc networks using mobile phones and battery-powered wireless routers for emergency situations. The system also provides functions to send emergency messages and identify the location of victims based on the network topology information. Optimized link state routing protocol is used to instantly form an ad hoc emergency network based on WiFi signals from mobile phones of the victims, backup battery-powered wireless routers preinstalled in buildings and mobile devices deployed by search and rescue teams. The proposed system is also designed to recover from partial crash of network and nodes lost. Experimental results demonstrate the effectiveness of the proposed system in terms of battery life, transmission distance and noises. A novel message routing schedule is proposed for conserving battery life. A novel function to estimate the location of a mobile device which sent an emergency message is proposed in this paper. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:2203.16857 [pdf]

Lifeline: Emergency Ad Hoc Network

Authors: Se-Hang Cheong, Kai-Ip Lee, Yain-Whar Si, Leong-Hou U

Abstract: Lifeline is a group of systems designed for mobile phones and battery powered wireless routers for forming emergency Ad hoc networks. Devices installed with Lifeline program can automatically form Ad hoc networks when cellular signal is unavailable or disrupted during natural disasters. For instance, large scale earthquakes can cause extensive damages to land-based telecommunication infrastructure… ▽ More Lifeline is a group of systems designed for mobile phones and battery powered wireless routers for forming emergency Ad hoc networks. Devices installed with Lifeline program can automatically form Ad hoc networks when cellular signal is unavailable or disrupted during natural disasters. For instance, large scale earthquakes can cause extensive damages to land-based telecommunication infrastructures. In such circumstances, mobile phones installed with Lifeline program can be used to send emergency messages by the victims who are trapped under collapsed buildings. In addition, Lifeline also provides a function for the rescuers to estimate the positions of the victims based on network propagation techniques. Lifeline also has the ability to recover from partial crash of network and nodes lost. △ Less

Submitted 31 March, 2022; originally announced March 2022.

arXiv:1911.01042 [pdf, other]

A General Early-Stopping Module for Crowdsourced Ranking

Authors: Caihua Shan, Leong Hou U, Nikos Mamoulis, Reynold Cheng, Xiang Li

Abstract: Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the… ▽ More Crowdsourcing can be used to determine a total order for an object set (e.g., the top-10 NBA players) based on crowd opinions. This ranking problem is often decomposed into a set of microtasks (e.g., pairwise comparisons). These microtasks are passed to a large number of workers and their answers are aggregated to infer the ranking. The number of microtasks depends on the budget allocated for the problem. Intuitively, the higher the number of microtask answers, the more accurate the ranking becomes. However, it is often hard to decide the budget required for an accurate ranking. We study how a ranking process can be terminated early, and yet achieve a high-quality ranking and great savings in the budget. We use statistical tools to estimate the quality of the ranking result at any stage of the crowdsourcing process and terminate the process as soon as the desired quality is achieved. Our proposed early-stopping module can be seamlessly integrated with most existing inference algorithms and task assignment methods. We conduct extensive experiments and show that our early-stopping module is better than other existing general stopping criteria. We also implement a prototype system to demonstrate the usability and effectiveness of our approach in practice. △ Less

Submitted 4 November, 2019; originally announced November 2019.

Showing 1–9 of 9 results for author: U, L