Search | arXiv e-print repository

Can Large Language Model Agents Balance Energy Systems?

Authors: Xinxing Ren, Chun Sing Lai, Gareth Taylor, Zekun Guo

Abstract: This paper presents a hybrid approach that integrates Large Language Models (LLMs) with a multi-scenario Stochastic Unit Commitment (SUC) framework to enhance both efficiency and reliability under high wind generation uncertainties. In a 10-trial study on the test energy system, the traditional SUC approach incurs an average total cost of 187.68 million dollars, whereas the LLM-assisted SUC (LLM-S… ▽ More This paper presents a hybrid approach that integrates Large Language Models (LLMs) with a multi-scenario Stochastic Unit Commitment (SUC) framework to enhance both efficiency and reliability under high wind generation uncertainties. In a 10-trial study on the test energy system, the traditional SUC approach incurs an average total cost of 187.68 million dollars, whereas the LLM-assisted SUC (LLM-SUC) achieves a mean cost of 185.58 million dollars (range: 182.61 to 188.65 million dollars), corresponding to a cost reduction of 1.1 to 2.7 percent. Furthermore, LLM-SUC reduces load curtailment by 26.3 percent (2.24 plus/minus 0.31 GWh versus 3.04 GWh for SUC), while both methods maintain zero wind curtailment. Detailed temporal analysis shows that LLM-SUC achieves lower costs in the majority of time intervals and consistently outperforms SUC in 90 percent of cases, with solutions clustering in a favorable cost-reliability region (Coefficient of Variation = 0.93 percent for total cost and 13.8 percent for load curtailment). By leveraging an LLM agent to guide generator commitment decisions and dynamically adjust to stochastic conditions, the proposed framework improves demand fulfillment and operational resilience. △ Less

Submitted 30 March, 2025; v1 submitted 14 February, 2025; originally announced February 2025.

arXiv:2412.00085 [pdf]

Residual Attention Single-Head Vision Transformer Network for Rolling Bearing Fault Diagnosis in Noisy Environments

Authors: Songjiang Lai, Tsun-Hin Cheung, Jiayi Zhao, Kaiwen Xue, Ka-Chun Fung, Kin-Man Lam

Abstract: Rolling bearings play a crucial role in industrial machinery, directly influencing equipment performance, durability, and safety. However, harsh operating conditions, such as high speeds and temperatures, often lead to bearing malfunctions, resulting in downtime, economic losses, and safety hazards. This paper proposes the Residual Attention Single-Head Vision Transformer Network (RA-SHViT-Net) fo… ▽ More Rolling bearings play a crucial role in industrial machinery, directly influencing equipment performance, durability, and safety. However, harsh operating conditions, such as high speeds and temperatures, often lead to bearing malfunctions, resulting in downtime, economic losses, and safety hazards. This paper proposes the Residual Attention Single-Head Vision Transformer Network (RA-SHViT-Net) for fault diagnosis in rolling bearings. Vibration signals are transformed from the time to frequency domain using the Fast Fourier Transform (FFT) before being processed by RA-SHViT-Net. The model employs the Single-Head Vision Transformer (SHViT) to capture local and global features, balancing computational efficiency and predictive accuracy. To enhance feature extraction, the Adaptive Hybrid Attention Block (AHAB) integrates channel and spatial attention mechanisms. The network architecture includes Depthwise Convolution, Single-Head Self-Attention, Residual Feed-Forward Networks (Res-FFN), and AHAB modules, ensuring robust feature representation and mitigating gradient vanishing issues. Evaluation on the Case Western Reserve University and Paderborn University datasets demonstrates the RA-SHViT-Net's superior accuracy and robustness in complex, noisy environments. Ablation studies further validate the contributions of individual components, establishing RA-SHViT-Net as an effective tool for early fault detection and classification, promoting efficient maintenance strategies in industrial settings. Keywords: rolling bearings, fault diagnosis, Vision Transformer, attention mechanism, noisy environments, Fast Fourier Transform (FFT) △ Less

Submitted 26 November, 2024; originally announced December 2024.

Comments: 24 pages, 14 figures, 3 tables

arXiv:2411.18003 [pdf, other]

HAAT: Hybrid Attention Aggregation Transformer for Image Super-Resolution

Authors: Song-Jiang Lai, Tsun-Hin Cheung, Ka-Chun Fung, Kai-wen Xue, Kin-Man Lam

Abstract: In the research area of image super-resolution, Swin-transformer-based models are favored for their global spatial modeling and shifting window attention mechanism. However, existing methods often limit self-attention to non overlapping windows to cut costs and ignore the useful information that exists across channels. To address this issue, this paper introduces a novel model, the Hybrid Attentio… ▽ More In the research area of image super-resolution, Swin-transformer-based models are favored for their global spatial modeling and shifting window attention mechanism. However, existing methods often limit self-attention to non overlapping windows to cut costs and ignore the useful information that exists across channels. To address this issue, this paper introduces a novel model, the Hybrid Attention Aggregation Transformer (HAAT), designed to better leverage feature information. HAAT is constructed by integrating Swin-Dense-Residual-Connected Blocks (SDRCB) with Hybrid Grid Attention Blocks (HGAB). SDRCB expands the receptive field while maintaining a streamlined architecture, resulting in enhanced performance. HGAB incorporates channel attention, sparse attention, and window attention to improve nonlocal feature fusion and achieve more visually compelling results. Experimental evaluations demonstrate that HAAT surpasses state-of-the-art methods on benchmark datasets. Keywords: Image super-resolution, Computer vision, Attention mechanism, Transformer △ Less

Submitted 10 December, 2024; v1 submitted 26 November, 2024; originally announced November 2024.

Comments: 6 pages, 2 figures, 1 table

arXiv:2406.12299 [pdf, other]

Exploiting and Securing ML Solutions in Near-RT RIC: A Perspective of an xApp

Authors: Thusitha Dayaratne, Viet Vo, Shangqi Lai, Sharif Abuadbba, Blake Haydon, Hajime Suzuki, Xingliang Yuan, Carsten Rudolph

Abstract: Open Radio Access Networks (O-RAN) are emerging as a disruptive technology, revolutionising traditional mobile network architecture and deployments in the current 5G and the upcoming 6G era. Disaggregation of network architecture, inherent support for AI/ML workflows, cloud-native principles, scalability, and interoperability make O-RAN attractive to network providers for beyond-5G and 6G deployme… ▽ More Open Radio Access Networks (O-RAN) are emerging as a disruptive technology, revolutionising traditional mobile network architecture and deployments in the current 5G and the upcoming 6G era. Disaggregation of network architecture, inherent support for AI/ML workflows, cloud-native principles, scalability, and interoperability make O-RAN attractive to network providers for beyond-5G and 6G deployments. Notably, the ability to deploy custom applications, including Machine Learning (ML) solutions as xApps or rApps on the RAN Intelligent Controllers (RICs), has immense potential for network function and resource optimisation. However, the openness, nascent standards, and distributed architecture of O-RAN and RICs introduce numerous vulnerabilities exploitable through multiple attack vectors, which have not yet been fully explored. To address this gap and ensure robust systems before large-scale deployments, this work analyses the security of ML-based applications deployed on the RIC platform. We focus on potential attacks, defence mechanisms, and pave the way for future research towards a more robust RIC platform. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2306.04032 [pdf, other]

BokehOrNot: Transforming Bokeh Effect with Image Transformer and Lens Metadata Embedding

Authors: Zhihao Yang, Wenyi Lian, Siyuan Lai

Abstract: Bokeh effect is an optical phenomenon that offers a pleasant visual experience, typically generated by high-end cameras with wide aperture lenses. The task of bokeh effect transformation aims to produce a desired effect in one set of lenses and apertures based on another combination. Current models are limited in their ability to render a specific set of bokeh effects, primarily transformations fr… ▽ More Bokeh effect is an optical phenomenon that offers a pleasant visual experience, typically generated by high-end cameras with wide aperture lenses. The task of bokeh effect transformation aims to produce a desired effect in one set of lenses and apertures based on another combination. Current models are limited in their ability to render a specific set of bokeh effects, primarily transformations from sharp to blur. In this paper, we propose a novel universal method for embedding lens metadata into the model and introducing a loss calculation method using alpha masks from the newly released Bokeh Effect Transformation Dataset(BETD) [3]. Based on the above techniques, we propose the BokehOrNot model, which is capable of producing both blur-to-sharp and sharp-to-blur bokeh effect with various combinations of lenses and aperture sizes. Our proposed model outperforms current leading bokeh rendering and image restoration models and renders visually natural bokeh effects. Our code is available at: https://github.com/indicator0/bokehornot. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2306.00188 [pdf, other]

Multi-environment lifelong deep reinforcement learning for medical imaging

Authors: Guangyao Zheng, Shuhao Lai, Vladimir Braverman, Michael A. Jacobs, Vishwa S. Parekh

Abstract: Deep reinforcement learning(DRL) is increasingly being explored in medical imaging. However, the environments for medical imaging tasks are constantly evolving in terms of imaging orientations, imaging sequences, and pathologies. To that end, we developed a Lifelong DRL framework, SERIL to continually learn new tasks in changing imaging environments without catastrophic forgetting. SERIL was devel… ▽ More Deep reinforcement learning(DRL) is increasingly being explored in medical imaging. However, the environments for medical imaging tasks are constantly evolving in terms of imaging orientations, imaging sequences, and pathologies. To that end, we developed a Lifelong DRL framework, SERIL to continually learn new tasks in changing imaging environments without catastrophic forgetting. SERIL was developed using selective experience replay based lifelong learning technique for the localization of five anatomical landmarks in brain MRI on a sequence of twenty-four different imaging environments. The performance of SERIL, when compared to two baseline setups: MERT(multi-environment-best-case) and SERT(single-environment-worst-case) demonstrated excellent performance with an average distance of $9.90\pm7.35$ pixels from the desired landmark across all 120 tasks, compared to $10.29\pm9.07$ for MERT and $36.37\pm22.41$ for SERT($p<0.05$), demonstrating the excellent potential for continuously learning multiple tasks across dynamically changing imaging environments. △ Less

Submitted 31 May, 2023; originally announced June 2023.

arXiv:2206.10397 [pdf, other]

doi 10.1109/TRO.2023.3331064

Neural Moving Horizon Estimation for Robust Flight Control

Authors: Bingheng Wang, Zhengtian Ma, Shupeng Lai, Lin Zhao

Abstract: Estimating and reacting to disturbances is crucial for robust flight control of quadrotors. Existing estimators typically require significant tuning for a specific flight scenario or training with extensive ground-truth disturbance data to achieve satisfactory performance. In this paper, we propose a neural moving horizon estimator (NeuroMHE) that can automatically tune its key parameters modeled… ▽ More Estimating and reacting to disturbances is crucial for robust flight control of quadrotors. Existing estimators typically require significant tuning for a specific flight scenario or training with extensive ground-truth disturbance data to achieve satisfactory performance. In this paper, we propose a neural moving horizon estimator (NeuroMHE) that can automatically tune its key parameters modeled by a neural network and adapt to different flight scenarios. We achieve this by deriving the analytical gradients of the MHE estimates with respect to the MHE weighting matrices, which enables a seamless embedding of the MHE as a learnable layer into the neural network for highly effective learning. Interestingly, we show that the gradients can be computed efficiently using a Kalman filter in a recursive form. Moreover, we develop a model-based policy gradient algorithm to train NeuroMHE directly from the quadrotor trajectory tracking error without needing the ground-truth disturbance data. The effectiveness of NeuroMHE is verified extensively via both simulations and physical experiments on quadrotors in various challenging flights. Notably, NeuroMHE outperforms a state-of-the-art neural network-based estimator, reducing force estimation errors by up to 76.7%, while using a portable neural network that has only 7.7% of the learnable parameters of the latter. The proposed method is general and can be applied to robust adaptive control of other robotic systems. △ Less

Submitted 14 November, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

Comments: This paper (not the final version) has been accepted for publication in the IEEE Transactions on Robotics

arXiv:2204.00630 [pdf, other]

Extremely Low-light Image Enhancement with Scene Text Restoration

Authors: Pohao Hsu, Che-Tsung Lin, Chun Chet Ng, Jie-Long Kew, Mei Yih Tan, Shang-Hong Lai, Chee Seng Chan, Christopher Zach

Abstract: Deep learning-based methods have made impressive progress in enhancing extremely low-light images - the image quality of the reconstructed images has generally improved. However, we found out that most of these methods could not sufficiently recover the image details, for instance, the texts in the scene. In this paper, a novel image enhancement framework is proposed to precisely restore the scene… ▽ More Deep learning-based methods have made impressive progress in enhancing extremely low-light images - the image quality of the reconstructed images has generally improved. However, we found out that most of these methods could not sufficiently recover the image details, for instance, the texts in the scene. In this paper, a novel image enhancement framework is proposed to precisely restore the scene texts, as well as the overall quality of the image simultaneously under extremely low-light images conditions. Mainly, we employed a self-regularised attention map, an edge map, and a novel text detection loss. In addition, leveraging synthetic low-light images is beneficial for image enhancement on the genuine ones in terms of text detection. The quantitative and qualitative experimental results have shown that the proposed model outperforms state-of-the-art methods in image restoration, text detection, and text spotting on See In the Dark and ICDAR15 datasets. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2112.10001 [pdf, other]

Cross-Domain Federated Learning in Medical Imaging

Authors: Vishwa S Parekh, Shuhao Lai, Vladimir Braverman, Jeff Leal, Steven Rowe, Jay J Pillai, Michael A Jacobs

Abstract: Federated learning is increasingly being explored in the field of medical imaging to train deep learning models on large scale datasets distributed across different data centers while preserving privacy by avoiding the need to transfer sensitive patient information. In this manuscript, we explore federated learning in a multi-domain, multi-task setting wherein different participating nodes may con… ▽ More Federated learning is increasingly being explored in the field of medical imaging to train deep learning models on large scale datasets distributed across different data centers while preserving privacy by avoiding the need to transfer sensitive patient information. In this manuscript, we explore federated learning in a multi-domain, multi-task setting wherein different participating nodes may contain datasets sourced from different domains and are trained to solve different tasks. We evaluated cross-domain federated learning for the tasks of object detection and segmentation across two different experimental settings: multi-modal and multi-organ. The result from our experiments on cross-domain federated learning framework were very encouraging with an overlap similarity of 0.79 for organ localization and 0.65 for lesion segmentation. Our results demonstrate the potential of federated learning in developing multi-domain, multi-task deep learning models without sharing data from different domains. △ Less

Submitted 18 December, 2021; originally announced December 2021.

Comments: Under Review for MIDL 2022

arXiv:2108.03212 [pdf, other]

Differentiable Moving Horizon Estimation for Robust Flight Control

Authors: Bingheng Wang, Zhengtian Ma, Shupeng Lai, Lin Zhao, Tong Heng Lee

Abstract: Estimating and reacting to external disturbances is of fundamental importance for robust control of quadrotors. Existing estimators typically require significant tuning or training with a large amount of data, including the ground truth, to achieve satisfactory performance. This paper proposes a data-efficient differentiable moving horizon estimation (DMHE) algorithm that can automatically tune th… ▽ More Estimating and reacting to external disturbances is of fundamental importance for robust control of quadrotors. Existing estimators typically require significant tuning or training with a large amount of data, including the ground truth, to achieve satisfactory performance. This paper proposes a data-efficient differentiable moving horizon estimation (DMHE) algorithm that can automatically tune the MHE parameters online and also adapt to different scenarios. We achieve this by deriving the analytical gradient of the estimated trajectory from MHE with respect to the tuning parameters, enabling end-to-end learning for auto-tuning. Most interestingly, we show that the gradient can be calculated efficiently from a Kalman filter in a recursive form. Moreover, we develop a model-based policy gradient algorithm to learn the parameters directly from the trajectory tracking errors without the need for the ground truth. The proposed DMHE can be further embedded as a layer with other neural networks for joint optimization. Finally, we demonstrate the effectiveness of the proposed method via both simulation and experiments on quadrotors, where challenging scenarios such as sudden payload change and flying in downwash are examined. △ Less

Submitted 29 May, 2022; v1 submitted 6 August, 2021; originally announced August 2021.

Comments: This paper was accepted for presentation at the 60th IEEE Conference on Decision and Control (CDC2021). The extended version here contains the experiment results and an appendix with brief proofs for the two lemmas

arXiv:1802.00285 [pdf, other]

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

Authors: Zhang-Wei Hong, Chen Yu-Ming, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Hsuan-Kung Yang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Yueh-Chuan Chang, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, Chun-Yi Lee

Abstract: Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular… ▽ More Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular architecture for tackling the virtual-to-real problem. The proposed architecture separates the learning model into a perception module and a control policy module, and uses semantic image segmentation as the meta representation for relating these two modules. The perception module translates the perceived RGB image to semantic image segmentation. The control policy module is implemented as a deep reinforcement learning agent, which performs actions based on the translated image segmentation. Our architecture is evaluated in an obstacle avoidance task and a target following task. Experimental results show that our architecture significantly outperforms all of the baseline methods in both virtual and real environments, and demonstrates a faster learning curve than them. We also present a detailed analysis for a variety of variant configurations, and validate the transferability of our modular architecture. △ Less

Submitted 28 October, 2018; v1 submitted 1 February, 2018; originally announced February 2018.

Comments: 7 pages, accepted by IJCAI-18

arXiv:1609.06000 [pdf]

Levelized Cost of Energy for PV and Grid Scale Energy Storage Systems

Authors: Chun Sing Lai, Malcolm D McCulloch

Abstract: With the increasing penetration of renewable energy sources and energy storage devices in the power system, it is important to evaluate the cost of the system by using Levelized Cost of Energy (LCOE). In this paper a new metric, Levelized Cost of Delivery (LCOD) is proposed to calculate the LCOE for the energy storage. The recent definitions in LCOE for renewable energy system has been reviewed. F… ▽ More With the increasing penetration of renewable energy sources and energy storage devices in the power system, it is important to evaluate the cost of the system by using Levelized Cost of Energy (LCOE). In this paper a new metric, Levelized Cost of Delivery (LCOD) is proposed to calculate the LCOE for the energy storage. The recent definitions in LCOE for renewable energy system has been reviewed. From fundamental principles, it is demonstrated that there is a need to introduce a new method to evaluate LCOE of the system as the conventional LCOE is not applicable for renewable energy storage systems. Three years of solar irradiance data in Africa collected from Johannesburg and the national load data from Kenya are obtained for case studies. The proposed cost calculation methods are evaluated for two types of storage technologies (Vanadium Redox Battery (VRB) and Lithium-ion) with real-life data. It shows that the marginal LCOE and LCOD indices can be used to assist policymakers to consider the discount rate and the type of storage technology for a cost effective renewable storage energy system. △ Less

Submitted 19 September, 2016; originally announced September 2016.

Comments: Pre-print submitted to Elsevier Applied Energy

Showing 1–12 of 12 results for author: Lai, S