-
Topology and geometry optimization of grid-shells under self-weight loading
Authors:
Helen E. Fairclough,
Karol Bolbotowski,
Linwei He,
Andrew Liew,
Matthew Gilbert
Abstract:
This manuscript presents an approach for simultaneously optimizing the connectivity and elevation of grid-shell structures acting in pure compression (or pure tension) under the combined effects of a prescribed external loading and the design-dependent self-weight of the structure itself. The method derived herein involves solving a second-order cone optimization problem, thereby ensuring convexit…
▽ More
This manuscript presents an approach for simultaneously optimizing the connectivity and elevation of grid-shell structures acting in pure compression (or pure tension) under the combined effects of a prescribed external loading and the design-dependent self-weight of the structure itself. The method derived herein involves solving a second-order cone optimization problem, thereby ensuring convexity and obtaining globally optimal results for a given discretization of the design domain. Several numerical examples are presented, illustrating characteristics of this class of optimal structures. It is found that, as self-weight becomes more significant, both the optimal topology and the optimal elevation profile of the structure change, highlighting the importance of optimizing both topology and geometry simultaneously from the earliest stages of design. It is shown that this approach can obtain solutions with greater accuracy and several orders of magnitude more quickly than a standard 3D layout/truss topology optimization approach.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Graph Retrieval-Augmented LLM for Conversational Recommendation Systems
Authors:
Zhangchi Qiu,
Linhao Luo,
Zicheng Zhao,
Shirui Pan,
Alan Wee-Chung Liew
Abstract:
Conversational Recommender Systems (CRSs) have emerged as a transformative paradigm for offering personalized recommendations through natural language dialogue. However, they face challenges with knowledge sparsity, as users often provide brief, incomplete preference statements. While recent methods have integrated external knowledge sources to mitigate this, they still struggle with semantic unde…
▽ More
Conversational Recommender Systems (CRSs) have emerged as a transformative paradigm for offering personalized recommendations through natural language dialogue. However, they face challenges with knowledge sparsity, as users often provide brief, incomplete preference statements. While recent methods have integrated external knowledge sources to mitigate this, they still struggle with semantic understanding and complex preference reasoning. Recent Large Language Models (LLMs) demonstrate promising capabilities in natural language understanding and reasoning, showing significant potential for CRSs. Nevertheless, due to the lack of domain knowledge, existing LLM-based CRSs either produce hallucinated recommendations or demand expensive domain-specific training, which largely limits their applicability. In this work, we present G-CRS (Graph Retrieval-Augmented Large Language Model for Conversational Recommender Systems), a novel training-free framework that combines graph retrieval-augmented generation and in-context learning to enhance LLMs' recommendation capabilities. Specifically, G-CRS employs a two-stage retrieve-and-recommend architecture, where a GNN-based graph reasoner first identifies candidate items, followed by Personalized PageRank exploration to jointly discover potential items and similar user interactions. These retrieved contexts are then transformed into structured prompts for LLM reasoning, enabling contextually grounded recommendations without task-specific training. Extensive experiments on two public datasets show that G-CRS achieves superior recommendation performance compared to existing methods without requiring task-specific training.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
A Label-Free Heterophily-Guided Approach for Unsupervised Graph Fraud Detection
Authors:
Junjun Pan,
Yixin Liu,
Xin Zheng,
Yizhen Zheng,
Alan Wee-Chung Liew,
Fuyi Li,
Shirui Pan
Abstract:
Graph fraud detection (GFD) has rapidly advanced in protecting online services by identifying malicious fraudsters. Recent supervised GFD research highlights that heterophilic connections between fraudsters and users can greatly impact detection performance, since fraudsters tend to camouflage themselves by building more connections to benign users. Despite the promising performance of supervised…
▽ More
Graph fraud detection (GFD) has rapidly advanced in protecting online services by identifying malicious fraudsters. Recent supervised GFD research highlights that heterophilic connections between fraudsters and users can greatly impact detection performance, since fraudsters tend to camouflage themselves by building more connections to benign users. Despite the promising performance of supervised GFD methods, the reliance on labels limits their applications to unsupervised scenarios; Additionally, accurately capturing complex and diverse heterophily patterns without labels poses a further challenge. To fill the gap, we propose a Heterophily-guided Unsupervised Graph fraud dEtection approach (HUGE) for unsupervised GFD, which contains two essential components: a heterophily estimation module and an alignment-based fraud detection module. In the heterophily estimation module, we design a novel label-free heterophily metric called HALO, which captures the critical graph properties for GFD, enabling its outstanding ability to estimate heterophily from node attributes. In the alignment-based fraud detection module, we develop a joint MLP-GNN architecture with ranking loss and asymmetric alignment loss. The ranking loss aligns the predicted fraud score with the relative order of HALO, providing an extra robustness guarantee by comparing heterophily among non-adjacent nodes. Moreover, the asymmetric alignment loss effectively utilizes structural information while alleviating the feature-smooth effects of GNNs. Extensive experiments on 6 datasets demonstrate that HUGE significantly outperforms competitors, showcasing its effectiveness and robustness.
△ Less
Submitted 19 March, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
Privacy-Preserving in Medical Image Analysis: A Review of Methods and Applications
Authors:
Yanming Zhu,
Xuefei Yin,
Alan Wee-Chung Liew,
Hui Tian
Abstract:
With the rapid advancement of artificial intelligence and deep learning, medical image analysis has become a critical tool in modern healthcare, significantly improving diagnostic accuracy and efficiency. However, AI-based methods also raise serious privacy concerns, as medical images often contain highly sensitive patient information. This review offers a comprehensive overview of privacy-preserv…
▽ More
With the rapid advancement of artificial intelligence and deep learning, medical image analysis has become a critical tool in modern healthcare, significantly improving diagnostic accuracy and efficiency. However, AI-based methods also raise serious privacy concerns, as medical images often contain highly sensitive patient information. This review offers a comprehensive overview of privacy-preserving techniques in medical image analysis, including encryption, differential privacy, homomorphic encryption, federated learning, and generative adversarial networks. We explore the application of these techniques across various medical image analysis tasks, such as diagnosis, pathology, and telemedicine. Notably, we organizes the review based on specific challenges and their corresponding solutions in different medical image analysis applications, so that technical applications are directly aligned with practical issues, addressing gaps in the current research landscape. Additionally, we discuss emerging trends, such as zero-knowledge proofs and secure multi-party computation, offering insights for future research. This review serves as a valuable resource for researchers and practitioners and can help advance privacy-preserving in medical image analysis.
△ Less
Submitted 5 December, 2024;
originally announced December 2024.
-
Unveiling User Preferences: A Knowledge Graph and LLM-Driven Approach for Conversational Recommendation
Authors:
Zhangchi Qiu,
Linhao Luo,
Shirui Pan,
Alan Wee-Chung Liew
Abstract:
Conversational Recommender Systems (CRSs) aim to provide personalized recommendations through dynamically capturing user preferences in interactive conversations. Conventional CRSs often extract user preferences as hidden representations, which are criticized for their lack of interpretability. This diminishes the transparency and trustworthiness of the recommendation process. Recent works have ex…
▽ More
Conversational Recommender Systems (CRSs) aim to provide personalized recommendations through dynamically capturing user preferences in interactive conversations. Conventional CRSs often extract user preferences as hidden representations, which are criticized for their lack of interpretability. This diminishes the transparency and trustworthiness of the recommendation process. Recent works have explored combining the impressive capabilities of Large Language Models (LLMs) with the domain-specific knowledge of Knowledge Graphs (KGs) to generate human-understandable recommendation explanations. Despite these efforts, the integration of LLMs and KGs for CRSs remains challenging due to the modality gap between unstructured dialogues and structured KGs. Moreover, LLMs pre-trained on large-scale corpora may not be well-suited for analyzing user preferences, which require domain-specific knowledge. In this paper, we propose COMPASS, a plug-and-play framework that synergizes LLMs and KGs to unveil user preferences, enhancing the performance and explainability of existing CRSs. To address integration challenges, COMPASS employs a two-stage training approach: first, it bridges the gap between the structured KG and natural language through an innovative graph entity captioning pre-training mechanism. This enables the LLM to transform KG entities into concise natural language descriptions, allowing them to comprehend domain-specific knowledge. Following, COMPASS optimizes user preference modeling via knowledge-aware instruction fine-tuning, where the LLM learns to reason and summarize user preferences from both dialogue histories and KG-augmented context. This enables COMPASS to perform knowledge-aware reasoning and generate comprehensive and interpretable user preferences that can seamlessly integrate with existing CRS models for improving recommendation performance and explainability.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
GPU Based Differential Evolution: New Insights and Comparative Study
Authors:
Dylan Janssen,
Wayne Pullan,
Alan Wee-Chung Liew
Abstract:
Differential Evolution (DE) is a highly successful population based global optimisation algorithm, commonly used for solving numerical optimisation problems. However, as the complexity of the objective function increases, the wall-clock run-time of the algorithm suffers as many fitness function evaluations must take place to effectively explore the search space. Due to the inherently parallel natu…
▽ More
Differential Evolution (DE) is a highly successful population based global optimisation algorithm, commonly used for solving numerical optimisation problems. However, as the complexity of the objective function increases, the wall-clock run-time of the algorithm suffers as many fitness function evaluations must take place to effectively explore the search space. Due to the inherently parallel nature of the DE algorithm, graphics processing units (GPU) have been used to effectively accelerate both the fitness evaluation and DE algorithm. This work reviews the main architectural choices made in the literature for GPU based DE algorithms and introduces a new GPU based numerical optimisation benchmark to evaluate and compare GPU based DE algorithms.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
Authors:
Jiapu Wang,
Kai Sun,
Linhao Luo,
Wei Wei,
Yongli Hu,
Alan Wee-Chung Liew,
Shirui Pan,
Baocai Yin
Abstract:
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively le…
▽ More
Temporal Knowledge Graph Reasoning (TKGR) is the process of utilizing temporal information to capture complex relations within a Temporal Knowledge Graph (TKG) to infer new knowledge. Conventional methods in TKGR typically depend on deep learning algorithms or temporal logical rules. However, deep learning-based TKGRs often lack interpretability, whereas rule-based TKGRs struggle to effectively learn temporal rules that capture temporal patterns. Recently, Large Language Models (LLMs) have demonstrated extensive knowledge and remarkable proficiency in temporal reasoning. Consequently, the employment of LLMs for Temporal Knowledge Graph Reasoning (TKGR) has sparked increasing interest among researchers. Nonetheless, LLMs are known to function as black boxes, making it challenging to comprehend their reasoning process. Additionally, due to the resource-intensive nature of fine-tuning, promptly updating LLMs to integrate evolving knowledge within TKGs for reasoning is impractical. To address these challenges, in this paper, we propose a Large Language Models-guided Dynamic Adaptation (LLM-DA) method for reasoning on TKGs. Specifically, LLM-DA harnesses the capabilities of LLMs to analyze historical data and extract temporal logical rules. These rules unveil temporal patterns and facilitate interpretable reasoning. To account for the evolving nature of TKGs, a dynamic adaptation strategy is proposed to update the LLM-generated rules with the latest events. This ensures that the extracted rules always incorporate the most recent knowledge and better generalize to the predictions on future events. Experimental results show that without the need of fine-tuning, LLM-DA significantly improves the accuracy of reasoning over several common datasets, providing a robust framework for TKGR tasks.
△ Less
Submitted 29 December, 2024; v1 submitted 23 May, 2024;
originally announced May 2024.
-
Unsupervised Band Selection Using Fused HSI and LiDAR Attention Integrating With Autoencoder
Authors:
Judy X Yang,
Jun Zhou,
Jing Wang,
Hui Tian,
Alan Wee Chung Liew
Abstract:
Band selection in hyperspectral imaging (HSI) is critical for optimising data processing and enhancing analytical accuracy. Traditional approaches have predominantly concentrated on analysing spectral and pixel characteristics within individual bands independently. These approaches overlook the potential benefits of integrating multiple data sources, such as Light Detection and Ranging (LiDAR), an…
▽ More
Band selection in hyperspectral imaging (HSI) is critical for optimising data processing and enhancing analytical accuracy. Traditional approaches have predominantly concentrated on analysing spectral and pixel characteristics within individual bands independently. These approaches overlook the potential benefits of integrating multiple data sources, such as Light Detection and Ranging (LiDAR), and is further challenged by the limited availability of labeled data in HSI processing, which represents a significant obstacle. To address these challenges, this paper introduces a novel unsupervised band selection framework that incorporates attention mechanisms and an Autoencoder for reconstruction-based band selection. Our methodology distinctively integrates HSI with LiDAR data through an attention score, using a convolutional Autoencoder to process the combined feature mask. This fusion effectively captures essential spatial and spectral features and reduces redundancy in hyperspectral datasets. A comprehensive comparative analysis of our innovative fused band selection approach is performed against existing unsupervised band selection and fusion models. We used data sets such as Houston 2013, Trento, and MUUFLE for our experiments. The results demonstrate that our method achieves superior classification accuracy and significantly outperforms existing models. This enhancement in HSI band selection, facilitated by the incorporation of LiDAR features, underscores the considerable advantages of integrating features from different sources.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification
Authors:
Judy X Yang,
Jun Zhou,
Jing Wang,
Hui Tian,
Alan Wee-Chung Liew
Abstract:
The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transfor…
▽ More
The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transformer architecture for the selection of HSI bands guided by LiDAR data. LiDAR provides high-resolution vertical structural information, which can be useful in distinguishing different types of land cover that may have similar spectral signatures but different structural profiles. In our approach, the LiDAR data are used as the "query" to search and identify the "key" from the HSI to choose the most pertinent bands for LiDAR. This method ensures that the selected HSI bands drastically reduce redundancy and computational requirements while working optimally with the LiDAR data. Extensive experiments have been undertaken on three paired HSI and LiDAR data sets: Houston 2013, Trento and MUUFL. The results highlight the superiority of the cross-attention mechanism, underlining the enhanced classification accuracy of the identified HSI bands when fused with the LiDAR features. The results also show that the use of fewer bands combined with LiDAR surpasses the performance of state-of-the-art fusion models.
△ Less
Submitted 15 April, 2024; v1 submitted 5 April, 2024;
originally announced April 2024.
-
HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification
Authors:
Judy X Yang,
Jun Zhou,
Jing Wang,
Hui Tian,
Alan Wee Chung Liew
Abstract:
Classifying hyperspectral images is a difficult task in remote sensing, due to their complex high-dimensional data. To address this challenge, we propose HSIMamba, a novel framework that uses bidirectional reversed convolutional neural network pathways to extract spectral features more efficiently. Additionally, it incorporates a specialized block for spatial analysis. Our approach combines the op…
▽ More
Classifying hyperspectral images is a difficult task in remote sensing, due to their complex high-dimensional data. To address this challenge, we propose HSIMamba, a novel framework that uses bidirectional reversed convolutional neural network pathways to extract spectral features more efficiently. Additionally, it incorporates a specialized block for spatial analysis. Our approach combines the operational efficiency of CNNs with the dynamic feature extraction capability of attention mechanisms found in Transformers. However, it avoids the associated high computational demands. HSIMamba is designed to process data bidirectionally, significantly enhancing the extraction of spectral features and integrating them with spatial information for comprehensive analysis. This approach improves classification accuracy beyond current benchmarks and addresses computational inefficiencies encountered with advanced models like Transformers. HSIMamba were tested against three widely recognized datasets Houston 2013, Indian Pines, and Pavia University and demonstrated exceptional performance, surpassing existing state-of-the-art models in HSI classification. This method highlights the methodological innovation of HSIMamba and its practical implications, which are particularly valuable in contexts where computational resources are limited. HSIMamba redefines the standards of efficiency and accuracy in HSI classification, thereby enhancing the capabilities of remote sensing applications. Hyperspectral imaging has become a crucial tool for environmental surveillance, agriculture, and other critical areas that require detailed analysis of the Earth surface. Please see our code in HSIMamba for more details.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Machine learning for structural design models of continuous beam systems via influence zones
Authors:
Adrien Gallet,
Andrew Liew,
Iman Hajirasouliha,
Danny Smyl
Abstract:
This work develops a machine learned structural design model for continuous beam systems from the inverse problem perspective. After demarcating between forward, optimisation and inverse machine learned operators, the investigation proposes a novel methodology based on the recently developed influence zone concept which represents a fundamental shift in approach compared to traditional structural…
▽ More
This work develops a machine learned structural design model for continuous beam systems from the inverse problem perspective. After demarcating between forward, optimisation and inverse machine learned operators, the investigation proposes a novel methodology based on the recently developed influence zone concept which represents a fundamental shift in approach compared to traditional structural design methods. The aim of this approach is to conceptualise a non-iterative structural design model that predicts cross-section requirements for continuous beam systems of arbitrary system size. After generating a dataset of known solutions, an appropriate neural network architecture is identified, trained, and tested against unseen data. The results show a mean absolute percentage testing error of 1.6% for cross-section property predictions, along with a good ability of the neural network to generalise well to structural systems of variable size. The CBeamXP dataset generated in this work and an associated python-based neural network training script are available at an open-source data repository to allow for the reproducibility of results and to encourage further investigations.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems
Authors:
Zhangchi Qiu,
Ye Tao,
Shirui Pan,
Alan Wee-Chung Liew
Abstract:
Conversational recommender systems (CRS) utilize natural language interactions and dialogue history to infer user preferences and provide accurate recommendations. Due to the limited conversation context and background knowledge, existing CRSs rely on external sources such as knowledge graphs to enrich the context and model entities based on their inter-relations. However, these methods ignore the…
▽ More
Conversational recommender systems (CRS) utilize natural language interactions and dialogue history to infer user preferences and provide accurate recommendations. Due to the limited conversation context and background knowledge, existing CRSs rely on external sources such as knowledge graphs to enrich the context and model entities based on their inter-relations. However, these methods ignore the rich intrinsic information within entities. To address this, we introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework, which leverages both the knowledge graph and a pre-trained language model to improve the semantic understanding of entities for CRS. In our KERL framework, entity textual descriptions are encoded via a pre-trained language model, while a knowledge graph helps reinforce the representation of these entities. We also employ positional encoding to effectively capture the temporal information of entities in a conversation. The enhanced entity representation is then used to develop a recommender component that fuses both entity and contextual representations for more informed recommendations, as well as a dialogue component that generates informative entity-related information in the response text. A high-quality knowledge graph with aligned entity descriptions is constructed to facilitate our study, namely the Wiki Movie Knowledge Graph (WikiMKG). The experimental results show that KERL achieves state-of-the-art results in both recommendation and response generation tasks.
△ Less
Submitted 1 May, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Towards Data-centric Graph Machine Learning: Review and Outlook
Authors:
Xin Zheng,
Yixin Liu,
Zhifeng Bao,
Meng Fang,
Xia Hu,
Alan Wee-Chung Liew,
Shirui Pan
Abstract:
Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing…
▽ More
Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing and capturing intricate dependencies among massive and diverse real-life entities. We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle, including graph data collection, exploration, improvement, exploitation, and maintenance. A thorough taxonomy of each stage is presented to answer three critical graph-centric questions: (1) how to enhance graph data availability and quality; (2) how to learn from graph data with limited-availability and low-quality; (3) how to build graph MLOps systems from the graph data-centric view. Lastly, we pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Influence zones for continuous beam systems
Authors:
Adrien Gallet,
Andrew Liew,
Iman Hajirasouliha,
Danny Smyl
Abstract:
Unlike influence lines, the concept of influence zones is remarkably absent within the field of structural engineering, despite its existence in the closely related domain of geotechnics. This paper proposes the novel concept of a structural influence zone in relation to continuous beam systems and explores its size numerically with various design constraints applicable to steel framed buildings.…
▽ More
Unlike influence lines, the concept of influence zones is remarkably absent within the field of structural engineering, despite its existence in the closely related domain of geotechnics. This paper proposes the novel concept of a structural influence zone in relation to continuous beam systems and explores its size numerically with various design constraints applicable to steel framed buildings. The key challenge involves explicitly defining the critical load arrangements, and is tackled by using the novel concepts of polarity sequences and polarity zones. These lead to the identification of flexural and (discovery of) shear load arrangements, with an equation demarcating when the latter arises. After developing algorithms that help identify both types of critical load arrangements, design data sets are generated and the influence zone values are extracted. The results indicate that the influence zone under ultimate state considerations is typically less than 3, rising to a maximum size of 5 adjacent members for any given continuous beam. Additional insights from the influence zone concept, specifically in comparison to influence lines, are highlighted, and the avenues for future research, such as in relation to the newly identified shear load arrangements, are discussed.
△ Less
Submitted 24 February, 2023;
originally announced May 2023.
-
Denial-of-Service or Fine-Grained Control: Towards Flexible Model Poisoning Attacks on Federated Learning
Authors:
Hangtao Zhang,
Zeming Yao,
Leo Yu Zhang,
Shengshan Hu,
Chao Chen,
Alan Liew,
Zhetao Li
Abstract:
Federated learning (FL) is vulnerable to poisoning attacks, where adversaries corrupt the global aggregation results and cause denial-of-service (DoS). Unlike recent model poisoning attacks that optimize the amplitude of malicious perturbations along certain prescribed directions to cause DoS, we propose a Flexible Model Poisoning Attack (FMPA) that can achieve versatile attack goals. We consider…
▽ More
Federated learning (FL) is vulnerable to poisoning attacks, where adversaries corrupt the global aggregation results and cause denial-of-service (DoS). Unlike recent model poisoning attacks that optimize the amplitude of malicious perturbations along certain prescribed directions to cause DoS, we propose a Flexible Model Poisoning Attack (FMPA) that can achieve versatile attack goals. We consider a practical threat scenario where no extra knowledge about the FL system (e.g., aggregation rules or updates on benign devices) is available to adversaries. FMPA exploits the global historical information to construct an estimator that predicts the next round of the global model as a benign reference. It then fine-tunes the reference model to obtain the desired poisoned model with low accuracy and small perturbations. Besides the goal of causing DoS, FMPA can be naturally extended to launch a fine-grained controllable attack, making it possible to precisely reduce the global accuracy. Armed with precise control, malicious FL service providers can gain advantages over their competitors without getting noticed, hence opening a new attack surface in FL other than DoS. Even for the purpose of DoS, experiments show that FMPA significantly decreases the global accuracy, outperforming six state-of-the-art attacks.
△ Less
Submitted 25 September, 2024; v1 submitted 21 April, 2023;
originally announced April 2023.
-
RADIFUSION: A multi-radiomics deep learning based breast cancer risk prediction model using sequential mammographic images with image attention and bilateral asymmetry refinement
Authors:
Hong Hui Yeoh,
Andrea Liew,
Raphaƫl Phan,
Fredrik Strand,
Kartini Rahmat,
Tuong Linh Nguyen,
John L. Hopper,
Maxine Tan
Abstract:
Breast cancer is a significant public health concern and early detection is critical for triaging high risk patients. Sequential screening mammograms can provide important spatiotemporal information about changes in breast tissue over time. In this study, we propose a deep learning architecture called RADIFUSION that utilizes sequential mammograms and incorporates a linear image attention mechanis…
▽ More
Breast cancer is a significant public health concern and early detection is critical for triaging high risk patients. Sequential screening mammograms can provide important spatiotemporal information about changes in breast tissue over time. In this study, we propose a deep learning architecture called RADIFUSION that utilizes sequential mammograms and incorporates a linear image attention mechanism, radiomic features, a new gating mechanism to combine different mammographic views, and bilateral asymmetry-based finetuning for breast cancer risk assessment. We evaluate our model on a screening dataset called Cohort of Screen-Aged Women (CSAW) dataset. Based on results obtained on the independent testing set consisting of 1,749 women, our approach achieved superior performance compared to other state-of-the-art models with area under the receiver operating characteristic curves (AUCs) of 0.905, 0.872 and 0.866 in the three respective metrics of 1-year AUC, 2-year AUC and > 2-year AUC. Our study highlights the importance of incorporating various deep learning mechanisms, such as image attention, radiomic features, gating mechanism, and bilateral asymmetry-based fine-tuning, to improve the accuracy of breast cancer risk assessment. We also demonstrate that our model's performance was enhanced by leveraging spatiotemporal information from sequential mammograms. Our findings suggest that RADIFUSION can provide clinicians with a powerful tool for breast cancer risk assessment.
△ Less
Submitted 2 June, 2023; v1 submitted 1 April, 2023;
originally announced April 2023.
-
Using Large Language Models to Generate Engaging Captions for Data Visualizations
Authors:
Ashley Liew,
Klaus Mueller
Abstract:
Creating compelling captions for data visualizations has been a longstanding challenge. Visualization researchers are typically untrained in journalistic reporting and hence the captions that are placed below data visualizations tend to be not overly engaging and rather just stick to basic observations about the data. In this work we explore the opportunities offered by the newly emerging crop of…
▽ More
Creating compelling captions for data visualizations has been a longstanding challenge. Visualization researchers are typically untrained in journalistic reporting and hence the captions that are placed below data visualizations tend to be not overly engaging and rather just stick to basic observations about the data. In this work we explore the opportunities offered by the newly emerging crop of large language models (LLM) which use sophisticated deep learning technology to produce human-like prose. We ask, can these powerful software devices be purposed to produce engaging captions for generic data visualizations like a scatterplot. It turns out that the key challenge lies in designing the most effective prompt for the LLM, a task called prompt engineering. We report on first experiments using the popular LLM GPT-3 and deliver some promising results.
△ Less
Submitted 27 December, 2022;
originally announced December 2022.
-
A Survey of Machine Unlearning
Authors:
Thanh Tam Nguyen,
Thanh Trung Huynh,
Zhao Ren,
Phi Le Nguyen,
Alan Wee-Chung Liew,
Hongzhi Yin,
Quoc Viet Hung Nguyen
Abstract:
Today, computer systems hold large amounts of personal data. Yet while such an abundance of data allows breakthroughs in artificial intelligence, and especially machine learning (ML), its existence can be a threat to user privacy, and it can weaken the bonds of trust between humans and AI. Recent regulations now require that, on request, private information about a user must be removed from both c…
▽ More
Today, computer systems hold large amounts of personal data. Yet while such an abundance of data allows breakthroughs in artificial intelligence, and especially machine learning (ML), its existence can be a threat to user privacy, and it can weaken the bonds of trust between humans and AI. Recent regulations now require that, on request, private information about a user must be removed from both computer systems and from ML models, i.e. ``the right to be forgotten''). While removing data from back-end databases should be straightforward, it is not sufficient in the AI context as ML models often `remember' the old data. Contemporary adversarial attacks on trained models have proven that we can learn whether an instance or an attribute belonged to the training data. This phenomenon calls for a new paradigm, namely machine unlearning, to make ML models forget about particular data. It turns out that recent works on machine unlearning have not been able to completely solve the problem due to the lack of common frameworks and resources. Therefore, this paper aspires to present a comprehensive examination of machine unlearning's concepts, scenarios, methods, and applications. Specifically, as a category collection of cutting-edge studies, the intention behind this article is to serve as a comprehensive resource for researchers and practitioners seeking an introduction to machine unlearning and its formulations, design criteria, removal requests, algorithms, and applications. In addition, we aim to highlight the key findings, current trends, and new research areas that have not yet featured the use of machine unlearning but could benefit greatly from it. We hope this survey serves as a valuable resource for ML researchers and those seeking to innovate privacy technologies. Our resources are publicly available at https://github.com/tamlhp/awesome-machine-unlearning.
△ Less
Submitted 17 September, 2024; v1 submitted 6 September, 2022;
originally announced September 2022.
-
A hybrid privacy protection scheme for medical data
Authors:
Judy X Yang,
Hui Tian,
Alan Wee-Chung Liew,
Ernest Foo
Abstract:
Healthcare data contains sensitive information, and it is challenging to persuade healthcare data owners to share their information for research purposes without any privacy assurance. The proposed hybrid medical data privacy protection scheme explores the possibility of providing adaptive privacy protection and data utility levels. The evaluation result demonstrates that the scheme can provide ad…
▽ More
Healthcare data contains sensitive information, and it is challenging to persuade healthcare data owners to share their information for research purposes without any privacy assurance. The proposed hybrid medical data privacy protection scheme explores the possibility of providing adaptive privacy protection and data utility levels. The evaluation result demonstrates that the scheme can provide adaptive privacy and data utility levels, and the data holder can choose their preferred risk level and data utility through the scheme. The evaluation results on the heart disease and diabetes data demonstrate that the scheme can provide a wide range of adaptive privacy protection and data utility levels to meet different privacy protection and data utility requirements.
△ Less
Submitted 9 May, 2022; v1 submitted 29 April, 2022;
originally announced April 2022.
-
Regulating Ownership Verification for Deep Neural Networks: Scenarios, Protocols, and Prospects
Authors:
Fang-Qi Li,
Shi-Lin Wang,
Alan Wee-Chung Liew
Abstract:
With the broad application of deep neural networks, the necessity of protecting them as intellectual properties has become evident. Numerous watermarking schemes have been proposed to identify the owner of a deep neural network and verify the ownership. However, most of them focused on the watermark embedding rather than the protocol for provable verification. To bridge the gap between those propo…
▽ More
With the broad application of deep neural networks, the necessity of protecting them as intellectual properties has become evident. Numerous watermarking schemes have been proposed to identify the owner of a deep neural network and verify the ownership. However, most of them focused on the watermark embedding rather than the protocol for provable verification. To bridge the gap between those proposals and real-world demands, we study the deep learning model intellectual property protection in three scenarios: the ownership proof, the federated learning, and the intellectual property transfer. We present three protocols respectively. These protocols raise several new requirements for the bottom-level watermarking schemes.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
CASPIANET++: A Multidimensional Channel-Spatial Asymmetric Attention Network with Noisy Student Curriculum Learning Paradigm for Brain Tumor Segmentation
Authors:
Andrea Liew,
Chun Cheng Lee,
Boon Leong Lan,
Maxine Tan
Abstract:
Convolutional neural networks (CNNs) have been used quite successfully for semantic segmentation of brain tumors. However, current CNNs and attention mechanisms are stochastic in nature and neglect the morphological indicators used by radiologists to manually annotate regions of interest. In this paper, we introduce a channel and spatial wise asymmetric attention (CASPIAN) by leveraging the inhere…
▽ More
Convolutional neural networks (CNNs) have been used quite successfully for semantic segmentation of brain tumors. However, current CNNs and attention mechanisms are stochastic in nature and neglect the morphological indicators used by radiologists to manually annotate regions of interest. In this paper, we introduce a channel and spatial wise asymmetric attention (CASPIAN) by leveraging the inherent structure of tumors to detect regions of saliency. To demonstrate the efficacy of our proposed layer, we integrate this into a well-established convolutional neural network (CNN) architecture to achieve higher Dice scores, with less GPU resources. Also, we investigate the inclusion of auxiliary multiscale and multiplanar attention branches to increase the spatial context crucial in semantic segmentation tasks. The resulting architecture is the new CASPIANET++, which achieves Dice Scores of 91.19% whole tumor, 87.6% for tumor core and 81.03% for enhancing tumor. Furthermore, driven by the scarcity of brain tumor data, we investigate the Noisy Student method for segmentation tasks. Our new Noisy Student Curriculum Learning paradigm, which infuses noise incrementally to increase the complexity of the training images exposed to the network, further boosts the enhancing tumor region to 81.53%. Additional validation performed on the BraTS2020 data shows that the Noisy Student Curriculum Learning method works well without any additional training or finetuning.
△ Less
Submitted 8 July, 2021;
originally announced July 2021.
-
Towards Practical Watermark for Deep Neural Networks in Federated Learning
Authors:
Fang-Qi Li,
Shi-Lin Wang,
Alan Wee-Chung Liew
Abstract:
With the wide application of deep neural networks, it is important to verify a host's possession over a deep neural network model and protect the model. To meet this goal, various mechanisms have been designed. By embedding extra information into a network and revealing it afterward, the watermark becomes a competitive candidate in proving integrity for deep learning systems. However, concurrent w…
▽ More
With the wide application of deep neural networks, it is important to verify a host's possession over a deep neural network model and protect the model. To meet this goal, various mechanisms have been designed. By embedding extra information into a network and revealing it afterward, the watermark becomes a competitive candidate in proving integrity for deep learning systems. However, concurrent watermarking schemes can hardly be adopted for emerging distributed learning paradigms that raise extra requirements during the ownership verification. A spearheading distributed learning paradigm is federated learning (FL) where many parties participate in training one single model. Each author participating in the FL should be able to verify its ownership independently. Moreover, there are other potential threat and corresponding security requirements under this scenario. To meet those requirements, in this paper, we demonstrate a watermarking protocol for protecting deep neural networks in the setting of FL. By incorporating the state-of-the-art watermarking scheme and the cryptological primitive designed for distributed storage, the protocol meets the need for ownership verification in the FL scenario without violating the privacy for each participant. This work paves the way for generalizing watermark as a practical security mechanism for protecting deep learning models in distributed learning platforms.
△ Less
Submitted 16 July, 2021; v1 submitted 7 May, 2021;
originally announced May 2021.
-
Ensemble Learning based on Classifier Prediction Confidence and Comprehensive Learning Particle Swarm Optimisation for polyp localisation
Authors:
Truong Dang,
Thanh Nguyen,
John McCall,
Alan Wee-Chung Liew
Abstract:
Colorectal cancer (CRC) is the first cause of death in many countries. CRC originates from a small clump of cells on the lining of the colon called polyps, which over time might grow and become malignant. Early detection and removal of polyps are therefore necessary for the prevention of colon cancer. In this paper, we introduce an ensemble of medical polyp segmentation algorithms. Based on an obs…
▽ More
Colorectal cancer (CRC) is the first cause of death in many countries. CRC originates from a small clump of cells on the lining of the colon called polyps, which over time might grow and become malignant. Early detection and removal of polyps are therefore necessary for the prevention of colon cancer. In this paper, we introduce an ensemble of medical polyp segmentation algorithms. Based on an observation that different segmentation algorithms will perform well on different subsets of examples because of the nature and size of training sets they have been exposed to and because of method-intrinsic factors, we propose to measure the confidence in the prediction of each algorithm and then use an associate threshold to determine whether the confidence is acceptable or not. An algorithm is selected for the ensemble if the confidence is below its associate threshold. The optimal threshold for each segmentation algorithm is found by using Comprehensive Learning Particle Swarm Optimization (CLPSO), a swarm intelligence algorithm. The Dice coefficient, a popular performance metric for image segmentation, is used as the fitness criteria. Experimental results on two polyp segmentation datasets MICCAI2015 and Kvasir-SEG confirm that our ensemble achieves better results compared to some well-known segmentation algorithms.
△ Less
Submitted 10 April, 2021;
originally announced April 2021.
-
A new semi-supervised self-training method for lung cancer prediction
Authors:
Kelvin Shak,
Mundher Al-Shabi,
Andrea Liew,
Boon Leong Lan,
Wai Yee Chan,
Kwan Hoong Ng,
Maxine Tan
Abstract:
Background and Objective: Early detection of lung cancer is crucial as it has high mortality rate with patients commonly present with the disease at stage 3 and above. There are only relatively few methods that simultaneously detect and classify nodules from computed tomography (CT) scans. Furthermore, very few studies have used semi-supervised learning for lung cancer prediction. This study prese…
▽ More
Background and Objective: Early detection of lung cancer is crucial as it has high mortality rate with patients commonly present with the disease at stage 3 and above. There are only relatively few methods that simultaneously detect and classify nodules from computed tomography (CT) scans. Furthermore, very few studies have used semi-supervised learning for lung cancer prediction. This study presents a complete end-to-end scheme to detect and classify lung nodules using the state-of-the-art Self-training with Noisy Student method on a comprehensive CT lung screening dataset of around 4,000 CT scans.
Methods: We used three datasets, namely LUNA16, LIDC and NLST, for this study. We first utilise a three-dimensional deep convolutional neural network model to detect lung nodules in the detection stage. The classification model known as Maxout Local-Global Network uses non-local networks to detect global features including shape features, residual blocks to detect local features including nodule texture, and a Maxout layer to detect nodule variations. We trained the first Self-training with Noisy Student model to predict lung cancer on the unlabelled NLST datasets. Then, we performed Mixup regularization to enhance our scheme and provide robustness to erroneous labels.
Results and Conclusions: Our new Mixup Maxout Local-Global network achieves an AUC of 0.87 on 2,005 completely independent testing scans from the NLST dataset. Our new scheme significantly outperformed the next highest performing method at the 5% significance level using DeLong's test (p = 0.0001). This study presents a new complete end-to-end scheme to predict lung cancer using Self-training with Noisy Student combined with Mixup regularization. On a completely independent dataset of 2,005 scans, we achieved state-of-the-art performance even with more images as compared to other methods.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
Streaming Active Deep Forest for Evolving Data Stream Classification
Authors:
Anh Vu Luong,
Tien Thanh Nguyen,
Alan Wee-Chung Liew
Abstract:
In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite…
▽ More
In recent years, Deep Neural Networks (DNNs) have gained progressive momentum in many areas of machine learning. The layer-by-layer process of DNNs has inspired the development of many deep models, including deep ensembles. The most notable deep ensemble-based model is Deep Forest, which can achieve highly competitive performance while having much fewer hyper-parameters comparing to DNNs. In spite of its huge success in the batch learning setting, no effort has been made to adapt Deep Forest to the context of evolving data streams. In this work, we introduce the Streaming Deep Forest (SDF) algorithm, a high-performance deep ensemble method specially adapted to stream classification. We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context. We compare the proposed methods to state-of-the-art streaming algorithms in a wide range of datasets. The results show that by following the AVU active learning strategy, SDF with only 70\% of labeling budget significantly outperforms other methods trained with all instances.
△ Less
Submitted 26 February, 2020;
originally announced February 2020.
-
Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation
Authors:
Fahim Irfan Alam,
Jun Zhou,
Alan Wee-Chung Liew,
Xiuping Jia,
Jocelyn Chanussot,
Yongsheng Gao
Abstract:
Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual informa…
▽ More
Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.
△ Less
Submitted 27 December, 2017; v1 submitted 13 November, 2017;
originally announced November 2017.
-
An ensemble-based online learning algorithm for streaming data
Authors:
Tien Thanh Nguyen,
Thi Thu Thuy Nguyen,
Xuan Cuong Pham,
Alan Wee-Chung Liew,
James C. Bezdek
Abstract:
In this study, we introduce an ensemble-based approach for online machine learning. The ensemble of base classifiers in our approach is obtained by learning Naive Bayes classifiers on different training sets which are generated by projecting the original training set to lower dimensional space. We propose a mechanism to learn sequences of data using data chunks paradigm. The experiments conducted…
▽ More
In this study, we introduce an ensemble-based approach for online machine learning. The ensemble of base classifiers in our approach is obtained by learning Naive Bayes classifiers on different training sets which are generated by projecting the original training set to lower dimensional space. We propose a mechanism to learn sequences of data using data chunks paradigm. The experiments conducted on a number of UCI datasets and one synthetic dataset demonstrate that the proposed approach performs significantly better than some well-known online learning algorithms.
△ Less
Submitted 25 April, 2017;
originally announced April 2017.
-
Aggregation of Classifiers: A Justifiable Information Granularity Approach
Authors:
Tien Thanh Nguyen,
Xuan Cuong Pham,
Alan Wee-Chung Liew,
Witold Pedrycz
Abstract:
In this study, we introduce a new approach to combine multi-classifiers in an ensemble system. Instead of using numeric membership values encountered in fixed combining rules, we construct interval membership values associated with each class prediction at the level of meta-data of observation by using concepts of information granules. In the proposed method, uncertainty (diversity) of findings pr…
▽ More
In this study, we introduce a new approach to combine multi-classifiers in an ensemble system. Instead of using numeric membership values encountered in fixed combining rules, we construct interval membership values associated with each class prediction at the level of meta-data of observation by using concepts of information granules. In the proposed method, uncertainty (diversity) of findings produced by the base classifiers is quantified by interval-based information granules. The discriminative decision model is generated by considering both the bounds and the length of the obtained intervals. We select ten and then fifteen learning algorithms to build a heterogeneous ensemble system and then conducted the experiment on a number of UCI datasets. The experimental results demonstrate that the proposed approach performs better than the benchmark algorithms including six fixed combining methods, one trainable combining method, AdaBoost, Bagging, and Random Subspace.
△ Less
Submitted 15 March, 2017;
originally announced March 2017.