Search | arXiv e-print repository

SiamNAS: Siamese Surrogate Model for Dominance Relation Prediction in Multi-objective Neural Architecture Search

Authors: Yuyang Zhou, Ferrante Neri, Yew-Soon Ong, Ruibin Bai

Abstract: Modern neural architecture search (NAS) is inherently multi-objective, balancing trade-offs such as accuracy, parameter count, and computational cost. This complexity makes NAS computationally expensive and nearly impossible to solve without efficient approximations. To address this, we propose a novel surrogate modelling approach that leverages an ensemble of Siamese network blocks to predict dom… ▽ More Modern neural architecture search (NAS) is inherently multi-objective, balancing trade-offs such as accuracy, parameter count, and computational cost. This complexity makes NAS computationally expensive and nearly impossible to solve without efficient approximations. To address this, we propose a novel surrogate modelling approach that leverages an ensemble of Siamese network blocks to predict dominance relationships between candidate architectures. Lightweight and easy to train, the surrogate achieves 92% accuracy and replaces the crowding distance calculation in the survivor selection strategy with a heuristic rule based on model size. Integrated into a framework termed SiamNAS, this design eliminates costly evaluations during the search process. Experiments on NAS-Bench-201 demonstrate the framework's ability to identify Pareto-optimal solutions with significantly reduced computational costs. The proposed SiamNAS identified a final non-dominated set containing the best architecture in NAS-Bench-201 for CIFAR-10 and the second-best for ImageNet, in terms of test error rate, within 0.01 GPU days. This proof-of-concept study highlights the potential of the proposed Siamese network surrogate model to generalise to multi-tasking optimisation, enabling simultaneous optimisation across tasks. Additionally, it offers opportunities to extend the approach for generating Sets of Pareto Sets (SOS), providing diverse Pareto-optimal solutions for heterogeneous task settings. △ Less

Submitted 3 June, 2025; originally announced June 2025.

Comments: Genetic and Evolutionary Computation Conference (GECCO' 25)

arXiv:2506.02021 [pdf, ps, other]

Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics

Authors: Yinjie Zhao, Heng Zhao, Bihan Wen, Yew-Soon Ong, Joey Tianyi Zhou

Abstract: With the rapid development of vision tasks and the scaling on datasets and models, redundancy reduction in vision datasets has become a key area of research. To address this issue, dataset distillation (DD) has emerged as a promising approach to generating highly compact synthetic datasets with significantly less redundancy while preserving essential information. However, while DD has been extensi… ▽ More With the rapid development of vision tasks and the scaling on datasets and models, redundancy reduction in vision datasets has become a key area of research. To address this issue, dataset distillation (DD) has emerged as a promising approach to generating highly compact synthetic datasets with significantly less redundancy while preserving essential information. However, while DD has been extensively studied for image datasets, DD on video datasets remains underexplored. Video datasets present unique challenges due to the presence of temporal information and varying levels of redundancy across different classes. Existing DD approaches assume a uniform level of temporal redundancy across all different video semantics, which limits their effectiveness on video datasets. In this work, we propose Dynamic-Aware Video Distillation (DAViD), a Reinforcement Learning (RL) approach to predict the optimal Temporal Resolution of the synthetic videos. A teacher-in-the-loop reward function is proposed to update the RL agent policy. To the best of our knowledge, this is the first study to introduce adaptive temporal resolution based on video semantics in video dataset distillation. Our approach significantly outperforms existing DD methods, demonstrating substantial improvements in performance. This work paves the way for future research on more efficient and semantic-adaptive video dataset distillation research. △ Less

Submitted 28 May, 2025; originally announced June 2025.

arXiv:2506.00490 [pdf, ps, other]

LLM-Driven Instance-Specific Heuristic Generation and Selection

Authors: Shaofeng Zhang, Shengcai Liu, Ning Lu, Jiahao Wu, Ji Liu, Yew-Soon Ong, Ke Tang

Abstract: Combinatorial optimization problems are widely encountered in real-world applications. Designing high-quality heuristic algorithms that efficiently approximate optimal solutions within reasonable time is a critical research challenge. In recent years, many works have explored integrating Large Language Models (LLMs) with Evolutionary Algorithms to automate heuristic algorithm design through prompt… ▽ More Combinatorial optimization problems are widely encountered in real-world applications. Designing high-quality heuristic algorithms that efficiently approximate optimal solutions within reasonable time is a critical research challenge. In recent years, many works have explored integrating Large Language Models (LLMs) with Evolutionary Algorithms to automate heuristic algorithm design through prompt engineering. However, these approaches generally adopt a problem-specific paradigm, applying a single algorithm across all problem instances, failing to account for the heterogeneity across instances. In this paper, we propose InstSpecHH, a novel framework that introduces the concept of instance-specific heuristic generation. InstSpecHH partitions the overall problem class into sub-classes based on instance features and performs differentiated, automated heuristic design for each problem subclass. By tailoring heuristics to the unique features of different sub-classes, InstSpecHH achieves better performance at the problem class level while avoiding redundant heuristic generation for similar instances, thus reducing computational overhead. This approach effectively balances the trade-off between the cost of automatic heuristic design and the quality of the obtained solutions. To evaluate the performance of InstSpecHH, we conduct experiments on 4,500 subclasses of the Online Bin Packing Problem (OBPP) and 365 subclasses of the Capacitated Vehicle Routing Problem (CVRP). Experimental results show that InstSpecHH demonstrates strong intra-subclass and inter-subclass generalization capabilities. Compared to previous problem-specific methods, InstSpecHH reduces the average optimality gap by more than 5.6\% for OBPP and 0.9\% for CVRP. These results highlight the potential of instance-aware automatic heuristic design to further enhance solution quality. △ Less

Submitted 2 June, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

arXiv:2505.22967 [pdf, ps, other]

MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming

Authors: Chengqi Zheng, Jianda Chen, Yueming Lyu, Wen Zheng Terence Ng, Haopeng Zhang, Yew-Soon Ong, Ivor Tsang, Haiyan Yin

Abstract: Despite the promise of autonomous agentic reasoning, existing workflow generation methods frequently produce fragile, unexecutable plans due to unconstrained LLM-driven construction. We introduce MermaidFlow, a framework that redefines the agentic search space through safety-constrained graph evolution. At its core, MermaidFlow represent workflows as a verifiable intermediate representation using… ▽ More Despite the promise of autonomous agentic reasoning, existing workflow generation methods frequently produce fragile, unexecutable plans due to unconstrained LLM-driven construction. We introduce MermaidFlow, a framework that redefines the agentic search space through safety-constrained graph evolution. At its core, MermaidFlow represent workflows as a verifiable intermediate representation using Mermaid, a structured and human-interpretable graph language. We formulate domain-aware evolutionary operators, i.e., crossover, mutation, insertion, and deletion, to preserve semantic correctness while promoting structural diversity, enabling efficient exploration of a high-quality, statically verifiable workflow space. Without modifying task settings or evaluation protocols, MermaidFlow achieves consistent improvements in success rates and faster convergence to executable plans on the agent reasoning benchmark. The experimental results demonstrate that safety-constrained graph evolution offers a scalable, modular foundation for robust and interpretable agentic reasoning systems. △ Less

Submitted 28 May, 2025; originally announced May 2025.

arXiv:2505.20648 [pdf, other]

Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning

Authors: Mengmeng Chen, Xiaohu Wu, Qiqi Liu, Tiantian He, Yew-Soon Ong, Yaochu Jin, Qicheng Lao, Han Yu

Abstract: Multi-objective optimization (MOO) exists extensively in machine learning, and aims to find a set of Pareto-optimal solutions, called the Pareto front, e.g., it is fundamental for multiple avenues of research in federated learning (FL). Pareto-Front Learning (PFL) is a powerful method implemented using Hypernetworks (PHNs) to approximate the Pareto front. This method enables the acquisition of a m… ▽ More Multi-objective optimization (MOO) exists extensively in machine learning, and aims to find a set of Pareto-optimal solutions, called the Pareto front, e.g., it is fundamental for multiple avenues of research in federated learning (FL). Pareto-Front Learning (PFL) is a powerful method implemented using Hypernetworks (PHNs) to approximate the Pareto front. This method enables the acquisition of a mapping function from a given preference vector to the solutions on the Pareto front. However, most existing PFL approaches still face two challenges: (a) sampling rays in high-dimensional spaces; (b) failing to cover the entire Pareto Front which has a convex shape. Here, we introduce a novel PFL framework, called as PHN-HVVS, which decomposes the design space into Voronoi grids and deploys a genetic algorithm (GA) for Voronoi grid partitioning within high-dimensional space. We put forward a new loss function, which effectively contributes to more extensive coverage of the resultant Pareto front and maximizes the HV Indicator. Experimental results on multiple MOO machine learning tasks demonstrate that PHN-HVVS outperforms the baselines significantly in generating Pareto front. Also, we illustrate that PHN-HVVS advances the methodologies of several recent problems in the FL field. The code is available at https://github.com/buptcmm/phnhvvs}{https://github.com/buptcmm/phnhvvs. △ Less

Submitted 26 May, 2025; originally announced May 2025.

arXiv:2505.13529 [pdf, other]

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs

Authors: Junxiao Yang, Jinzhe Tu, Haoran Liu, Xiaoce Wang, Chujie Zheng, Zhexin Zhang, Shiyao Cui, Caishun Chen, Tiantian He, Hongning Wang, Yew-Soon Ong, Minlie Huang

Abstract: Recent advances in Large Reasoning Models (LRMs) have shown impressive capabilities in mathematical and logical reasoning. However, current LRMs rarely admit ignorance or respond with "I don't know". Instead, they often produce incorrect answers while showing undue confidence, raising concerns about their factual reliability. In this work, we identify two pathological reasoning patterns characteri… ▽ More Recent advances in Large Reasoning Models (LRMs) have shown impressive capabilities in mathematical and logical reasoning. However, current LRMs rarely admit ignorance or respond with "I don't know". Instead, they often produce incorrect answers while showing undue confidence, raising concerns about their factual reliability. In this work, we identify two pathological reasoning patterns characterized by overthinking that contribute to the overconfident and incorrect answers: last-minute guessing and second-thought spiraling. To address these issues, we propose BARREL-a novel framework that promotes concise and boundary-aware factual reasoning. Our experiments show that BARREL-training increases the reliability of DeepSeek-R1-Distill-Llama-8B from 39.33% to 61.48%, while still achieving accuracy comparable to models finetuned on reasoning data generated by R1. These results demonstrate that our pilot study is inspiring to build more reliable and factual System 2 LRMs. △ Less

Submitted 18 May, 2025; originally announced May 2025.

arXiv:2505.12038 [pdf, other]

Safe Delta: Consistently Preserving Safety when Fine-Tuning LLMs on Diverse Datasets

Authors: Ning Lu, Shengcai Liu, Jiahao Wu, Weiyu Chen, Zhirui Zhang, Yew-Soon Ong, Qi Wang, Ke Tang

Abstract: Large language models (LLMs) have shown great potential as general-purpose AI assistants across various domains. To fully leverage this potential in specific applications, many companies provide fine-tuning API services, enabling users to upload their own data for LLM customization. However, fine-tuning services introduce a new safety threat: user-uploaded data, whether harmful or benign, can brea… ▽ More Large language models (LLMs) have shown great potential as general-purpose AI assistants across various domains. To fully leverage this potential in specific applications, many companies provide fine-tuning API services, enabling users to upload their own data for LLM customization. However, fine-tuning services introduce a new safety threat: user-uploaded data, whether harmful or benign, can break the model's alignment, leading to unsafe outputs. Moreover, existing defense methods struggle to address the diversity of fine-tuning datasets (e.g., varying sizes, tasks), often sacrificing utility for safety or vice versa. To address this issue, we propose Safe Delta, a safety-aware post-training defense method that adjusts the delta parameters (i.e., the parameter change before and after fine-tuning). Specifically, Safe Delta estimates the safety degradation, selects delta parameters to maximize utility while limiting overall safety loss, and applies a safety compensation vector to mitigate residual safety loss. Through extensive experiments on four diverse datasets with varying settings, our approach consistently preserves safety while ensuring that the utility gain from benign datasets remains unaffected. △ Less

Submitted 17 May, 2025; originally announced May 2025.

Comments: ICML 2025 Camera Ready

arXiv:2505.07972 [pdf, other]

GUP Effective Metric Without GUP: Implications for the Sign of GUP Parameter and Quantum Bounce

Authors: Yen Chin Ong

Abstract: The standard form of generalized uncertainty principle (GUP) predicts that the Hawking temperature is modified near the Planck scale and that the Bekenstein-Hawking entropy receives a logarithmic correction, consistent with other approaches to quantum gravity. However, due to the heuristic arguments in most GUP literature, it is not clear how to obtain the Schwarzschild metric that incorporates GU… ▽ More The standard form of generalized uncertainty principle (GUP) predicts that the Hawking temperature is modified near the Planck scale and that the Bekenstein-Hawking entropy receives a logarithmic correction, consistent with other approaches to quantum gravity. However, due to the heuristic arguments in most GUP literature, it is not clear how to obtain the Schwarzschild metric that incorporates GUP correction. In this work, we try a different approach. We will start with the entropy expression with the standard logarithmic correction term, and use the recently proposed "generalized entropy and varying-G correspondence" (GEVAG) to obtain the associated metric. We show that the Hawking temperature obtained from this metric matches the GUP version. In this sense, we have derived in a consistent and reliable manner, a metric tensor that can describe the standard GUP physics, and use it to clarify some shortcomings in the heuristic GUP approach itself. In particular, if the strict Bekenstein bound is imposed, then the GUP parameter is negative. We also speculate on the possibility that instead of a stable remnant, the final stage of black hole evaporation could be a "bounce" due to an effective gravitational repulsion, once higher order corrections are included. △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: 8 pages, 3 figures

arXiv:2505.03907 [pdf, other]

Do Black Holes With Generalized Entropy Violate Bekenstein Bound?

Authors: Hengxin Lu, Yen Chin Ong

Abstract: In general yes, but also not quite. It is known that if the Bekenstein-Hawking entropy is replaced by some kind of generalized entropy, then the Bekenstein bound may be grossly violated. In this work, we show that this undesired violation can be avoided if we employ the equivalence between generalized entropy and varying-$G$ gravity (GEVAG). In this approach, modifying entropy necessarily also mod… ▽ More In general yes, but also not quite. It is known that if the Bekenstein-Hawking entropy is replaced by some kind of generalized entropy, then the Bekenstein bound may be grossly violated. In this work, we show that this undesired violation can be avoided if we employ the equivalence between generalized entropy and varying-$G$ gravity (GEVAG). In this approach, modifying entropy necessarily also modifies gravity (as one should expect if gravity is indeed inherently tied to thermodynamics), which leads to an effective gravitational "constant" $G_\text{eff}$ that is area-dependent, and a thermodynamic energy that is distinct from the ADM mass. We show that a relaxed Bekenstein bound of the form $S \leqslant CRE$ is always satisfied, albeit the coefficient $C$ is no longer $2π$. △ Less

Submitted 6 May, 2025; originally announced May 2025.

arXiv:2505.00998 [pdf, other]

Deterministic-to-Stochastic Diverse Latent Feature Mapping for Human Motion Synthesis

Authors: Yu Hua, Weiming Liu, Gui Xu, Yaqing Hou, Yew-Soon Ong, Qiang Zhang

Abstract: Human motion synthesis aims to generate plausible human motion sequences, which has raised widespread attention in computer animation. Recent score-based generative models (SGMs) have demonstrated impressive results on this task. However, their training process involves complex curvature trajectories, leading to unstable training process. In this paper, we propose a Deterministic-to-Stochastic Div… ▽ More Human motion synthesis aims to generate plausible human motion sequences, which has raised widespread attention in computer animation. Recent score-based generative models (SGMs) have demonstrated impressive results on this task. However, their training process involves complex curvature trajectories, leading to unstable training process. In this paper, we propose a Deterministic-to-Stochastic Diverse Latent Feature Mapping (DSDFM) method for human motion synthesis. DSDFM consists of two stages. The first human motion reconstruction stage aims to learn the latent space distribution of human motions. The second diverse motion generation stage aims to build connections between the Gaussian distribution and the latent space distribution of human motions, thereby enhancing the diversity and accuracy of the generated human motions. This stage is achieved by the designed deterministic feature mapping procedure with DerODE and stochastic diverse output generation procedure with DivSDE.DSDFM is easy to train compared to previous SGMs-based methods and can enhance diversity without introducing additional training parameters.Through qualitative and quantitative experiments, DSDFM achieves state-of-the-art results surpassing the latest methods, validating its superiority in human motion synthesis. △ Less

Submitted 2 May, 2025; originally announced May 2025.

Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025

arXiv:2504.11511 [pdf, other]

Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs

Authors: Flint Xiaofeng Fan, Cheston Tan, Roger Wattenhofer, Yew-Soon Ong

Abstract: The rise of reinforcement learning (RL) in critical real-world applications demands a fundamental rethinking of privacy in AI systems. Traditional privacy frameworks, designed to protect isolated data points, fall short for sequential decision-making systems where sensitive information emerges from temporal patterns, behavioral strategies, and collaborative dynamics. Modern RL paradigms, such as f… ▽ More The rise of reinforcement learning (RL) in critical real-world applications demands a fundamental rethinking of privacy in AI systems. Traditional privacy frameworks, designed to protect isolated data points, fall short for sequential decision-making systems where sensitive information emerges from temporal patterns, behavioral strategies, and collaborative dynamics. Modern RL paradigms, such as federated RL (FedRL) and RL with human feedback (RLHF) in large language models (LLMs), exacerbate these challenges by introducing complex, interactive, and context-dependent learning environments that traditional methods do not address. In this position paper, we argue for a new privacy paradigm built on four core principles: multi-scale protection, behavioral pattern protection, collaborative privacy preservation, and context-aware adaptation. These principles expose inherent tensions between privacy, utility, and interpretability that must be navigated as RL systems become more pervasive in high-stakes domains like healthcare, autonomous vehicles, and decision support systems powered by LLMs. To tackle these challenges, we call for the development of new theoretical frameworks, practical mechanisms, and rigorous evaluation methodologies that collectively enable effective privacy protection in sequential decision-making systems. △ Less

Submitted 15 April, 2025; originally announced April 2025.

Comments: Accepted to IJCNN 2025 Position Paper Track

arXiv:2503.08394 [pdf, other]

($\boldsymbolθ_l, \boldsymbolθ_u$)-Parametric Multi-Task Optimization: Joint Search in Solution and Infinite Task Spaces

Authors: Tingyang Wei, Jiao Liu, Abhishek Gupta, Puay Siew Tan, Yew-Soon Ong

Abstract: Multi-task optimization is typically characterized by a fixed and finite set of optimization tasks. The present paper relaxes this condition by considering a non-fixed and potentially infinite set of optimization tasks defined in a parameterized, continuous and bounded task space. We refer to this unique problem setting as parametric multi-task optimization (PMTO). Assuming the bounds of the task… ▽ More Multi-task optimization is typically characterized by a fixed and finite set of optimization tasks. The present paper relaxes this condition by considering a non-fixed and potentially infinite set of optimization tasks defined in a parameterized, continuous and bounded task space. We refer to this unique problem setting as parametric multi-task optimization (PMTO). Assuming the bounds of the task parameters to be ($\boldsymbolθ_l$, $\boldsymbolθ_u$), a novel ($\boldsymbolθ_l$, $\boldsymbolθ_u$)-PMTO algorithm is crafted to enable joint search over tasks and their solutions. This joint search is supported by two approximation models: (1) for mapping solutions to the objective spaces of all tasks, which provably accelerates convergence by acting as a conduit for inter-task knowledge transfers, and (2) for probabilistically mapping tasks to the solution space, which facilitates evolutionary exploration of under-explored regions of the task space. At the end of a full ($\boldsymbolθ_l$, $\boldsymbolθ_u$)-PMTO run, the acquired models enable rapid identification of optimized solutions for any task lying within the specified bounds. This outcome is validated on both synthetic test problems and practical case studies, with the significant real-world applicability of PMTO shown towards fast reconfiguration of robot controllers under changing task conditions. The potential of PMTO to vastly speedup the search for solutions to minimax optimization problems is also demonstrated through an example in robust engineering design. △ Less

Submitted 11 March, 2025; originally announced March 2025.

arXiv:2503.05246 [pdf, other]

Mastering Continual Reinforcement Learning through Fine-Grained Sparse Network Allocation and Dormant Neuron Exploration

Authors: Chengqi Zheng, Haiyan Yin, Jianda Chen, Terence Ng, Yew-Soon Ong, Ivor Tsang

Abstract: Continual Reinforcement Learning (CRL) is essential for developing agents that can learn, adapt, and accumulate knowledge over time. However, a fundamental challenge persists as agents must strike a delicate balance between plasticity, which enables rapid skill acquisition, and stability, which ensures long-term knowledge retention while preventing catastrophic forgetting. In this paper, we introd… ▽ More Continual Reinforcement Learning (CRL) is essential for developing agents that can learn, adapt, and accumulate knowledge over time. However, a fundamental challenge persists as agents must strike a delicate balance between plasticity, which enables rapid skill acquisition, and stability, which ensures long-term knowledge retention while preventing catastrophic forgetting. In this paper, we introduce SSDE, a novel structure-based approach that enhances plasticity through a fine-grained allocation strategy with Structured Sparsity and Dormant-guided Exploration. SSDE decomposes the parameter space into forward-transfer (frozen) parameters and task-specific (trainable) parameters. Crucially, these parameters are allocated by an efficient co-allocation scheme under sparse coding, ensuring sufficient trainable capacity for new tasks while promoting efficient forward transfer through frozen parameters. However, structure-based methods often suffer from rigidity due to the accumulation of non-trainable parameters, limiting exploration and adaptability. To address this, we further introduce a sensitivity-guided neuron reactivation mechanism that systematically identifies and resets dormant neurons, which exhibit minimal influence in the sparse policy network during inference. This approach effectively enhance exploration while preserving structural efficiency. Extensive experiments on the CW10-v1 Continual World benchmark demonstrate that SSDE achieves state-of-the-art performance, reaching a success rate of 95%, surpassing prior methods significantly in both plasticity and stability trade-offs (code is available at: https://github.com/chengqiArchy/SSDE). △ Less

Submitted 9 March, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

arXiv:2502.15685 [pdf, other]

Active Large Language Model-based Knowledge Distillation for Session-based Recommendation

Authors: Yingpeng Du, Zhu Sun, Ziyan Wang, Haoyan Chua, Jie Zhang, Yew-Soon Ong

Abstract: Large language models (LLMs) provide a promising way for accurate session-based recommendation (SBR), but they demand substantial computational time and memory. Knowledge distillation (KD)-based methods can alleviate these issues by transferring the knowledge to a small student, which trains a student based on the predictions of a cumbersome teacher. However, these methods encounter difficulties f… ▽ More Large language models (LLMs) provide a promising way for accurate session-based recommendation (SBR), but they demand substantial computational time and memory. Knowledge distillation (KD)-based methods can alleviate these issues by transferring the knowledge to a small student, which trains a student based on the predictions of a cumbersome teacher. However, these methods encounter difficulties for \textit{LLM-based KD in SBR}. 1) It is expensive to make LLMs predict for all instances in KD. 2) LLMs may make ineffective predictions for some instances in KD, e.g., incorrect predictions for hard instances or similar predictions as existing recommenders for easy instances. In this paper, we propose an active LLM-based KD method in SBR, contributing to sustainable AI. To efficiently distill knowledge from LLMs with limited cost, we propose to extract a small proportion of instances predicted by LLMs. Meanwhile, for a more effective distillation, we propose an active learning strategy to extract instances that are as effective as possible for KD from a theoretical view. Specifically, we first formulate gains based on potential effects (e.g., effective, similar, and incorrect predictions by LLMs) and difficulties (e.g., easy or hard to fit) of instances for KD. Then, we propose to maximize the minimal gains of distillation to find the optimal selection policy for active learning, which can largely avoid extracting ineffective instances in KD. Experiments on real-world datasets show that our method significantly outperforms state-of-the-art methods for SBR. △ Less

Submitted 15 December, 2024; originally announced February 2025.

Comments: 14 pages, 4 figures

ACM Class: H.4

arXiv:2502.13180 [pdf, other]

Uncertain Multi-Objective Recommendation via Orthogonal Meta-Learning Enhanced Bayesian Optimization

Authors: Hongxu Wang, Zhu Sun, Yingpeng Du, Lu Zhang, Tiantian He, Yew-Soon Ong

Abstract: Recommender systems (RSs) play a crucial role in shaping our digital interactions, influencing how we access and engage with information across various domains. Traditional research has predominantly centered on maximizing recommendation accuracy, often leading to unintended side effects such as echo chambers and constrained user experiences. Drawing inspiration from autonomous driving, we introdu… ▽ More Recommender systems (RSs) play a crucial role in shaping our digital interactions, influencing how we access and engage with information across various domains. Traditional research has predominantly centered on maximizing recommendation accuracy, often leading to unintended side effects such as echo chambers and constrained user experiences. Drawing inspiration from autonomous driving, we introduce a novel framework that categorizes RS autonomy into five distinct levels, ranging from basic rule-based accuracy-driven systems to behavior-aware, uncertain multi-objective RSs - where users may have varying needs, such as accuracy, diversity, and fairness. In response, we propose an approach that dynamically identifies and optimizes multiple objectives based on individual user preferences, fostering more ethical and intelligent user-centric recommendations. To navigate the uncertainty inherent in multi-objective RSs, we develop a Bayesian optimization (BO) framework that captures personalized trade-offs between different objectives while accounting for their uncertain interdependencies. Furthermore, we introduce an orthogonal meta-learning paradigm to enhance BO efficiency and effectiveness by leveraging shared knowledge across similar tasks and mitigating conflicts among objectives through the discovery of orthogonal information. Finally, extensive empirical evaluations demonstrate the effectiveness of our method in optimizing uncertain multi-objectives for individual users, paving the way for more adaptive and user-focused RSs. △ Less

Submitted 18 February, 2025; originally announced February 2025.

arXiv:2502.01692 [pdf, other]

Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation

Authors: Kim Yong Tan, Yueming Lyu, Ivor Tsang, Yew-Soon Ong

Abstract: Guided diffusion-model generation is a promising direction for customizing the generation process of a pre-trained diffusion model to address specific downstream tasks. Existing guided diffusion models either rely on training the guidance model with pre-collected datasets or require the objective functions to be differentiable. However, for most real-world tasks, offline datasets are often unavail… ▽ More Guided diffusion-model generation is a promising direction for customizing the generation process of a pre-trained diffusion model to address specific downstream tasks. Existing guided diffusion models either rely on training the guidance model with pre-collected datasets or require the objective functions to be differentiable. However, for most real-world tasks, offline datasets are often unavailable, and their objective functions are often not differentiable, such as image generation with human preferences, molecular generation for drug discovery, and material design. Thus, we need an $\textbf{online}$ algorithm capable of collecting data during runtime and supporting a $\textbf{black-box}$ objective function. Moreover, the $\textbf{query efficiency}$ of the algorithm is also critical because the objective evaluation of the query is often expensive in real-world scenarios. In this work, we propose a novel and simple algorithm, $\textbf{Fast Direct}$, for query-efficient online black-box target generation. Our Fast Direct builds a pseudo-target on the data manifold to update the noise sequence of the diffusion model with a universal direction, which is promising to perform query-efficient guided generation. Extensive experiments on twelve high-resolution ($\small {1024 \times 1024}$) image target generation tasks and six 3D-molecule target generation tasks show $\textbf{6}\times$ up to $\textbf{10}\times$ query efficiency improvement and $\textbf{11}\times$ up to $\textbf{44}\times$ query efficiency improvement, respectively. Our implementation is publicly available at: https://github.com/kimyong95/guide-stable-diffusion/tree/fast-direct △ Less

Submitted 29 March, 2025; v1 submitted 2 February, 2025; originally announced February 2025.

arXiv:2501.13337 [pdf, other]

doi 10.1109/TCYB.2022.3165044

Generative Multi-Form Bayesian Optimization

Authors: Zhendong Guo, Haitao Liu, Yew-Soon Ong, Xinghua Qu, Yuzhe Zhang, Jianmin Zheng

Abstract: Many real-world problems, such as airfoil design, involve optimizing a black-box expensive objective function over complex structured input space (e.g., discrete space or non-Euclidean space). By mapping the complex structured input space into a latent space of dozens of variables, a two-stage procedure labeled as generative model based optimization (GMO) in this paper, shows promise in solving su… ▽ More Many real-world problems, such as airfoil design, involve optimizing a black-box expensive objective function over complex structured input space (e.g., discrete space or non-Euclidean space). By mapping the complex structured input space into a latent space of dozens of variables, a two-stage procedure labeled as generative model based optimization (GMO) in this paper, shows promise in solving such problems. However, the latent dimension of GMO is hard to determine, which may trigger the conflicting issue between desirable solution accuracy and convergence rate. To address the above issue, we propose a multi-form GMO approach, namely generative multi-form optimization (GMFoO), which conducts optimization over multiple latent spaces simultaneously to complement each other. More specifically, we devise a generative model which promotes positive correlation between latent spaces to facilitate effective knowledge transfer in GMFoO. And further, by using Bayesian optimization (BO) as the optimizer, we propose two strategies to exchange information between these latent spaces continuously. Experimental results are presented on airfoil and corbel design problems and an area maximization problem as well to demonstrate that our proposed GMFoO converges to better designs on a limited computational budget. △ Less

Submitted 22 January, 2025; originally announced January 2025.

Journal ref: in IEEE Transactions on Cybernetics, vol. 53, no. 7, pp. 4347-4360, July 2023

arXiv:2501.13332 [pdf, other]

doi 10.1109/TCYB.2022.3168551

Co-Learning Bayesian Optimization

Authors: Zhendong Guo, Yew-Soon Ong, Tiantian He, Haitao Liu

Abstract: Bayesian optimization (BO) is well known to be sample-efficient for solving black-box problems. However, the BO algorithms can sometimes get stuck in suboptimal solutions even with plenty of samples. Intrinsically, such suboptimal problem of BO can attribute to the poor surrogate accuracy of the trained Gaussian process (GP), particularly that in the regions where the optimal solutions locate. Hen… ▽ More Bayesian optimization (BO) is well known to be sample-efficient for solving black-box problems. However, the BO algorithms can sometimes get stuck in suboptimal solutions even with plenty of samples. Intrinsically, such suboptimal problem of BO can attribute to the poor surrogate accuracy of the trained Gaussian process (GP), particularly that in the regions where the optimal solutions locate. Hence, we propose to build multiple GP models instead of a single GP surrogate to complement each other and thus resolving the suboptimal problem of BO. Nevertheless, according to the bias-variance tradeoff equation, the individual prediction errors can increase when increasing the diversity of models, which may lead to even worse overall surrogate accuracy. On the other hand, based on the theory of Rademacher complexity, it has been proved that exploiting the agreement of models on unlabeled information can help to reduce the complexity of the hypothesis space, and therefore achieving the required surrogate accuracy with fewer samples. Such value of model agreement has been extensively demonstrated for co-training style algorithms to boost model accuracy with a small portion of samples. Inspired by the above, we propose a novel BO algorithm labeled as co-learning BO (CLBO), which exploits both model diversity and agreement on unlabeled information to improve the overall surrogate accuracy with limited samples, and therefore achieving more efficient global optimization. Through tests on five numerical toy problems and three engineering benchmarks, the effectiveness of proposed CLBO has been well demonstrated. △ Less

Submitted 22 January, 2025; originally announced January 2025.

Journal ref: in IEEE Transactions on Cybernetics, vol. 52, no. 9, pp. 9820-9833, Sept. 2022

arXiv:2501.08085 [pdf, other]

Dynamic Multimodal Sentiment Analysis: Leveraging Cross-Modal Attention for Enabled Classification

Authors: Hui Lee, Singh Suniljit, Yong Siang Ong

Abstract: This paper explores the development of a multimodal sentiment analysis model that integrates text, audio, and visual data to enhance sentiment classification. The goal is to improve emotion detection by capturing the complex interactions between these modalities, thereby enabling more accurate and nuanced sentiment interpretation. The study evaluates three feature fusion strategies -- late stage f… ▽ More This paper explores the development of a multimodal sentiment analysis model that integrates text, audio, and visual data to enhance sentiment classification. The goal is to improve emotion detection by capturing the complex interactions between these modalities, thereby enabling more accurate and nuanced sentiment interpretation. The study evaluates three feature fusion strategies -- late stage fusion, early stage fusion, and multi-headed attention -- within a transformer-based architecture. Experiments were conducted using the CMU-MOSEI dataset, which includes synchronized text, audio, and visual inputs labeled with sentiment scores. Results show that early stage fusion significantly outperforms late stage fusion, achieving an accuracy of 71.87\%, while the multi-headed attention approach offers marginal improvement, reaching 72.39\%. The findings suggest that integrating modalities early in the process enhances sentiment classification, while attention mechanisms may have limited impact within the current framework. Future work will focus on refining feature fusion techniques, incorporating temporal data, and exploring dynamic feature weighting to further improve model performance. △ Less

Submitted 14 January, 2025; originally announced January 2025.

arXiv:2501.06572 [pdf, other]

Evolutionary Optimization of Physics-Informed Neural Networks: Survey and Prospects

Authors: Jian Cheng Wong, Abhishek Gupta, Chin Chun Ooi, Pao-Hsiung Chiu, Jiao Liu, Yew-Soon Ong

Abstract: Deep learning models trained on finite data lack a complete understanding of the physical world. On the other hand, physics-informed neural networks (PINNs) are infused with such knowledge through the incorporation of mathematically expressible laws of nature into their training loss function. By complying with physical laws, PINNs provide advantages over purely data-driven models in limited-data… ▽ More Deep learning models trained on finite data lack a complete understanding of the physical world. On the other hand, physics-informed neural networks (PINNs) are infused with such knowledge through the incorporation of mathematically expressible laws of nature into their training loss function. By complying with physical laws, PINNs provide advantages over purely data-driven models in limited-data regimes. This feature has propelled them to the forefront of scientific machine learning, a domain characterized by scarce and costly data. However, the vision of accurate physics-informed learning comes with significant challenges. This review examines PINNs for the first time in terms of model optimization and generalization, shedding light on the need for new algorithmic advances to overcome issues pertaining to the training speed, precision, and generalizability of today's PINN models. Of particular interest are the gradient-free methods of neuroevolution for optimizing the uniquely complex loss landscapes arising in PINN training. Methods synergizing gradient descent and neuroevolution for discovering bespoke neural architectures and balancing multiple conflicting terms in physics-informed learning objectives are positioned as important avenues for future research. Yet another exciting track is to cast neuroevolution as a meta-learner of generalizable PINN models. △ Less

Submitted 31 March, 2025; v1 submitted 11 January, 2025; originally announced January 2025.

Comments: 20 pages, 8 figures, 1 table

arXiv:2412.15538 [pdf, other]

FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF

Authors: Flint Xiaofeng Fan, Cheston Tan, Yew-Soon Ong, Roger Wattenhofer, Wei-Tsang Ooi

Abstract: In the era of increasing privacy concerns and demand for personalized experiences, traditional Reinforcement Learning with Human Feedback (RLHF) frameworks face significant challenges due to their reliance on centralized data. We introduce Federated Reinforcement Learning with Human Feedback (FedRLHF), a novel framework that decentralizes the RLHF process. FedRLHF enables collaborative policy lear… ▽ More In the era of increasing privacy concerns and demand for personalized experiences, traditional Reinforcement Learning with Human Feedback (RLHF) frameworks face significant challenges due to their reliance on centralized data. We introduce Federated Reinforcement Learning with Human Feedback (FedRLHF), a novel framework that decentralizes the RLHF process. FedRLHF enables collaborative policy learning across multiple clients without necessitating the sharing of raw data or human feedback, thereby ensuring robust privacy preservation. Leveraging federated reinforcement learning, each client integrates human feedback locally into their reward functions and updates their policies through personalized RLHF processes. We establish rigorous theoretical foundations for FedRLHF, providing convergence guarantees, and deriving sample complexity bounds that scale efficiently with the number of clients. Empirical evaluations on the MovieLens and IMDb datasets demonstrate that FedRLHF not only preserves user privacy but also achieves performance on par with centralized RLHF, while enhancing personalization across diverse client environments. △ Less

Submitted 7 February, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

Comments: Updated for AAMAS 2025 camera-ready. This preprint represents the full version of the paper, including all proofs, experimental details, and additional discussions

ACM Class: I.2.11

arXiv:2412.14500 [pdf, other]

The Digital Ecosystem of Beliefs: does evolution favour AI over humans?

Authors: David M. Bossens, Shanshan Feng, Yew-Soon Ong

Abstract: As AI systems are integrated into social networks, there are AI safety concerns that AI-generated content may dominate the web, e.g. in popularity or impact on beliefs. To understand such questions, this paper proposes the Digital Ecosystem of Beliefs (Digico), the first evolutionary framework for controlled experimentation with multi-population interactions in simulated social networks. The frame… ▽ More As AI systems are integrated into social networks, there are AI safety concerns that AI-generated content may dominate the web, e.g. in popularity or impact on beliefs. To understand such questions, this paper proposes the Digital Ecosystem of Beliefs (Digico), the first evolutionary framework for controlled experimentation with multi-population interactions in simulated social networks. The framework models a population of agents which change their messaging strategies due to evolutionary updates following a Universal Darwinism approach, interact via messages, influence each other's beliefs through dynamics based on a contagion model, and maintain their beliefs through cognitive Lamarckian inheritance. Initial experiments with an abstract implementation of Digico show that: a) when AIs have faster messaging, evolution, and more influence in the recommendation algorithm, they get 80% to 95% of the views, depending on the size of the influence benefit; b) AIs designed for propaganda can typically convince 50% of humans to adopt extreme beliefs, and up to 85% when agents believe only a limited number of channels; c) a penalty for content that violates agents' beliefs reduces propaganda effectiveness by up to 8%. We further discuss implications for control (e.g. legislation) and Digico as a means of studying evolutionary principles. △ Less

Submitted 8 January, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

arXiv:2412.07796 [pdf, other]

MRP-LLM: Multitask Reflective Large Language Models for Privacy-Preserving Next POI Recommendation

Authors: Ziqing Wu, Zhu Sun, Dongxia Wang, Lu Zhang, Jie Zhang, Yew Soon Ong

Abstract: Large language models (LLMs) have shown promising potential for next Point-of-Interest (POI) recommendation. However, existing methods only perform direct zero-shot prompting, leading to ineffective extraction of user preferences, insufficient injection of collaborative signals, and a lack of user privacy protection. As such, we propose a novel Multitask Reflective Large Language Model for Privacy… ▽ More Large language models (LLMs) have shown promising potential for next Point-of-Interest (POI) recommendation. However, existing methods only perform direct zero-shot prompting, leading to ineffective extraction of user preferences, insufficient injection of collaborative signals, and a lack of user privacy protection. As such, we propose a novel Multitask Reflective Large Language Model for Privacy-preserving Next POI Recommendation (MRP-LLM), aiming to exploit LLMs for better next POI recommendation while preserving user privacy. Specifically, the Multitask Reflective Preference Extraction Module first utilizes LLMs to distill each user's fine-grained (i.e., categorical, temporal, and spatial) preferences into a knowledge base (KB). The Neighbor Preference Retrieval Module retrieves and summarizes the preferences of similar users from the KB to obtain collaborative signals. Subsequently, aggregating the user's preferences with those of similar users, the Multitask Next POI Recommendation Module generates the next POI recommendations via multitask prompting. Meanwhile, during data collection, a Privacy Transmission Module is specifically devised to preserve sensitive POI data. Extensive experiments on three real-world datasets demonstrate the efficacy of our proposed MRP-LLM in providing more accurate next POI recommendations with user privacy preserved. △ Less

Submitted 3 December, 2024; originally announced December 2024.

Comments: 14 pages, 7 figures

arXiv:2412.04826 [pdf, other]

Pushing Rendering Boundaries: Hard Gaussian Splatting

Authors: Qingshan Xu, Jiequan Cui, Xuanyu Yi, Yuxuan Wang, Yuan Zhou, Yew-Soon Ong, Hanwang Zhang

Abstract: 3D Gaussian Splatting (3DGS) has demonstrated impressive Novel View Synthesis (NVS) results in a real-time rendering manner. During training, it relies heavily on the average magnitude of view-space positional gradients to grow Gaussians to reduce rendering loss. However, this average operation smooths the positional gradients from different viewpoints and rendering errors from different pixels, h… ▽ More 3D Gaussian Splatting (3DGS) has demonstrated impressive Novel View Synthesis (NVS) results in a real-time rendering manner. During training, it relies heavily on the average magnitude of view-space positional gradients to grow Gaussians to reduce rendering loss. However, this average operation smooths the positional gradients from different viewpoints and rendering errors from different pixels, hindering the growth and optimization of many defective Gaussians. This leads to strong spurious artifacts in some areas. To address this problem, we propose Hard Gaussian Splatting, dubbed HGS, which considers multi-view significant positional gradients and rendering errors to grow hard Gaussians that fill the gaps of classical Gaussian Splatting on 3D scenes, thus achieving superior NVS results. In detail, we present positional gradient driven HGS, which leverages multi-view significant positional gradients to uncover hard Gaussians. Moreover, we propose rendering error guided HGS, which identifies noticeable pixel rendering errors and potentially over-large Gaussians to jointly mine hard Gaussians. By growing and optimizing these hard Gaussians, our method helps to resolve blurring and needle-like artifacts. Experiments on various datasets demonstrate that our method achieves state-of-the-art rendering quality while maintaining real-time efficiency. △ Less

Submitted 6 December, 2024; originally announced December 2024.

arXiv:2412.00322 [pdf, other]

The Case For Black Hole Remnants: A Review

Authors: Yen Chin Ong

Abstract: It has been almost 40 years since the proposal of the idea that Hawking radiation of black holes does not lead to a complete evaporation but rather a "remnant" state. Though traditionally viewed with great criticisms especially from the high energy physics community, in recent years, various approaches have demonstrated that black hole remnants remain a viable possibility. In this review, which is… ▽ More It has been almost 40 years since the proposal of the idea that Hawking radiation of black holes does not lead to a complete evaporation but rather a "remnant" state. Though traditionally viewed with great criticisms especially from the high energy physics community, in recent years, various approaches have demonstrated that black hole remnants remain a viable possibility. In this review, which is primarily aimed as an introduction to the subject, we will discuss some possible routes to forming remnants and their respective properties and challenges. △ Less

Submitted 17 January, 2025; v1 submitted 29 November, 2024; originally announced December 2024.

Comments: Invited contribution for the book "The Black Hole Information Paradox" (Eds. Ali Akil and Cosimo Bambi, Springer Singapore, expected in 2025); ver.2: improved and expanded some discussions; fixed typos and grammatical errors

arXiv:2412.00111 [pdf, other]

Video Set Distillation: Information Diversification and Temporal Densification

Authors: Yinjie Zhao, Heng Zhao, Bihan Wen, Yew-Soon Ong, Joey Tianyi Zhou

Abstract: The rapid development of AI models has led to a growing emphasis on enhancing their capabilities for complex input data such as videos. While large-scale video datasets have been introduced to support this growth, the unique challenges of reducing redundancies in video \textbf{sets} have not been explored. Compared to image datasets or individual videos, video \textbf{sets} have a two-layer nested… ▽ More The rapid development of AI models has led to a growing emphasis on enhancing their capabilities for complex input data such as videos. While large-scale video datasets have been introduced to support this growth, the unique challenges of reducing redundancies in video \textbf{sets} have not been explored. Compared to image datasets or individual videos, video \textbf{sets} have a two-layer nested structure, where the outer layer is the collection of individual videos, and the inner layer contains the correlations among frame-level data points to provide temporal information. Video \textbf{sets} have two dimensions of redundancies: within-sample and inter-sample redundancies. Existing methods like key frame selection, dataset pruning or dataset distillation are not addressing the unique challenge of video sets since they aimed at reducing redundancies in only one of the dimensions. In this work, we are the first to study Video Set Distillation, which synthesizes optimized video data by jointly addressing within-sample and inter-sample redundancies. Our Information Diversification and Temporal Densification (IDTD) method jointly reduces redundancies across both dimensions. This is achieved through a Feature Pool and Feature Selectors mechanism to preserve inter-sample diversity, alongside a Temporal Fusor that maintains temporal information density within synthesized videos. Our method achieves state-of-the-art results in Video Dataset Distillation, paving the way for more effective redundancy reduction and efficient AI model training on video datasets. △ Less

Submitted 28 November, 2024; originally announced December 2024.

arXiv:2411.19490 [pdf]

Generative AI as a Tool or Leader? Exploring AI-Augmented Thinking in Student Programming Tasks

Authors: Tianlong Zhong, Gaoxia Zhu, Kang You Lim, Yew Soon Ong

Abstract: The increasing use of Generative Artificial Intelligence (GAI) tools in education highlights the need to understand their influence on individuals' thinking processes and agency. This research explored 20 university students' interaction with GAI during programming. Participants completed surveys, recorded their screens during an hour-long programming session, and reflected on their GAI use. To an… ▽ More The increasing use of Generative Artificial Intelligence (GAI) tools in education highlights the need to understand their influence on individuals' thinking processes and agency. This research explored 20 university students' interaction with GAI during programming. Participants completed surveys, recorded their screens during an hour-long programming session, and reflected on their GAI use. To analyse the data, we developed an AI-augmented thinking coding scheme with four dimensions: Question Formulation, Solution Development, Solution Analysis and Evaluation, and Solution Refinement. Participants were categorised into human-led and AI-led groups based on the time ratio of human-generating source code versus copying source code from GAI. T-tests indicated that the human-led group spent significantly more time on Solution Development and Solution Refinement than the AI-led group. Sequential pattern mining revealed distinct patterns of the two groups: the human-led group often refined GAI outputs, while the AI-led group frequently relied on direct answers from GAI. Correlation analyses found that positive attitudes towards AI, critical thinking, and programming self-efficacy positively correlated with Question Formulation; critical thinking was positively related to Solution Refinement; and programming self-efficacy was negatively associated with Solution Analysis and Evaluation. This study enhances understanding of the thinking process in GAI-supported programming. △ Less

Submitted 29 November, 2024; originally announced November 2024.

arXiv:2411.12259 [pdf, other]

Prototype Optimization with Neural ODE for Few-Shot Learning

Authors: Baoquan Zhang, Shanshan Feng, Bingqi Shan, Xutao Li, Yunming Ye, Yew-Soon Ong

Abstract: Few-Shot Learning (FSL) is a challenging task, which aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then performing class prediction via a cosine classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt t… ▽ More Few-Shot Learning (FSL) is a challenging task, which aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then performing class prediction via a cosine classifier with mean-based prototypes. Nevertheless, due to the data scarcity, the mean-based prototypes are usually biased. In this paper, we attempt to diminish the prototype bias by regarding it as a prototype optimization problem. To this end, we propose a novel prototype optimization framework to rectify prototypes, i.e., introducing a meta-optimizer to optimize prototypes. Although the existing meta-optimizers can also be adapted to our framework, they all overlook a crucial gradient bias issue, i.e., the mean-based gradient estimation is also biased on sparse data. To address this issue, in this paper, we regard the gradient and its flow as meta-knowledge and then propose a novel Neural Ordinary Differential Equation (ODE)-based meta-optimizer to optimize prototypes, called MetaNODE. Although MetaNODE has shown superior performance, it suffers from a huge computational burden. To further improve its computation efficiency, we conduct a detailed analysis on MetaNODE and then design an effective and efficient MetaNODE extension version (called E2MetaNODE). It consists of two novel modules: E2GradNet and E2Solver, which aim to estimate accurate gradient flows and solve optimal prototypes in an effective and efficient manner, respectively. Extensive experiments show that 1) our methods achieve superior performance over previous FSL methods and 2) our E2MetaNODE significantly improves computation efficiency meanwhile without performance degradation. △ Less

Submitted 19 November, 2024; originally announced November 2024.

Comments: An extended version of metanode: prototype optimization as a neural ode for few-shot learning. arXiv admin note: text overlap with arXiv:2103.14341

arXiv:2411.10697 [pdf, other]

Language Model Evolutionary Algorithms for Recommender Systems: Benchmarks and Algorithm Comparisons

Authors: Jiao Liu, Zhu Sun, Shanshan Feng, Caishun Chen, Yew-Soon Ong

Abstract: In the evolutionary computing community, the remarkable language-handling capabilities and reasoning power of large language models (LLMs) have significantly enhanced the functionality of evolutionary algorithms (EAs), enabling them to tackle optimization problems involving structured language or program code. Although this field is still in its early stages, its impressive potential has led to th… ▽ More In the evolutionary computing community, the remarkable language-handling capabilities and reasoning power of large language models (LLMs) have significantly enhanced the functionality of evolutionary algorithms (EAs), enabling them to tackle optimization problems involving structured language or program code. Although this field is still in its early stages, its impressive potential has led to the development of various LLM-based EAs. To effectively evaluate the performance and practical applicability of these LLM-based EAs, benchmarks with real-world relevance are essential. In this paper, we focus on LLM-based recommender systems (RSs) and introduce a benchmark problem set, named RSBench, specifically designed to assess the performance of LLM-based EAs in recommendation prompt optimization. RSBench emphasizes session-based recommendations, aiming to discover a set of Pareto optimal prompts that guide the recommendation process, providing accurate, diverse, and fair recommendations. We develop three LLM-based EAs based on established EA frameworks and experimentally evaluate their performance using RSBench. Our study offers valuable insights into the application of EAs in LLM-based RSs. Additionally, we explore key components that may influence the overall performance of the RS, providing meaningful guidance for future research on the development of LLM-based EAs in RSs. △ Less

Submitted 6 March, 2025; v1 submitted 15 November, 2024; originally announced November 2024.

arXiv:2411.06740 [pdf, other]

Dockformer: A transformer-based molecular docking paradigm for large-scale virtual screening

Authors: Zhangfan Yang, Junkai Ji, Shan He, Jianqiang Li, Tiantian He, Ruibin Bai, Zexuan Zhu, Yew Soon Ong

Abstract: Molecular docking is a crucial step in drug development, which enables the virtual screening of compound libraries to identify potential ligands that target proteins of interest. However, the computational complexity of traditional docking models increases as the size of the compound library increases. Recently, deep learning algorithms can provide data-driven research and development models to in… ▽ More Molecular docking is a crucial step in drug development, which enables the virtual screening of compound libraries to identify potential ligands that target proteins of interest. However, the computational complexity of traditional docking models increases as the size of the compound library increases. Recently, deep learning algorithms can provide data-driven research and development models to increase the speed of the docking process. Unfortunately, few models can achieve superior screening performance compared to that of traditional models. Therefore, a novel deep learning-based docking approach named Dockformer is introduced in this study. Dockformer leverages multimodal information to capture the geometric topology and structural knowledge of molecules and can directly generate binding conformations with the corresponding confidence measures in an end-to-end manner. The experimental results show that Dockformer achieves success rates of 90.53% and 82.71% on the PDBbind core set and PoseBusters benchmarks, respectively, and more than a 100-fold increase in the inference process speed, outperforming almost all state-of-the-art docking methods. In addition, the ability of Dockformer to identify the main protease inhibitors of coronaviruses is demonstrated in a real-world virtual screening scenario. Considering its high docking accuracy and screening efficiency, Dockformer can be regarded as a powerful and robust tool in the field of drug design. △ Less

Submitted 5 December, 2024; v1 submitted 11 November, 2024; originally announced November 2024.

Comments: 15 pages, 10 figures

arXiv:2411.02871 [pdf, other]

Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training

Authors: Junhao Dong, Xinghua Qu, Z. Jane Wang, Yew-Soon Ong

Abstract: Despite remarkable achievements in deep learning across various domains, its inherent vulnerability to adversarial examples still remains a critical concern for practical deployment. Adversarial training has emerged as one of the most effective defensive techniques for improving model robustness against such malicious inputs. However, existing adversarial training schemes often lead to limited gen… ▽ More Despite remarkable achievements in deep learning across various domains, its inherent vulnerability to adversarial examples still remains a critical concern for practical deployment. Adversarial training has emerged as one of the most effective defensive techniques for improving model robustness against such malicious inputs. However, existing adversarial training schemes often lead to limited generalization ability against underlying adversaries with diversity due to their overreliance on a point-by-point augmentation strategy by mapping each clean example to its adversarial counterpart during training. In addition, adversarial examples can induce significant disruptions in the statistical information w.r.t. the target model, thereby introducing substantial uncertainty and challenges to modeling the distribution of adversarial examples. To circumvent these issues, in this paper, we propose a novel uncertainty-aware distributional adversarial training method, which enforces adversary modeling by leveraging both the statistical information of adversarial examples and its corresponding uncertainty estimation, with the goal of augmenting the diversity of adversaries. Considering the potentially negative impact induced by aligning adversaries to misclassified clean examples, we also refine the alignment reference based on the statistical proximity to clean examples during adversarial training, thereby reframing adversarial training within a distribution-to-distribution matching framework interacted between the clean and adversarial domains. Furthermore, we design an introspective gradient alignment approach via matching input gradients between these domains without introducing external models. Extensive experiments across four benchmark datasets and various network architectures demonstrate that our approach achieves state-of-the-art adversarial robustness and maintains natural performance. △ Less

Submitted 5 November, 2024; originally announced November 2024.

arXiv:2411.02467 [pdf, other]

Towards Harmless Rawlsian Fairness Regardless of Demographic Prior

Authors: Xuanqian Wang, Jing Li, Ivor W. Tsang, Yew-Soon Ong

Abstract: Due to privacy and security concerns, recent advancements in group fairness advocate for model training regardless of demographic information. However, most methods still require prior knowledge of demographics. In this study, we explore the potential for achieving fairness without compromising its utility when no prior demographics are provided to the training set, namely \emph{harmless Rawlsian… ▽ More Due to privacy and security concerns, recent advancements in group fairness advocate for model training regardless of demographic information. However, most methods still require prior knowledge of demographics. In this study, we explore the potential for achieving fairness without compromising its utility when no prior demographics are provided to the training set, namely \emph{harmless Rawlsian fairness}. We ascertain that such a fairness requirement with no prior demographic information essential promotes training losses to exhibit a Dirac delta distribution. To this end, we propose a simple but effective method named VFair to minimize the variance of training losses inside the optimal set of empirical losses. This problem is then optimized by a tailored dynamic update approach that operates in both loss and gradient dimensions, directing the model towards relatively fairer solutions while preserving its intact utility. Our experimental findings indicate that regression tasks, which are relatively unexplored from literature, can achieve significant fairness improvement through VFair regardless of any prior, whereas classification tasks usually do not because of their quantized utility measurements. The implementation of our method is publicly available at \url{https://github.com/wxqpxw/VFair}. △ Less

Submitted 8 November, 2024; v1 submitted 4 November, 2024; originally announced November 2024.

Comments: Accepted as a Poster in Neurips 2024

arXiv:2410.19580 [pdf, other]

Hybrid Memetic Search for Electric Vehicle Routing with Time Windows, Simultaneous Pickup-Delivery, and Partial Recharges

Authors: Zubin Zheng, Shengcai Liu, Yew-Soon Ong

Abstract: With growing environmental concerns, electric vehicles for logistics have gained significant attention within the computational intelligence community in recent years. This work addresses an emerging and significant extension of the electric vehicle routing problem (EVRP), namely EVRP with time windows, simultaneous pickup-delivery, and partial recharges (EVRP-TW-SPD), which has widespread real-wo… ▽ More With growing environmental concerns, electric vehicles for logistics have gained significant attention within the computational intelligence community in recent years. This work addresses an emerging and significant extension of the electric vehicle routing problem (EVRP), namely EVRP with time windows, simultaneous pickup-delivery, and partial recharges (EVRP-TW-SPD), which has widespread real-world applications. We propose a hybrid memetic algorithm (HMA) for solving EVRP-TW-SPD. HMA incorporates two novel components: a parallel-sequential station insertion (PSSI) procedure for handling partial recharges that can better avoid local optima compared to purely sequential insertion, and a cross-domain neighborhood search (CDNS) that explores solution spaces of both electric and non-electric problem domains simultaneously. These components can also be easily applied to various EVRP variants. To bridge the gap between existing benchmarks and real-world scenarios, we introduce a new, large-scale EVRP-TW-SPD benchmark set derived from real-world applications, containing instances with many more customers and charging stations than existing benchmark instances. Extensive experiments demonstrate the significant performance advantages of HMA over existing algorithms across a wide range of problem instances. Both the benchmark set and HMA are to be made open-source to facilitate further research in this area. △ Less

Submitted 25 May, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

Comments: Accepted to IEEE TETCI

arXiv:2410.19321 [pdf, other]

Free-Rider and Conflict Aware Collaboration Formation for Cross-Silo Federated Learning

Authors: Mengmeng Chen, Xiaohu Wu, Xiaoli Tang, Tiantian He, Yew-Soon Ong, Qiqi Liu, Qicheng Lao, Han Yu

Abstract: Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key source… ▽ More Federated learning (FL) is a machine learning paradigm that allows multiple FL participants (FL-PTs) to collaborate on training models without sharing private data. Due to data heterogeneity, negative transfer may occur in the FL training process. This necessitates FL-PT selection based on their data complementarity. In cross-silo FL, organizations that engage in business activities are key sources of FL-PTs. The resulting FL ecosystem has two features: (i) self-interest, and (ii) competition among FL-PTs. This requires the desirable FL-PT selection strategy to simultaneously mitigate the problems of free riders and conflicts of interest among competitors. To this end, we propose an optimal FL collaboration formation strategy -- FedEgoists -- which ensures that: (1) a FL-PT can benefit from FL if and only if it benefits the FL ecosystem, and (2) a FL-PT will not contribute to its competitors or their supporters. It provides an efficient clustering solution to group FL-PTs into coalitions, ensuring that within each coalition, FL-PTs share the same interest. We theoretically prove that the FL-PT coalitions formed are optimal since no coalitions can collaborate together to improve the utility of any of their members. Extensive experiments on widely adopted benchmark datasets demonstrate the effectiveness of FedEgoists compared to nine state-of-the-art baseline methods, and its ability to establish efficient collaborative networks in cross-silos FL with FL-PTs that engage in business activities. △ Less

Submitted 31 January, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

arXiv:2410.17839 [pdf, other]

Few-shot NeRF by Adaptive Rendering Loss Regularization

Authors: Qingshan Xu, Xuanyu Yi, Jianyao Xu, Wenbing Tao, Yew-Soon Ong, Hanwang Zhang

Abstract: Novel view synthesis with sparse inputs poses great challenges to Neural Radiance Field (NeRF). Recent works demonstrate that the frequency regularization of Positional Encoding (PE) can achieve promising results for few-shot NeRF. In this work, we reveal that there exists an inconsistency between the frequency regularization of PE and rendering loss. This prevents few-shot NeRF from synthesizing… ▽ More Novel view synthesis with sparse inputs poses great challenges to Neural Radiance Field (NeRF). Recent works demonstrate that the frequency regularization of Positional Encoding (PE) can achieve promising results for few-shot NeRF. In this work, we reveal that there exists an inconsistency between the frequency regularization of PE and rendering loss. This prevents few-shot NeRF from synthesizing higher-quality novel views. To mitigate this inconsistency, we propose Adaptive Rendering loss regularization for few-shot NeRF, dubbed AR-NeRF. Specifically, we present a two-phase rendering supervision and an adaptive rendering loss weight learning strategy to align the frequency relationship between PE and 2D-pixel supervision. In this way, AR-NeRF can learn global structures better in the early training phase and adaptively learn local details throughout the training process. Extensive experiments show that our AR-NeRF achieves state-of-the-art performance on different datasets, including object-level and complex scenes. △ Less

Submitted 23 October, 2024; originally announced October 2024.

Comments: Accepted by ECCV2024

arXiv:2410.10289 [pdf, other]

Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection

Authors: Jiawen Zhu, Yew-Soon Ong, Chunhua Shen, Guansong Pang

Abstract: Current zero-shot anomaly detection (ZSAD) methods show remarkable success in prompting large pre-trained vision-language models to detect anomalies in a target dataset without using any dataset-specific training or demonstration. However, these methods are often focused on crafting/learning prompts that capture only coarse-grained semantics of abnormality, e.g., high-level semantics like "damaged… ▽ More Current zero-shot anomaly detection (ZSAD) methods show remarkable success in prompting large pre-trained vision-language models to detect anomalies in a target dataset without using any dataset-specific training or demonstration. However, these methods are often focused on crafting/learning prompts that capture only coarse-grained semantics of abnormality, e.g., high-level semantics like "damaged", "imperfect", or "defective" on carpet. They therefore have limited capability in recognizing diverse abnormality details with distinctive visual appearance, e.g., specific defect types like color stains, cuts, holes, and threads on carpet. To address this limitation, we propose FAPrompt, a novel framework designed to learn Fine-grained Abnormality Prompts for more accurate ZSAD. To this end, we introduce a novel compound abnormality prompting module in FAPrompt to learn a set of complementary, decomposed abnormality prompts, where each abnormality prompt is formed by a compound of shared normal tokens and a few learnable abnormal tokens. On the other hand, the fine-grained abnormality patterns can be very different from one dataset to another. To enhance their cross-dataset generalization, we further introduce a data-dependent abnormality prior module that learns to derive abnormality features from each query/test image as a sample-wise abnormality prior to ground the abnormality prompts in a given target dataset. Comprehensive experiments conducted across 19 real-world datasets, covering both industrial defects and medical anomalies, demonstrate that FAPrompt substantially outperforms state-of-the-art methods by at least 3%-5% AUC/AP in both image- and pixel-level ZSAD tasks. Code is available at https://github.com/mala-lab/FAPrompt. △ Less

Submitted 14 October, 2024; originally announced October 2024.

Comments: 27 pages, 19 figures

arXiv:2410.08638 [pdf]

Leveraging reconfigurable micro-resonator soliton crystals for Intensity-Modulated Direct Detection Data Transmission

Authors: Xavier X. Chia, Kenny Y. K. Ong, A. Aadhi, George F. R. Chen, Ju Won Choi, Byoung-Uk Sohn, Amdad Chowdury, Dawn T. H. Tan

Abstract: The perennial demand for highly efficient short-haul communications is evidenced by a sustained explosion of growth in data center infrastructure that is predicted to continue for the foreseeable future. In these relatively compact networks, cost-sensitivity is of particular importance, which limits options to direct detection schemes that are more cost efficient than their coherent counterparts.… ▽ More The perennial demand for highly efficient short-haul communications is evidenced by a sustained explosion of growth in data center infrastructure that is predicted to continue for the foreseeable future. In these relatively compact networks, cost-sensitivity is of particular importance, which limits options to direct detection schemes that are more cost efficient than their coherent counterparts. Since their initial demonstration, multi-soliton states in optical microresonators have been observed to manifest in self-organised ensembles where soliton pulses are equally spaced around the resonators. In the spectral domain, these states, dubbed soliton crystals (SCs), result in significant enhancements to individual comb lines depending on the crystal state, making them well suited towards intensity-modulated direct detection (IMDD) schemes. In this work, we experimentally demonstrate adiabatic, deterministic access to lower-order soliton crystal states using an auxiliary-assisted cavity pumping method, attaining up to 19.6 dB enhancement of the comb lines in the 7-SC configuration compared to the single-soliton state. Seven comb lines of each 46 Gbaud/s pulse amplitude modulation 4 (PAM4) is transmitted over 4km of fiber in comb lines across the C-band with bit-error-rates (BER) as low as 5E-5. Our demonstration shows the promising way of using soliton crystal states as future integrated sources for highly stable Terabaud/s datacenter communications. △ Less

Submitted 11 October, 2024; originally announced October 2024.

arXiv:2410.07286 [pdf, other]

Benchmarking Data Heterogeneity Evaluation Approaches for Personalized Federated Learning

Authors: Zhilong Li, Xiaohu Wu, Xiaoli Tang, Tiantian He, Yew-Soon Ong, Mengmeng Chen, Qiqi Liu, Qicheng Lao, Han Yu

Abstract: There is growing research interest in measuring the statistical heterogeneity of clients' local datasets. Such measurements are used to estimate the suitability for collaborative training of personalized federated learning (PFL) models. Currently, these research endeavors are taking place in silos and there is a lack of a unified benchmark to provide a fair and convenient comparison among various… ▽ More There is growing research interest in measuring the statistical heterogeneity of clients' local datasets. Such measurements are used to estimate the suitability for collaborative training of personalized federated learning (PFL) models. Currently, these research endeavors are taking place in silos and there is a lack of a unified benchmark to provide a fair and convenient comparison among various approaches in common settings. We aim to bridge this important gap in this paper. The proposed benchmarking framework currently includes six representative approaches. Extensive experiments have been conducted to compare these approaches under five standard non-IID FL settings, providing much needed insights into which approaches are advantageous under which settings. The proposed framework offers useful guidance on the suitability of various data divergence measures in FL systems. It is beneficial for keeping related research activities on the right track in terms of: (1) designing PFL schemes, (2) selecting appropriate data heterogeneity evaluation approaches for specific FL application scenarios, and (3) addressing fairness issues in collaborative model training. The code is available at https://github.com/Xiaoni-61/DH-Benchmark. △ Less

Submitted 28 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

Comments: Accepted to FL@FM-NeurIPS'24

arXiv:2409.08236 [pdf, other]

doi 10.1016/j.nuclphysb.2024.116711

Black Holes, Complex Curves, and Graph Theory: Revising a Conjecture by Kasner

Authors: Yen Chin Ong

Abstract: The ratios $\sqrt{8/9}=2\sqrt{2}/3\approx 0.9428$ and $\sqrt{3}/2 \approx 0.866$ appear in various contexts of black hole physics, as values of the charge-to-mass ratio $Q/M$ or the rotation parameter $a/M$ for Reissner-Nordström and Kerr black holes, respectively. In this work, in the Reissner-Nordström case, I relate these ratios with the quantization of the horizon area, or equivalently of the… ▽ More The ratios $\sqrt{8/9}=2\sqrt{2}/3\approx 0.9428$ and $\sqrt{3}/2 \approx 0.866$ appear in various contexts of black hole physics, as values of the charge-to-mass ratio $Q/M$ or the rotation parameter $a/M$ for Reissner-Nordström and Kerr black holes, respectively. In this work, in the Reissner-Nordström case, I relate these ratios with the quantization of the horizon area, or equivalently of the entropy. Furthermore, these ratios are related to a century-old work of Kasner, in which he conjectured that certain sequences arising from complex analysis may have a quantum interpretation. These numbers also appear in the case of Kerr black holes, but the explanation is not as straightforward. The Kasner ratio may also be relevant for understanding the random matrix and random graph approaches to black hole physics, such as fast scrambling of quantum information, via a bound related to Ramanujan graph. Intriguingly, some other pure mathematical problems in complex analysis, notably complex interpolation in the unit disk, appear to share some mathematical expressions with the black hole problem and thus also involve the Kasner ratio. △ Less

Submitted 15 September, 2024; v1 submitted 12 September, 2024; originally announced September 2024.

Comments: V2: Fixed some typos and added minor clarification

Journal ref: Nucl.Phys.B 1008 (2024) 116711

arXiv:2408.00779 [pdf, other]

Learning Structurally Stabilized Representations for Multi-modal Lossless DNA Storage

Authors: Ben Cao, Tiantian He, Xue Li, Bin Wang, Xiaohu Wu, Qiang Zhang, Yew-Soon Ong

Abstract: In this paper, we present Reed-Solomon coded single-stranded representation learning (RSRL), a novel end-to-end model for learning representations for multi-modal lossless DNA storage. In contrast to existing learning-based methods, the proposed RSRL is inspired by both error-correction codec and structural biology. Specifically, RSRL first learns the representations for the subsequent storage fro… ▽ More In this paper, we present Reed-Solomon coded single-stranded representation learning (RSRL), a novel end-to-end model for learning representations for multi-modal lossless DNA storage. In contrast to existing learning-based methods, the proposed RSRL is inspired by both error-correction codec and structural biology. Specifically, RSRL first learns the representations for the subsequent storage from the binary data transformed by the Reed-Solomon codec. Then, the representations are masked by an RS-code-informed mask to focus on correcting the burst errors occurring in the learning process. With the decoded representations with error corrections, a novel biologically stabilized loss is formulated to regularize the data representations to possess stable single-stranded structures. By incorporating these novel strategies, the proposed RSRL can learn highly durable, dense, and lossless representations for the subsequent storage tasks into DNA sequences. The proposed RSRL has been compared with a number of strong baselines in real-world tasks of multi-modal data storage. The experimental results obtained demonstrate that RSRL can store diverse types of data with much higher information density and durability but much lower error rates. △ Less

Submitted 17 July, 2024; originally announced August 2024.

arXiv:2407.21114 [pdf, other]

doi 10.1016/j.aop.2024.169919

Hawking Temperature and the Inverse-Radius Scale of the Horizon

Authors: Michael R. R. Good, Yen Chin Ong

Abstract: The Hawking temperature of a Schwarzschild black hole can be heuristically derived by identifying the temperature with the inverse radius of the horizon up to a multiplicative constant. This does not work for more general black holes such as the Kerr and Reissner-Nordström solutions. Expounding on the details of how it fails to work nevertheless uncovers interesting connections with the "spring co… ▽ More The Hawking temperature of a Schwarzschild black hole can be heuristically derived by identifying the temperature with the inverse radius of the horizon up to a multiplicative constant. This does not work for more general black holes such as the Kerr and Reissner-Nordström solutions. Expounding on the details of how it fails to work nevertheless uncovers interesting connections with the "spring constant" of black holes and with black hole thermodynamics. △ Less

Submitted 17 January, 2025; v1 submitted 30 July, 2024; originally announced July 2024.

Comments: published version

Journal ref: Annals of Phys. 474 (2025) 169919

arXiv:2407.00484 [pdf, ps, other]

doi 10.1016/j.aop.2024.169914

Generalized Entropy Implies Varying-G: Horizon Area Dependent Field Equations and Black Hole-Cosmology Coupling

Authors: Hengxin Lu, Sofia Di Gennaro, Yen Chin Ong

Abstract: When the Bekenstein-Hawking entropy is modified, ambiguity often arises concerning whether the Hawking temperature or the thermodynamic mass should be modified. The common practice, however, is to keep the black hole solution the same as that in general relativity. On the other hand, if Jacobson's method of deriving Einstein equations from thermodynamic is valid in the general settings, then given… ▽ More When the Bekenstein-Hawking entropy is modified, ambiguity often arises concerning whether the Hawking temperature or the thermodynamic mass should be modified. The common practice, however, is to keep the black hole solution the same as that in general relativity. On the other hand, if Jacobson's method of deriving Einstein equations from thermodynamic is valid in the general settings, then given a generalized entropy one should first derive the corresponding modified gravity, and then look for the compatible black hole solution before investigating its thermodynamics. We comment on some properties and subtleties in this approach. In particular, we point out that generically generalized entropy would lead to a varying effective gravitational "constant" theory, where $G_\text{eff}$ depends on the horizon area. We discuss in what ways such theories are discernible from general relativity despite its seemingly jarring differences, and how to make sense of area-dependent field equations. As a consequence we show that in the Jacobson's approach, the standard quantum gravitational logarithmic correction to Bekenstein-Hawking entropy is equivalent to a running gravitational "constant". A horizon area dependent $G_\text{eff}$ could also lead to a coupling between black hole masses and cosmological expansion, a scenario that has been studied recently in the literature, but so far lacks strong theoretical motivation. In the Tsallis case, we show that the thermodynamic mass for a Schwarzschild black hole is just a constant multiple of its ADM mass, which is considerably simpler than the approach not utilizing the Jacobson's method. △ Less

Submitted 17 January, 2025; v1 submitted 29 June, 2024; originally announced July 2024.

Comments: Minor changes; also added/fixed some references; published version

Journal ref: Annals of Phys. 474 (2025) 169914

arXiv:2406.14917 [pdf, other]

LLM2FEA: Discover Novel Designs with Generative Evolutionary Multitasking

Authors: Melvin Wong, Jiao Liu, Thiago Rios, Stefan Menzel, Yew Soon Ong

Abstract: The rapid research and development of generative artificial intelligence has enabled the generation of high-quality images, text, and 3D models from text prompts. This advancement impels an inquiry into whether these models can be leveraged to create digital artifacts for both creative and engineering applications. Drawing on innovative designs from other domains may be one answer to this question… ▽ More The rapid research and development of generative artificial intelligence has enabled the generation of high-quality images, text, and 3D models from text prompts. This advancement impels an inquiry into whether these models can be leveraged to create digital artifacts for both creative and engineering applications. Drawing on innovative designs from other domains may be one answer to this question, much like the historical practice of ``bionics", where humans have sought inspiration from nature's exemplary designs. This raises the intriguing possibility of using generative models to simultaneously tackle design tasks across multiple domains, facilitating cross-domain learning and resulting in a series of innovative design solutions. In this paper, we propose LLM2FEA as the first attempt to discover novel designs in generative models by transferring knowledge across multiple domains. By utilizing a multi-factorial evolutionary algorithm (MFEA) to drive a large language model, LLM2FEA integrates knowledge from various fields to generate prompts that guide the generative model in discovering novel and practical objects. Experimental results in the context of 3D aerodynamic design verify the discovery capabilities of the proposed LLM2FEA. The designs generated by LLM2FEA not only satisfy practicality requirements to a certain degree but also feature novel and aesthetically pleasing shapes, demonstrating the potential applications of LLM2FEA in discovery tasks. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2406.09143 [pdf, other]

doi 10.1109/CEC60901.2024.10611898

Generative AI-based Prompt Evolution Engineering Design Optimization With Vision-Language Model

Authors: Melvin Wong, Thiago Rios, Stefan Menzel, Yew Soon Ong

Abstract: Engineering design optimization requires an efficient combination of a 3D shape representation, an optimization algorithm, and a design performance evaluation method, which is often computationally expensive. We present a prompt evolution design optimization (PEDO) framework contextualized in a vehicle design scenario that leverages a vision-language model for penalizing impractical car designs sy… ▽ More Engineering design optimization requires an efficient combination of a 3D shape representation, an optimization algorithm, and a design performance evaluation method, which is often computationally expensive. We present a prompt evolution design optimization (PEDO) framework contextualized in a vehicle design scenario that leverages a vision-language model for penalizing impractical car designs synthesized by a generative model. The backbone of our framework is an evolutionary strategy coupled with an optimization objective function that comprises a physics-based solver and a vision-language model for practical or functional guidance in the generated car designs. In the prompt evolutionary search, the optimizer iteratively generates a population of text prompts, which embed user specifications on the aerodynamic performance and visual preferences of the 3D car designs. Then, in addition to the computational fluid dynamics simulations, the pre-trained vision-language model is used to penalize impractical designs and, thus, foster the evolutionary algorithm to seek more viable designs. Our investigations on a car design optimization problem show a wide spread of potential car designs generated at the early phase of the search, which indicates a good diversity of designs in the initial populations, and an increase of over 20\% in the probability of generating practical designs compared to a baseline framework without using a vision-language model. Visual inspection of the designs against the performance results demonstrates prompt evolution as a very promising paradigm for finding novel designs with good optimization performance while providing ease of use in specifying design specifications and preferences via a natural language interface. △ Less

Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted and to be published in IEEE Congress on Evolutionary Computation (CEC) 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses

Journal ref: IEEE Congress on Evolutionary Computation (CEC), 2024, 1-8

arXiv:2406.04038 [pdf, other]

Road Network Representation Learning with the Third Law of Geography

Authors: Haicang Zhou, Weiming Huang, Yile Chen, Tiantian He, Gao Cong, Yew-Soon Ong

Abstract: Road network representation learning aims to learn compressed and effective vectorized representations for road segments that are applicable to numerous tasks. In this paper, we identify the limitations of existing methods, particularly their overemphasis on the distance effect as outlined in the First Law of Geography. In response, we propose to endow road network representation with the principl… ▽ More Road network representation learning aims to learn compressed and effective vectorized representations for road segments that are applicable to numerous tasks. In this paper, we identify the limitations of existing methods, particularly their overemphasis on the distance effect as outlined in the First Law of Geography. In response, we propose to endow road network representation with the principles of the recent Third Law of Geography. To this end, we propose a novel graph contrastive learning framework that employs geographic configuration-aware graph augmentation and spectral negative sampling, ensuring that road segments with similar geographic configurations yield similar representations, and vice versa, aligning with the principles stated in the Third Law. The framework further fuses the Third Law with the First Law through a dual contrastive learning objective to effectively balance the implications of both laws. We evaluate our framework on two real-world datasets across three downstream tasks. The results show that the integration of the Third Law significantly improves the performance of road segment representations in downstream tasks. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2406.00812 [pdf, other]

Covariance-Adaptive Sequential Black-box Optimization for Diffusion Targeted Generation

Authors: Yueming Lyu, Kim Yong Tan, Yew Soon Ong, Ivor W. Tsang

Abstract: Diffusion models have demonstrated great potential in generating high-quality content for images, natural language, protein domains, etc. However, how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users remains challenging. To address this issue, we first formulate the fine-tuning of the targeted reserve-time stochastic differential equatio… ▽ More Diffusion models have demonstrated great potential in generating high-quality content for images, natural language, protein domains, etc. However, how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users remains challenging. To address this issue, we first formulate the fine-tuning of the targeted reserve-time stochastic differential equation (SDE) associated with a pre-trained diffusion model as a sequential black-box optimization problem. Furthermore, we propose a novel covariance-adaptive sequential optimization algorithm to optimize cumulative black-box scores under unknown transition dynamics. Theoretically, we prove a $O(\frac{d^2}{\sqrt{T}})$ convergence rate for cumulative convex functions without smooth and strongly convex assumptions. Empirically, experiments on both numerical test problems and target-guided 3D-molecule generation tasks show the superior performance of our method in achieving better target scores. △ Less

Submitted 8 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.19062 [pdf, other]

SIG: Efficient Self-Interpretable Graph Neural Network for Continuous-time Dynamic Graphs

Authors: Lanting Fang, Yulian Yang, Kai Wang, Shanshan Feng, Kaiyu Feng, Jie Gui, Shuliang Wang, Yew-Soon Ong

Abstract: While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challen… ▽ More While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challenges: (1) capturing the underlying structural and temporal information that remains consistent across both independent and identically distributed (IID) and out-of-distribution (OOD) data, and (2) efficiently generating high-quality link prediction results and explanations. To tackle these challenges, we propose a novel causal inference model, namely the Independent and Confounded Causal Model (ICCM). ICCM is then integrated into a deep learning architecture that considers both effectiveness and efficiency. Extensive experiments demonstrate that our proposed model significantly outperforms existing methods across link prediction accuracy, explanation quality, and robustness to shortcut features. Our code and datasets are anonymously released at https://github.com/2024SIG/SIG. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 19 pages

arXiv:2405.18884 [pdf]

Learning Mixture-of-Experts for General-Purpose Black-Box Discrete Optimization

Authors: Shengcai Liu, Zhiyuan Wang, Yew-Soon Ong, Xin Yao, Ke Tang

Abstract: Real-world applications involve various discrete optimization problems. Designing a specialized optimizer for each of these problems is challenging, typically requiring significant domain knowledge and human efforts. Hence, developing general-purpose optimizers as an off-the-shelf tool for a wide range of problems has been a long-standing research target. This article introduces MEGO, a novel gene… ▽ More Real-world applications involve various discrete optimization problems. Designing a specialized optimizer for each of these problems is challenging, typically requiring significant domain knowledge and human efforts. Hence, developing general-purpose optimizers as an off-the-shelf tool for a wide range of problems has been a long-standing research target. This article introduces MEGO, a novel general-purpose neural optimizer trained through a fully data-driven learning-to-optimize (L2O) approach. MEGO consists of a mixture-of-experts trained on experiences from solving training problems and can be viewed as a foundation model for optimization problems with binary decision variables. When presented with a problem to solve, MEGO actively selects relevant expert models to generate high-quality solutions. MEGO can be used as a standalone sample-efficient optimizer or in conjunction with existing search methods as an initial solution generator. The generality of MEGO is validated across six problem classes, including three classic problem classes and three problem classes arising from real-world applications in compilers, network analysis, and 3D reconstruction. Trained solely on classic problem classes, MEGO performs very well on all six problem classes, significantly surpassing widely used general-purpose optimizers in both solution quality and efficiency. In some cases, MEGO even surpasses specialized state-of-the-art optimizers. Additionally, MEGO provides a similarity measure between problems, yielding a new perspective for problem classification. In the pursuit of general-purpose optimizers through L2O, MEGO represents an initial yet significant step forward. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 34 pages, 6 figures

arXiv:2405.13048 [pdf]

Human-Generative AI Collaborative Problem Solving Who Leads and How Students Perceive the Interactions

Authors: Gaoxia Zhu, Vidya Sudarshan, Jason Fok Kow, Yew Soon Ong

Abstract: This research investigates distinct human-generative AI collaboration types and students' interaction experiences when collaborating with generative AI (i.e., ChatGPT) for problem-solving tasks and how these factors relate to students' sense of agency and perceived collaborative problem solving. By analyzing the surveys and reflections of 79 undergraduate students, we identified three human-genera… ▽ More This research investigates distinct human-generative AI collaboration types and students' interaction experiences when collaborating with generative AI (i.e., ChatGPT) for problem-solving tasks and how these factors relate to students' sense of agency and perceived collaborative problem solving. By analyzing the surveys and reflections of 79 undergraduate students, we identified three human-generative AI collaboration types: even contribution, human leads, and AI leads. Notably, our study shows that 77.21% of students perceived they led or had even contributed to collaborative problem-solving when collaborating with ChatGPT. On the other hand, 15.19% of the human participants indicated that the collaborations were led by ChatGPT, indicating a potential tendency for students to rely on ChatGPT. Furthermore, 67.09% of students perceived their interaction experiences with ChatGPT to be positive or mixed. We also found a positive correlation between positive interaction experience and a sense of positive agency. The results of this study contribute to our understanding of the collaboration between students and generative AI and highlight the need to study further why some students let ChatGPT lead collaborative problem-solving and how to enhance their interaction experience through curriculum and technology design. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: This paper appears at the IEEE Conference on Artificial Intelligence (CAI) 2024

arXiv:2405.07215 [pdf, other]

doi 10.1007/s11433-024-2582-7

Testing Cotton gravity as dark matter substitute with weak lensing

Authors: Geyu Mo, Qingqing Wang, Xin Ren, Weitong Yan, Yen Chin Ong, Wentao Luo

Abstract: Harada proposed a modified theory of gravity called Cotton gravity, and argued that it successfully explains the rotation curves of $84$ galaxies without the need of dark matter. In this work we use galaxy-galaxy lensing technique to test whether the modification effect of Cotton gravity can indeed be a viable substitute for dark matter. Using the spherically symmetric solution of Cotton gravity,… ▽ More Harada proposed a modified theory of gravity called Cotton gravity, and argued that it successfully explains the rotation curves of $84$ galaxies without the need of dark matter. In this work we use galaxy-galaxy lensing technique to test whether the modification effect of Cotton gravity can indeed be a viable substitute for dark matter. Using the spherically symmetric solution of Cotton gravity, we obtain the deflection angle via Gauss-Bonnet theorem and the weak lensing shear. We use five galaxy catalogs divided in 5 stellar mass bins from the Sloan Digital Sky Survey Data Release 7 (SDSS DR7), each of which is further divided into blue star forming galaxy and red passive galaxy sub-catalogs. We find that Cotton gravity on its own has significant deviation from the measured galaxy-galaxy lensing signals, thus it cannot replace the role of dark matter. If we consider the combination of dark matter and Cotton gravity, the modification is tightly constrained. Our analysis also applies to other modified gravity theories whose an additional linear term appears in the Schwarzschild solution. △ Less

Submitted 7 February, 2025; v1 submitted 12 May, 2024; originally announced May 2024.

Comments: 16 pages, 3 figures

Journal ref: SCIENCE CHINA Physics, Mechanics & Astronomy , Volume 68, Issue 4: 240412 (2025)

Showing 1–50 of 262 results for author: Ong, Y