-
STOAT: Spatial-Temporal Probabilistic Causal Inference Network
Authors:
Yang Yang,
Du Yin,
Hao Xue,
Flora Salim
Abstract:
Spatial-temporal causal time series (STC-TS) involve region-specific temporal observations driven by causally relevant covariates and interconnected across geographic or network-based spaces. Existing methods often model spatial and temporal dynamics independently and overlook causality-driven probabilistic forecasting, limiting their predictive power. To address this, we propose STOAT (Spatial-Te…
▽ More
Spatial-temporal causal time series (STC-TS) involve region-specific temporal observations driven by causally relevant covariates and interconnected across geographic or network-based spaces. Existing methods often model spatial and temporal dynamics independently and overlook causality-driven probabilistic forecasting, limiting their predictive power. To address this, we propose STOAT (Spatial-Temporal Probabilistic Causal Inference Network), a novel framework for probabilistic forecasting in STC-TS. The proposed method extends a causal inference approach by incorporating a spatial relation matrix that encodes interregional dependencies (e.g. proximity or connectivity), enabling spatially informed causal effect estimation. The resulting latent series are processed by deep probabilistic models to estimate the parameters of the distributions, enabling calibrated uncertainty modeling. We further explore multiple output distributions (e.g., Gaussian, Student's-$t$, Laplace) to capture region-specific variability. Experiments on COVID-19 data across six countries demonstrate that STOAT outperforms state-of-the-art probabilistic forecasting models (DeepAR, DeepVAR, Deep State Space Model, etc.) in key metrics, particularly in regions with strong spatial dependencies. By bridging causal inference and geospatial probabilistic forecasting, STOAT offers a generalizable framework for complex spatial-temporal tasks, such as epidemic management.
△ Less
Submitted 12 June, 2025; v1 submitted 11 June, 2025;
originally announced June 2025.
-
BLUE: Bi-layer Heterogeneous Graph Fusion Network for Avian Influenza Forecasting
Authors:
Jing Du,
Haley Stone,
Yang Yang,
Ashna Desai,
Hao Xue,
Andreas Züfle,
Chandini Raina MacIntyre,
Flora D. Salim
Abstract:
Accurate forecasting of avian influenza outbreaks within wild bird populations requires models that account for complex, multi-scale transmission patterns driven by various factors. Spatio-temporal GNN-based models have recently gained traction for infection forecasting due to their ability to capture relations and flow between spatial regions, but most existing frameworks rely solely on spatial c…
▽ More
Accurate forecasting of avian influenza outbreaks within wild bird populations requires models that account for complex, multi-scale transmission patterns driven by various factors. Spatio-temporal GNN-based models have recently gained traction for infection forecasting due to their ability to capture relations and flow between spatial regions, but most existing frameworks rely solely on spatial connections and their connections. This overlooks valuable genetic information at the case level, such as cases in one region being genetically descended from strains in another, which is essential for understanding how infectious diseases spread through epidemiological linkages beyond geography. We address this gap with BLUE, a B}i-Layer heterogeneous graph fUsion nEtwork designed to integrate genetic, spatial, and ecological data for accurate outbreak forecasting. The framework 1) builds heterogeneous graphs from multiple information sources and multiple layers, 2) smooths across relation types, 3) performs fusion while retaining structural patterns, and 4) predicts future outbreaks via an autoregressive graph sequence model that captures transmission dynamics over time. To facilitate further research, we introduce \textbf{Avian-US} dataset, the dataset for avian influenza outbreak forecasting in the United States, incorporating genetic, spatial, and ecological data across locations. BLUE achieves superior performance over existing baselines, highlighting the value of incorporating multi-layer information into infectious disease forecasting.
△ Less
Submitted 9 June, 2025; v1 submitted 28 May, 2025;
originally announced May 2025.
-
EMAC+: Embodied Multimodal Agent for Collaborative Planning with VLM+LLM
Authors:
Shuang Ao,
Flora D. Salim,
Simon Khan
Abstract:
Although LLMs demonstrate proficiency in several text-based reasoning and planning tasks, their implementation in robotics control is constrained by significant deficiencies: (1) LLM agents are designed to work mainly with textual inputs rather than visual conditions; (2) Current multimodal agents treat LLMs as static planners, which separates their reasoning from environment dynamics, resulting i…
▽ More
Although LLMs demonstrate proficiency in several text-based reasoning and planning tasks, their implementation in robotics control is constrained by significant deficiencies: (1) LLM agents are designed to work mainly with textual inputs rather than visual conditions; (2) Current multimodal agents treat LLMs as static planners, which separates their reasoning from environment dynamics, resulting in actions that do not take domain-specific knowledge into account; and (3) LLMs are not designed to learn from visual interactions, which makes it harder for them to make better policies for specific domains. In this paper, we introduce EMAC+, an Embodied Multimodal Agent that collaboratively integrates LLM and VLM via a bidirectional training paradigm. Unlike existing methods, EMAC+ dynamically refines high-level textual plans generated by an LLM using real-time feedback from a VLM executing low-level visual control tasks. We address critical limitations of previous models by enabling the LLM to internalize visual environment dynamics directly through interactive experience, rather than relying solely on static symbolic mappings. Extensive experimental evaluations on ALFWorld and RT-1 benchmarks demonstrate that EMAC+ achieves superior task performance, robustness against noisy observations, and efficient learning. We also conduct thorough ablation studies and provide detailed analyses of success and failure cases.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning
Authors:
Ruiyi Yang,
Hao Xue,
Imran Razzak,
Hakim Hacid,
Flora D. Salim
Abstract:
Retrieval-Augmented Generation (RAG) systems empower large language models (LLMs) with external knowledge, yet struggle with efficiency-accuracy trade-offs when scaling to large knowledge graphs. Existing approaches often rely on monolithic graph retrieval, incurring unnecessary latency for simple queries and fragmented reasoning for complex multi-hop questions. To address these challenges, this p…
▽ More
Retrieval-Augmented Generation (RAG) systems empower large language models (LLMs) with external knowledge, yet struggle with efficiency-accuracy trade-offs when scaling to large knowledge graphs. Existing approaches often rely on monolithic graph retrieval, incurring unnecessary latency for simple queries and fragmented reasoning for complex multi-hop questions. To address these challenges, this paper propose SPLIT-RAG, a multi-agent RAG framework that addresses these limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval. The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG. The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types, while lightweight LLM agents are assigned to partitioned subgraphs, and only relevant partitions are activated during retrieval, thus reduce search space while enhancing efficiency. Finally, a hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications. Extensive experimental validation demonstrates considerable improvements compared to existing approaches.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
SOCIA: An End-to-End Agentic Framework for Automated Cyber-Physical-Social Simulator Generation
Authors:
Yuncheng Hua,
Ji Miao,
Mehdi Jafari,
Jianxiang Xie,
Hao Xue,
Flora D. Salim
Abstract:
This paper introduces SOCIA (Simulation Orchestration for Cyber-physical-social Intelligence and Agents), a novel end-to-end framework leveraging Large Language Model (LLM)-based multi-agent systems to automate the generation of high-fidelity Cyber-Physical-Social (CPS) simulators. Addressing the challenges of labor-intensive manual simulator development and complex data calibration, SOCIA integra…
▽ More
This paper introduces SOCIA (Simulation Orchestration for Cyber-physical-social Intelligence and Agents), a novel end-to-end framework leveraging Large Language Model (LLM)-based multi-agent systems to automate the generation of high-fidelity Cyber-Physical-Social (CPS) simulators. Addressing the challenges of labor-intensive manual simulator development and complex data calibration, SOCIA integrates a centralized orchestration manager that coordinates specialized agents for tasks including data comprehension, code generation, simulation execution, and iterative evaluation-feedback loops. Through empirical evaluations across diverse CPS tasks, such as mask adoption behavior simulation (social), personal mobility generation (physical), and user modeling (cyber), SOCIA demonstrates its ability to produce high-fidelity, scalable simulations with reduced human intervention. These results highlight SOCIA's potential to offer a scalable solution for studying complex CPS phenomena
△ Less
Submitted 23 May, 2025; v1 submitted 17 May, 2025;
originally announced May 2025.
-
Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks
Authors:
Wilson Wongso,
Hao Xue,
Flora D. Salim
Abstract:
Understanding human mobility through Point-of-Interest (POI) recommendation is increasingly important for applications such as urban planning, personalized services, and generative agent simulation. However, progress in this field is hindered by two key challenges: the over-reliance on older datasets from 2012-2013 and the lack of reproducible, city-level check-in datasets that reflect diverse glo…
▽ More
Understanding human mobility through Point-of-Interest (POI) recommendation is increasingly important for applications such as urban planning, personalized services, and generative agent simulation. However, progress in this field is hindered by two key challenges: the over-reliance on older datasets from 2012-2013 and the lack of reproducible, city-level check-in datasets that reflect diverse global regions. To address these gaps, we present Massive-STEPS (Massive Semantic Trajectories for Understanding POI Check-ins), a large-scale, publicly available benchmark dataset built upon the Semantic Trails dataset and enriched with semantic POI metadata. Massive-STEPS spans 12 geographically and culturally diverse cities and features more recent (2017-2018) and longer-duration (24 months) check-in data than prior datasets. We benchmarked a wide range of POI recommendation models on Massive-STEPS using both supervised and zero-shot approaches, and evaluated their performance across multiple urban contexts. By releasing Massive-STEPS, we aim to facilitate reproducible and equitable research in human mobility and POI recommendation. The dataset and benchmarking code are available at: https://github.com/cruiseresearchgroup/Massive-STEPS
△ Less
Submitted 18 May, 2025; v1 submitted 16 May, 2025;
originally announced May 2025.
-
A data-driven approach for star formation parameterization using symbolic regression
Authors:
Diane M. Salim,
Matthew E. Orr,
Blakesley Burkhart,
Rachel S. Somerville,
Miles Cramner
Abstract:
Star formation (SF) in the interstellar medium (ISM) is fundamental to understanding galaxy evolution and planet formation. However, efforts to develop closed-form analytic expressions that link SF with key influencing physical variables, such as gas density and turbulence, remain challenging. In this work, we leverage recent advancements in machine learning (ML) and use symbolic regression (SR) t…
▽ More
Star formation (SF) in the interstellar medium (ISM) is fundamental to understanding galaxy evolution and planet formation. However, efforts to develop closed-form analytic expressions that link SF with key influencing physical variables, such as gas density and turbulence, remain challenging. In this work, we leverage recent advancements in machine learning (ML) and use symbolic regression (SR) techniques to produce the first data-driven, ML-discovered analytic expressions for SF using the publicly available FIRE-2 simulation suites. Employing a pipeline based on training the genetic algorithm of SR from an open software package called PySR, in tandem with a custom loss function and a model selection technique which compares candidate equations to analytic approaches to describing SF, we produce symbolic representations of a predictive model for the star formation rate surface density ($Σ_\mathrm{SFR}$) averaged over both 10 Myr and 100 Myr based on eight extracted variables from FIRE-2 galaxies. The resulting model that PySR finds best describes SF, on both averaging timescales, features equations that incorporates the surface density of gas, $Σ_\mathrm{gas}$, the velocity dispersion of gas $σ_{\mathrm{gas,~z}}$ and the surface density of stars $Σ_\mathrm{*}$. Furthermore, we find that the equations found for the longer SFR timescale all converge to a scaling-relation-like equation, all of which also closely capture the intrinsic physical scatter of the data within the Kennicutt-Schmidt (KS) plane. This observed convergence to physically interpretable scaling relations at longer SFR timescales demonstrates that our method successfully identifies robust physical relationships rather than fitting to stochastic fluctuations.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
Optimizing Electric Vehicle Charging Station Locations: A Data-driven System with Multi-source Fusion
Authors:
Lihuan Li,
Du Yin,
Hao Xue,
David Lillo-Trynes,
Flora Salim
Abstract:
With the growing electric vehicles (EVs) charging demand, urban planners face the challenges of providing charging infrastructure at optimal locations. For example, range anxiety during long-distance travel and the inadequate distribution of residential charging stations are the major issues many cities face. To achieve reasonable estimation and deployment of the charging demand, we develop a data…
▽ More
With the growing electric vehicles (EVs) charging demand, urban planners face the challenges of providing charging infrastructure at optimal locations. For example, range anxiety during long-distance travel and the inadequate distribution of residential charging stations are the major issues many cities face. To achieve reasonable estimation and deployment of the charging demand, we develop a data-driven system based on existing EV trips in New South Wales (NSW) state, Australia, incorporating multiple factors that enhance the geographical feasibility of recommended charging stations. Our system integrates data sources including EV trip data, geographical data such as route data and Local Government Area (LGA) boundaries, as well as features like fire and flood risks, and Points of Interest (POIs). We visualize our results to intuitively demonstrate the findings from our data-driven, multi-source fusion system, and evaluate them through case studies. The outcome of this work can provide a platform for discussion to develop new insights that could be used to give guidance on where to position future EV charging stations.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Evaluating the Bias in LLMs for Surveying Opinion and Decision Making in Healthcare
Authors:
Yonchanok Khaokaew,
Flora D. Salim,
Andreas Züfle,
Hao Xue,
Taylor Anderson,
C. Raina MacIntyre,
Matthew Scotch,
David J Heslop
Abstract:
Generative agents have been increasingly used to simulate human behaviour in silico, driven by large language models (LLMs). These simulacra serve as sandboxes for studying human behaviour without compromising privacy or safety. However, it remains unclear whether such agents can truly represent real individuals. This work compares survey data from the Understanding America Study (UAS) on healthca…
▽ More
Generative agents have been increasingly used to simulate human behaviour in silico, driven by large language models (LLMs). These simulacra serve as sandboxes for studying human behaviour without compromising privacy or safety. However, it remains unclear whether such agents can truly represent real individuals. This work compares survey data from the Understanding America Study (UAS) on healthcare decision-making with simulated responses from generative agents. Using demographic-based prompt engineering, we create digital twins of survey respondents and analyse how well different LLMs reproduce real-world behaviours. Our findings show that some LLMs fail to reflect realistic decision-making, such as predicting universal vaccine acceptance. However, Llama 3 captures variations across race and Income more accurately but also introduces biases not present in the UAS data. This study highlights the potential of generative agents for behavioural research while underscoring the risks of bias from both LLMs and prompting strategies.
△ Less
Submitted 16 April, 2025; v1 submitted 11 April, 2025;
originally announced April 2025.
-
Information Retrieval for Climate Impact
Authors:
Maarten de Rijke,
Bart van den Hurk,
Flora Salim,
Alaa Al Khourdajie,
Nan Bai,
Renato Calzone,
Declan Curran,
Getnet Demil,
Lesley Frew,
Noah Gießing,
Mukesh Kumar Gupta,
Maria Heuss,
Sanaa Hobeichi,
David Huard,
Jingwei Kang,
Ana Lucic,
Tanwi Mallick,
Shruti Nath,
Andrew Okem,
Barbara Pernici,
Thilina Rajapakse,
Hira Saleem,
Harry Scells,
Nicole Schneider,
Damiano Spina
, et al. (6 additional authors not shown)
Abstract:
The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- informa…
▽ More
The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- information retrieval, natural language processing, systematic reviews, impact assessments, and climate science. The workshop brought together a diverse set of researchers and practitioners interested in contributing to the development of a technical research agenda for information retrieval to assess climate change impacts.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
Enforcing Consistency and Fairness in Multi-level Hierarchical Classification with a Mask-based Output Layer
Authors:
Shijing Chen,
Shoaib Jameel,
Mohamed Reda Bouadjenek,
Feilong Tang,
Usman Naseem,
Basem Suleiman,
Hakim Hacid,
Flora D. Salim,
Imran Razzak
Abstract:
Traditional Multi-level Hierarchical Classification (MLHC) classifiers often rely on backbone models with $n$ independent output layers. This structure tends to overlook the hierarchical relationships between classes, leading to inconsistent predictions that violate the underlying taxonomy. Additionally, once a backbone architecture for an MLHC classifier is selected, adapting the model to accommo…
▽ More
Traditional Multi-level Hierarchical Classification (MLHC) classifiers often rely on backbone models with $n$ independent output layers. This structure tends to overlook the hierarchical relationships between classes, leading to inconsistent predictions that violate the underlying taxonomy. Additionally, once a backbone architecture for an MLHC classifier is selected, adapting the model to accommodate new tasks can be challenging. For example, incorporating fairness to protect sensitive attributes within a hierarchical classifier necessitates complex adjustments to maintain the class hierarchy while enforcing fairness constraints. In this paper, we extend this concept to hierarchical classification by introducing a fair, model-agnostic layer designed to enforce taxonomy and optimize specific objectives, including consistency, fairness, and exact match. Our evaluations demonstrate that the proposed layer not only improves the fairness of predictions but also enforces the taxonomy, resulting in consistent predictions and superior performance. Compared to Large Language Models (LLMs) employing in-processing de-biasing techniques and models without any bias correction, our approach achieves better outcomes in both fairness and accuracy, making it particularly valuable in sectors like e-commerce, healthcare, and education, where predictive reliability is crucial.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Embedding spatial context in urban traffic forecasting with contrastive pre-training
Authors:
Matthew Low,
Arian Prabowo,
Hao Xue,
Flora Salim
Abstract:
Urban traffic forecasting is a commonly encountered problem, with wide-ranging applications in fields such as urban planning, civil engineering and transport. In this paper, we study the enhancement of traffic forecasting with pre-training, focusing on spatio-temporal graph methods. While various machine learning methods to solve traffic forecasting problems have been explored and extensively stud…
▽ More
Urban traffic forecasting is a commonly encountered problem, with wide-ranging applications in fields such as urban planning, civil engineering and transport. In this paper, we study the enhancement of traffic forecasting with pre-training, focusing on spatio-temporal graph methods. While various machine learning methods to solve traffic forecasting problems have been explored and extensively studied, there is a gap of a more contextual approach: studying how relevant non-traffic data can improve prediction performance on traffic forecasting problems. We call this data spatial context. We introduce a novel method of combining road and traffic information through the notion of a traffic quotient graph, a quotient graph formed from road geometry and traffic sensors. We also define a way to encode this relationship in the form of a geometric encoder, pre-trained using contrastive learning methods and enhanced with OpenStreetMap data. We introduce and discuss ways to integrate this geometric encoder with existing graph neural network (GNN)-based traffic forecasting models, using a contrastive pre-training paradigm. We demonstrate the potential for this hybrid model to improve generalisation and performance with zero additional traffic data. Code for this paper is available at https://github.com/mattchrlw/forecasting-on-new-roads.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Long Context Modeling with Ranked Memory-Augmented Retrieval
Authors:
Ghadir Alselwi,
Hao Xue,
Shoaib Jameel,
Basem Suleiman,
Flora D. Salim,
Imran Razzak
Abstract:
Effective long-term memory management is crucial for language models handling extended contexts. We introduce a novel framework that dynamically ranks memory entries based on relevance. Unlike previous works, our model introduces a novel relevance scoring and a pointwise re-ranking model for key-value embeddings, inspired by learning-to-rank techniques in information retrieval. Enhanced Ranked Mem…
▽ More
Effective long-term memory management is crucial for language models handling extended contexts. We introduce a novel framework that dynamically ranks memory entries based on relevance. Unlike previous works, our model introduces a novel relevance scoring and a pointwise re-ranking model for key-value embeddings, inspired by learning-to-rank techniques in information retrieval. Enhanced Ranked Memory Augmented Retrieval ERMAR achieves state-of-the-art results on standard benchmarks.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Beyond Single Pass, Looping Through Time: KG-IRAG with Iterative Knowledge Retrieval
Authors:
Ruiyi Yang,
Hao Xue,
Imran Razzak,
Hakim Hacid,
Flora D. Salim
Abstract:
Graph Retrieval-Augmented Generation (GraphRAG) has proven highly effective in enhancing the performance of Large Language Models (LLMs) on tasks that require external knowledge. By leveraging Knowledge Graphs (KGs), GraphRAG improves information retrieval for complex reasoning tasks, providing more precise and comprehensive retrieval and generating more accurate responses to QAs. However, most RA…
▽ More
Graph Retrieval-Augmented Generation (GraphRAG) has proven highly effective in enhancing the performance of Large Language Models (LLMs) on tasks that require external knowledge. By leveraging Knowledge Graphs (KGs), GraphRAG improves information retrieval for complex reasoning tasks, providing more precise and comprehensive retrieval and generating more accurate responses to QAs. However, most RAG methods fall short in addressing multi-step reasoning, particularly when both information extraction and inference are necessary. To address this limitation, this paper presents Knowledge Graph-Based Iterative Retrieval-Augmented Generation (KG-IRAG), a novel framework that integrates KGs with iterative reasoning to improve LLMs' ability to handle queries involving temporal and logical dependencies. Through iterative retrieval steps, KG-IRAG incrementally gathers relevant data from external KGs, enabling step-by-step reasoning. The proposed approach is particularly suited for scenarios where reasoning is required alongside dynamic temporal data extraction, such as determining optimal travel times based on weather conditions or traffic patterns. Experimental results show that KG-IRAG improves accuracy in complex reasoning tasks by effectively integrating external knowledge with iterative, logic-based retrieval. Additionally, three new datasets: weatherQA-Irish, weatherQA-Sydney, and trafficQA-TFNSW, are formed to evaluate KG-IRAG's performance, demonstrating its potential beyond traditional RAG applications.
△ Less
Submitted 19 May, 2025; v1 submitted 18 March, 2025;
originally announced March 2025.
-
Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey
Authors:
Yuxuan Liang,
Haomin Wen,
Yutong Xia,
Ming Jin,
Bin Yang,
Flora Salim,
Qingsong Wen,
Shirui Pan,
Gao Cong
Abstract:
Spatio-Temporal (ST) data science, which includes sensing, managing, and mining large-scale data across space and time, is fundamental to understanding complex systems in domains such as urban computing, climate science, and intelligent transportation. Traditional deep learning approaches have significantly advanced this field, particularly in the stage of ST data mining. However, these models rem…
▽ More
Spatio-Temporal (ST) data science, which includes sensing, managing, and mining large-scale data across space and time, is fundamental to understanding complex systems in domains such as urban computing, climate science, and intelligent transportation. Traditional deep learning approaches have significantly advanced this field, particularly in the stage of ST data mining. However, these models remain task-specific and often require extensive labeled data. Inspired by the success of Foundation Models (FM), especially large language models, researchers have begun exploring the concept of Spatio-Temporal Foundation Models (STFMs) to enhance adaptability and generalization across diverse ST tasks. Unlike prior architectures, STFMs empower the entire workflow of ST data science, ranging from data sensing, management, to mining, thereby offering a more holistic and scalable approach. Despite rapid progress, a systematic study of STFMs for ST data science remains lacking. This survey aims to provide a comprehensive review of STFMs, categorizing existing methodologies and identifying key research directions to advance ST general intelligence.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Harnessing Test-time Adaptation for NLU tasks Involving Dialects of English
Authors:
Duke Nguyen,
Aditya Joshi,
Flora Salim
Abstract:
Test-time adaptation (TTA) is an excellent method which helps generalize models across domains, tasks, and distributions without the use of labeled datasets. Thus, TTA is very useful in natural language processing (NLP) in the dialectal setting, since oftentimes, models are trained on Standard American English (SAE), evaluated on Indian English or Nigerian English, of which distribution differs si…
▽ More
Test-time adaptation (TTA) is an excellent method which helps generalize models across domains, tasks, and distributions without the use of labeled datasets. Thus, TTA is very useful in natural language processing (NLP) in the dialectal setting, since oftentimes, models are trained on Standard American English (SAE), evaluated on Indian English or Nigerian English, of which distribution differs significantly from the former. This is especially useful since dialectal datasets are scarce. In this paper, we explore one of the most famous TTA techniques, SHOT, in dialectal NLP. We finetune and evaluate SHOT on different combinations of dialectal GLUE. Our findings show that SHOT is a viable technique when labeled datasets are unavailable. We also theoretically propose the concept of dialectal gap and show that it has a positive correlation with the effectiveness of SHOT. We also find that in many cases, finetuning on SAE yields higher performance than finetuning on dialectal data. Our code is available at https://github.com/dukenguyenxyz/dialect-adaptation
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition
Authors:
Baiyu Chen,
Wilson Wongso,
Zechen Li,
Yonchanok Khaokaew,
Hao Xue,
Flora Salim
Abstract:
Egocentric video-based models capture rich semantic information and have demonstrated strong performance in human activity recognition (HAR). However, their high power consumption, privacy concerns, and dependence on lighting conditions limit their feasibility for continuous on-device recognition. In contrast, inertial measurement unit (IMU) sensors offer an energy-efficient and privacy-preserving…
▽ More
Egocentric video-based models capture rich semantic information and have demonstrated strong performance in human activity recognition (HAR). However, their high power consumption, privacy concerns, and dependence on lighting conditions limit their feasibility for continuous on-device recognition. In contrast, inertial measurement unit (IMU) sensors offer an energy-efficient and privacy-preserving alternative, yet they suffer from limited large-scale annotated datasets, leading to weaker generalization in downstream tasks. To bridge this gap, we propose COMODO, a cross-modal self-supervised distillation framework that transfers rich semantic knowledge from the video modality to the IMU modality without requiring labeled annotations. COMODO leverages a pretrained and frozen video encoder to construct a dynamic instance queue, aligning the feature distributions of video and IMU embeddings. By distilling knowledge from video representations, our approach enables the IMU encoder to inherit rich semantic information from video while preserving its efficiency for real-world applications. Experiments on multiple egocentric HAR datasets demonstrate that COMODO consistently improves downstream classification performance, achieving results comparable to or exceeding fully supervised fine-tuned models. Moreover, COMODO exhibits strong cross-dataset generalization. Benefiting from its simplicity, our method is also generally applicable to various video and time-series pre-trained models, offering the potential to leverage more powerful teacher and student foundation models in future research. The code is available at https://github.com/Breezelled/COMODO .
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
TrajLLM: A Modular LLM-Enhanced Agent-Based Framework for Realistic Human Trajectory Simulation
Authors:
Chenlu Ju,
Jiaxin Liu,
Shobhit Sinha,
Hao Xue,
Flora Salim
Abstract:
This work leverages Large Language Models (LLMs) to simulate human mobility, addressing challenges like high costs and privacy concerns in traditional models. Our hierarchical framework integrates persona generation, activity selection, and destination prediction, using real-world demographic and psychological data to create realistic movement patterns. Both physical models and language models are…
▽ More
This work leverages Large Language Models (LLMs) to simulate human mobility, addressing challenges like high costs and privacy concerns in traditional models. Our hierarchical framework integrates persona generation, activity selection, and destination prediction, using real-world demographic and psychological data to create realistic movement patterns. Both physical models and language models are employed to explore and demonstrate different methodologies for human mobility simulation. By structuring data with summarization and weighted density metrics, the system ensures scalable memory management while retaining actionable insights. Preliminary results indicate that LLM-driven simulations align with observed real-world patterns, offering scalable, interpretable insights for social problems such as urban planning, traffic management, and public health. The framework's ability to dynamically generate personas and activities enables it to provide adaptable and realistic daily routines. This study demonstrates the transformative potential of LLMs in advancing mobility modeling for societal and urban applications. The source code and interactive demo for our framework are available at https://github.com/cju0/TrajLLM.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
Watch Out E-scooter Coming Through: Multimodal Sensing of Mixed Traffic Use and Conflicts Through Riders Ego-centric Views
Authors:
Hiruni Nuwanthika Kegalle,
Danula Hettiachchi,
Jeffrey Chan,
Mark Sanderson,
Flora D. Salim
Abstract:
E-scooters are becoming a popular means of urban transportation. However, this increased popularity brings challenges, such as road accidents and conflicts when sharing space with traditional transport modes. An in-depth understanding of e-scooter rider behaviour is crucial for ensuring rider safety, guiding infrastructure planning, and enforcing traffic rules. This study investigated the rider be…
▽ More
E-scooters are becoming a popular means of urban transportation. However, this increased popularity brings challenges, such as road accidents and conflicts when sharing space with traditional transport modes. An in-depth understanding of e-scooter rider behaviour is crucial for ensuring rider safety, guiding infrastructure planning, and enforcing traffic rules. This study investigated the rider behaviour through a naturalistic study with 23 participants equipped with a bike computer, eye-tracking glasses and cameras. They followed a pre-determined route, enabling multi-modal data collection. We analysed and compared gaze movements, speed, and video feeds across three transport infrastructure types: a pedestrian-shared path, a cycle lane and a roadway. Our findings reveal unique challenges e-scooter riders face, including difficulty keeping up with cyclists and motor vehicles due to speed limits on shared e-scooters, risks in signalling turns due to control lose, and limited acceptance in mixed-use spaces. The cycle lane showed the highest average speed, the least speed change points, and the least head movements, supporting its suitability as dedicated infrastructure for e-scooters. These findings are facilitated through multimodal sensing and analysing the e-scooter riders' ego-centric view, which show the efficacy of our method in discovering the behavioural dynamics of the riders in the wild. Our study highlights the critical need to align infrastructure with user behaviour to improve safety and emphasises the importance of targeted safety measures and regulations, especially when e-scooter riders share spaces with pedestrians or motor vehicles. The dataset and analysis code are available at https://github.com/HiruniNuwanthika/Electric-Scooter-Riders-Multi-Modal-Data-Analysis.git.
△ Less
Submitted 23 February, 2025;
originally announced February 2025.
-
Enhancing Conversational Agents with Theory of Mind: Aligning Beliefs, Desires, and Intentions for Human-Like Interaction
Authors:
Mehdi Jafari,
Devin Yuncheng Hua,
Hao Xue,
Flora Salim
Abstract:
Natural language interaction with agentic Artificial Intelligence (AI), driven by Large Language Models (LLMs), is expected to remain a dominant paradigm in the near future. While humans instinctively align their communication with mental states -- an ability known as Theory of Mind (ToM), current LLM powered systems exhibit significant limitations in this regard. This study examines the extent to…
▽ More
Natural language interaction with agentic Artificial Intelligence (AI), driven by Large Language Models (LLMs), is expected to remain a dominant paradigm in the near future. While humans instinctively align their communication with mental states -- an ability known as Theory of Mind (ToM), current LLM powered systems exhibit significant limitations in this regard. This study examines the extent to which open source language models (LLaMA) can capture and preserve ToM related information and how effectively it contributes to consistent ToM reasoning in generated responses. We further investigate whether explicit manipulation of ToM related components, such as beliefs, desires, and intentions, can enhance response alignment. Experiments on two LLaMA 3 variants demonstrate that incorporating ToM informed alignment improves response quality, achieving win rates of 67 and 63 percent for the 3B and 8B models, respectively. These findings highlight the potential of ToM driven strategies to improve alignment in LLM based conversational agents.
△ Less
Submitted 20 May, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models
Authors:
Yuanyuan Xu,
Hanchen Wang,
Wenjie Zhang,
Lexing Xie,
Yin Chen,
Flora Salim,
Ying Zhang,
Justin Gooding,
Toby Walsh
Abstract:
Catalysts are essential for accelerating chemical reactions and enhancing selectivity, which is crucial for the sustainable production of energy, materials, and bioactive compounds. Catalyst discovery is fundamental yet challenging in computational chemistry and has garnered significant attention due to the promising performance of advanced Artificial Intelligence (AI) techniques. The development…
▽ More
Catalysts are essential for accelerating chemical reactions and enhancing selectivity, which is crucial for the sustainable production of energy, materials, and bioactive compounds. Catalyst discovery is fundamental yet challenging in computational chemistry and has garnered significant attention due to the promising performance of advanced Artificial Intelligence (AI) techniques. The development of Large Language Models (LLMs) notably accelerates progress in the discovery of both homogeneous and heterogeneous catalysts, where their chemical reactions differ significantly in material phases, temperature, dynamics, etc. However, there is currently no comprehensive survey that discusses the progress and latest developments in both areas, particularly with the application of LLM techniques. To address this gap, this paper presents a thorough and systematic survey of AI-empowered catalyst discovery, employing a unified and general categorization for homogeneous and heterogeneous catalysts. We examine the progress of AI-empowered catalyst discovery, highlighting their individual advantages and disadvantages, and discuss the challenges faced in this field. Furthermore, we suggest potential directions for future research from the perspective of computer science. Our goal is to assist researchers in computational chemistry, computer science, and related fields in easily tracking the latest advancements, providing a clear overview and roadmap of this area. We also organize and make accessible relevant resources, including article lists and datasets, in an open repository at https://github.com/LuckyGirl-XU/Awesome-Artificial-Intelligence-Empowered-Catalyst-Discovery.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
PAR-AdvGAN: Improving Adversarial Attack Capability with Progressive Auto-Regression AdvGAN
Authors:
Jiayu Zhang,
Zhiyu Zhu,
Xinyi Wang,
Silin Liao,
Zhibo Jin,
Flora D. Salim,
Huaming Chen
Abstract:
Deep neural networks have demonstrated remarkable performance across various domains. However, they are vulnerable to adversarial examples, which can lead to erroneous predictions. Generative Adversarial Networks (GANs) can leverage the generators and discriminators model to quickly produce high-quality adversarial examples. Since both modules train in a competitive and simultaneous manner, GAN-ba…
▽ More
Deep neural networks have demonstrated remarkable performance across various domains. However, they are vulnerable to adversarial examples, which can lead to erroneous predictions. Generative Adversarial Networks (GANs) can leverage the generators and discriminators model to quickly produce high-quality adversarial examples. Since both modules train in a competitive and simultaneous manner, GAN-based algorithms like AdvGAN can generate adversarial examples with better transferability compared to traditional methods. However, the generation of perturbations is usually limited to a single iteration, preventing these examples from fully exploiting the potential of the methods. To tackle this issue, we introduce a novel approach named Progressive Auto-Regression AdvGAN (PAR-AdvGAN). It incorporates an auto-regressive iteration mechanism within a progressive generation network to craft adversarial examples with enhanced attack capability. We thoroughly evaluate our PAR-AdvGAN method with a large-scale experiment, demonstrating its superior performance over various state-of-the-art black-box adversarial attacks, as well as the original AdvGAN.Moreover, PAR-AdvGAN significantly accelerates the adversarial example generation, i.e., achieving the speeds of up to 335.5 frames per second on Inception-v3 model, outperforming the gradient-based transferable attack algorithms. Our code is available at: https://anonymous.4open.science/r/PAR-01BF/
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars
Authors:
Yuncheng Hua,
Lizhen Qu,
Zhuang Li,
Hao Xue,
Flora D. Salim,
Gholamreza Haffari
Abstract:
Alignment tuning is crucial for ensuring large language models (LLMs) behave ethically and helpfully. Current alignment approaches require high-quality annotations and significant training resources. This paper proposes a low-cost, tuning-free method using in-context learning (ICL) to enhance LLM alignment. Through an analysis of high-quality ICL demos, we identified style as a key factor influenc…
▽ More
Alignment tuning is crucial for ensuring large language models (LLMs) behave ethically and helpfully. Current alignment approaches require high-quality annotations and significant training resources. This paper proposes a low-cost, tuning-free method using in-context learning (ICL) to enhance LLM alignment. Through an analysis of high-quality ICL demos, we identified style as a key factor influencing LLM alignment capabilities and explicitly restyled ICL exemplars based on this stylistic framework. Additionally, we combined the restyled demos to achieve a balance between the two conflicting aspects of LLM alignment--factuality and safety. We packaged the restyled examples as prompts to trigger few-shot learning, improving LLM alignment. Compared to the best baseline approach, with an average score of 5.00 as the maximum, our method achieves a maximum 0.10 increase on the Alpaca task (from 4.50 to 4.60), a 0.22 enhancement on the Just-eval benchmark (from 4.34 to 4.56), and a maximum improvement of 0.32 (from 3.53 to 3.85) on the MT-Bench dataset. We release the code and data at https://github.com/AnonymousCode-ComputerScience/RIDE.
△ Less
Submitted 5 March, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
The Homework Wars: Exploring Emotions, Behaviours, and Conflicts in Parent-Child Homework Interactions
Authors:
Nan Gao,
Yibin Liu,
Xin Tang,
Yanyan Liu,
Chun Yu,
Yun Huang,
Yuntao Wang,
Flora D. Salim,
Xuhai Orson Xu,
Jun Wei,
Yuanchun Shi
Abstract:
Parental involvement in homework is a crucial aspect of family education, but it often leads to emotional strain and conflicts that can severely impact family well-being. This paper presents findings from a 4-week in situ study involving 78 families in China, where we collected and analyzed 602 valid audio recordings (totalling 475 hours) and daily surveys. Leveraging large language models (LLMs)…
▽ More
Parental involvement in homework is a crucial aspect of family education, but it often leads to emotional strain and conflicts that can severely impact family well-being. This paper presents findings from a 4-week in situ study involving 78 families in China, where we collected and analyzed 602 valid audio recordings (totalling 475 hours) and daily surveys. Leveraging large language models (LLMs) to analyze parent-child conversations, we gained a nuanced understanding of emotional and behavioural dynamics that overcomes the limitations of traditional one-time surveys and interviews. Our findings reveal significant emotional shifts in parents before and after homework involvement and summarise a range of positive, neutral and negative parental behaviours. We also catalogue seven common conflicts, with Knowledge Conflict being the most frequent. Notably, we found that even well-intentioned parental behaviours, such as Unlabelled Praise, were significantly positively correlated with specific conflict types. This work advances ubiquitous computing's research to sense and understand complex family dynamics, while offering evidence-based insights for designing future ambient intelligent systems to support healthy family education environments.
△ Less
Submitted 4 February, 2025; v1 submitted 3 February, 2025;
originally announced February 2025.
-
Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification
Authors:
Shijing Chen,
Mohamed Reda Bouadjenek,
Shoaib Jameel,
Usman Naseem,
Basem Suleiman,
Flora D. Salim,
Hakim Hacid,
Imran Razzak
Abstract:
Multi-level Hierarchical Classification (MLHC) tackles the challenge of categorizing items within a complex, multi-layered class structure. However, traditional MLHC classifiers often rely on a backbone model with independent output layers, which tend to ignore the hierarchical relationships between classes. This oversight can lead to inconsistent predictions that violate the underlying taxonomy.…
▽ More
Multi-level Hierarchical Classification (MLHC) tackles the challenge of categorizing items within a complex, multi-layered class structure. However, traditional MLHC classifiers often rely on a backbone model with independent output layers, which tend to ignore the hierarchical relationships between classes. This oversight can lead to inconsistent predictions that violate the underlying taxonomy. Leveraging Large Language Models (LLMs), we propose a novel taxonomy-embedded transitional LLM-agnostic framework for multimodality classification. The cornerstone of this advancement is the ability of models to enforce consistency across hierarchical levels. Our evaluations on the MEP-3M dataset - a multi-modal e-commerce product dataset with various hierarchical levels - demonstrated a significant performance improvement compared to conventional LLM structures.
△ Less
Submitted 12 January, 2025;
originally announced January 2025.
-
RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data
Authors:
Luan Pham,
Hongyu Zhang,
Huong Ha,
Flora Salim,
Xiuzhen Zhang
Abstract:
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years. However, there is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments. In this paper, we introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems. First, we introduc…
▽ More
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years. However, there is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments. In this paper, we introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems. First, we introduce three comprehensive datasets comprising 735 failure cases collected from three microservice systems, covering various fault types observed in real-world failures. Second, we present a comprehensive evaluation framework that includes fifteen reproducible baselines covering a wide range of RCA approaches, with the ability to evaluate both coarse-grained and fine-grained RCA. We hope that this ready-to-use benchmark will enable researchers and practitioners to conduct extensive analysis and pave the way for robust new solutions for RCA of microservice systems.
△ Less
Submitted 3 February, 2025; v1 submitted 22 December, 2024;
originally announced December 2024.
-
BiTSA: Leveraging Time Series Foundation Model for Building Energy Analytics
Authors:
Xiachong Lin,
Arian Prabowo,
Imran Razzak,
Hao Xue,
Matthew Amos,
Sam Behrens,
Flora D. Salim
Abstract:
Incorporating AI technologies into digital infrastructure offers transformative potential for energy management, particularly in enhancing energy efficiency and supporting net-zero objectives. However, the complexity of IoT-generated datasets often poses a significant challenge, hindering the translation of research insights into practical, real-world applications. This paper presents the design o…
▽ More
Incorporating AI technologies into digital infrastructure offers transformative potential for energy management, particularly in enhancing energy efficiency and supporting net-zero objectives. However, the complexity of IoT-generated datasets often poses a significant challenge, hindering the translation of research insights into practical, real-world applications. This paper presents the design of an interactive visualization tool, BiTSA. The tool enables building managers to interpret complex energy data quickly and take immediate, data-driven actions based on real-time insights. By integrating advanced forecasting models with an intuitive visual interface, our solution facilitates proactive decision-making, optimizes energy consumption, and promotes sustainable building management practices. BiTSA will empower building managers to optimize energy consumption, control demand-side energy usage, and achieve sustainability goals.
△ Less
Submitted 20 November, 2024;
originally announced December 2024.
-
Round and Communication Efficient Graph Coloring
Authors:
Yi-Jun Chang,
Gopinath Mishra,
Hung Thuan Nguyen,
Farrel D Salim
Abstract:
In the context of communication complexity, we explore protocols for graph coloring, focusing on the vertex and edge coloring problems in $n$-vertex graphs $G$ with a maximum degree $Δ$. We consider a scenario where the edges of $G$ are partitioned between two players.
Our first contribution is a randomized protocol that efficiently finds a $(Δ+ 1)$-vertex coloring of $G$, utilizing $O(n)$ bits…
▽ More
In the context of communication complexity, we explore protocols for graph coloring, focusing on the vertex and edge coloring problems in $n$-vertex graphs $G$ with a maximum degree $Δ$. We consider a scenario where the edges of $G$ are partitioned between two players.
Our first contribution is a randomized protocol that efficiently finds a $(Δ+ 1)$-vertex coloring of $G$, utilizing $O(n)$ bits of communication in expectation and completing in $O(\log \log n \cdot \log Δ)$ rounds in the worst case. This advancement represents a significant improvement over the work of Flin and Mittal [Distributed Computing 2025], who achieved the same communication cost but required $O(n)$ rounds in expectation, thereby making a significant reduction in the round complexity.
Our second contribution is a deterministic protocol to compute a $(2Δ- 1)$-edge coloring of $G$, which maintains the same $O(n)$ bits of communication and uses only $O(1)$ rounds. We complement the result with a tight $Ω(n)$-bit lower bound on the communication complexity of the $(2Δ-1)$-edge coloring problem, while a similar $Ω(n)$ lower bound for the $(Δ+1)$-vertex coloring problem has been established by Flin and Mittal [Distributed Computing 2025]. Our result implies a space lower bound of $Ω(n)$ bits for $(2Δ- 1)$-edge coloring in the $W$-streaming model, which is the first non-trivial space lower bound for edge coloring in the $W$-streaming model.
△ Less
Submitted 8 May, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Exploring Capabilities of Time Series Foundation Models in Building Analytics
Authors:
Xiachong Lin,
Arian Prabowo,
Imran Razzak,
Hao Xue,
Matthew Amos,
Sam Behrens,
Flora D. Salim
Abstract:
The growing integration of digitized infrastructure with Internet of Things (IoT) networks has transformed the management and optimization of building energy consumption. By leveraging IoT-based monitoring systems, stakeholders such as building managers, energy suppliers, and policymakers can make data-driven decisions to improve energy efficiency. However, accurate energy forecasting and analytic…
▽ More
The growing integration of digitized infrastructure with Internet of Things (IoT) networks has transformed the management and optimization of building energy consumption. By leveraging IoT-based monitoring systems, stakeholders such as building managers, energy suppliers, and policymakers can make data-driven decisions to improve energy efficiency. However, accurate energy forecasting and analytics face persistent challenges, primarily due to the inherent physical constraints of buildings and the diverse, heterogeneous nature of IoT-generated data. In this study, we conduct a comprehensive benchmarking of two publicly available IoT datasets, evaluating the performance of time series foundation models in the context of building energy analytics. Our analysis shows that single-modal models demonstrate significant promise in overcoming the complexities of data variability and physical limitations in buildings, with future work focusing on optimizing multi-modal models for sustainable energy management.
△ Less
Submitted 27 October, 2024;
originally announced November 2024.
-
ODEStream: A Buffer-Free Online Learning Framework with ODE-based Adaptor for Streaming Time Series Forecasting
Authors:
Futoon M. Abushaqra,
Hao Xue,
Yongli Ren,
Flora D. Salim
Abstract:
Addressing the challenges of irregularity and concept drift in streaming time series is crucial for real-world predictive modelling. Previous studies in time series continual learning often propose models that require buffering long sequences, potentially restricting the responsiveness of the inference system. Moreover, these models are typically designed for regularly sampled data, an unrealistic…
▽ More
Addressing the challenges of irregularity and concept drift in streaming time series is crucial for real-world predictive modelling. Previous studies in time series continual learning often propose models that require buffering long sequences, potentially restricting the responsiveness of the inference system. Moreover, these models are typically designed for regularly sampled data, an unrealistic assumption in real-world scenarios. This paper introduces ODEStream, a novel buffer-free continual learning framework that incorporates a temporal isolation layer to capture temporal dependencies within the data. Simultaneously, it leverages the capability of neural ordinary differential equations to process irregular sequences and generate a continuous data representation, enabling seamless adaptation to changing dynamics in a data streaming scenario. Our approach focuses on learning how the dynamics and distribution of historical data change over time, facilitating direct processing of streaming sequences. Evaluations on benchmark real-world datasets demonstrate that ODEStream outperforms the state-of-the-art online learning and streaming analysis baseline models, providing accurate predictions over extended periods while minimising performance degradation over time by learning how the sequence dynamics change. The implementation of ODEStream is available at: https://github.com/FtoonAbushaqra/ODEStream.git.
△ Less
Submitted 9 April, 2025; v1 submitted 11 November, 2024;
originally announced November 2024.
-
PACER: Physics Informed Uncertainty Aware Climate Emulator
Authors:
Hira Saleem,
Flora Salim,
Cormac Purcell
Abstract:
Climate models serve as critical tools for evaluating the effects of climate change and projecting future climate scenarios. However, the reliance on numerical simulations of physical equations renders them computationally intensive and inefficient. While deep learning methodologies have made significant progress in weather forecasting, they are still unstable for climate emulation tasks. Here, we…
▽ More
Climate models serve as critical tools for evaluating the effects of climate change and projecting future climate scenarios. However, the reliance on numerical simulations of physical equations renders them computationally intensive and inefficient. While deep learning methodologies have made significant progress in weather forecasting, they are still unstable for climate emulation tasks. Here, we propose PACER, a lightweight 684K parameter Physics Informed Uncertainty Aware Climate Emulator. PACER emulates temperature and precipitation stably for 86 years while only being trained on greenhouse gas emissions data. We incorporate a fundamental physical law of advection-diffusion in PACER accounting for boundary conditions and empirically estimating the diffusion co-efficient and flow velocities from emissions data. PACER has been trained on 15 climate models provided by ClimateSet outperforming baselines across most of the climate models and advancing a new state of the art in a climate diagnostic task.
△ Less
Submitted 30 October, 2024; v1 submitted 28 October, 2024;
originally announced October 2024.
-
GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems
Authors:
Wilson Wongso,
Hao Xue,
Flora D. Salim
Abstract:
Traditional Point-of-Interest (POI) recommendation systems often lack transparency, interpretability, and scrutability due to their reliance on dense vector-based user embeddings. Furthermore, the cold-start problem -- where systems have insufficient data for new users -- limits their ability to generate accurate recommendations. Existing methods often address this by leveraging similar trajectori…
▽ More
Traditional Point-of-Interest (POI) recommendation systems often lack transparency, interpretability, and scrutability due to their reliance on dense vector-based user embeddings. Furthermore, the cold-start problem -- where systems have insufficient data for new users -- limits their ability to generate accurate recommendations. Existing methods often address this by leveraging similar trajectories from other users, but this approach can be computationally expensive and increases the context length for LLM-based methods, making them difficult to scale. To address these limitations, we propose a method that generates natural language (NL) user profiles from large-scale, location-based social network (LBSN) check-ins, utilizing robust personality assessments and behavioral theories. These NL profiles capture user preferences, routines, and behaviors, improving POI prediction accuracy while offering enhanced transparency. By incorporating NL profiles as system prompts to LLMs, our approach reduces reliance on extensive historical data, while remaining flexible, easily updated, and computationally efficient. Our method is not only competitive with other LLM-based and complex agentic frameworks but is also more scalable for real-world POI recommender systems. Results demonstrate that our approach consistently outperforms baseline methods, offering a more interpretable and resource-efficient solution for POI recommendation systems. Our source code is available at: https://github.com/w11wo/GenUP/.
△ Less
Submitted 12 March, 2025; v1 submitted 27 October, 2024;
originally announced October 2024.
-
Evaluating the Influences of Explanation Style on Human-AI Reliance
Authors:
Emma Casolin,
Flora D. Salim,
Ben Newell
Abstract:
Explainable AI (XAI) aims to support appropriate human-AI reliance by increasing the interpretability of complex model decisions. Despite the proliferation of proposed methods, there is mixed evidence surrounding the effects of different styles of XAI explanations on human-AI reliance. Interpreting these conflicting findings requires an understanding of the individual and combined qualities of dif…
▽ More
Explainable AI (XAI) aims to support appropriate human-AI reliance by increasing the interpretability of complex model decisions. Despite the proliferation of proposed methods, there is mixed evidence surrounding the effects of different styles of XAI explanations on human-AI reliance. Interpreting these conflicting findings requires an understanding of the individual and combined qualities of different explanation styles that influence appropriate and inappropriate human-AI reliance, and the role of interpretability in this interaction. In this study, we investigate the influences of feature-based, example-based, and combined feature- and example-based XAI methods on human-AI reliance through a two-part experimental study with 274 participants comparing these explanation style conditions. Our findings suggest differences between feature-based and example-based explanation styles beyond interpretability that affect human-AI reliance patterns across differences in individual performance and task complexity. Our work highlights the importance of adapting explanations to their specific users and context over maximising broad interpretability.
△ Less
Submitted 26 October, 2024;
originally announced October 2024.
-
SensorLLM: Human-Intuitive Alignment of Multivariate Sensor Data with LLMs for Activity Recognition
Authors:
Zechen Li,
Shohreh Deldari,
Linyao Chen,
Hao Xue,
Flora D. Salim
Abstract:
We introduce SensorLLM, a two-stage framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from wearable sensor data. While LLMs excel at reasoning and generalization, they struggle with time-series inputs due to limited semantic context, numerical complexity, and sequence variability. To address these challenges, we construct SensorQA, a question-answering…
▽ More
We introduce SensorLLM, a two-stage framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from wearable sensor data. While LLMs excel at reasoning and generalization, they struggle with time-series inputs due to limited semantic context, numerical complexity, and sequence variability. To address these challenges, we construct SensorQA, a question-answering dataset of human-intuitive sensor-text pairs spanning diverse HAR scenarios. It supervises the Sensor-Language Alignment stage, where the model aligns sensor inputs with trend descriptions. Special tokens are introduced to mark channel boundaries. This alignment enables LLMs to interpret numerical patterns, channel-specific signals, and variable-length inputs--without requiring human annotation. In the subsequent Task-Aware Tuning stage, we adapt the model for multivariate HAR classification, achieving performance that matches or exceeds state-of-the-art methods. Our results show that, guided by human-intuitive alignment, SensorLLM becomes an effective sensor learner, reasoner, and classifier--generalizing across varied HAR settings and paving the way for foundation model research in time-series analysis.
△ Less
Submitted 20 May, 2025; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Inside Out or Not: Privacy Implications of Emotional Disclosure
Authors:
Elham Naghizade,
Kaixin Ji,
Benjamin Tag,
Flora Salim
Abstract:
Privacy is dynamic, sensitive, and contextual, much like our emotions. Previous studies have explored the interplay between privacy and context, privacy and emotion, and emotion and context. However, there remains a significant gap in understanding the interplay of these aspects simultaneously. In this paper, we present a preliminary study investigating the role of emotions in driving individuals'…
▽ More
Privacy is dynamic, sensitive, and contextual, much like our emotions. Previous studies have explored the interplay between privacy and context, privacy and emotion, and emotion and context. However, there remains a significant gap in understanding the interplay of these aspects simultaneously. In this paper, we present a preliminary study investigating the role of emotions in driving individuals' information sharing behaviour, particularly in relation to urban locations and social ties. We adopt a novel methodology that integrates context (location and time), emotion, and personal information sharing behaviour, providing a comprehensive analysis of how contextual emotions affect privacy. The emotions are assessed with both self-reporting and electrodermal activity (EDA). Our findings reveal that self-reported emotions influence personal information-sharing behaviour with distant social groups, while neutral emotions lead individuals to share less precise information with close social circles, a pattern is potentially detectable with wrist-worn EDA. Our study helps lay the foundation for personalised emotion-aware strategies to mitigate oversharing risks and enhance user privacy in the digital age.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
Characterizing the Molecular Gas in Infrared Bright Galaxies with CARMA
Authors:
Katherine Alatalo,
Andreea O. Petric,
Lauranne Lanz,
Kate Rowlands,
Vivian U,
Kirsten L. Larson,
Lee Armus,
Loreto Barcos-Muñoz,
Aaron S. Evans,
Jin Koda,
Yuanze Luo,
Anne M. Medling,
Kristina E. Nyland,
Justin A. Otter,
Pallavi Patil,
Fernando Peñaloza,
Diane Salim,
David B. Sanders,
Elizaveta Sazonova,
Maya Skarbinski,
Yiqing Song,
Ezequiel Treister,
C. Meg Urry
Abstract:
We present the CO(1-0) maps of 28 infrared-bright galaxies from the Great Observatories All-Sky Luminous Infrared Galaxy Survey (GOALS) taken with the Combined Array for Research in Millimeter Astronomy (CARMA). We detect 100GHz continuum in 16 of 28 galaxies, which trace both active galactic nuclei (AGNs) and compact star-forming cores. The GOALS galaxies show a variety of molecular gas morpholog…
▽ More
We present the CO(1-0) maps of 28 infrared-bright galaxies from the Great Observatories All-Sky Luminous Infrared Galaxy Survey (GOALS) taken with the Combined Array for Research in Millimeter Astronomy (CARMA). We detect 100GHz continuum in 16 of 28 galaxies, which trace both active galactic nuclei (AGNs) and compact star-forming cores. The GOALS galaxies show a variety of molecular gas morphologies, though in the majority of cases, the average velocity fields show a gradient consistent with rotation. We fit the full continuum SEDs of each of the source using either MAGPHYS or SED3FIT (if there are signs of an AGN) to derive the total stellar mass, dust mass, and star formation rates of each object. We adopt a value determined from luminous and ultraluminous infrared galaxies (LIRGs and ULIRGs) of $α_{\rm CO}=1.5^{+1.3}_{-0.8}~M_\odot$ (K km s$^{-1}$ pc$^2)^{-1}$, which leads to more physical values for $f_{\rm mol}$ and the gas-to-dust ratio. Mergers tend to have the highest gas-to-dust ratios. We assume the cospatiality of the molecular gas and star formation, and plot the sample on the Schmidt-Kennicutt relation, we find that they preferentially lie above the line set by normal star-forming galaxies. This hyper-efficiency is likely due to the increased turbulence in these systems, which decreases the freefall time compared to star-forming galaxies, leading to "enhanced" star formation efficiency. Line wings are present in a non-negligible subsample (11/28) of the CARMA GOALS sources and are likely due to outflows driven by AGNs or star formation, gas inflows, or additional decoupled gas components.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Exploring Large-Scale Language Models to Evaluate EEG-Based Multimodal Data for Mental Health
Authors:
Yongquan Hu,
Shuning Zhang,
Ting Dang,
Hong Jia,
Flora D. Salim,
Wen Hu,
Aaron J. Quigley
Abstract:
Integrating physiological signals such as electroencephalogram (EEG), with other data such as interview audio, may offer valuable multimodal insights into psychological states or neurological disorders. Recent advancements with Large Language Models (LLMs) position them as prospective ``health agents'' for mental health assessment. However, current research predominantly focus on single data modal…
▽ More
Integrating physiological signals such as electroencephalogram (EEG), with other data such as interview audio, may offer valuable multimodal insights into psychological states or neurological disorders. Recent advancements with Large Language Models (LLMs) position them as prospective ``health agents'' for mental health assessment. However, current research predominantly focus on single data modalities, presenting an opportunity to advance understanding through multimodal data. Our study aims to advance this approach by investigating multimodal data using LLMs for mental health assessment, specifically through zero-shot and few-shot prompting. Three datasets are adopted for depression and emotion classifications incorporating EEG, facial expressions, and audio (text). The results indicate that multimodal information confers substantial advantages over single modality approaches in mental health assessment. Notably, integrating EEG alongside commonly used LLM modalities such as audio and images demonstrates promising potential. Moreover, our findings reveal that 1-shot learning offers greater benefits compared to zero-shot learning methods.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
WorkR: Occupation Inference for Intelligent Task Assistance
Authors:
Yonchanok Khaokaew,
Hao Xue,
Mohammad Saiedur Rahaman,
Flora D. Salim
Abstract:
Occupation information can be utilized by digital assistants to provide occupation-specific personalized task support, including interruption management, task planning, and recommendations. Prior research in the digital workplace assistant domain requires users to input their occupation information for effective support. However, as many individuals switch between multiple occupations daily, curre…
▽ More
Occupation information can be utilized by digital assistants to provide occupation-specific personalized task support, including interruption management, task planning, and recommendations. Prior research in the digital workplace assistant domain requires users to input their occupation information for effective support. However, as many individuals switch between multiple occupations daily, current solutions falter without continuous user input. To address this, this study introduces WorkR, a framework that leverages passive sensing to capture pervasive signals from various task activities, addressing three challenges: the lack of a passive sensing architecture, personalization of occupation characteristics, and discovering latent relationships among occupation variables. We argue that signals from application usage, movements, social interactions, and the environment can inform a user's occupation. WorkR uses a Variational Autoencoder (VAE) to derive latent features for training models to infer occupations. Our experiments with an anonymized, context-rich activity and task log dataset demonstrate that our models can accurately infer occupations with more than 91% accuracy across six ISO occupation categories.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Long-term Fairness in Ride-Hailing Platform
Authors:
Yufan Kang,
Jeffrey Chan,
Wei Shao,
Flora D. Salim,
Christopher Leckie
Abstract:
Matching in two-sided markets such as ride-hailing has recently received significant attention. However, existing studies on ride-hailing mainly focus on optimising efficiency, and fairness issues in ride-hailing have been neglected. Fairness issues in ride-hailing, including significant earning differences between drivers and variance of passenger waiting times among different locations, have pot…
▽ More
Matching in two-sided markets such as ride-hailing has recently received significant attention. However, existing studies on ride-hailing mainly focus on optimising efficiency, and fairness issues in ride-hailing have been neglected. Fairness issues in ride-hailing, including significant earning differences between drivers and variance of passenger waiting times among different locations, have potential impacts on economic and ethical aspects. The recent studies that focus on fairness in ride-hailing exploit traditional optimisation methods and the Markov Decision Process to balance efficiency and fairness. However, there are several issues in these existing studies, such as myopic short-term decision-making from traditional optimisation and instability of fairness in a comparably longer horizon from both traditional optimisation and Markov Decision Process-based methods. To address these issues, we propose a dynamic Markov Decision Process model to alleviate fairness issues currently faced by ride-hailing, and seek a balance between efficiency and fairness, with two distinct characteristics: (i) a prediction module to predict the number of requests that will be raised in the future from different locations to allow the proposed method to consider long-term fairness based on the whole timeline instead of consider fairness only based on historical and current data patterns; (ii) a customised scalarisation function for multi-objective multi-agent Q Learning that aims to balance efficiency and fairness. Extensive experiments on a publicly available real-world dataset demonstrate that our proposed method outperforms existing state-of-the-art methods.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
AuditNet: A Conversational AI-based Security Assistant [DEMO]
Authors:
Shohreh Deldari,
Mohammad Goudarzi,
Aditya Joshi,
Arash Shaghaghi,
Simon Finn,
Flora D. Salim,
Sanjay Jha
Abstract:
In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate complian…
▽ More
In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate compliance checking on the go, in diverse domains, including but not limited to network infrastructure, legal contracts, educational standards, environmental regulations, and government policies. By leveraging retrieval-augmented generation using large language models, our framework automates the review, indexing, and retrieval of relevant, context-aware information, streamlining the process of verifying adherence to established guidelines and requirements. This AI assistant not only reduces the manual effort involved in compliance checks but also enhances accuracy and efficiency, supporting professionals in maintaining high standards of practice and ensuring regulatory compliance in their respective fields. We propose and demonstrate AuditNet, the first conversational AI security assistant designed to assist IoT network security experts by providing instant access to security standards, policies, and regulations.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
Understanding Physiological Responses of Students Over Different Courses
Authors:
Soundariya Ananthan,
Nan Gao,
Flora D. Salim
Abstract:
Student engagement plays a vital role in academic success with high engagement often linked to positive educational outcomes. Traditionally, student engagement is measured through self-reports, which are both labour-intensive and not real-time. An emerging alternative is monitoring physiological signals such as Electrodermal Activity (EDA) and Inter-Beat Interval (IBI), which reflect students' emo…
▽ More
Student engagement plays a vital role in academic success with high engagement often linked to positive educational outcomes. Traditionally, student engagement is measured through self-reports, which are both labour-intensive and not real-time. An emerging alternative is monitoring physiological signals such as Electrodermal Activity (EDA) and Inter-Beat Interval (IBI), which reflect students' emotional and cognitive states. In this research, we analyzed these signals from 23 students wearing Empatica E4 devices in real-world scenarios. Diverging from previous studies focused on lab settings or specific subjects, we examined physiological synchrony at the intra-student level across various courses. We also assessed how different courses influence physiological responses and identified consistent temporal patterns. Our findings show unique physiological response patterns among students, enhancing our understanding of student engagement dynamics. This opens up possibilities for tailoring educational strategies based on unobtrusive sensing data to optimize learning outcomes.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability
Authors:
Shuang Ao,
Simon Khan,
Haris Aziz,
Flora D. Salim
Abstract:
Understanding the agent's learning process, particularly the factors that contribute to its success or failure post-training, is crucial for comprehending the rationale behind the agent's decision-making process. Prior methods clarify the learning process by creating a structural causal model (SCM) or visually representing the distribution of value functions. Nevertheless, these approaches have co…
▽ More
Understanding the agent's learning process, particularly the factors that contribute to its success or failure post-training, is crucial for comprehending the rationale behind the agent's decision-making process. Prior methods clarify the learning process by creating a structural causal model (SCM) or visually representing the distribution of value functions. Nevertheless, these approaches have constraints as they exclusively function in 2D-environments or with uncomplicated transition dynamics. Understanding the agent's learning process in complicated environments or tasks is more challenging. In this paper, we propose REVEAL-IT, a novel framework for explaining the learning process of an agent in complex environments. Initially, we visualize the policy structure and the agent's learning process for various training tasks. By visualizing these findings, we can understand how much a particular training task or stage affects the agent's performance in test. Then, a GNN-based explainer learns to highlight the most important section of the policy, providing a more clear and robust explanation of the agent's learning process. The experiments demonstrate that explanations derived from this framework can effectively help in the optimization of the training tasks, resulting in improved learning efficiency and final performance.
△ Less
Submitted 14 October, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
ViLCo-Bench: VIdeo Language COntinual learning Benchmark
Authors:
Tianqi Tang,
Shohreh Deldari,
Hao Xue,
Celso De Melo,
Flora D. Salim
Abstract:
Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark…
▽ More
Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark, ViLCo-Bench, designed to evaluate continual learning models across a range of video-text tasks. The dataset comprises ten-minute-long videos and corresponding language queries collected from publicly available datasets. Additionally, we introduce a novel memory-efficient framework that incorporates self-supervised learning and mimics long-term and short-term memory effects. This framework addresses challenges including memory complexity from long video clips, natural language complexity from open queries, and text-video misalignment. We posit that ViLCo-Bench, with greater complexity compared to existing continual learning benchmarks, would serve as a critical tool for exploring the video-language domain, extending beyond conventional class-incremental tasks, and addressing complex and limited annotation issues. The curated data, evaluations, and our novel method are available at https://github.com/cruiseresearchgroup/ViLCo.
△ Less
Submitted 15 December, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
BTS: Building Timeseries Dataset: Empowering Large-Scale Building Analytics
Authors:
Arian Prabowo,
Xiachong Lin,
Imran Razzak,
Hao Xue,
Emily W. Yap,
Matthew Amos,
Flora D. Salim
Abstract:
Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytic…
▽ More
Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytics has been hampered by the lack of accessible, available, and comprehensive real-world datasets on multiple building operations. In this paper, we introduce the Building TimeSeries (BTS) dataset. Our dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique ontologies. Moreover, the metadata is standardized using the Brick schema. To demonstrate the utility of this dataset, we performed benchmarks on two tasks: timeseries ontology classification and zero-shot forecasting. These tasks represent an essential initial step in addressing challenges related to interoperability in building analytics. Access to the dataset and the code used for benchmarking are available here: https://github.com/cruiseresearchgroup/DIEF_BTS .
△ Less
Submitted 18 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning
Authors:
Wei Shao,
Yufan Kang,
Ziyan Peng,
Xiao Xiao,
Lei Wang,
Yuhui Yang,
Flora D Salim
Abstract:
Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balanc…
▽ More
Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balance between accuracy and timeliness is crucial. In this paper, we propose an early spatio-temporal forecasting model based on Multi-Objective reinforcement learning that can either implement an optimal policy given a preference or infer the preference based on a small number of samples. The model addresses two primary challenges: 1) enhancing the accuracy of early forecasting and 2) providing the optimal policy for determining the most suitable prediction time for each area. Our method demonstrates superior performance on three large-scale real-world datasets, surpassing existing methods in early spatio-temporal forecasting tasks.
△ Less
Submitted 18 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation
Authors:
Wei Shao,
Rongyi Zhu,
Cai Yang,
Chandra Thapa,
Muhammad Ejaz Ahmed,
Seyit Camtepe,
Rui Zhang,
DuYong Kim,
Hamid Menouar,
Flora D. Salim
Abstract:
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha…
▽ More
Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this challenge, we propose a Graph-GAN-based model for generating privacy-protected spatiotemporal data. Our approach incorporates spatial and temporal attention blocks in the discriminator and a spatiotemporal deconvolution structure in the generator. These enhancements enable efficient training under Gaussian noise to achieve differential privacy. Extensive experiments conducted on three real-world spatiotemporal datasets validate the efficacy of our model. Our method provides a privacy guarantee while maintaining the data utility. The prediction model trained on our generated data maintains a competitive performance compared to the model trained on the original data.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
CAPRI-FAIR: Integration of Multi-sided Fairness in Contextual POI Recommendation Framework
Authors:
Francis Zac dela Cruz,
Flora D. Salim,
Yonchanok Khaokaew,
Jeffrey Chan
Abstract:
Point-of-interest (POI) recommendation considers spatio-temporal factors like distance, peak hours, and user check-ins. Given their influence on both consumer experience and POI business, it's crucial to consider fairness from multiple perspectives. Unfortunately, these systems often provide less accurate recommendations to inactive users and less exposure to unpopular POIs. This paper develops a…
▽ More
Point-of-interest (POI) recommendation considers spatio-temporal factors like distance, peak hours, and user check-ins. Given their influence on both consumer experience and POI business, it's crucial to consider fairness from multiple perspectives. Unfortunately, these systems often provide less accurate recommendations to inactive users and less exposure to unpopular POIs. This paper develops a post-filter method that includes provider and consumer fairness in existing models, aiming to balance fairness metrics like item exposure with performance metrics such as precision and distance. Experiments show that a linear scoring model for provider fairness in re-scoring items offers the best balance between performance and long-tail exposure, sometimes without much precision loss. Addressing consumer fairness by recommending more popular POIs to inactive users increased precision in some models and datasets. However, combinations that reached the Pareto front of consumer and provider fairness resulted in the lowest precision values, highlighting that tradeoffs depend greatly on the model and dataset.
△ Less
Submitted 14 August, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Identifying high resolution benchmark data needs and Novel data-driven methodologies for Climate Downscaling
Authors:
Declan Curran,
Hira Saleem,
Flora Salim
Abstract:
We address the essential role of information retrieval in enhancing climate downscaling, focusing on the need for high-resolution datasets and the application of deep learning models. We explore the requirements for acquiring detailed spatial and temporal climate data, crucial for accurate local forecasts, and discuss how deep learning (DL) techniques can significantly improve downscaling precisio…
▽ More
We address the essential role of information retrieval in enhancing climate downscaling, focusing on the need for high-resolution datasets and the application of deep learning models. We explore the requirements for acquiring detailed spatial and temporal climate data, crucial for accurate local forecasts, and discuss how deep learning (DL) techniques can significantly improve downscaling precision by modelling the complex relationships between climate variables. Additionally, we examine the specific challenges related to the retrieval of relevant climatic data, emphasizing methods for efficient data extraction and utilization to support advanced model training. This research underscores an integrated approach, combining information retrieval, deep learning, and climate science to refine the process of climate downscaling, aiming to produce more accurate and actionable local climate projections.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Promoting Two-sided Fairness in Dynamic Vehicle Routing Problem
Authors:
Yufan Kang,
Rongsheng Zhang,
Wei Shao,
Flora D. Salim,
Jeffrey Chan
Abstract:
Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and…
▽ More
Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and non-compliance capture. Apart from original objectives like optimising total utility or efficiency, DVRP should also consider fairness for all parties. Unfairness can induce service providers and customers to give up on the systems, leading to negative financial and social impacts. However, most existing DVRP-related applications focus on improving fairness from a single side, and there have been few works considering two-sided fairness and utility optimisation concurrently. To this end, we propose a novel framework, a Two-sided Fairness-aware Genetic Algorithm (named 2FairGA), which expands the genetic algorithm from the original objective solely focusing on utility to multi-objectives that incorporate two-sided fairness. Subsequently, the impact of injecting two fairness definitions into the utility-focused model and the correlation between any pair of the three objectives are explored. Extensive experiments demonstrate the superiority of our proposed framework compared to the state-of-the-art.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Spectraformer: A Unified Random Feature Framework for Transformer
Authors:
Duke Nguyen,
Aditya Joshi,
Flora Salim
Abstract:
Linearization of attention using various kernel approximation and kernel learning techniques has shown promise. Past methods use a subset of combinations of component functions and weight matrices within the random features paradigm. We identify the need for a systematic comparison of different combinations of weight matrices and component functions for attention learning in Transformer. In this w…
▽ More
Linearization of attention using various kernel approximation and kernel learning techniques has shown promise. Past methods use a subset of combinations of component functions and weight matrices within the random features paradigm. We identify the need for a systematic comparison of different combinations of weight matrices and component functions for attention learning in Transformer. In this work, we introduce Spectraformer, a unified framework for approximating and learning the kernel function in linearized attention of the Transformer. We experiment with broad classes of component functions and weight matrices for three textual tasks in the LRA benchmark. Our empirical findings indicate that different kernels are good at different tasks and that kernel choice is fundamental to performant models. Our code is available at: https://github.com/dukenguyenxyz/spectraformer .
△ Less
Submitted 23 October, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.