Skip to main content

Showing 1–50 of 150 results for author: Salim, D

.
  1. arXiv:2506.09544  [pdf, ps, other

    cs.LG

    STOAT: Spatial-Temporal Probabilistic Causal Inference Network

    Authors: Yang Yang, Du Yin, Hao Xue, Flora Salim

    Abstract: Spatial-temporal causal time series (STC-TS) involve region-specific temporal observations driven by causally relevant covariates and interconnected across geographic or network-based spaces. Existing methods often model spatial and temporal dynamics independently and overlook causality-driven probabilistic forecasting, limiting their predictive power. To address this, we propose STOAT (Spatial-Te… ▽ More

    Submitted 12 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

  2. arXiv:2505.22692  [pdf, ps, other

    cs.SI

    BLUE: Bi-layer Heterogeneous Graph Fusion Network for Avian Influenza Forecasting

    Authors: Jing Du, Haley Stone, Yang Yang, Ashna Desai, Hao Xue, Andreas Züfle, Chandini Raina MacIntyre, Flora D. Salim

    Abstract: Accurate forecasting of avian influenza outbreaks within wild bird populations requires models that account for complex, multi-scale transmission patterns driven by various factors. Spatio-temporal GNN-based models have recently gained traction for infection forecasting due to their ability to capture relations and flow between spatial regions, but most existing frameworks rely solely on spatial c… ▽ More

    Submitted 9 June, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: 21 pages, 3 figures, 9 tables. The paper is under review

  3. arXiv:2505.19905  [pdf, ps, other

    cs.AI

    EMAC+: Embodied Multimodal Agent for Collaborative Planning with VLM+LLM

    Authors: Shuang Ao, Flora D. Salim, Simon Khan

    Abstract: Although LLMs demonstrate proficiency in several text-based reasoning and planning tasks, their implementation in robotics control is constrained by significant deficiencies: (1) LLM agents are designed to work mainly with textual inputs rather than visual conditions; (2) Current multimodal agents treat LLMs as static planners, which separates their reasoning from environment dynamics, resulting i… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  4. arXiv:2505.13994  [pdf, ps, other

    cs.AI cs.IR cs.MA

    Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning

    Authors: Ruiyi Yang, Hao Xue, Imran Razzak, Hakim Hacid, Flora D. Salim

    Abstract: Retrieval-Augmented Generation (RAG) systems empower large language models (LLMs) with external knowledge, yet struggle with efficiency-accuracy trade-offs when scaling to large knowledge graphs. Existing approaches often rely on monolithic graph retrieval, incurring unnecessary latency for simple queries and fragmented reasoning for complex multi-hop questions. To address these challenges, this p… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

    Comments: 20 pages, 4 figures

  5. arXiv:2505.12006  [pdf, ps, other

    cs.AI

    SOCIA: An End-to-End Agentic Framework for Automated Cyber-Physical-Social Simulator Generation

    Authors: Yuncheng Hua, Ji Miao, Mehdi Jafari, Jianxiang Xie, Hao Xue, Flora D. Salim

    Abstract: This paper introduces SOCIA (Simulation Orchestration for Cyber-physical-social Intelligence and Agents), a novel end-to-end framework leveraging Large Language Model (LLM)-based multi-agent systems to automate the generation of high-fidelity Cyber-Physical-Social (CPS) simulators. Addressing the challenges of labor-intensive manual simulator development and complex data calibration, SOCIA integra… ▽ More

    Submitted 23 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

    Comments: 55 pages, 3 figures, 3 tables. The paper is under review

    ACM Class: I.2.7

  6. arXiv:2505.11239  [pdf, other

    cs.LG

    Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks

    Authors: Wilson Wongso, Hao Xue, Flora D. Salim

    Abstract: Understanding human mobility through Point-of-Interest (POI) recommendation is increasingly important for applications such as urban planning, personalized services, and generative agent simulation. However, progress in this field is hindered by two key challenges: the over-reliance on older datasets from 2012-2013 and the lack of reproducible, city-level check-in datasets that reflect diverse glo… ▽ More

    Submitted 18 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  7. arXiv:2505.04681  [pdf, other

    astro-ph.GA astro-ph.IM

    A data-driven approach for star formation parameterization using symbolic regression

    Authors: Diane M. Salim, Matthew E. Orr, Blakesley Burkhart, Rachel S. Somerville, Miles Cramner

    Abstract: Star formation (SF) in the interstellar medium (ISM) is fundamental to understanding galaxy evolution and planet formation. However, efforts to develop closed-form analytic expressions that link SF with key influencing physical variables, such as gas density and turbulence, remain challenging. In this work, we leverage recent advancements in machine learning (ML) and use symbolic regression (SR) t… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: Submitted to the Astrophysical Journal; 29 pages, 11 figures, 5 tables

  8. arXiv:2504.13517  [pdf, other

    cs.AI

    Optimizing Electric Vehicle Charging Station Locations: A Data-driven System with Multi-source Fusion

    Authors: Lihuan Li, Du Yin, Hao Xue, David Lillo-Trynes, Flora Salim

    Abstract: With the growing electric vehicles (EVs) charging demand, urban planners face the challenges of providing charging infrastructure at optimal locations. For example, range anxiety during long-distance travel and the inadequate distribution of residential charging stations are the major issues many cities face. To achieve reasonable estimation and deployment of the charging demand, we develop a data… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

    Comments: 4-page short paper

  9. arXiv:2504.08260  [pdf, other

    cs.CL

    Evaluating the Bias in LLMs for Surveying Opinion and Decision Making in Healthcare

    Authors: Yonchanok Khaokaew, Flora D. Salim, Andreas Züfle, Hao Xue, Taylor Anderson, C. Raina MacIntyre, Matthew Scotch, David J Heslop

    Abstract: Generative agents have been increasingly used to simulate human behaviour in silico, driven by large language models (LLMs). These simulacra serve as sandboxes for studying human behaviour without compromising privacy or safety. However, it remains unclear whether such agents can truly represent real individuals. This work compares survey data from the Understanding America Study (UAS) on healthca… ▽ More

    Submitted 16 April, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

  10. arXiv:2504.01162  [pdf, ps, other

    cs.IR

    Information Retrieval for Climate Impact

    Authors: Maarten de Rijke, Bart van den Hurk, Flora Salim, Alaa Al Khourdajie, Nan Bai, Renato Calzone, Declan Curran, Getnet Demil, Lesley Frew, Noah Gießing, Mukesh Kumar Gupta, Maria Heuss, Sanaa Hobeichi, David Huard, Jingwei Kang, Ana Lucic, Tanwi Mallick, Shruti Nath, Andrew Okem, Barbara Pernici, Thilina Rajapakse, Hira Saleem, Harry Scells, Nicole Schneider, Damiano Spina , et al. (6 additional authors not shown)

    Abstract: The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- informa… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: Report on the MANILA24 Workshop

    ACM Class: H.3.3

  11. arXiv:2503.15566  [pdf, other

    cs.LG

    Enforcing Consistency and Fairness in Multi-level Hierarchical Classification with a Mask-based Output Layer

    Authors: Shijing Chen, Shoaib Jameel, Mohamed Reda Bouadjenek, Feilong Tang, Usman Naseem, Basem Suleiman, Hakim Hacid, Flora D. Salim, Imran Razzak

    Abstract: Traditional Multi-level Hierarchical Classification (MLHC) classifiers often rely on backbone models with $n$ independent output layers. This structure tends to overlook the hierarchical relationships between classes, leading to inconsistent predictions that violate the underlying taxonomy. Additionally, once a backbone architecture for an MLHC classifier is selected, adapting the model to accommo… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 14 pages, 14 figures. arXiv admin note: text overlap with arXiv:2501.06827

  12. arXiv:2503.14980  [pdf, other

    cs.LG

    Embedding spatial context in urban traffic forecasting with contrastive pre-training

    Authors: Matthew Low, Arian Prabowo, Hao Xue, Flora Salim

    Abstract: Urban traffic forecasting is a commonly encountered problem, with wide-ranging applications in fields such as urban planning, civil engineering and transport. In this paper, we study the enhancement of traffic forecasting with pre-training, focusing on spatio-temporal graph methods. While various machine learning methods to solve traffic forecasting problems have been explored and extensively stud… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 21 pages with references, 10 figures

  13. arXiv:2503.14800  [pdf, other

    cs.IR cs.AI cs.LG

    Long Context Modeling with Ranked Memory-Augmented Retrieval

    Authors: Ghadir Alselwi, Hao Xue, Shoaib Jameel, Basem Suleiman, Flora D. Salim, Imran Razzak

    Abstract: Effective long-term memory management is crucial for language models handling extended contexts. We introduce a novel framework that dynamically ranks memory entries based on relevance. Unlike previous works, our model introduces a novel relevance scoring and a pointwise re-ranking model for key-value embeddings, inspired by learning-to-rank techniques in information retrieval. Enhanced Ranked Mem… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  14. arXiv:2503.14234  [pdf, other

    cs.AI cs.MA

    Beyond Single Pass, Looping Through Time: KG-IRAG with Iterative Knowledge Retrieval

    Authors: Ruiyi Yang, Hao Xue, Imran Razzak, Hakim Hacid, Flora D. Salim

    Abstract: Graph Retrieval-Augmented Generation (GraphRAG) has proven highly effective in enhancing the performance of Large Language Models (LLMs) on tasks that require external knowledge. By leveraging Knowledge Graphs (KGs), GraphRAG improves information retrieval for complex reasoning tasks, providing more precise and comprehensive retrieval and generating more accurate responses to QAs. However, most RA… ▽ More

    Submitted 19 May, 2025; v1 submitted 18 March, 2025; originally announced March 2025.

    Comments: 15 pages, 3 figures

  15. arXiv:2503.13502  [pdf, other

    cs.DB cs.LG

    Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey

    Authors: Yuxuan Liang, Haomin Wen, Yutong Xia, Ming Jin, Bin Yang, Flora Salim, Qingsong Wen, Shirui Pan, Gao Cong

    Abstract: Spatio-Temporal (ST) data science, which includes sensing, managing, and mining large-scale data across space and time, is fundamental to understanding complex systems in domains such as urban computing, climate science, and intelligent transportation. Traditional deep learning approaches have significantly advanced this field, particularly in the stage of ST data mining. However, these models rem… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

  16. arXiv:2503.12858  [pdf, other

    cs.CL cs.LG

    Harnessing Test-time Adaptation for NLU tasks Involving Dialects of English

    Authors: Duke Nguyen, Aditya Joshi, Flora Salim

    Abstract: Test-time adaptation (TTA) is an excellent method which helps generalize models across domains, tasks, and distributions without the use of labeled datasets. Thus, TTA is very useful in natural language processing (NLP) in the dialectal setting, since oftentimes, models are trained on Standard American English (SAE), evaluated on Indian English or Nigerian English, of which distribution differs si… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  17. arXiv:2503.07259  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

    Authors: Baiyu Chen, Wilson Wongso, Zechen Li, Yonchanok Khaokaew, Hao Xue, Flora Salim

    Abstract: Egocentric video-based models capture rich semantic information and have demonstrated strong performance in human activity recognition (HAR). However, their high power consumption, privacy concerns, and dependence on lighting conditions limit their feasibility for continuous on-device recognition. In contrast, inertial measurement unit (IMU) sensors offer an energy-efficient and privacy-preserving… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  18. arXiv:2502.18712  [pdf, other

    cs.AI cs.SI

    TrajLLM: A Modular LLM-Enhanced Agent-Based Framework for Realistic Human Trajectory Simulation

    Authors: Chenlu Ju, Jiaxin Liu, Shobhit Sinha, Hao Xue, Flora Salim

    Abstract: This work leverages Large Language Models (LLMs) to simulate human mobility, addressing challenges like high costs and privacy concerns in traditional models. Our hierarchical framework integrates persona generation, activity selection, and destination prediction, using real-world demographic and psychological data to create realistic movement patterns. Both physical models and language models are… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

    Comments: Accepted WWW2025 Demo Paper

  19. arXiv:2502.16755  [pdf, other

    cs.CY

    Watch Out E-scooter Coming Through: Multimodal Sensing of Mixed Traffic Use and Conflicts Through Riders Ego-centric Views

    Authors: Hiruni Nuwanthika Kegalle, Danula Hettiachchi, Jeffrey Chan, Mark Sanderson, Flora D. Salim

    Abstract: E-scooters are becoming a popular means of urban transportation. However, this increased popularity brings challenges, such as road accidents and conflicts when sharing space with traditional transport modes. An in-depth understanding of e-scooter rider behaviour is crucial for ensuring rider safety, guiding infrastructure planning, and enforcing traffic rules. This study investigated the rider be… ▽ More

    Submitted 23 February, 2025; originally announced February 2025.

    Comments: Accepted in Proc. ACM Interactive, Mobile, Wearable and Ubiquitous Technologies,(March 2025), 23 pages. https://doi.org/10.1145/3712284

  20. arXiv:2502.14171  [pdf, other

    cs.CL

    Enhancing Conversational Agents with Theory of Mind: Aligning Beliefs, Desires, and Intentions for Human-Like Interaction

    Authors: Mehdi Jafari, Devin Yuncheng Hua, Hao Xue, Flora Salim

    Abstract: Natural language interaction with agentic Artificial Intelligence (AI), driven by Large Language Models (LLMs), is expected to remain a dominant paradigm in the near future. While humans instinctively align their communication with mental states -- an ability known as Theory of Mind (ToM), current LLM powered systems exhibit significant limitations in this regard. This study examines the extent to… ▽ More

    Submitted 20 May, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: Accepted to Findings of ACL 2025

  21. arXiv:2502.13626  [pdf, other

    cs.CE

    AI-Empowered Catalyst Discovery: A Survey from Classical Machine Learning Approaches to Large Language Models

    Authors: Yuanyuan Xu, Hanchen Wang, Wenjie Zhang, Lexing Xie, Yin Chen, Flora Salim, Ying Zhang, Justin Gooding, Toby Walsh

    Abstract: Catalysts are essential for accelerating chemical reactions and enhancing selectivity, which is crucial for the sustainable production of energy, materials, and bioactive compounds. Catalyst discovery is fundamental yet challenging in computational chemistry and has garnered significant attention due to the promising performance of advanced Artificial Intelligence (AI) techniques. The development… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  22. arXiv:2502.12207  [pdf, other

    cs.LG cs.AI

    PAR-AdvGAN: Improving Adversarial Attack Capability with Progressive Auto-Regression AdvGAN

    Authors: Jiayu Zhang, Zhiyu Zhu, Xinyi Wang, Silin Liao, Zhibo Jin, Flora D. Salim, Huaming Chen

    Abstract: Deep neural networks have demonstrated remarkable performance across various domains. However, they are vulnerable to adversarial examples, which can lead to erroneous predictions. Generative Adversarial Networks (GANs) can leverage the generators and discriminators model to quickly produce high-quality adversarial examples. Since both modules train in a competitive and simultaneous manner, GAN-ba… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

  23. arXiv:2502.11681  [pdf, other

    cs.CL cs.AI

    RIDE: Enhancing Large Language Model Alignment through Restyled In-Context Learning Demonstration Exemplars

    Authors: Yuncheng Hua, Lizhen Qu, Zhuang Li, Hao Xue, Flora D. Salim, Gholamreza Haffari

    Abstract: Alignment tuning is crucial for ensuring large language models (LLMs) behave ethically and helpfully. Current alignment approaches require high-quality annotations and significant training resources. This paper proposes a low-cost, tuning-free method using in-context learning (ICL) to enhance LLM alignment. Through an analysis of high-quality ICL demos, we identified style as a key factor influenc… ▽ More

    Submitted 5 March, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: 38 pages, 2 figures, 20 tables; The paper is under review in ARR

    ACM Class: I.2.7

  24. arXiv:2502.01325  [pdf, other

    cs.HC

    The Homework Wars: Exploring Emotions, Behaviours, and Conflicts in Parent-Child Homework Interactions

    Authors: Nan Gao, Yibin Liu, Xin Tang, Yanyan Liu, Chun Yu, Yun Huang, Yuntao Wang, Flora D. Salim, Xuhai Orson Xu, Jun Wei, Yuanchun Shi

    Abstract: Parental involvement in homework is a crucial aspect of family education, but it often leads to emotional strain and conflicts that can severely impact family well-being. This paper presents findings from a 4-week in situ study involving 78 families in China, where we collected and analyzed 602 valid audio recordings (totalling 475 hours) and daily surveys. Leveraging large language models (LLMs)… ▽ More

    Submitted 4 February, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  25. arXiv:2501.06827  [pdf, other

    cs.AI

    Leveraging Taxonomy and LLMs for Improved Multimodal Hierarchical Classification

    Authors: Shijing Chen, Mohamed Reda Bouadjenek, Shoaib Jameel, Usman Naseem, Basem Suleiman, Flora D. Salim, Hakim Hacid, Imran Razzak

    Abstract: Multi-level Hierarchical Classification (MLHC) tackles the challenge of categorizing items within a complex, multi-layered class structure. However, traditional MLHC classifiers often rely on a backbone model with independent output layers, which tend to ignore the hierarchical relationships between classes. This oversight can lead to inconsistent predictions that violate the underlying taxonomy.… ▽ More

    Submitted 12 January, 2025; originally announced January 2025.

    Comments: 11 pages, 7 figures, 2 tables, and accepted by COLING 2025

  26. arXiv:2412.17015  [pdf, other

    cs.SE

    RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data

    Authors: Luan Pham, Hongyu Zhang, Huong Ha, Flora Salim, Xiuzhen Zhang

    Abstract: Root cause analysis (RCA) for microservice systems has gained significant attention in recent years. However, there is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments. In this paper, we introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems. First, we introduc… ▽ More

    Submitted 3 February, 2025; v1 submitted 22 December, 2024; originally announced December 2024.

  27. BiTSA: Leveraging Time Series Foundation Model for Building Energy Analytics

    Authors: Xiachong Lin, Arian Prabowo, Imran Razzak, Hao Xue, Matthew Amos, Sam Behrens, Flora D. Salim

    Abstract: Incorporating AI technologies into digital infrastructure offers transformative potential for energy management, particularly in enhancing energy efficiency and supporting net-zero objectives. However, the complexity of IoT-generated datasets often poses a significant challenge, hindering the translation of research insights into practical, real-world applications. This paper presents the design o… ▽ More

    Submitted 20 November, 2024; originally announced December 2024.

    Comments: 4 pages, 4 figures, 3 tables

  28. arXiv:2412.12589  [pdf, ps, other

    cs.DS cs.DC

    Round and Communication Efficient Graph Coloring

    Authors: Yi-Jun Chang, Gopinath Mishra, Hung Thuan Nguyen, Farrel D Salim

    Abstract: In the context of communication complexity, we explore protocols for graph coloring, focusing on the vertex and edge coloring problems in $n$-vertex graphs $G$ with a maximum degree $Δ$. We consider a scenario where the edges of $G$ are partitioned between two players. Our first contribution is a randomized protocol that efficiently finds a $(Δ+ 1)$-vertex coloring of $G$, utilizing $O(n)$ bits… ▽ More

    Submitted 8 May, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

  29. arXiv:2411.08888  [pdf, other

    cs.CY cs.AI

    Exploring Capabilities of Time Series Foundation Models in Building Analytics

    Authors: Xiachong Lin, Arian Prabowo, Imran Razzak, Hao Xue, Matthew Amos, Sam Behrens, Flora D. Salim

    Abstract: The growing integration of digitized infrastructure with Internet of Things (IoT) networks has transformed the management and optimization of building energy consumption. By leveraging IoT-based monitoring systems, stakeholders such as building managers, energy suppliers, and policymakers can make data-driven decisions to improve energy efficiency. However, accurate energy forecasting and analytic… ▽ More

    Submitted 27 October, 2024; originally announced November 2024.

    Comments: 7 pages, 1 figures, and 4 tables

  30. arXiv:2411.07413  [pdf, other

    cs.LG

    ODEStream: A Buffer-Free Online Learning Framework with ODE-based Adaptor for Streaming Time Series Forecasting

    Authors: Futoon M. Abushaqra, Hao Xue, Yongli Ren, Flora D. Salim

    Abstract: Addressing the challenges of irregularity and concept drift in streaming time series is crucial for real-world predictive modelling. Previous studies in time series continual learning often propose models that require buffering long sequences, potentially restricting the responsiveness of the inference system. Moreover, these models are typically designed for regularly sampled data, an unrealistic… ▽ More

    Submitted 9 April, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

  31. arXiv:2410.21657  [pdf, other

    physics.ao-ph cs.AI cs.LG

    PACER: Physics Informed Uncertainty Aware Climate Emulator

    Authors: Hira Saleem, Flora Salim, Cormac Purcell

    Abstract: Climate models serve as critical tools for evaluating the effects of climate change and projecting future climate scenarios. However, the reliance on numerical simulations of physical equations renders them computationally intensive and inefficient. While deep learning methodologies have made significant progress in weather forecasting, they are still unstable for climate emulation tasks. Here, we… ▽ More

    Submitted 30 October, 2024; v1 submitted 28 October, 2024; originally announced October 2024.

  32. arXiv:2410.20643  [pdf, other

    cs.IR

    GenUP: Generative User Profilers as In-Context Learners for Next POI Recommender Systems

    Authors: Wilson Wongso, Hao Xue, Flora D. Salim

    Abstract: Traditional Point-of-Interest (POI) recommendation systems often lack transparency, interpretability, and scrutability due to their reliance on dense vector-based user embeddings. Furthermore, the cold-start problem -- where systems have insufficient data for new users -- limits their ability to generate accurate recommendations. Existing methods often address this by leveraging similar trajectori… ▽ More

    Submitted 12 March, 2025; v1 submitted 27 October, 2024; originally announced October 2024.

  33. arXiv:2410.20067  [pdf, other

    cs.HC

    Evaluating the Influences of Explanation Style on Human-AI Reliance

    Authors: Emma Casolin, Flora D. Salim, Ben Newell

    Abstract: Explainable AI (XAI) aims to support appropriate human-AI reliance by increasing the interpretability of complex model decisions. Despite the proliferation of proposed methods, there is mixed evidence surrounding the effects of different styles of XAI explanations on human-AI reliance. Interpreting these conflicting findings requires an understanding of the individual and combined qualities of dif… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

    Comments: 20 pages

  34. arXiv:2410.10624  [pdf, other

    cs.CL

    SensorLLM: Human-Intuitive Alignment of Multivariate Sensor Data with LLMs for Activity Recognition

    Authors: Zechen Li, Shohreh Deldari, Linyao Chen, Hao Xue, Flora D. Salim

    Abstract: We introduce SensorLLM, a two-stage framework that enables Large Language Models (LLMs) to perform human activity recognition (HAR) from wearable sensor data. While LLMs excel at reasoning and generalization, they struggle with time-series inputs due to limited semantic context, numerical complexity, and sequence variability. To address these challenges, we construct SensorQA, a question-answering… ▽ More

    Submitted 20 May, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

  35. Inside Out or Not: Privacy Implications of Emotional Disclosure

    Authors: Elham Naghizade, Kaixin Ji, Benjamin Tag, Flora Salim

    Abstract: Privacy is dynamic, sensitive, and contextual, much like our emotions. Previous studies have explored the interplay between privacy and context, privacy and emotion, and emotion and context. However, there remains a significant gap in understanding the interplay of these aspects simultaneously. In this paper, we present a preliminary study investigating the role of emotions in driving individuals'… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: see https://doi.org/10.1145/3675094.3677598

  36. arXiv:2409.09116  [pdf, other

    astro-ph.GA

    Characterizing the Molecular Gas in Infrared Bright Galaxies with CARMA

    Authors: Katherine Alatalo, Andreea O. Petric, Lauranne Lanz, Kate Rowlands, Vivian U, Kirsten L. Larson, Lee Armus, Loreto Barcos-Muñoz, Aaron S. Evans, Jin Koda, Yuanze Luo, Anne M. Medling, Kristina E. Nyland, Justin A. Otter, Pallavi Patil, Fernando Peñaloza, Diane Salim, David B. Sanders, Elizaveta Sazonova, Maya Skarbinski, Yiqing Song, Ezequiel Treister, C. Meg Urry

    Abstract: We present the CO(1-0) maps of 28 infrared-bright galaxies from the Great Observatories All-Sky Luminous Infrared Galaxy Survey (GOALS) taken with the Combined Array for Research in Millimeter Astronomy (CARMA). We detect 100GHz continuum in 16 of 28 galaxies, which trace both active galactic nuclei (AGNs) and compact star-forming cores. The GOALS galaxies show a variety of molecular gas morpholog… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: 29 pages, 4 tables, 11 figures, Accepted by the Astrophysical Journal

  37. Exploring Large-Scale Language Models to Evaluate EEG-Based Multimodal Data for Mental Health

    Authors: Yongquan Hu, Shuning Zhang, Ting Dang, Hong Jia, Flora D. Salim, Wen Hu, Aaron J. Quigley

    Abstract: Integrating physiological signals such as electroencephalogram (EEG), with other data such as interview audio, may offer valuable multimodal insights into psychological states or neurological disorders. Recent advancements with Large Language Models (LLMs) position them as prospective ``health agents'' for mental health assessment. However, current research predominantly focus on single data modal… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

    Comments: 6 pages; UbiComp Companion '24, Companion of the 2024 ACM International Joint Conference on Pervasive and Ubiquitous Computing, October 5--9, 2024}{Melbourne, VIC, Australia

  38. WorkR: Occupation Inference for Intelligent Task Assistance

    Authors: Yonchanok Khaokaew, Hao Xue, Mohammad Saiedur Rahaman, Flora D. Salim

    Abstract: Occupation information can be utilized by digital assistants to provide occupation-specific personalized task support, including interruption management, task planning, and recommendations. Prior research in the digital workplace assistant domain requires users to input their occupation information for effective support. However, as many individuals switch between multiple occupations daily, curre… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  39. arXiv:2407.17839  [pdf, other

    cs.AI cs.LG

    Long-term Fairness in Ride-Hailing Platform

    Authors: Yufan Kang, Jeffrey Chan, Wei Shao, Flora D. Salim, Christopher Leckie

    Abstract: Matching in two-sided markets such as ride-hailing has recently received significant attention. However, existing studies on ride-hailing mainly focus on optimising efficiency, and fairness issues in ride-hailing have been neglected. Fairness issues in ride-hailing, including significant earning differences between drivers and variance of passenger waiting times among different locations, have pot… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: Accepted by ECML PKDD 2024

  40. arXiv:2407.14116  [pdf, other

    cs.CR cs.LG

    AuditNet: A Conversational AI-based Security Assistant [DEMO]

    Authors: Shohreh Deldari, Mohammad Goudarzi, Aditya Joshi, Arash Shaghaghi, Simon Finn, Flora D. Salim, Sanjay Jha

    Abstract: In the age of information overload, professionals across various fields face the challenge of navigating vast amounts of documentation and ever-evolving standards. Ensuring compliance with standards, regulations, and contractual obligations is a critical yet complex task across various professional fields. We propose a versatile conversational AI assistant framework designed to facilitate complian… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  41. Understanding Physiological Responses of Students Over Different Courses

    Authors: Soundariya Ananthan, Nan Gao, Flora D. Salim

    Abstract: Student engagement plays a vital role in academic success with high engagement often linked to positive educational outcomes. Traditionally, student engagement is measured through self-reports, which are both labour-intensive and not real-time. An emerging alternative is monitoring physiological signals such as Electrodermal Activity (EDA) and Inter-Beat Interval (IBI), which reflect students' emo… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: This paper is published on ISWC '24

  42. arXiv:2406.14214  [pdf, other

    cs.AI

    REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability

    Authors: Shuang Ao, Simon Khan, Haris Aziz, Flora D. Salim

    Abstract: Understanding the agent's learning process, particularly the factors that contribute to its success or failure post-training, is crucial for comprehending the rationale behind the agent's decision-making process. Prior methods clarify the learning process by creating a structural causal model (SCM) or visually representing the distribution of value functions. Nevertheless, these approaches have co… ▽ More

    Submitted 14 October, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  43. arXiv:2406.13123  [pdf, other

    cs.AI cs.CV

    ViLCo-Bench: VIdeo Language COntinual learning Benchmark

    Authors: Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

    Abstract: Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark… ▽ More

    Submitted 15 December, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures, 8 tables, Accepted at NeurIPS Dataset and Benchmark Track 2024

  44. arXiv:2406.08990  [pdf, other

    cs.LG

    BTS: Building Timeseries Dataset: Empowering Large-Scale Building Analytics

    Authors: Arian Prabowo, Xiachong Lin, Imran Razzak, Hao Xue, Emily W. Yap, Matthew Amos, Flora D. Salim

    Abstract: Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytic… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 21 pages, 2 figures, 9 tables, under review

  45. arXiv:2406.04035  [pdf, other

    cs.LG cs.AI

    STEMO: Early Spatio-temporal Forecasting with Multi-Objective Reinforcement Learning

    Authors: Wei Shao, Yufan Kang, Ziyan Peng, Xiao Xiao, Lei Wang, Yuhui Yang, Flora D Salim

    Abstract: Accuracy and timeliness are indeed often conflicting goals in prediction tasks. Premature predictions may yield a higher rate of false alarms, whereas delaying predictions to gather more information can render them too late to be useful. In applications such as wildfires, crimes, and traffic jams, timely forecasting are vital for safeguarding human life and property. Consequently, finding a balanc… ▽ More

    Submitted 18 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted paper in KDD 2024

  46. arXiv:2406.03404  [pdf, other

    cs.LG cs.AI cs.CR

    ST-DPGAN: A Privacy-preserving Framework for Spatiotemporal Data Generation

    Authors: Wei Shao, Rongyi Zhu, Cai Yang, Chandra Thapa, Muhammad Ejaz Ahmed, Seyit Camtepe, Rui Zhang, DuYong Kim, Hamid Menouar, Flora D. Salim

    Abstract: Spatiotemporal data is prevalent in a wide range of edge devices, such as those used in personal communication and financial transactions. Recent advancements have sparked a growing interest in integrating spatiotemporal analysis with large-scale language models. However, spatiotemporal data often contains sensitive information, making it unsuitable for open third-party access. To address this cha… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  47. arXiv:2406.03109  [pdf, other

    cs.IR

    CAPRI-FAIR: Integration of Multi-sided Fairness in Contextual POI Recommendation Framework

    Authors: Francis Zac dela Cruz, Flora D. Salim, Yonchanok Khaokaew, Jeffrey Chan

    Abstract: Point-of-interest (POI) recommendation considers spatio-temporal factors like distance, peak hours, and user check-ins. Given their influence on both consumer experience and POI business, it's crucial to consider fairness from multiple perspectives. Unfortunately, these systems often provide less accurate recommendations to inactive users and less exposure to unpopular POIs. This paper develops a… ▽ More

    Submitted 14 August, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  48. arXiv:2405.20346  [pdf, other

    physics.ao-ph

    Identifying high resolution benchmark data needs and Novel data-driven methodologies for Climate Downscaling

    Authors: Declan Curran, Hira Saleem, Flora Salim

    Abstract: We address the essential role of information retrieval in enhancing climate downscaling, focusing on the need for high-resolution datasets and the application of deep learning models. We explore the requirements for acquiring detailed spatial and temporal climate data, crucial for accurate local forecasts, and discuss how deep learning (DL) techniques can significantly improve downscaling precisio… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  49. Promoting Two-sided Fairness in Dynamic Vehicle Routing Problem

    Authors: Yufan Kang, Rongsheng Zhang, Wei Shao, Flora D. Salim, Jeffrey Chan

    Abstract: Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  50. arXiv:2405.15310  [pdf, other

    cs.LG

    Spectraformer: A Unified Random Feature Framework for Transformer

    Authors: Duke Nguyen, Aditya Joshi, Flora Salim

    Abstract: Linearization of attention using various kernel approximation and kernel learning techniques has shown promise. Past methods use a subset of combinations of component functions and weight matrices within the random features paradigm. We identify the need for a systematic comparison of different combinations of weight matrices and component functions for attention learning in Transformer. In this w… ▽ More

    Submitted 23 October, 2024; v1 submitted 24 May, 2024; originally announced May 2024.