Search | arXiv e-print repository

Domain Knowledge-Enhanced LLMs for Fraud and Concept Drift Detection

Authors: Ali Şenol, Garima Agrawal, Huan Liu

Abstract: Detecting deceptive conversations on dynamic platforms is increasingly difficult due to evolving language patterns and Concept Drift (CD)-i.e., semantic or topical shifts that alter the context or intent of interactions over time. These shifts can obscure malicious intent or mimic normal dialogue, making accurate classification challenging. While Large Language Models (LLMs) show strong performanc… ▽ More Detecting deceptive conversations on dynamic platforms is increasingly difficult due to evolving language patterns and Concept Drift (CD)-i.e., semantic or topical shifts that alter the context or intent of interactions over time. These shifts can obscure malicious intent or mimic normal dialogue, making accurate classification challenging. While Large Language Models (LLMs) show strong performance in natural language tasks, they often struggle with contextual ambiguity and hallucinations in risk-sensitive scenarios. To address these challenges, we present a Domain Knowledge (DK)-Enhanced LLM framework that integrates pretrained LLMs with structured, task-specific insights to perform fraud and concept drift detection. The proposed architecture consists of three main components: (1) a DK-LLM module to detect fake or deceptive conversations; (2) a drift detection unit (OCDD) to determine whether a semantic shift has occurred; and (3) a second DK-LLM module to classify the drift as either benign or fraudulent. We first validate the value of domain knowledge using a fake review dataset and then apply our full framework to SEConvo, a multiturn dialogue dataset that includes various types of fraud and spam attacks. Results show that our system detects fake conversations with high accuracy and effectively classifies the nature of drift. Guided by structured prompts, the LLaMA-based implementation achieves 98% classification accuracy. Comparative studies against zero-shot baselines demonstrate that incorporating domain knowledge and drift awareness significantly improves performance, interpretability, and robustness in high-stakes NLP applications. △ Less

Submitted 26 June, 2025; originally announced June 2025.

arXiv:2505.07852 [pdf, other]

Joint Detection of Fraud and Concept Drift inOnline Conversations with LLM-Assisted Judgment

Authors: Ali Senol, Garima Agrawal, Huan Liu

Abstract: Detecting fake interactions in digital communication platforms remains a challenging and insufficiently addressed problem. These interactions may appear as harmless spam or escalate into sophisticated scam attempts, making it difficult to flag malicious intent early. Traditional detection methods often rely on static anomaly detection techniques that fail to adapt to dynamic conversational shifts.… ▽ More Detecting fake interactions in digital communication platforms remains a challenging and insufficiently addressed problem. These interactions may appear as harmless spam or escalate into sophisticated scam attempts, making it difficult to flag malicious intent early. Traditional detection methods often rely on static anomaly detection techniques that fail to adapt to dynamic conversational shifts. One key limitation is the misinterpretation of benign topic transitions referred to as concept drift as fraudulent behavior, leading to either false alarms or missed threats. We propose a two stage detection framework that first identifies suspicious conversations using a tailored ensemble classification model. To improve the reliability of detection, we incorporate a concept drift analysis step using a One Class Drift Detector (OCDD) to isolate conversational shifts within flagged dialogues. When drift is detected, a large language model (LLM) assesses whether the shift indicates fraudulent manipulation or a legitimate topic change. In cases where no drift is found, the behavior is inferred to be spam like. We validate our framework using a dataset of social engineering chat scenarios and demonstrate its practical advantages in improving both accuracy and interpretability for real time fraud detection. To contextualize the trade offs, we compare our modular approach against a Dual LLM baseline that performs detection and judgment using different language models. △ Less

Submitted 7 May, 2025; originally announced May 2025.

arXiv:2412.18047 [pdf, ps, other]

Uncertainty-Aware Critic Augmentation for Hierarchical Multi-Agent EV Charging Control

Authors: Lo Pang-Yun Ting, Ali Şenol, Huan-Yang Wang, Hsu-Chao Lai, Kun-Ta Chuang, Huan Liu

Abstract: The advanced bidirectional EV charging and discharging technology, aimed at supporting grid stability and emergency operations, has driven a growing interest in workplace applications. It not only reduces electricity expenses but also enhances the resilience in handling practical matters, such as peak power limitation, fluctuating energy prices, and unpredictable EV departures. Considering these f… ▽ More The advanced bidirectional EV charging and discharging technology, aimed at supporting grid stability and emergency operations, has driven a growing interest in workplace applications. It not only reduces electricity expenses but also enhances the resilience in handling practical matters, such as peak power limitation, fluctuating energy prices, and unpredictable EV departures. Considering these factors systematically can benefit energy efficiency in office buildings and for EV users simultaneously. To employ AI to address these issues, we propose HUCA, a novel real-time charging control for regulating energy demands for both the building and EVs. HUCA employs hierarchical actor-critic networks to dynamically reduce electricity costs in buildings, accounting for the needs of EV charging in the dynamic pricing scenario. To tackle the uncertain EV departures, we introduce a new critic augmentation to account for departure uncertainties in evaluating the charging decisions, while maintaining the robustness of the charging control. Experiments on real-world electricity datasets under both simulated certain and uncertain departure scenarios demonstrate that HUCA outperforms baselines in terms of total electricity costs while maintaining competitive performance in fulfilling EV charging requirements. A case study also manifests that HUCA effectively balances energy supply between the building and EVs based on real-time information, showcasing its potential as a key AI-driven solution for vehicle charging control. △ Less

Submitted 16 June, 2025; v1 submitted 23 December, 2024; originally announced December 2024.

arXiv:2308.04887 [pdf, other]

Targeted and Troublesome: Tracking and Advertising on Children's Websites

Authors: Zahra Moti, Asuman Senol, Hamid Bostani, Frederik Zuiderveen Borgesius, Veelasha Moonsamy, Arunesh Mathur, Gunes Acar

Abstract: On the modern web, trackers and advertisers frequently construct and monetize users' detailed behavioral profiles without consent. Despite various studies on web tracking mechanisms and advertisements, there has been no rigorous study focusing on websites targeted at children. To address this gap, we present a measurement of tracking and (targeted) advertising on websites directed at children. Mot… ▽ More On the modern web, trackers and advertisers frequently construct and monetize users' detailed behavioral profiles without consent. Despite various studies on web tracking mechanisms and advertisements, there has been no rigorous study focusing on websites targeted at children. To address this gap, we present a measurement of tracking and (targeted) advertising on websites directed at children. Motivated by lacking a comprehensive list of child-directed (i.e., targeted at children) websites, we first build a multilingual classifier based on web page titles and descriptions. Applying this classifier to over two million pages, we compile a list of two thousand child-directed websites. Crawling these sites from five vantage points, we measure the prevalence of trackers, fingerprinting scripts, and advertisements. Our crawler detects ads displayed on child-directed websites and determines if ad targeting is enabled by scraping ad disclosure pages whenever available. Our results show that around 90% of child-directed websites embed one or more trackers, and about 27% contain targeted advertisements--a practice that should require verifiable parental consent. Next, we identify improper ads on child-directed websites by developing an ML pipeline that processes both images and text extracted from ads. The pipeline allows us to run semantic similarity queries for arbitrary search terms, revealing ads that promote services related to dating, weight loss, and mental health; as well as ads for sex toys and flirting chat services. Some of these ads feature repulsive and sexually explicit imagery. In summary, our findings indicate a trend of non-compliance with privacy regulations and troubling ad safety practices among many advertisers and child-directed websites. To protect children and create a safer online environment, regulators and stakeholders must adopt and enforce more stringent measures. △ Less

Submitted 10 December, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: To appear at 45th IEEE Symposium on Security and Privacy, May 20-23 2024

arXiv:1602.03031 [pdf, other]

Coinami: A Cryptocurrency with DNA Sequence Alignment as Proof-of-work

Authors: Atalay M. Ileri, Halil I. Ozercan, Alper Gundogdu, Ahmet K. Senol, M. Yusuf Ozkaya, Can Alkan

Abstract: Rate of growth of the amount of data generated using the high throughput sequencing (HTS) platforms now exceeds the growth stipulated by Moore's Law. The HTS data is expected to surpass those of other "big data" domains such as astronomy, before the year 2025. In addition to sequencing genomes for research purposes, genome and exome sequencing in clinical settings will be a routine part of health… ▽ More Rate of growth of the amount of data generated using the high throughput sequencing (HTS) platforms now exceeds the growth stipulated by Moore's Law. The HTS data is expected to surpass those of other "big data" domains such as astronomy, before the year 2025. In addition to sequencing genomes for research purposes, genome and exome sequencing in clinical settings will be a routine part of health care. The analysis of such large amounts of data, however, is not without computational challenges. This burden is even more increased due to the periodic updates to reference genomes, which typically require re-analysis of existing data. Here we propose Coin-Application Mediator Interface (Coinami) to distribute the workload for mapping reads to reference genomes using a volunteer grid computer approach similar to Berkeley Open Infrastructure for Network Computing (BOINC). However, since HTS read mapping requires substantial computational resources and fast analysis turnout is desired, Coinami uses the HTS read mapping as proof-of-work to generate valid blocks to main its own cryptocurrency system, which may help motivate volunteers to dedicate more resources. The Coinami protocol includes mechanisms to ensure that jobs performed by volunteers are correct, and provides genomic data privacy. The prototype implementation of Coinami is available at http://coinami.github.io/. △ Less

Submitted 19 February, 2016; v1 submitted 9 February, 2016; originally announced February 2016.

Showing 1–5 of 5 results for author: Şenol, A