Search | arXiv e-print repository

arXiv:2505.20707 [pdf, ps, other]

Dissecting Physics Reasoning in Small Language Models: A Multi-Dimensional Analysis from an Educational Perspective

Authors: Nicy Scaria, Silvester John Joseph Kennedy, Diksha Seth, Deepak Subramani

Abstract: Small Language Models (SLMs) offer computational efficiency and accessibility, making them promising for educational applications. However, their capacity for complex reasoning, particularly in domains such as physics, remains underexplored. This study investigates the high school physics reasoning capabilities of state-of-the-art SLMs (under 4 billion parameters), including instruct versions of L… ▽ More Small Language Models (SLMs) offer computational efficiency and accessibility, making them promising for educational applications. However, their capacity for complex reasoning, particularly in domains such as physics, remains underexplored. This study investigates the high school physics reasoning capabilities of state-of-the-art SLMs (under 4 billion parameters), including instruct versions of Llama 3.2, Phi 4 Mini, Gemma 3, and Qwen series. We developed a comprehensive physics dataset from the OpenStax High School Physics textbook, annotated according to Bloom's Taxonomy, with LaTeX and plaintext mathematical notations. A novel cultural contextualization approach was applied to a subset, creating culturally adapted problems for Asian, African, and South American/Australian contexts while preserving core physics principles. Using an LLM-as-a-judge framework with Google's Gemini 2.5 Flash, we evaluated answer and reasoning chain correctness, along with calculation accuracy. The results reveal significant differences between the SLMs. Qwen 3 1.7B achieved high `answer accuracy' (85%), but `fully correct reasoning' was substantially low (38%). The format of the mathematical notation had a negligible impact on performance. SLMs exhibited varied performance across the physics topics and showed a decline in reasoning quality with increasing cognitive and knowledge complexity. In particular, the consistency of reasoning was largely maintained in diverse cultural contexts, especially by better performing models. These findings indicate that, while SLMs can often find correct answers, their underlying reasoning is frequently flawed, suggesting an overreliance on pattern recognition. For SLMs to become reliable educational tools in physics, future development must prioritize enhancing genuine understanding and the generation of sound, verifiable reasoning chains over mere answer accuracy. △ Less

Submitted 27 May, 2025; originally announced May 2025.

arXiv:2505.19414 [pdf, ps, other]

Toward Physics-Informed Machine Learning for Data Center Operations: A Tropical Case Study

Authors: Ruihang Wang, Zhiwei Cao, Qingang Zhang, Rui Tan, Yonggang Wen, Tommy Leung, Stuart Kennedy, Justin Teoh

Abstract: Data centers are the backbone of computing capacity. Operating data centers in the tropical regions faces unique challenges due to consistently high ambient temperature and elevated relative humidity throughout the year. These conditions result in increased cooling costs to maintain the reliability of the computing systems. While existing machine learning-based approaches have demonstrated potenti… ▽ More Data centers are the backbone of computing capacity. Operating data centers in the tropical regions faces unique challenges due to consistently high ambient temperature and elevated relative humidity throughout the year. These conditions result in increased cooling costs to maintain the reliability of the computing systems. While existing machine learning-based approaches have demonstrated potential to elevate operations to a more proactive and intelligent level, their deployment remains dubious due to concerns about model extrapolation capabilities and associated system safety issues. To address these concerns, this article proposes incorporating the physical characteristics of data centers into traditional data-driven machine learning solutions. We begin by introducing the data center system, including the relevant multiphysics processes and the data-physics availability. Next, we outline the associated modeling and optimization problems and propose an integrated, physics-informed machine learning system to address them. Using the proposed system, we present relevant applications across varying levels of operational intelligence. A case study on an industry-grade tropical data center is provided to demonstrate the effectiveness of our approach. Finally, we discuss key challenges and highlight potential future directions. △ Less

Submitted 25 May, 2025; originally announced May 2025.

arXiv:2505.02850 [pdf, other]

Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective Distractors

Authors: Nicy Scaria, Silvester John Joseph Kennedy, Diksha Seth, Ananya Thakur, Deepak Subramani

Abstract: Generating high-quality MCQs, especially those targeting diverse cognitive levels and incorporating common misconceptions into distractor design, is time-consuming and expertise-intensive, making manual creation impractical at scale. Current automated approaches typically generate questions at lower cognitive levels and fail to incorporate domain-specific misconceptions. This paper presents a hier… ▽ More Generating high-quality MCQs, especially those targeting diverse cognitive levels and incorporating common misconceptions into distractor design, is time-consuming and expertise-intensive, making manual creation impractical at scale. Current automated approaches typically generate questions at lower cognitive levels and fail to incorporate domain-specific misconceptions. This paper presents a hierarchical concept map-based framework that provides structured knowledge to guide LLMs in generating MCQs with distractors. We chose high-school physics as our test domain and began by developing a hierarchical concept map covering major Physics topics and their interconnections with an efficient database design. Next, through an automated pipeline, topic-relevant sections of these concept maps are retrieved to serve as a structured context for the LLM to generate questions and distractors that specifically target common misconceptions. Lastly, an automated validation is completed to ensure that the generated MCQs meet the requirements provided. We evaluate our framework against two baseline approaches: a base LLM and a RAG-based generation. We conducted expert evaluations and student assessments of the generated MCQs. Expert evaluation shows that our method significantly outperforms the baseline approaches, achieving a success rate of 75.20% in meeting all quality criteria compared to approximately 37% for both baseline methods. Student assessment data reveal that our concept map-driven approach achieved a significantly lower guess success rate of 28.05% compared to 37.10% for the baselines, indicating a more effective assessment of conceptual understanding. The results demonstrate that our concept map-based approach enables robust assessment across cognitive levels and instant identification of conceptual gaps, facilitating faster feedback loops and targeted interventions at scale. △ Less

Submitted 2 May, 2025; originally announced May 2025.

arXiv:2502.11304 [pdf, other]

Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring

Authors: Murat Arda Onsu, Poonam Lohan, Burak Kantarci, Aisha Syed, Matthew Andrews, Sean Kennedy

Abstract: A robust and efficient traffic monitoring system is essential for smart cities and Intelligent Transportation Systems (ITS), using sensors and cameras to track vehicle movements, optimize traffic flow, reduce congestion, enhance road safety, and enable real-time adaptive traffic control. Traffic monitoring models must comprehensively understand dynamic urban conditions and provide an intuitive use… ▽ More A robust and efficient traffic monitoring system is essential for smart cities and Intelligent Transportation Systems (ITS), using sensors and cameras to track vehicle movements, optimize traffic flow, reduce congestion, enhance road safety, and enable real-time adaptive traffic control. Traffic monitoring models must comprehensively understand dynamic urban conditions and provide an intuitive user interface for effective management. This research leverages the LLaVA visual grounding multimodal large language model (LLM) for traffic monitoring tasks on the real-time Quanser Interactive Lab simulation platform, covering scenarios like intersections, congestion, and collisions. Cameras placed at multiple urban locations collect real-time images from the simulation, which are fed into the LLaVA model with queries for analysis. An instance segmentation model integrated into the cameras highlights key elements such as vehicles and pedestrians, enhancing training and throughput. The system achieves 84.3% accuracy in recognizing vehicle locations and 76.4% in determining steering direction, outperforming traditional models. △ Less

Submitted 16 February, 2025; originally announced February 2025.

Comments: 6 pages, 7 figures, submitted to 30th IEEE International Symposium on Computers and Communications (ISCC) 2025

arXiv:2408.12226 [pdf, ps, other]

EvalYaks: Instruction Tuning Datasets and LoRA Fine-tuned Models for Automated Scoring of CEFR B2 Speaking Assessment Transcripts

Authors: Nicy Scaria, Silvester John Joseph Kennedy, Thomas Latinovich, Deepak Subramani

Abstract: Relying on human experts to evaluate CEFR speaking assessments in an e-learning environment creates scalability challenges, as it limits how quickly and widely assessments can be conducted. We aim to automate the evaluation of CEFR B2 English speaking assessments in e-learning environments from conversation transcripts. First, we evaluate the capability of leading open source and commercial Large… ▽ More Relying on human experts to evaluate CEFR speaking assessments in an e-learning environment creates scalability challenges, as it limits how quickly and widely assessments can be conducted. We aim to automate the evaluation of CEFR B2 English speaking assessments in e-learning environments from conversation transcripts. First, we evaluate the capability of leading open source and commercial Large Language Models (LLMs) to score a candidate's performance across various criteria in the CEFR B2 speaking exam in both global and India-specific contexts. Next, we create a new expert-validated, CEFR-aligned synthetic conversational dataset with transcripts that are rated at different assessment scores. In addition, new instruction-tuned datasets are developed from the English Vocabulary Profile (up to CEFR B2 level) and the CEFR-SP WikiAuto datasets. Finally, using these new datasets, we perform parameter efficient instruction tuning of Mistral Instruct 7B v0.2 to develop a family of models called EvalYaks. Four models in this family are for assessing the four sections of the CEFR B2 speaking exam, one for identifying the CEFR level of vocabulary and generating level-specific vocabulary, and another for detecting the CEFR level of text and generating level-specific text. EvalYaks achieved an average acceptable accuracy of 96%, a degree of variation of 0.35 levels, and performed 3 times better than the next best model. This demonstrates that a 7B parameter LLM instruction tuned with high-quality CEFR-aligned assessment data can effectively evaluate and score CEFR B2 English speaking assessments, offering a promising solution for scalable, automated language proficiency evaluation. △ Less

Submitted 30 May, 2025; v1 submitted 22 August, 2024; originally announced August 2024.

arXiv:2407.00996 [pdf, other]

Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

Authors: Nicy Scaria, Silvester John Joseph Kennedy, Deepak Subramani

Abstract: With the growing need for efficient language models in resource-constrained environments, Small Language Models (SLMs) have emerged as compact and practical alternatives to Large Language Models (LLMs). While studies have explored noise handling in LLMs, little is known about how SLMs handle noise, a critical factor for their reliable real-world deployment. This study investigates the ability of S… ▽ More With the growing need for efficient language models in resource-constrained environments, Small Language Models (SLMs) have emerged as compact and practical alternatives to Large Language Models (LLMs). While studies have explored noise handling in LLMs, little is known about how SLMs handle noise, a critical factor for their reliable real-world deployment. This study investigates the ability of SLMs with parameters between 1 and 3 billion to learn, retain, and subsequently eliminate different types of noise (word flip, character flip, transliteration, irrelevant content, and contradictory information). Four pretrained SLMs (Olmo 1B, Qwen1.5 1.8B, Gemma1.1 2B, and Phi2 2.7B) were instruction-tuned on noise-free data and tested with in-context examples to assess noise learning. Subsequently, noise patterns were introduced in instruction tuning to assess their adaptability. The results revealed differences in how models handle noise, with smaller models like Olmo quickly adapting to noise patterns. Phi2's carefully curated, structured, and high-quality pretraining data enabled resistance to character level, transliteration, and counterfactual noise, while Gemma adapted successfully to transliteration noise through its multilingual pretraining. Subsequent clean data training effectively mitigated noise effects. These findings provide practical strategies for developing robust SLMs for real-world applications. △ Less

Submitted 27 May, 2025; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2404.09981 [pdf, other]

Robot Positioning Using Torus Packing for Multisets

Authors: Chung Shue Chen, Peter Keevash, Sean Kennedy, Élie de Panafieu, Adrian Vetta

Abstract: We consider the design of a positioning system where a robot determines its position from local observations. This is a well-studied problem of considerable practical importance and mathematical interest. The dominant paradigm derives from the classical theory of de Bruijn sequences, where the robot has access to a window within a larger code and can determine its position if these windows are dis… ▽ More We consider the design of a positioning system where a robot determines its position from local observations. This is a well-studied problem of considerable practical importance and mathematical interest. The dominant paradigm derives from the classical theory of de Bruijn sequences, where the robot has access to a window within a larger code and can determine its position if these windows are distinct. We propose an alternative model in which the robot has more limited observational powers, which we argue is more realistic in terms of engineering: the robot does not have access to the full pattern of colours (or letters) in the window, but only to the intensity of each colour (or the number of occurrences of each letter). This leads to a mathematically interesting problem with a different flavour to that arising in the classical paradigm, requiring new construction techniques. The parameters of our construction are optimal up to a constant factor, and computing the position requires only a constant number of arithmetic operations. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 22 pages, accepted at ICALP 2024

ACM Class: G.2.1

arXiv:2403.00148 [pdf, ps, other]

Implications of Regulations on the Use of AI and Generative AI for Human-Centered Responsible Artificial Intelligence

Authors: Marios Constantinides, Mohammad Tahaei, Daniele Quercia, Simone Stumpf, Michael Madaio, Sean Kennedy, Lauren Wilcox, Jessica Vitak, Henriette Cramer, Edyta Bogucka, Ricardo Baeza-Yates, Ewa Luger, Jess Holbrook, Michael Muller, Ilana Golbin Blumenfeld, Giada Pistilli

Abstract: With the upcoming AI regulations (e.g., EU AI Act) and rapid advancements in generative AI, new challenges emerge in the area of Human-Centered Responsible Artificial Intelligence (HCR-AI). As AI becomes more ubiquitous, questions around decision-making authority, human oversight, accountability, sustainability, and the ethical and legal responsibilities of AI and their creators become paramount.… ▽ More With the upcoming AI regulations (e.g., EU AI Act) and rapid advancements in generative AI, new challenges emerge in the area of Human-Centered Responsible Artificial Intelligence (HCR-AI). As AI becomes more ubiquitous, questions around decision-making authority, human oversight, accountability, sustainability, and the ethical and legal responsibilities of AI and their creators become paramount. Addressing these questions requires a collaborative approach. By involving stakeholders from various disciplines in the 2\textsuperscript{nd} edition of the HCR-AI Special Interest Group (SIG) at CHI 2024, we aim to discuss the implications of regulations in HCI research, develop new theories, evaluation frameworks, and methods to navigate the complex nature of AI ethics, steering AI development in a direction that is beneficial and sustainable for all of humanity. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 6 pages

arXiv:2402.11183 [pdf, other]

doi 10.1038/s42256-025-01017-7

Materiality and Risk in the Age of Pervasive AI Sensors

Authors: Mona Sloane, Emanuel Moss, Susan Kennedy, Matthew Stewart, Pete Warden, Brian Plancher, Vijay Janapa Reddi

Abstract: Artificial intelligence (AI) systems connected to sensor-laden devices are becoming pervasive, which has notable implications for a range of AI risks, including to privacy, the environment, autonomy and more. There is therefore a growing need for increased accountability around the responsible development and deployment of these technologies. Here we highlight the dimensions of risk associated wit… ▽ More Artificial intelligence (AI) systems connected to sensor-laden devices are becoming pervasive, which has notable implications for a range of AI risks, including to privacy, the environment, autonomy and more. There is therefore a growing need for increased accountability around the responsible development and deployment of these technologies. Here we highlight the dimensions of risk associated with AI systems that arise from the material affordances of sensors and their underlying calculative models. We propose a sensor-sensitive framework for diagnosing these risks, complementing existing approaches such as the US National Institute of Standards and Technology AI Risk Management Framework and the European Union AI Act, and discuss its implementation. We conclude by advocating for increased attention to the materiality of algorithmic systems, and of on-device AI sensors in particular, and highlight the need for development of a sensor design paradigm that empowers users and communities and leads to a future of increased fairness, accountability and transparency. △ Less

Submitted 24 March, 2025; v1 submitted 16 February, 2024; originally announced February 2024.

Journal ref: Nature Machine Intelligence (2025): 1-12

arXiv:2309.09928 [pdf, other]

Evaluating Adversarial Robustness with Expected Viable Performance

Authors: Ryan McCoppin, Colin Dawson, Sean M. Kennedy, Leslie M. Blaha

Abstract: We introduce a metric for evaluating the robustness of a classifier, with particular attention to adversarial perturbations, in terms of expected functionality with respect to possible adversarial perturbations. A classifier is assumed to be non-functional (that is, has a functionality of zero) with respect to a perturbation bound if a conventional measure of performance, such as classification ac… ▽ More We introduce a metric for evaluating the robustness of a classifier, with particular attention to adversarial perturbations, in terms of expected functionality with respect to possible adversarial perturbations. A classifier is assumed to be non-functional (that is, has a functionality of zero) with respect to a perturbation bound if a conventional measure of performance, such as classification accuracy, is less than a minimally viable threshold when the classifier is tested on examples from that perturbation bound. Defining robustness in terms of an expected value is motivated by a domain general approach to robustness quantification. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: Accepted at the 22nd International Conference on Machine Learning and Applications (IEEE 2023)

arXiv:2306.05952 [pdf, other]

Overcoming Adversarial Attacks for Human-in-the-Loop Applications

Authors: Ryan McCoppin, Marla Kennedy, Platon Lukyanenko, Sean Kennedy

Abstract: Including human analysis has the potential to positively affect the robustness of Deep Neural Networks and is relatively unexplored in the Adversarial Machine Learning literature. Neural network visual explanation maps have been shown to be prone to adversarial attacks. Further research is needed in order to select robust visualizations of explanations for the image analyst to evaluate a given mod… ▽ More Including human analysis has the potential to positively affect the robustness of Deep Neural Networks and is relatively unexplored in the Adversarial Machine Learning literature. Neural network visual explanation maps have been shown to be prone to adversarial attacks. Further research is needed in order to select robust visualizations of explanations for the image analyst to evaluate a given model. These factors greatly impact Human-In-The-Loop (HITL) evaluation tools due to their reliance on adversarial images, including explanation maps and measurements of robustness. We believe models of human visual attention may improve interpretability and robustness of human-machine imagery analysis systems. Our challenge remains, how can HITL evaluation be robust in this adversarial landscape? △ Less

Submitted 25 August, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: New Frontiers in Adversarial Machine Learning, ICML 2022

arXiv:2305.02112 [pdf, ps, other]

doi 10.1109/LNET.2023.3283936

Heterogeneous GNN-RL Based Task Offloading for UAV-aided Smart Agriculture

Authors: Turgay Pamuklu, Aisha Syed, W. Sean Kennedy, Melike Erol-Kantarci

Abstract: Having unmanned aerial vehicles (UAVs) with edge computing capability hover over smart farmlands supports Internet of Things (IoT) devices with low processing capacity and power to accomplish their deadline-sensitive tasks efficiently and economically. In this work, we propose a graph neural network-based reinforcement learning solution to optimize the task offloading from these IoT devices to the… ▽ More Having unmanned aerial vehicles (UAVs) with edge computing capability hover over smart farmlands supports Internet of Things (IoT) devices with low processing capacity and power to accomplish their deadline-sensitive tasks efficiently and economically. In this work, we propose a graph neural network-based reinforcement learning solution to optimize the task offloading from these IoT devices to the UAVs. We conduct evaluations to show that our approach reduces task deadline violations while also increasing the mission time of the UAVs by optimizing their battery usage. Moreover, the proposed solution has increased robustness to network topology changes and is able to adapt to extreme cases, such as the failure of a UAV. △ Less

Submitted 8 June, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

arXiv:2302.08157 [pdf, other]

doi 10.1145/3544549.3583178

Human-Centered Responsible Artificial Intelligence: Current & Future Trends

Authors: Mohammad Tahaei, Marios Constantinides, Daniele Quercia, Sean Kennedy, Michael Muller, Simone Stumpf, Q. Vera Liao, Ricardo Baeza-Yates, Lora Aroyo, Jess Holbrook, Ewa Luger, Michael Madaio, Ilana Golbin Blumenfeld, Maria De-Arteaga, Jessica Vitak, Alexandra Olteanu

Abstract: In recent years, the CHI community has seen significant growth in research on Human-Centered Responsible Artificial Intelligence. While different research communities may use different terminology to discuss similar topics, all of this work is ultimately aimed at developing AI that benefits humanity while being grounded in human rights and ethics, and reducing the potential harms of AI. In this sp… ▽ More In recent years, the CHI community has seen significant growth in research on Human-Centered Responsible Artificial Intelligence. While different research communities may use different terminology to discuss similar topics, all of this work is ultimately aimed at developing AI that benefits humanity while being grounded in human rights and ethics, and reducing the potential harms of AI. In this special interest group, we aim to bring together researchers from academia and industry interested in these topics to map current and future research trends to advance this important area of research by fostering collaboration and sharing ideas. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: To appear in Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems

arXiv:2302.07399 [pdf, other]

To Risk or Not to Risk: Learning with Risk Quantification for IoT Task Offloading in UAVs

Authors: Anne Catherine Nguyen, Turgay Pamuklu, Aisha Syed, W. Sean Kennedy, Melike Erol-Kantarci

Abstract: A deep reinforcement learning technique is presented for task offloading decision-making algorithms for a multi-access edge computing (MEC) assisted unmanned aerial vehicle (UAV) network in a smart farm Internet of Things (IoT) environment. The task offloading technique uses financial concepts such as cost functions and conditional variable at risk (CVaR) in order to quantify the damage that may b… ▽ More A deep reinforcement learning technique is presented for task offloading decision-making algorithms for a multi-access edge computing (MEC) assisted unmanned aerial vehicle (UAV) network in a smart farm Internet of Things (IoT) environment. The task offloading technique uses financial concepts such as cost functions and conditional variable at risk (CVaR) in order to quantify the damage that may be caused by each risky action. The approach was able to quantify potential risks to train the reinforcement learning agent to avoid risky behaviors that will lead to irreversible consequences for the farm. Such consequences include an undetected fire, pest infestation, or a UAV being unusable. The proposed CVaR-based technique was compared to other deep reinforcement learning techniques and two fixed rule-based techniques. The simulation results show that the CVaR-based risk quantifying method eliminated the most dangerous risk, which was exceeding the deadline for a fire detection task. As a result, it reduced the total number of deadline violations with a negligible increase in energy consumption. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Comments: Accepted for ICC2023

arXiv:2209.07382 [pdf, ps, other]

doi 10.1109/TGCN.2022.3205330

IoT-Aerial Base Station Task Offloading with Risk-Sensitive Reinforcement Learning for Smart Agriculture

Authors: Turgay Pamuklu, Anne Catherine Nguyen, Aisha Syed, W. Sean Kennedy, Melike Erol-Kantarci

Abstract: Aerial base stations (ABSs) allow smart farms to offload processing responsibility of complex tasks from internet of things (IoT) devices to ABSs. IoT devices have limited energy and computing resources, thus it is required to provide an advanced solution for a system that requires the support of ABSs. This paper introduces a novel multi-actor-based risk-sensitive reinforcement learning approach f… ▽ More Aerial base stations (ABSs) allow smart farms to offload processing responsibility of complex tasks from internet of things (IoT) devices to ABSs. IoT devices have limited energy and computing resources, thus it is required to provide an advanced solution for a system that requires the support of ABSs. This paper introduces a novel multi-actor-based risk-sensitive reinforcement learning approach for ABS task scheduling for smart agriculture. The problem is defined as task offloading with a strict condition on completing the IoT tasks before their deadlines. Moreover, the algorithm must also consider the limited energy capacity of the ABSs. The results show that our proposed approach outperforms several heuristics and the classic Q-Learning approach. Furthermore, we provide a mixed integer linear programming solution to determine a lower bound on the performance, and clarify the gap between our risk-sensitive solution and the optimal solution, as well. The comparison proves our extensive simulation results demonstrate that our method is a promising approach for providing a guaranteed task processing services for the IoT tasks in a smart farm, while increasing the hovering time of the ABSs in this farm. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: Accepted Paper

arXiv:2209.07367 [pdf, other]

Deep Reinforcement Learning for Task Offloading in UAV-Aided Smart Farm Networks

Authors: Anne Catherine Nguyen, Turgay Pamuklu, Aisha Syed, W. Sean Kennedy, Melike Erol-Kantarci

Abstract: The fifth and sixth generations of wireless communication networks are enabling tools such as internet of things devices, unmanned aerial vehicles (UAVs), and artificial intelligence, to improve the agricultural landscape using a network of devices to automatically monitor farmlands. Surveying a large area requires performing a lot of image classification tasks within a specific period of time in… ▽ More The fifth and sixth generations of wireless communication networks are enabling tools such as internet of things devices, unmanned aerial vehicles (UAVs), and artificial intelligence, to improve the agricultural landscape using a network of devices to automatically monitor farmlands. Surveying a large area requires performing a lot of image classification tasks within a specific period of time in order to prevent damage to the farm in case of an incident, such as fire or flood. UAVs have limited energy and computing power, and may not be able to perform all of the intense image classification tasks locally and within an appropriate amount of time. Hence, it is assumed that the UAVs are able to partially offload their workload to nearby multi-access edge computing devices. The UAVs need a decision-making algorithm that will decide where the tasks will be performed, while also considering the time constraints and energy level of the other UAVs in the network. In this paper, we introduce a Deep Q-Learning (DQL) approach to solve this multi-objective problem. The proposed method is compared with Q-Learning and three heuristic baselines, and the simulation results show that our proposed DQL-based method achieves comparable results when it comes to the UAVs' remaining battery levels and percentage of deadline violations. In addition, our method is able to reach convergence 13 times faster than Q-Learning. △ Less

Submitted 15 September, 2022; originally announced September 2022.

Comments: Accepted Paper

arXiv:2201.10361 [pdf, other]

doi 10.1109/ICC45855.2022.9838500

Reinforcement Learning-Based Deadline and Battery-Aware Offloading in Smart Farm IoT-UAV Networks

Authors: Anne Catherine Nguyen, Turgay Pamuklu, Aisha Syed, W. Sean Kennedy, Melike Erol-Kantarci

Abstract: Unmanned aerial vehicles (UAVs) with mounted base stations are a promising technology for monitoring smart farms. They can provide communication and computation services to extensive agricultural regions. With the assistance of a Multi-Access Edge Computing infrastructure, an aerial base station (ABS) network can provide an energy-efficient solution for smart farms that need to process deadline cr… ▽ More Unmanned aerial vehicles (UAVs) with mounted base stations are a promising technology for monitoring smart farms. They can provide communication and computation services to extensive agricultural regions. With the assistance of a Multi-Access Edge Computing infrastructure, an aerial base station (ABS) network can provide an energy-efficient solution for smart farms that need to process deadline critical tasks fed by IoT devices deployed on the field. In this paper, we introduce a multi-objective maximization problem and a Q-Learning based method which aim to process these tasks before their deadline while considering the UAVs' hover time. We also present three heuristic baselines to evaluate the performance of our approaches. In addition, we introduce an integer linear programming (ILP) model to define the upper bound of our objective function. The results show that Q-Learning outperforms the baselines in terms of remaining energy levels and percentage of delay violations. △ Less

Submitted 12 February, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: Accepted Paper. Please check footnote in Page 1 for copyright

Journal ref: ICC 2022 - IEEE International Conference on Communications

arXiv:2106.04008 [pdf, other]

Widening Access to Applied Machine Learning with TinyML

Authors: Vijay Janapa Reddi, Brian Plancher, Susan Kennedy, Laurence Moroney, Pete Warden, Anant Agarwal, Colby Banbury, Massimo Banzi, Matthew Bennett, Benjamin Brown, Sharad Chitlangia, Radhika Ghosal, Sarah Grafman, Rupert Jaeger, Srivatsan Krishnan, Maximilian Lam, Daniel Leiker, Cara Mann, Mark Mazumder, Dominic Pajak, Dhilan Ramaprasad, J. Evan Smith, Matthew Stewart, Dustin Tingley

Abstract: Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest tha… ▽ More Broadening access to both computational and educational resources is critical to diffusing machine-learning (ML) innovation. However, today, most ML resources and experts are siloed in a few countries and organizations. In this paper, we describe our pedagogical approach to increasing access to applied ML through a massive open online course (MOOC) on Tiny Machine Learning (TinyML). We suggest that TinyML, ML on resource-constrained embedded devices, is an attractive means to widen access because TinyML both leverages low-cost and globally accessible hardware, and encourages the development of complete, self-contained applications, from data collection to deployment. To this end, a collaboration between academia (Harvard University) and industry (Google) produced a four-part MOOC that provides application-oriented instruction on how to develop solutions using TinyML. The series is openly available on the edX MOOC platform, has no prerequisites beyond basic programming, and is designed for learners from a global variety of backgrounds. It introduces pupils to real-world applications, ML algorithms, data-set engineering, and the ethical considerations of these technologies via hands-on programming and deployment of TinyML applications in both the cloud and their own microcontrollers. To facilitate continued learning, community building, and collaboration beyond the courses, we launched a standalone website, a forum, a chat, and an optional course-project competition. We also released the course materials publicly, hoping they will inspire the next generation of ML practitioners and educators and further broaden access to cutting-edge ML technologies. △ Less

Submitted 9 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: Understanding the underpinnings of the TinyML edX course series: https://www.edx.org/professional-certificate/harvardx-tiny-machine-learning

arXiv:2010.15296 [pdf, other]

doi 10.18653/v1/P19-2048

Fact or Factitious? Contextualized Opinion Spam Detection

Authors: Stefan Kennedy, Niall Walsh, Kirils Sloka, Jennifer Foster, Andrew McCarren

Abstract: In this paper we perform an analytic comparison of a number of techniques used to detect fake and deceptive online reviews. We apply a number machine learning approaches found to be effective, and introduce our own approach by fine-tuning state of the art contextualised embeddings. The results we obtain show the potential of contextualised embeddings for fake review detection, and lay the groundwo… ▽ More In this paper we perform an analytic comparison of a number of techniques used to detect fake and deceptive online reviews. We apply a number machine learning approaches found to be effective, and introduce our own approach by fine-tuning state of the art contextualised embeddings. The results we obtain show the potential of contextualised embeddings for fake review detection, and lay the groundwork for future research in this area. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: 6 pages, 3 figures, presented at the 2019 ACL Conference in Florence, Italy

Report number: P19-2048 P19-2048 ACM Class: I.2.7; I.2.6

arXiv:2005.09800 [pdf, other]

Fingerprinting Encrypted Voice Traffic on Smart Speakers with Deep Learning

Authors: Chenggang Wang, Sean Kennedy, Haipeng Li, King Hudson, Gowtham Atluri, Xuetao Wei, Wenhai Sun, Boyang Wang

Abstract: This paper investigates the privacy leakage of smart speakers under an encrypted traffic analysis attack, referred to as voice command fingerprinting. In this attack, an adversary can eavesdrop both outgoing and incoming encrypted voice traffic of a smart speaker, and infers which voice command a user says over encrypted traffic. We first built an automatic voice traffic collection tool and collec… ▽ More This paper investigates the privacy leakage of smart speakers under an encrypted traffic analysis attack, referred to as voice command fingerprinting. In this attack, an adversary can eavesdrop both outgoing and incoming encrypted voice traffic of a smart speaker, and infers which voice command a user says over encrypted traffic. We first built an automatic voice traffic collection tool and collected two large-scale datasets on two smart speakers, Amazon Echo and Google Home. Then, we implemented proof-of-concept attacks by leveraging deep learning. Our experimental results over the two datasets indicate disturbing privacy concerns. Specifically, compared to 1% accuracy with random guess, our attacks can correctly infer voice commands over encrypted traffic with 92.89\% accuracy on Amazon Echo. Despite variances that human voices may cause on outgoing traffic, our proof-of-concept attacks remain effective even only leveraging incoming traffic (i.e., the traffic from the server). This is because the AI-based voice services running on the server side response commands in the same voice and with a deterministic or predictable manner in text, which leaves distinguishable pattern over encrypted traffic. We also built a proof-of-concept defense to obfuscate encrypted traffic. Our results show that the defense can effectively mitigate attack accuracy on Amazon Echo to 32.18%. △ Less

Submitted 19 May, 2020; originally announced May 2020.

Journal ref: 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks (WiSec '20), July 8--10, 2020, Linz (Virtual Event), Austria

arXiv:1804.03270 [pdf, other]

Towards Deep Cellular Phenotyping in Placental Histology

Authors: Michael Ferlaino, Craig A. Glastonbury, Carolina Motta-Mejia, Manu Vatish, Ingrid Granne, Stephen Kennedy, Cecilia M. Lindgren, Christoffer Nellåker

Abstract: The placenta is a complex organ, playing multiple roles during fetal development. Very little is known about the association between placental morphological abnormalities and fetal physiology. In this work, we present an open sourced, computationally tractable deep learning pipeline to analyse placenta histology at the level of the cell. By utilising two deep Convolutional Neural Network architect… ▽ More The placenta is a complex organ, playing multiple roles during fetal development. Very little is known about the association between placental morphological abnormalities and fetal physiology. In this work, we present an open sourced, computationally tractable deep learning pipeline to analyse placenta histology at the level of the cell. By utilising two deep Convolutional Neural Network architectures and transfer learning, we can robustly localise and classify placental cells within five classes with an accuracy of 89%. Furthermore, we learn deep embeddings encoding phenotypic knowledge that is capable of both stratifying five distinct cell populations and learn intraclass phenotypic variance. We envisage that the automation of this pipeline to population scale studies of placenta histology has the potential to improve our understanding of basic cellular placental biology and its variations, particularly its role in predicting adverse birth outcomes. △ Less

Submitted 25 May, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

Comments: Updated MRC funding material. Corrected typo that suggested ensembling and Inception accuracy were the same (updated to reflect the fact the ensemble model is 1% better than previously reported)

arXiv:1707.06903 [pdf, other]

A New Family of Near-metrics for Universal Similarity

Authors: Chu Wang, Iraj Saniee, William S. Kennedy, Chris A. White

Abstract: We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near… ▽ More We propose a family of near-metrics based on local graph diffusion to capture similarity for a wide class of data sets. These quasi-metametrics, as their names suggest, dispense with one or two standard axioms of metric spaces, specifically distinguishability and symmetry, so that similarity between data points of arbitrary type and form could be measured broadly and effectively. The proposed near-metric family includes the forward k-step diffusion and its reverse, typically on the graph consisting of data objects and their features. By construction, this family of near-metrics is particularly appropriate for categorical data, continuous data, and vector representations of images and text extracted via deep learning approaches. We conduct extensive experiments to evaluate the performance of this family of similarity measures and compare and contrast with traditional measures of similarity used for each specific application and with the ground truth when available. We show that for structured data including categorical and continuous data, the near-metrics corresponding to normalized forward k-step diffusion (k small) work as one of the best performing similarity measures; for vector representations of text and images including those extracted from deep learning, the near-metrics derived from normalized and reverse k-step graph diffusion (k very small) exhibit outstanding ability to distinguish data points from different classes. △ Less

Submitted 17 October, 2017; v1 submitted 21 July, 2017; originally announced July 2017.

arXiv:1604.07359 [pdf, other]

Fast approximation algorithms for $p$-centres in large $δ$-hyperbolic graphs

Authors: Katherine Edwards, W. Sean Kennedy, Iraj Saniee

Abstract: We provide a quasilinear time algorithm for the $p$-center problem with an additive error less than or equal to 3 times the input graph's hyperbolic constant. Specifically, for the graph $G=(V,E)$ with $n$ vertices, $m$ edges and hyperbolic constant $δ$, we construct an algorithm for $p$-centers in time $O(p(δ+1)(n+m)\log(n))$ with radius not exceeding $r_p + δ$ when $p \leq 2$ and $r_p + 3δ$ when… ▽ More We provide a quasilinear time algorithm for the $p$-center problem with an additive error less than or equal to 3 times the input graph's hyperbolic constant. Specifically, for the graph $G=(V,E)$ with $n$ vertices, $m$ edges and hyperbolic constant $δ$, we construct an algorithm for $p$-centers in time $O(p(δ+1)(n+m)\log(n))$ with radius not exceeding $r_p + δ$ when $p \leq 2$ and $r_p + 3δ$ when $p \geq 3$, where $r_p$ are the optimal radii. Prior work identified $p$-centers with accuracy $r_p+δ$ but with time complexity $O((n^3\log n + n^2m)\log(diam(G)))$ which is impractical for large graphs. △ Less

Submitted 25 April, 2016; originally announced April 2016.

Comments: 19 pages

arXiv:1411.2873 [pdf, other]

Improving Robustness of Next-Hop Routing

Authors: Glencora Borradaile, W. Sean Kennedy, Gordon Wilfong, Lisa Zhang

Abstract: A weakness of next-hop routing is that following a link or router failure there may be no routes between some source-destination pairs, or packets may get stuck in a routing loop as the protocol operates to establish new routes. In this article, we address these weaknesses by describing mechanisms to choose alternate next hops. Our first contribution is to model the scenario as the following {\s… ▽ More A weakness of next-hop routing is that following a link or router failure there may be no routes between some source-destination pairs, or packets may get stuck in a routing loop as the protocol operates to establish new routes. In this article, we address these weaknesses by describing mechanisms to choose alternate next hops. Our first contribution is to model the scenario as the following {\sc tree augmentation} problem. Consider a mixed graph where some edges are directed and some undirected. The directed edges form a spanning tree pointing towards the common destination node. Each directed edge represents the unique next hop in the routing protocol. Our goal is to direct the undirected edges so that the resulting graph remains acyclic and the number of nodes with outdegree two or more is maximized. These nodes represent those with alternative next hops in their routing paths. We show that {\sc tree augmentation} is NP-hard in general and present a simple $\frac{1}{2}$-approximation algorithm. We also study 3 special cases. We give exact polynomial-time algorithms for when the input spanning tree consists of exactly 2 directed paths or when the input graph has bounded treewidth. For planar graphs, we present a polynomial-time approximation scheme when the input tree is a breadth-first search tree. To the best of our knowledge, {\sc tree augmentation} has not been previously studied. △ Less

Submitted 11 November, 2014; originally announced November 2014.

arXiv:1404.5002 [pdf, ps, other]

A Geometric Distance Oracle for Large Real-World Graphs

Authors: Deepak Ajwani, W. Sean Kennedy, Alessandra Sala, Iraj Saniee

Abstract: Many graph processing algorithms require determination of shortest-path distances between arbitrary numbers of node pairs. Since computation of exact distances between all node-pairs of a large graph, e.g., 10M nodes and up, is prohibitively expensive both in computational time and storage space, distance approximation is often used in place of exact computation. In this paper, we present a novel… ▽ More Many graph processing algorithms require determination of shortest-path distances between arbitrary numbers of node pairs. Since computation of exact distances between all node-pairs of a large graph, e.g., 10M nodes and up, is prohibitively expensive both in computational time and storage space, distance approximation is often used in place of exact computation. In this paper, we present a novel and scalable distance oracle that leverages the hyperbolic core of real-world large graphs for fast and scalable distance approximation. We show empirically that the proposed oracle significantly outperforms prior oracles on a random set of test cases drawn from public domain graph libraries. There are two sets of prior work against which we benchmark our approach. The first set, which often outperforms other oracles, employs embedding of the graph into low dimensional Euclidean spaces with carefully constructed hyperbolic distances, but provides no guarantees on the distance estimation error. The second set leverages Gromov-type tree contraction of the graph with the additive error guaranteed not to exceed $2δ\log{n}$, where $δ$ is the hyperbolic constant of the graph. We show that our proposed oracle 1) is significantly faster than those oracles that use hyperbolic embedding (first set) with similar approximation error and, perhaps surprisingly, 2) exhibits substantially lower average estimation error compared to Gromov-like tree contractions (second set). We substantiate our claims through numerical computations on a collection of a dozen real world networks and synthetic test cases from multiple domains, ranging in size from 10s of thousand to 10s of millions of nodes. △ Less

Submitted 19 April, 2014; originally announced April 2014.

Comments: 15 pages, 9 figures, 3 tables

arXiv:1307.0031 [pdf, other]

On the Hyperbolicity of Large-Scale Networks

Authors: W. Sean Kennedy, Onuttom Narayan, Iraj Saniee

Abstract: Through detailed analysis of scores of publicly available data sets corresponding to a wide range of large-scale networks, from communication and road networks to various forms of social networks, we explore a little-studied geometric characteristic of real-life networks, namely their hyperbolicity. In smooth geometry, hyperbolicity captures the notion of negative curvature; within the more abstra… ▽ More Through detailed analysis of scores of publicly available data sets corresponding to a wide range of large-scale networks, from communication and road networks to various forms of social networks, we explore a little-studied geometric characteristic of real-life networks, namely their hyperbolicity. In smooth geometry, hyperbolicity captures the notion of negative curvature; within the more abstract context of metric spaces, it can be generalized as d-hyperbolicity. This generalized definition can be applied to graphs, which we explore in this report. We provide strong evidence that communication and social networks exhibit this fundamental property, and through extensive computations we quantify the degree of hyperbolicity of each network in comparison to its diameter. By contrast, and as evidence of the validity of the methodology, applying the same methods to the road networks shows that they are not hyperbolic, which is as expected. Finally, we present practical computational means for detection of hyperbolicity and show how the test itself may be scaled to much larger graphs than those we examined via renormalization group methodology. Using well-understood mechanisms, we provide evidence through synthetically generated graphs that hyperbolicity is preserved and indeed amplified by renormalization. This allows us to detect hyperbolicity in large networks efficiently, through much smaller renormalized versions. These observations indicate that d-hyperbolicity is a common feature of large-scale networks. We propose that d-hyperbolicity in conjunction with other local characteristics of networks, such as the degree distribution and clustering coefficients, provide a more complete unifying picture of networks, and helps classify in a parsimonious way what is otherwise a bewildering and complex array of features and characteristics specific to each natural and man-made network. △ Less

Submitted 28 June, 2013; originally announced July 2013.

Comments: 22 pages, 25 figures

arXiv:1103.6222 [pdf, other]

Finding a smallest odd hole in a claw-free graph using global structure

Authors: W. Sean Kennedy, Andrew D. King

Abstract: A lemma of Fouquet implies that a claw-free graph contains an induced $C_5$, contains no odd hole, or is quasi-line. In this paper we use this result to give an improved shortest-odd-hole algorithm for claw-free graphs by exploiting the structural relationship between line graphs and quasi-line graphs suggested by Chudnovsky and Seymour's structure theorem for quasi-line graphs. Our approach invol… ▽ More A lemma of Fouquet implies that a claw-free graph contains an induced $C_5$, contains no odd hole, or is quasi-line. In this paper we use this result to give an improved shortest-odd-hole algorithm for claw-free graphs by exploiting the structural relationship between line graphs and quasi-line graphs suggested by Chudnovsky and Seymour's structure theorem for quasi-line graphs. Our approach involves reducing the problem to that of finding a shortest odd cycle of length $\geq 5$ in a graph. Our algorithm runs in $O(m^2+n^2\log n)$ time, improving upon Shrem, Stern, and Golumbic's recent $O(nm^2)$ algorithm, which uses a local approach. The best known recognition algorithms for claw-free graphs run in $O(m^{1.69}) \cap O(n^{3.5})$ time, or $O(m^2) \cap O(n^{3.5})$ without fast matrix multiplication. △ Less

Submitted 23 May, 2011; v1 submitted 31 March, 2011; originally announced March 2011.

Comments: 12 pages, 1 figure

Showing 1–27 of 27 results for author: Kennedy, S