-
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
Authors:
Dimitrios Sikeridis,
Dennis Ramdass,
Pranay Pareek
Abstract:
Recently, the number of off-the-shelf Large Language Models (LLMs) has exploded with many open-source options. This creates a diverse landscape regarding both serving options (e.g., inference on local hardware vs remote LLM APIs) and model heterogeneous expertise. However, it is hard for the user to efficiently optimize considering operational cost (pricing structures, expensive LLMs-as-a-service…
▽ More
Recently, the number of off-the-shelf Large Language Models (LLMs) has exploded with many open-source options. This creates a diverse landscape regarding both serving options (e.g., inference on local hardware vs remote LLM APIs) and model heterogeneous expertise. However, it is hard for the user to efficiently optimize considering operational cost (pricing structures, expensive LLMs-as-a-service for large querying volumes), efficiency, or even per-case specific measures such as response accuracy, bias, or toxicity. Also, existing LLM routing solutions focus mainly on cost reduction, with response accuracy optimizations relying on non-generalizable supervised training, and ensemble approaches necessitating output computation for every considered LLM candidate. In this work, we tackle the challenge of selecting the optimal LLM from a model pool for specific queries with customizable objectives. We propose PickLLM, a lightweight framework that relies on Reinforcement Learning (RL) to route on-the-fly queries to available models. We introduce a weighted reward function that considers per-query cost, inference latency, and model response accuracy by a customizable scoring function. Regarding the learning algorithms, we explore two alternatives: PickLLM router acting as a learning automaton that utilizes gradient ascent to select a specific LLM, or utilizing stateless Q-learning to explore the set of LLMs and perform selection with a $ε$-greedy approach. The algorithm converges to a single LLM for the remaining session queries. To evaluate, we utilize a pool of four LLMs and benchmark prompt-response datasets with different contexts. A separate scoring function is assessing response accuracy during the experiment. We demonstrate the speed of convergence for different learning rates and improvement in hard metrics such as cost per querying session and overall response latency.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Smart City Defense Game: Strategic Resource Management during Socio-Cyber-Physical Attacks
Authors:
Dimitrios Sikeridis,
Michael Devetsikiotis
Abstract:
Ensuring public safety in a Smart City (SC) environment is a critical and increasingly complicated task due to the involvement of multiple agencies and the city's expansion across cyber and social layers. In this paper, we propose an extensive form perfect information game to model interactions and optimal city resource allocations when a Terrorist Organization (TO) performs attacks on multiple ta…
▽ More
Ensuring public safety in a Smart City (SC) environment is a critical and increasingly complicated task due to the involvement of multiple agencies and the city's expansion across cyber and social layers. In this paper, we propose an extensive form perfect information game to model interactions and optimal city resource allocations when a Terrorist Organization (TO) performs attacks on multiple targets across two conceptual SC levels, a physical, and a cyber-social. The Smart City Defense Game (SCDG) considers three players that initially are entitled to a specific finite budget. Two SC agencies that have to defend their physical or social territories respectively, fight against a common enemy, the TO. Each layer consists of multiple targets and the attack outcome depends on whether the resources allocated there by the associated agency, exceed or not the TO's. Each player's utility is equal to the number of successfully defended targets. The two agencies are allowed to make budget transfers provided that it is beneficial for both. We completely characterize the Sub-game Perfect Nash Equilibrium (SPNE) of the SCDG that consists of strategies for optimal resource exchanges between SC agencies and accounts for the TO's budget allocation across the physical and social targets. Also, we present numerical and comparative results demonstrating that when the SC players act according to the SPNE, they maximize the number of successfully defended targets. The SCDG is shown to be a promising solution for modeling critical resource allocations between SC parties in the face of multi-layer simultaneous terrorist attacks.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
BLEBeacon: A Real-Subject Trial Dataset from Mobile Bluetooth Low Energy Beacons
Authors:
Dimitrios Sikeridis,
Ioannis Papapanagiotou,
Michael Devetsikiotis
Abstract:
The BLEBeacon dataset is a collection of Bluetooth Low Energy (BLE) advertisement packets/traces generated from BLE beacons carried by people following their daily routine inside a university building. A network of Raspberry Pi 3 (RPi)-based edge devices were deployed inside a multi-floor facility continuously gathering BLE advertisement packets and storing them in a cloud-based environment. The d…
▽ More
The BLEBeacon dataset is a collection of Bluetooth Low Energy (BLE) advertisement packets/traces generated from BLE beacons carried by people following their daily routine inside a university building. A network of Raspberry Pi 3 (RPi)-based edge devices were deployed inside a multi-floor facility continuously gathering BLE advertisement packets and storing them in a cloud-based environment. The data were collected during an IRB (Institutional Review Board forhe Protection of Human Subjects in Research) approved one-month trial. Each facility occupant/participant was handed a BLE beacon to carry with him at all times. The focus is on presenting a real-life realization of a location-aware sensing infrastructure, that can provide insights for smart sensing platforms, crowd-based applications, building management, and user-localization frameworks. This work describes and documents the published BLEBeacon dataset.
△ Less
Submitted 9 May, 2019; v1 submitted 23 February, 2018;
originally announced February 2018.
-
A Comparative Taxonomy and Survey of Public Cloud Infrastructure Vendors
Authors:
Dimitrios Sikeridis,
Ioannis Papapanagiotou,
Bhaskar Prasad Rimal,
Michael Devetsikiotis
Abstract:
An increasing number of technology enterprises are adopting cloud-native architectures to offer their web-based products, by moving away from privately-owned data-centers and relying exclusively on cloud service providers. As a result, cloud vendors have lately increased, along with the estimated annual revenue they share. However, in the process of selecting a provider's cloud service over the co…
▽ More
An increasing number of technology enterprises are adopting cloud-native architectures to offer their web-based products, by moving away from privately-owned data-centers and relying exclusively on cloud service providers. As a result, cloud vendors have lately increased, along with the estimated annual revenue they share. However, in the process of selecting a provider's cloud service over the competition, we observe a lack of universal common ground in terms of terminology, functionality of services and billing models. This is an important gap especially under the new reality of the industry where each cloud provider has moved towards his own service taxonomy, while the number of specialized services has grown exponentially. This work discusses cloud services offered by four dominant, in terms of their current market share, cloud vendors. We provide a taxonomy of their services and sub-services that designates major service families namely computing, storage, databases, analytics, data pipelines, machine learning, and networking. The aim of such clustering is to indicate similarities, common design approaches and functional differences of the offered services. The outcomes are essential both for individual researchers, and bigger enterprises in their attempt to identify the set of cloud services that will utterly meet their needs without compromises. While we acknowledge the fact that this is a dynamic industry, where new services arise constantly, and old ones experience important updates, this study paints a solid image of the current offerings and gives prominence to the directions that cloud service providers are following.
△ Less
Submitted 28 January, 2018; v1 submitted 4 October, 2017;
originally announced October 2017.