-
Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness
Authors:
Ambreesh Parthasarathy,
Chandrasekar Subramanian,
Ganesh Senrayan,
Shreyash Adappanavar,
Aparna Taneja,
Balaraman Ravindran,
Milind Tambe
Abstract:
Restless Multi-Armed Bandits (RMABs) have been successfully applied to resource allocation problems in a variety of settings, including public health. With the rapid development of powerful large language models (LLMs), they are increasingly used to design reward functions to better match human preferences. Recent work has shown that LLMs can be used to tailor automated allocation decisions to com…
▽ More
Restless Multi-Armed Bandits (RMABs) have been successfully applied to resource allocation problems in a variety of settings, including public health. With the rapid development of powerful large language models (LLMs), they are increasingly used to design reward functions to better match human preferences. Recent work has shown that LLMs can be used to tailor automated allocation decisions to community needs using language prompts. However, this has been studied primarily for English prompts and with a focus on task performance only. This can be an issue since grassroots workers, especially in developing countries like India, prefer to work in local languages, some of which are low-resource. Further, given the nature of the problem, biases along population groups unintended by the user are also undesirable. In this work, we study the effects on both task performance and fairness when the DLM algorithm, a recent work on using LLMs to design reward functions for RMABs, is prompted with non-English language commands. Specifically, we run the model on a synthetic environment for various prompts translated into multiple languages. The prompts themselves vary in complexity. Our results show that the LLM-proposed reward functions are significantly better when prompted in English compared to other languages. We also find that the exact phrasing of the prompt impacts task performance. Further, as prompt complexity increases, performance worsens for all languages; however, it is more robust with English prompts than with lower-resource languages. On the fairness side, we find that low-resource languages and more complex prompts are both highly likely to create unfairness along unintended dimensions.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Participatory Approaches in AI Development and Governance: Case Studies
Authors:
Ambreesh Parthasarathy,
Aditya Phalnikar,
Gokul S Krishnan,
Ameen Jauhar,
Balaraman Ravindran
Abstract:
This paper forms the second of a two-part series on the value of a participatory approach to AI development and deployment. The first paper had crafted a principled, as well as pragmatic, justification for deploying participatory methods in these two exercises (that is, development and deployment of AI). The pragmatic justification is that it improves the quality of the overall algorithm by provid…
▽ More
This paper forms the second of a two-part series on the value of a participatory approach to AI development and deployment. The first paper had crafted a principled, as well as pragmatic, justification for deploying participatory methods in these two exercises (that is, development and deployment of AI). The pragmatic justification is that it improves the quality of the overall algorithm by providing more granular and minute information. The more principled justification is that it offers a voice to those who are going to be affected by the deployment of the algorithm, and through engagement attempts to build trust and buy-in for an AI system. By a participatory approach, we mean including various stakeholders (defined a certain way) in the actual decision making process through the life cycle of an AI system. Despite the justifications offered above, actual implementation depends crucially on how stakeholders in the entire process are identified, what information is elicited from them, and how it is incorporated. This paper will test these preliminary conclusions in two sectors, the use of facial recognition technology in the upkeep of law and order and the use of large language models in the healthcare sector. These sectors have been chosen for two primary reasons. Since Facial Recognition Technologies are a branch of AI solutions that are well-researched and the impact of which is well documented, it provides an established space to illustrate the various aspects of adapting PAI to an existing domain, especially one that has been quite contentious in the recent past. LLMs in healthcare provide a canvas for a relatively less explored space, and helps us illustrate how one could possibly envision enshrining the principles of PAI for a relatively new technology, in a space where innovation must always align with patient welfare.
△ Less
Submitted 3 June, 2024;
originally announced July 2024.
-
Participatory Approaches in AI Development and Governance: A Principled Approach
Authors:
Ambreesh Parthasarathy,
Aditya Phalnikar,
Ameen Jauhar,
Dhruv Somayajula,
Gokul S Krishnan,
Balaraman Ravindran
Abstract:
The widespread adoption of Artificial Intelligence (AI) technologies in the public and private sectors has resulted in them significantly impacting the lives of people in new and unexpected ways. In this context, it becomes important to inquire how their design, development and deployment takes place. Upon this inquiry, it is seen that persons who will be impacted by the deployment of these system…
▽ More
The widespread adoption of Artificial Intelligence (AI) technologies in the public and private sectors has resulted in them significantly impacting the lives of people in new and unexpected ways. In this context, it becomes important to inquire how their design, development and deployment takes place. Upon this inquiry, it is seen that persons who will be impacted by the deployment of these systems have little to no say in how they are developed. Seeing this as a lacuna, this research study advances the premise that a participatory approach is beneficial (both practically and normatively) to building and using more responsible, safe, and human-centric AI systems. Normatively, it enhances the fairness of the process and empowers citizens in voicing concerns to systems that may heavily impact their lives. Practically, it provides developers with new avenues of information which will be beneficial to them in improving the quality of the AI algorithm. The paper advances this argument first, by describing the life cycle of an AI system; second, by identifying criteria which may be used to identify relevant stakeholders for a participatory exercise; and third, by mapping relevant stakeholders to different stages of AI lifecycle. This paper forms the first part of a two-part series on participatory governance in AI. The second paper will expand upon and concretise the principles developed in this paper and apply the same to actual use cases of AI systems.
△ Less
Submitted 3 June, 2024;
originally announced July 2024.
-
Partitioning and Deployment of Deep Neural Networks on Edge Clusters
Authors:
Arjun Parthasarathy,
Bhaskar Krishnamachari
Abstract:
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. Additionally, no production-ready orchestration system exists for deploying said models over su…
▽ More
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. Additionally, no production-ready orchestration system exists for deploying said models over such edge networks which adopts the robustness and scalability of the cloud. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes, while being node fault-tolerant. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35% over a greedy joint partitioning-placement algorithm, although the joint-partitioning algorithm outperforms our algorithm in most practical use-cases. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2% of the optimal bottleneck latency. We then developed a standalone cluster network emulator on which we tested configurations of up to 20 nodes and found a steady increase in throughput and decrease in end-to-end latency as the cluster size scales. In these tests, we observed that our system has multi-node fault-tolerance as well as network and system IO fault-tolerance. We have implemented our framework in open-source software that is publicly available to the research community at https://github.com/ANRGUSC/SEIFER.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Partitioning and Placement of Deep Neural Networks on Distributed Edge Devices to Maximize Inference Throughput
Authors:
Arjun Parthasarathy,
Bhaskar Krishnamachari
Abstract:
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices…
▽ More
Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these clusters while maximizing the inference throughput of the system. We present an algorithm which partitions DNNs and distributes them across a set of edge devices with the goal of minimizing the bottleneck latency and therefore maximizing inference throughput. The system scales well to systems of different node memory capacities and numbers of nodes. We find that we can reduce the bottleneck latency by 10x over a random algorithm and 35% over a greedy joint partitioning-placement algorithm. Furthermore we find empirically that for the set of representative models we tested, the algorithm produces results within 9.2% of the optimal bottleneck latency.
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
SEIFER: Scalable Edge Inference for Deep Neural Networks
Authors:
Arjun Parthasarathy,
Bhaskar Krishnamachari
Abstract:
Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standa…
▽ More
Edge inference is becoming ever prevalent through its applications from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet there is no production-ready orchestration system for deploying deep learning models over such edge networks which adopts the robustness and scalability of the cloud. We present SEIFER, a framework utilizing a standalone Kubernetes cluster to partition a given DNN and place these partitions in a distributed manner across an edge network, with the goal of maximizing inference throughput. The system is node fault-tolerant and automatically updates deployments based on updates to the model's version. We provide a preliminary evaluation of a partitioning and placement algorithm that works within this framework, and show that we can improve the inference pipeline throughput by 200% by utilizing sufficient numbers of resource-constrained nodes. We have implemented SEIFER in open-source software that is publicly available to the research community.
△ Less
Submitted 17 November, 2022; v1 submitted 21 October, 2022;
originally announced October 2022.
-
DEFER: Distributed Edge Inference for Deep Neural Networks
Authors:
Arjun Parthasarathy,
Bhaskar Krishnamachari
Abstract:
Modern machine learning tools such as deep neural networks (DNNs) are playing a revolutionary role in many fields such as natural language processing, computer vision, and the internet of things. Once they are trained, deep learning models can be deployed on edge computers to perform classification and prediction on real-time data for these applications. Particularly for large models, the limited…
▽ More
Modern machine learning tools such as deep neural networks (DNNs) are playing a revolutionary role in many fields such as natural language processing, computer vision, and the internet of things. Once they are trained, deep learning models can be deployed on edge computers to perform classification and prediction on real-time data for these applications. Particularly for large models, the limited computational and memory resources on a single edge device can become the throughput bottleneck for an inference pipeline. To increase throughput and decrease per-device compute load, we present DEFER (Distributed Edge inFERence), a framework for distributed edge inference, which partitions deep neural networks into layers that can be spread across multiple compute nodes. The architecture consists of a single "dispatcher" node to distribute DNN partitions and inference data to respective compute nodes. The compute nodes are connected in a series pattern where each node's computed result is relayed to the subsequent node. The result is then returned to the Dispatcher. We quantify the throughput, energy consumption, network payload, and overhead for our framework under realistic network conditions using the CORE network emulator. We find that for the ResNet50 model, the inference throughput of DEFER with 8 compute nodes is 53% higher and per node energy consumption is 63% lower than single device inference. We further reduce network communication demands and energy consumption using the ZFP serialization and LZ4 compression algorithms. We have implemented DEFER in Python using the TensorFlow and Keras ML libraries, and have released DEFER as an open-source framework to benefit the research community.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
The Coming Age of Pervasive Data Processing
Authors:
Jan S. Rellermeyer,
Sobhan Omranian Khorasani,
Dan Graur,
Apourva Parthasarathy
Abstract:
Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new oppo…
▽ More
Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new opportunities for improving the performance. In order to prepare for an era where data collection and processing occur on a wide range of devices, from powerful HPC machines to small embedded devices, it is crucial to investigate and eliminate the potential sources of inefficiency in the current state of the art platforms. In this paper, we address the current and upcoming challenges of pervasive data processing and present directions for designing the next generation of large-scale data processing systems.
△ Less
Submitted 21 June, 2019;
originally announced June 2019.
-
Solving the stochastic Landau-Lifshitz-Gilbert-Slonczewski equation for monodomain nanomagnets : A survey and analysis of numerical techniques
Authors:
Sebastian Ament,
Nikhil Rangarajan,
Arun Parthasarathy,
Shaloo Rakheja
Abstract:
The stochastic Landau-Lifshitz-Gilbert-Slonczewski (s-LLGS) equation is widely used to study the temporal evolution of the macrospin subject to spin torque and thermal noise. The numerical simulation of the s-LLGS equation requires an appropriate choice of stochastic calculus and numerical integration scheme. In this paper, we comprehensively evaluate the accuracy and complexity of various numeric…
▽ More
The stochastic Landau-Lifshitz-Gilbert-Slonczewski (s-LLGS) equation is widely used to study the temporal evolution of the macrospin subject to spin torque and thermal noise. The numerical simulation of the s-LLGS equation requires an appropriate choice of stochastic calculus and numerical integration scheme. In this paper, we comprehensively evaluate the accuracy and complexity of various numerical techniques to solve the s-LLGS equation. We focus on implicit midpoint, Heun, and Euler-Heun methods that converge to the Stratonovich solution of the s-LLGS equation. By performing numerical tests for both strong (path-wise) and weak (statistical) convergence, we quantify the accuracy of various numerical schemes used to solve the s-LLGS equation. We demonstrate a new method intended to solve Stochastic Differential Equations (SDEs) with small noise (RK4-Heun), and test its capability to handle the s-LLGS equation. We also discuss the circuit implementation of nanomagnets for large-scale SPICE-based simulations. We evaluate the efficacy of SPICE in handling the stochastic dynamics of the multiplicative noise in the s-LLGS equation. Numerical schemes such as Euler and Gear, typically used by SPICE-based circuit simulators do not yield the expected outcome when solving the Stratonovich s-LLGS equation. While the trapezoidal method in SPICE does solve for the Stratonovich solution, its accuracy is limited by the minimum time step of integration in SPICE. We implement the s-LLGS equation in both its cartesian and spherical coordinates form in SPICE and compare the stability and accuracy of the two implementations. The results in this paper will serve as guidelines for researchers to understand the tradeoffs between accuracy and complexity of various numerical methods and the choice of appropriate calculus to solve the s-LLGS equation.
△ Less
Submitted 16 January, 2017; v1 submitted 15 July, 2016;
originally announced July 2016.
-
Improved Content Based Image Watermarking
Authors:
Arvind Parthasarathy
Abstract:
This paper presents a robust and transparent scheme of watermarking that exploits the human visual systems' sensitivity to frequency, along with local image characteristics obtained from the spatial domain. The underlying idea is generating a visual mask based on the visual systems' perception of image content. This mask is used to embed a decimal sequence while keeping its amplitude below the d…
▽ More
This paper presents a robust and transparent scheme of watermarking that exploits the human visual systems' sensitivity to frequency, along with local image characteristics obtained from the spatial domain. The underlying idea is generating a visual mask based on the visual systems' perception of image content. This mask is used to embed a decimal sequence while keeping its amplitude below the distortion sensitivity of the image pixel. We consider texture, luminance, corner and the edge information in the image to generate a mask that makes the addition of the watermark imperceptible to the human eye.
△ Less
Submitted 25 August, 2006;
originally announced August 2006.