-
A Case for Kolmogorov-Arnold Networks in Prefetching: Towards Low-Latency, Generalizable ML-Based Prefetchers
Authors:
Dhruv Kulkarni,
Bharat Bhammar,
Henil Thaker,
Pranav Dhobi,
R. P. Gohil,
Sai Manoj Pudukotai Dinkarrao
Abstract:
The memory wall problem arises due to the disparity between fast processors and slower memory, causing significant delays in data access, even more so on edge devices. Data prefetching is a key strategy to address this, with traditional methods evolving to incorporate Machine Learning (ML) for improved accuracy. Modern prefetchers must balance high accuracy with low latency to further practicality…
▽ More
The memory wall problem arises due to the disparity between fast processors and slower memory, causing significant delays in data access, even more so on edge devices. Data prefetching is a key strategy to address this, with traditional methods evolving to incorporate Machine Learning (ML) for improved accuracy. Modern prefetchers must balance high accuracy with low latency to further practicality. We explore the applicability of utilizing Kolmogorov-Arnold Networks (KAN) with learnable activation functions,a prefetcher we implemented called KANBoost, to further this aim. KANs are a novel, state-of-the-art model that work on breaking down continuous, bounded multi-variate functions into functions of their constituent variables, and use these constitutent functions as activations on each individual neuron. KANBoost predicts the next memory access by modeling deltas between consecutive addresses, offering a balance of accuracy and efficiency to mitigate the memory wall problem with minimal overhead, instead of relying on address-correlation prefetching. Initial results indicate that KAN-based prefetching reduces inference latency (18X lower than state-of-the-art ML prefetchers) while achieving moderate IPC improvements (2.5\% over no-prefetching). While KANs still face challenges in capturing long-term dependencies, we propose that future research should explore hybrid models that combine KAN efficiency with stronger sequence modeling techniques, paving the way for practical ML-based prefetching in edge devices and beyond.
△ Less
Submitted 12 April, 2025;
originally announced April 2025.
-
SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
Authors:
Apurv Deepak Kulkarni,
Siavash Ghiasvand
Abstract:
Recent advancements in data stream processing frameworks have improved real-time data handling, however, scalability remains a significant challenge affecting throughput and latency. While studies have explored this issue on local machines and cloud clusters, research on modern high performance computing (HPC) infrastructures is yet limited due to the lack of scalable measurement tools. This work…
▽ More
Recent advancements in data stream processing frameworks have improved real-time data handling, however, scalability remains a significant challenge affecting throughput and latency. While studies have explored this issue on local machines and cloud clusters, research on modern high performance computing (HPC) infrastructures is yet limited due to the lack of scalable measurement tools. This work presents SProBench, a novel benchmark suite designed to evaluate the performance of data stream processing frameworks in large-scale computing systems. Building on best practices, SProBench incorporates a modular architecture, offers native support for SLURM-based clusters, and seamlessly integrates with popular stream processing frameworks such as Apache Flink, Apache Spark Streaming, and Apache Kafka Streams. Experiments conducted on HPC clusters demonstrate its exceptional scalability, delivering throughput that surpasses existing benchmarks by more than tenfold. The distinctive features of SProBench, including complete customization options, built-in automated experiment management tools, seamless interoperability, and an open-source license, distinguish it as an innovative benchmark suite tailored to meet the needs of modern data stream processing frameworks.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Fuzzy Convolution Neural Networks for Tabular Data Classification
Authors:
Arun D. Kulkarni
Abstract:
Recently, convolution neural networks (CNNs) have attracted a great deal of attention due to their remarkable performance in various domains, particularly in image and text classification tasks. However, their application to tabular data classification remains underexplored. There are many fields such as bioinformatics, finance, medicine where nonimage data are prevalent. Adaption of CNNs to class…
▽ More
Recently, convolution neural networks (CNNs) have attracted a great deal of attention due to their remarkable performance in various domains, particularly in image and text classification tasks. However, their application to tabular data classification remains underexplored. There are many fields such as bioinformatics, finance, medicine where nonimage data are prevalent. Adaption of CNNs to classify nonimage data remains highly challenging. This paper investigates the efficacy of CNNs for tabular data classification, aiming to bridge the gap between traditional machine learning approaches and deep learning techniques. We propose a novel framework fuzzy convolution neural network (FCNN) tailored specifically for tabular data to capture local patterns within feature vectors. In our approach, we map feature values to fuzzy memberships. The fuzzy membership vectors are converted into images that are used to train the CNN model. The trained CNN model is used to classify unknown feature vectors. To validate our approach, we generated six complex noisy data sets. We used randomly selected seventy percent samples from each data set for training and thirty percent for testing. The data sets were also classified using the state-of-the-art machine learning algorithms such as the decision tree (DT), support vector machine (SVM), fuzzy neural network (FNN), Bayes classifier, and Random Forest (RF). Experimental results demonstrate that our proposed model can effectively learn meaningful representations from tabular data, achieving competitive or superior performance compared to existing methods. Overall, our finding suggests that the proposed FCNN model holds promise as a viable alternative for tabular data classification tasks, offering a fresh prospective and potentially unlocking new opportunities for leveraging deep learning in structured data analysis.
△ Less
Submitted 14 October, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
On Decentralizing Federated Reinforcement Learning in Multi-Robot Scenarios
Authors:
Jayprakash S. Nair,
Divya D. Kulkarni,
Ajitem Joshi,
Sruthy Suresh
Abstract:
Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling issues of privacy and the need of huge bandwidth. FL techniques generally use a central server or cloud for aggregating the models received from the devices. Such centralized FL techniques suffer from inherent problems such as failu…
▽ More
Federated Learning (FL) allows for collaboratively aggregating learned information across several computing devices and sharing the same amongst them, thereby tackling issues of privacy and the need of huge bandwidth. FL techniques generally use a central server or cloud for aggregating the models received from the devices. Such centralized FL techniques suffer from inherent problems such as failure of the central node and bottlenecks in channel bandwidth. When FL is used in conjunction with connected robots serving as devices, a failure of the central controlling entity can lead to a chaotic situation. This paper describes a mobile agent based paradigm to decentralize FL in multi-robot scenarios. Using Webots, a popular free open-source robot simulator, and Tartarus, a mobile agent platform, we present a methodology to decentralize federated learning in a set of connected robots. With Webots running on different connected computing systems, we show how mobile agents can perform the task of Decentralized Federated Reinforcement Learning (dFRL). Results obtained from experiments carried out using Q-learning and SARSA by aggregating their corresponding Q-tables, show the viability of using decentralized FL in the domain of robotics. Since the proposed work can be used in conjunction with other learning algorithms and also real robots, it can act as a vital tool for the study of decentralized FL using heterogeneous learning algorithms concurrently in multi-robot scenarios.
△ Less
Submitted 7 September, 2022; v1 submitted 19 July, 2022;
originally announced July 2022.
-
Employee Attrition Prediction
Authors:
Rahul Yedida,
Rahul Reddy,
Rakshit Vahi,
Rahul Jana,
Abhilash GV,
Deepti Kulkarni
Abstract:
We aim to predict whether an employee of a company will leave or not, using the k-Nearest Neighbors algorithm. We use evaluation of employee performance, average monthly hours at work and number of years spent in the company, among others, as our features. Other approaches to this problem include the use of ANNs, decision trees and logistic regression. The dataset was split, using 70% for training…
▽ More
We aim to predict whether an employee of a company will leave or not, using the k-Nearest Neighbors algorithm. We use evaluation of employee performance, average monthly hours at work and number of years spent in the company, among others, as our features. Other approaches to this problem include the use of ANNs, decision trees and logistic regression. The dataset was split, using 70% for training the algorithm and 30% for testing it, achieving an accuracy of 94.32%.
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
On an Immuno-inspired Distributed, Embodied Action-Evolution cum Selection Algorithm
Authors:
Tushar Semwal,
Divya D Kulkarni,
Shivashankar B. Nair
Abstract:
Traditional Evolutionary Robotics (ER) employs evolutionary techniques to search for a single monolithic controller which can aid a robot to learn a desired task. These techniques suffer from bootstrap and deception issues when the tasks are complex for a single controller to learn. Behaviour-decomposition techniques have been used to divide a task into multiple subtasks and evolve separate subcon…
▽ More
Traditional Evolutionary Robotics (ER) employs evolutionary techniques to search for a single monolithic controller which can aid a robot to learn a desired task. These techniques suffer from bootstrap and deception issues when the tasks are complex for a single controller to learn. Behaviour-decomposition techniques have been used to divide a task into multiple subtasks and evolve separate subcontrollers for each subtask. However, these subcontrollers and the associated subcontroller arbitrator(s) are all evolved off-line. A distributed, fully embodied and evolutionary version of such approaches will greatly aid online learning and help reduce the reality gap. In this paper, we propose an immunology-inspired embodied action-evolution cum selection algorithm that can cater to distributed ER. This algorithm evolves different subcontrollers for different portions of the search space in a distributed manner just as antibodies are evolved and primed for different antigens in the antigenic space. Experimentation on a collective of real robots embodied with the algorithm showed that a repertoire of antibody-like subcontrollers was created, evolved and shared on-the-fly to cope up with different environmental conditions. In addition, instead of the conventionally used approach of broadcasting for sharing, we present an Intelligent Packet Migration scheme that reduces energy consumption.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
Self-Supervised Intrinsic Image Decomposition
Authors:
Michael Janner,
Jiajun Wu,
Tejas D. Kulkarni,
Ilker Yildirim,
Joshua B. Tenenbaum
Abstract:
Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. Our model, the Rendered Intrinsics Network (RIN), joins together an image decomposition pipeline, whi…
▽ More
Intrinsic decomposition from a single image is a highly challenging task, due to its inherent ambiguity and the scarcity of training data. In contrast to traditional fully supervised learning approaches, in this paper we propose learning intrinsic image decomposition by explaining the input image. Our model, the Rendered Intrinsics Network (RIN), joins together an image decomposition pipeline, which predicts reflectance, shape, and lighting conditions given a single image, with a recombination function, a learned shading model used to recompose the original input based off of intrinsic image predictions. Our network can then use unsupervised reconstruction error as an additional signal to improve its intermediate representations. This allows large-scale unlabeled data to be useful during training, and also enables transferring learned knowledge to images of unseen object categories, lighting conditions, and shapes. Extensive experiments demonstrate that our method performs well on both intrinsic image decomposition and knowledge transfer.
△ Less
Submitted 5 February, 2018; v1 submitted 9 November, 2017;
originally announced November 2017.
-
Learning to Perform Physics Experiments via Deep Reinforcement Learning
Authors:
Misha Denil,
Pulkit Agrawal,
Tejas D Kulkarni,
Tom Erez,
Peter Battaglia,
Nando de Freitas
Abstract:
When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way. This process of active interaction is in the same spirit as a scientist performing experiments to discover hidden facts. Recent advances in artificial intelligence have yielded machines that can achieve superhuman perf…
▽ More
When encountering novel objects, humans are able to infer a wide range of physical properties such as mass, friction and deformability by interacting with them in a goal driven way. This process of active interaction is in the same spirit as a scientist performing experiments to discover hidden facts. Recent advances in artificial intelligence have yielded machines that can achieve superhuman performance in Go, Atari, natural language processing, and complex control problems; however, it is not clear that these systems can rival the scientific intuition of even a young child. In this work we introduce a basic set of tasks that require agents to estimate properties such as mass and cohesion of objects in an interactive simulated environment where they can manipulate the objects and observe the consequences. We found that state of art deep reinforcement learning methods can learn to perform the experiments necessary to discover such hidden properties. By systematically manipulating the problem difficulty and the cost incurred by the agent for performing experiments, we found that agents learn different strategies that balance the cost of gathering information against the cost of making mistakes in different situations.
△ Less
Submitted 17 August, 2017; v1 submitted 6 November, 2016;
originally announced November 2016.
-
Deep Successor Reinforcement Learning
Authors:
Tejas D. Kulkarni,
Ardavan Saeedi,
Simanta Gautam,
Samuel J. Gershman
Abstract:
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any giv…
▽ More
Learning robust value functions given raw observations and rewards is now possible with model-free and model-based deep reinforcement learning algorithms. There is a third alternative, called Successor Representations (SR), which decomposes the value function into two components -- a reward predictor and a successor map. The successor map represents the expected future state occupancy from any given state and the reward predictor maps states to scalar rewards. The value function of a state can be computed as the inner product between the successor map and the reward weights. In this paper, we present DSR, which generalizes SR within an end-to-end deep reinforcement learning framework. DSR has several appealing properties including: increased sensitivity to distal reward changes due to factorization of reward and world dynamics, and the ability to extract bottleneck states (subgoals) given successor maps trained under a random policy. We show the efficacy of our approach on two diverse environments given raw pixel observations -- simple grid-world domains (MazeBase) and the Doom game engine.
△ Less
Submitted 8 June, 2016;
originally announced June 2016.
-
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
Authors:
Tejas D. Kulkarni,
Karthik R. Narasimhan,
Ardavan Saeedi,
Joshua B. Tenenbaum
Abstract:
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors co…
▽ More
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning. A top-level value function learns a policy over intrinsic goals, and a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse, delayed feedback: (1) a complex discrete stochastic decision process, and (2) the classic ATARI game `Montezuma's Revenge'.
△ Less
Submitted 31 May, 2016; v1 submitted 20 April, 2016;
originally announced April 2016.
-
Speech Controlled Quadruped
Authors:
Devashish Kulkarni,
Sagar Paldhe,
Vinod Kamat
Abstract:
The project which we have performed is based on voice recognition and we desire to create a four legged robot that can acknowledge the given instructions which are given through vocal commands and perform the tasks. The main processing unit of the robot will be Arduino Uno. We are using 8 servos for the movement of its legs while two servos will be required for each leg. The interface between a hu…
▽ More
The project which we have performed is based on voice recognition and we desire to create a four legged robot that can acknowledge the given instructions which are given through vocal commands and perform the tasks. The main processing unit of the robot will be Arduino Uno. We are using 8 servos for the movement of its legs while two servos will be required for each leg. The interface between a human and the robot is generated through Python programming and Eclipse software and it is implemented by using Bluetooth module HC 06.
△ Less
Submitted 24 June, 2015;
originally announced June 2015.
-
Deep Convolutional Inverse Graphics Network
Authors:
Tejas D. Kulkarni,
Will Whitney,
Pushmeet Kohli,
Joshua B. Tenenbaum
Abstract:
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient…
▽ More
This paper presents the Deep Convolution Inverse Graphics Network (DC-IGN), a model that learns an interpretable representation of images. This representation is disentangled with respect to transformations such as out-of-plane rotations and lighting variations. The DC-IGN model is composed of multiple layers of convolution and de-convolution operators and is trained using the Stochastic Gradient Variational Bayes (SGVB) algorithm. We propose a training procedure to encourage neurons in the graphics code layer to represent a specific transformation (e.g. pose or light). Given a single input image, our model can generate new images of the same object with variations in pose and lighting. We present qualitative and quantitative results of the model's efficacy at learning a 3D rendering engine.
△ Less
Submitted 21 June, 2015; v1 submitted 11 March, 2015;
originally announced March 2015.
-
Inverse Graphics with Probabilistic CAD Models
Authors:
Tejas D. Kulkarni,
Vikash K. Mansinghka,
Pushmeet Kohli,
Joshua B. Tenenbaum
Abstract:
Recently, multiple formulations of vision problems as probabilistic inversions of generative models based on computer graphics have been proposed. However, applications to 3D perception from natural images have focused on low-dimensional latent scenes, due to challenges in both modeling and inference. Accounting for the enormous variability in 3D object shape and 2D appearance via realistic genera…
▽ More
Recently, multiple formulations of vision problems as probabilistic inversions of generative models based on computer graphics have been proposed. However, applications to 3D perception from natural images have focused on low-dimensional latent scenes, due to challenges in both modeling and inference. Accounting for the enormous variability in 3D object shape and 2D appearance via realistic generative models seems intractable, as does inverting even simple versions of the many-to-many computations that link 3D scenes to 2D images. This paper proposes and evaluates an approach that addresses key aspects of both these challenges. We show that it is possible to solve challenging, real-world 3D vision problems by approximate inference in generative models for images based on rendering the outputs of probabilistic CAD (PCAD) programs. Our PCAD object geometry priors generate deformable 3D meshes corresponding to plausible objects and apply affine transformations to place them in a scene. Image likelihoods are based on similarity in a feature space based on standard mid-level image representations from the vision literature. Our inference algorithm integrates single-site and locally blocked Metropolis-Hastings proposals, Hamiltonian Monte Carlo and discriminative data-driven proposals learned from training data generated from our models. We apply this approach to 3D human pose estimation and object shape reconstruction from single images, achieving quantitative and qualitative performance improvements over state-of-the-art baselines.
△ Less
Submitted 4 July, 2014;
originally announced July 2014.
-
Variational Particle Approximations
Authors:
Ardavan Saeedi,
Tejas D Kulkarni,
Vikash Mansinghka,
Samuel Gershman
Abstract:
Approximate inference in high-dimensional, discrete probabilistic models is a central problem in computational statistics and machine learning. This paper describes discrete particle variational inference (DPVI), a new approach that combines key strengths of Monte Carlo, variational and search-based techniques. DPVI is based on a novel family of particle-based variational approximations that can b…
▽ More
Approximate inference in high-dimensional, discrete probabilistic models is a central problem in computational statistics and machine learning. This paper describes discrete particle variational inference (DPVI), a new approach that combines key strengths of Monte Carlo, variational and search-based techniques. DPVI is based on a novel family of particle-based variational approximations that can be fit using simple, fast, deterministic search techniques. Like Monte Carlo, DPVI can handle multiple modes, and yields exact results in a well-defined limit. Like unstructured mean-field, DPVI is based on optimizing a lower bound on the partition function; when this quantity is not of intrinsic interest, it facilitates convergence assessment and debugging. Like both Monte Carlo and combinatorial search, DPVI can take advantage of factorization, sequential structure, and custom search operators. This paper defines DPVI particle-based approximation family and partition function lower bounds, along with the sequential DPVI and local DPVI algorithm templates for optimizing them. DPVI is illustrated and evaluated via experiments on lattice Markov Random Fields, nonparametric Bayesian mixtures and block-models, and parametric as well as non-parametric hidden Markov models. Results include applications to real-world spike-sorting and relational modeling problems, and show that DPVI can offer appealing time/accuracy trade-offs as compared to multiple alternatives.
△ Less
Submitted 5 December, 2015; v1 submitted 23 February, 2014;
originally announced February 2014.
-
Approximate Bayesian Image Interpretation using Generative Probabilistic Graphics Programs
Authors:
Vikash K. Mansinghka,
Tejas D. Kulkarni,
Yura N. Perov,
Joshua B. Tenenbaum
Abstract:
The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that define flexible generative models and to au…
▽ More
The idea of computer vision as the Bayesian inverse problem to computer graphics has a long history and an appealing elegance, but it has proved difficult to directly implement. Instead, most vision tasks are approached via complex bottom-up processing pipelines. Here we show that it is possible to write short, simple probabilistic graphics programs that define flexible generative models and to automatically invert them to interpret real-world images. Generative probabilistic graphics programs consist of a stochastic scene generator, a renderer based on graphics software, a stochastic likelihood model linking the renderer's output and the data, and latent variables that adjust the fidelity of the renderer and the tolerance of the likelihood model. Representations and algorithms from computer graphics, originally designed to produce high-quality images, are instead used as the deterministic backbone for highly approximate and stochastic generative models. This formulation combines probabilistic programming, computer graphics, and approximate Bayesian computation, and depends only on general-purpose, automatic inference techniques. We describe two applications: reading sequences of degraded and adversarially obscured alphanumeric characters, and inferring 3D road models from vehicle-mounted camera images. Each of the probabilistic graphics programs we present relies on under 20 lines of probabilistic code, and supports accurate, approximately Bayesian inferences about ambiguous real-world images.
△ Less
Submitted 28 June, 2013;
originally announced July 2013.
-
Learning Context for Text Categorization
Authors:
Y. V. Haribhakta,
Dr. Parag Kulkarni
Abstract:
This paper describes our work which is based on discovering context for text document categorization. The document categorization approach is derived from a combination of a learning paradigm known as relation extraction and an technique known as context discovery. We demonstrate the effectiveness of our categorization approach using reuters 21578 dataset and synthetic real world data from sports…
▽ More
This paper describes our work which is based on discovering context for text document categorization. The document categorization approach is derived from a combination of a learning paradigm known as relation extraction and an technique known as context discovery. We demonstrate the effectiveness of our categorization approach using reuters 21578 dataset and synthetic real world data from sports domain. Our experimental results indicate that the learned context greatly improves the categorization performance as compared to traditional categorization approaches.
△ Less
Submitted 9 December, 2011;
originally announced December 2011.
-
Finding Cliques of a Graph using Prime Numbers
Authors:
Dhananjay D. Kulkarni,
Shekhar Verma,
Prashant
Abstract:
This paper proposes a new algorithm for solving maximal cliques for simple undirected graphs using the theory of prime numbers. A novel approach using prime numbers is used to find cliques and ends with a discussion of the algorithm.
This paper proposes a new algorithm for solving maximal cliques for simple undirected graphs using the theory of prime numbers. A novel approach using prime numbers is used to find cliques and ends with a discussion of the algorithm.
△ Less
Submitted 18 January, 2007; v1 submitted 27 January, 2006;
originally announced January 2006.