-
Predictive AI with External Knowledge Infusion for Stocks
Authors:
Ambedkar Dukkipati,
Kawin Mayilvaghanan,
Naveen Kumar Pallekonda,
Sai Prakash Hadnoor,
Ranga Shaarad Ayyagari
Abstract:
Fluctuations in stock prices are influenced by a complex interplay of factors that go beyond mere historical data. These factors, themselves influenced by external forces, encompass inter-stock dynamics, broader economic factors, various government policy decisions, outbreaks of wars, etc. Furthermore, all of these factors are dynamic and exhibit changes over time. In this paper, for the first tim…
▽ More
Fluctuations in stock prices are influenced by a complex interplay of factors that go beyond mere historical data. These factors, themselves influenced by external forces, encompass inter-stock dynamics, broader economic factors, various government policy decisions, outbreaks of wars, etc. Furthermore, all of these factors are dynamic and exhibit changes over time. In this paper, for the first time, we tackle the forecasting problem under external influence by proposing learning mechanisms that not only learn from historical trends but also incorporate external knowledge from temporal knowledge graphs. Since there are no such datasets or temporal knowledge graphs available, we study this problem with stock market data, and we construct comprehensive temporal knowledge graph datasets. In our proposed approach, we model relations on external temporal knowledge graphs as events of a Hawkes process on graphs. With extensive experiments, we show that learned dynamic representations effectively rank stocks based on returns across multiple holding periods, outperforming related baselines on relevant metrics.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
Active Reinforcement Learning Strategies for Offline Policy Improvement
Authors:
Ambedkar Dukkipati,
Ranga Shaarad Ayyagari,
Bodhisattwa Dasgupta,
Parag Dutta,
Prabhas Reddy Onteru
Abstract:
Learning agents that excel at sequential decision-making tasks must continuously resolve the problem of exploration and exploitation for optimal learning. However, such interactions with the environment online might be prohibitively expensive and may involve some constraints, such as a limited budget for agent-environment interactions and restricted exploration in certain regions of the state spac…
▽ More
Learning agents that excel at sequential decision-making tasks must continuously resolve the problem of exploration and exploitation for optimal learning. However, such interactions with the environment online might be prohibitively expensive and may involve some constraints, such as a limited budget for agent-environment interactions and restricted exploration in certain regions of the state space. Examples include selecting candidates for medical trials and training agents in complex navigation environments. This problem necessitates the study of active reinforcement learning strategies that collect minimal additional experience trajectories by reusing existing offline data previously collected by some unknown behavior policy. In this work, we propose an active reinforcement learning method capable of collecting trajectories that can augment existing offline data. With extensive experimentation, we demonstrate that our proposed method reduces additional online interaction with the environment by up to 75% over competitive baselines across various continuous control environments such as Gym-MuJoCo locomotion environments as well as Maze2d, AntMaze, CARLA and IsaacSimGo1. To the best of our knowledge, this is the first work that addresses the active learning problem in the context of sequential decision-making and reinforcement learning.
△ Less
Submitted 26 December, 2024; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Temporal Abstraction in Reinforcement Learning with Offline Data
Authors:
Ranga Shaarad Ayyagari,
Anurita Ghosh,
Ambedkar Dukkipati
Abstract:
Standard reinforcement learning algorithms with a single policy perform poorly on tasks in complex environments involving sparse rewards, diverse behaviors, or long-term planning. This led to the study of algorithms that incorporate temporal abstraction by training a hierarchy of policies that plan over different time scales. The options framework has been introduced to implement such temporal abs…
▽ More
Standard reinforcement learning algorithms with a single policy perform poorly on tasks in complex environments involving sparse rewards, diverse behaviors, or long-term planning. This led to the study of algorithms that incorporate temporal abstraction by training a hierarchy of policies that plan over different time scales. The options framework has been introduced to implement such temporal abstraction by learning low-level options that act as extended actions controlled by a high-level policy. The main challenge in applying these algorithms to real-world problems is that they suffer from high sample complexity to train multiple levels of the hierarchy, which is impossible in online settings. Motivated by this, in this paper, we propose an offline hierarchical RL method that can learn options from existing offline datasets collected by other unknown agents. This is a very challenging problem due to the distribution mismatch between the learned options and the policies responsible for the offline dataset and to our knowledge, this is the first work in this direction. In this work, we propose a framework by which an online hierarchical reinforcement learning algorithm can be trained on an offline dataset of transitions collected by an unknown behavior policy. We validate our method on Gym MuJoCo locomotion environments and robotic gripper block-stacking tasks in the standard as well as transfer and goal-conditioned settings.
△ Less
Submitted 21 July, 2024;
originally announced July 2024.
-
Label Noise Robustness for Domain-Agnostic Fair Corrections via Nearest Neighbors Label Spreading
Authors:
Nathan Stromberg,
Rohan Ayyagari,
Sanmi Koyejo,
Richard Nock,
Lalitha Sankar
Abstract:
Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction…
▽ More
Last-layer retraining methods have emerged as an efficient framework for correcting existing base models. Within this framework, several methods have been proposed to deal with correcting models for subgroup fairness with and without group membership information. Importantly, prior work has demonstrated that many methods are susceptible to noisy labels. To this end, we propose a drop-in correction for label noise in last-layer retraining, and demonstrate that it achieves state-of-the-art worst-group accuracy for a broad range of symmetric label noise and across a wide variety of datasets exhibiting spurious correlations. Our proposed approach uses label spreading on a latent nearest neighbors graph and has minimal computational overhead compared to existing methods.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains
Authors:
Nathan Stromberg,
Rohan Ayyagari,
Monica Welfert,
Sanmi Koyejo,
Richard Nock,
Lalitha Sankar
Abstract:
Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla em…
▽ More
Existing methods for last layer retraining that aim to optimize worst-group accuracy (WGA) rely heavily on well-annotated groups in the training data. We show, both in theory and practice, that annotation-based data augmentations using either downsampling or upweighting for WGA are susceptible to domain annotation noise, and in high-noise regimes approach the WGA of a model trained with vanilla empirical risk minimization. We introduce Regularized Annotation of Domains (RAD) in order to train robust last layer classifiers without the need for explicit domain annotations. Our results show that RAD is competitive with other recently proposed domain annotation-free techniques. Most importantly, RAD outperforms state-of-the-art annotation-reliant methods even with only 5% noise in the training data for several publicly available datasets.
△ Less
Submitted 26 June, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Markov Decision Processes under External Temporal Processes
Authors:
Ranga Shaarad Ayyagari,
Ambedkar Dukkipati
Abstract:
Most reinforcement learning algorithms treat the context under which they operate as a stationary, isolated, and undisturbed environment. However, in real world applications, environments constantly change due to a variety of external events. To address this problem, we study Markov Decision Processes (MDP) under the influence of an external temporal process. First, we formalize this notion and de…
▽ More
Most reinforcement learning algorithms treat the context under which they operate as a stationary, isolated, and undisturbed environment. However, in real world applications, environments constantly change due to a variety of external events. To address this problem, we study Markov Decision Processes (MDP) under the influence of an external temporal process. First, we formalize this notion and derive conditions under which the problem becomes tractable with suitable solutions. We propose a policy iteration algorithm to solve this problem and theoretically analyze its performance. Our analysis addresses the non-stationarity present in the MDP as a result of non-Markovian events, necessitating the formulation of policies that are contingent upon both the current state and a history of prior events. Additionally, we derive insights regarding the sample complexity of the algorithm and incorporate factors that define the exogenous temporal process into the established bounds. Finally, we perform experiments to demonstrate our findings within a traditional control environment.
△ Less
Submitted 10 October, 2024; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Learning Skills to Navigate without a Master: A Sequential Multi-Policy Reinforcement Learning Algorithm
Authors:
Ambedkar Dukkipati,
Rajarshi Banerjee,
Ranga Shaarad Ayyagari,
Dhaval Parmar Udaybhai
Abstract:
Solving complex problems using reinforcement learning necessitates breaking down the problem into manageable tasks and learning policies to solve these tasks. These policies, in turn, have to be controlled by a master policy that takes high-level decisions. Hence learning policies involves hierarchical decision structures. However, training such methods in practice may lead to poor generalization,…
▽ More
Solving complex problems using reinforcement learning necessitates breaking down the problem into manageable tasks and learning policies to solve these tasks. These policies, in turn, have to be controlled by a master policy that takes high-level decisions. Hence learning policies involves hierarchical decision structures. However, training such methods in practice may lead to poor generalization, with either sub-policies executing actions for too few time steps or devolving into a single policy altogether. In our work, we introduce an alternative approach to learn such skills sequentially without using an overarching hierarchical policy. We propose this method in the context of environments where a major component of the objective of a learning agent is to prolong the episode for as long as possible. We refer to our proposed method as Sequential Soft Option Critic. We demonstrate the utility of our approach on navigation and goal-based tasks in a flexible simulated 3D navigation environment that we have developed. We also show that our method outperforms prior methods such as Soft Actor-Critic and Soft Option Critic on various environments, including the Atari River Raid environment and the Gym-Duckietown self-driving car simulator.
△ Less
Submitted 7 August, 2022; v1 submitted 30 January, 2021;
originally announced February 2021.
-
Cache Contention on Multicore Systems: An Ontology-based Approach
Authors:
Maruthi Rohit Ayyagari
Abstract:
Multicore processors have proved to be the right choice for both desktop and server systems because it can support high performance with an acceptable budget expenditure. In this work, we have compared several works in cache contention and found that such works have identified several techniques for cache contention other than cache size including FSB, Memory Controller and prefetching hardware. W…
▽ More
Multicore processors have proved to be the right choice for both desktop and server systems because it can support high performance with an acceptable budget expenditure. In this work, we have compared several works in cache contention and found that such works have identified several techniques for cache contention other than cache size including FSB, Memory Controller and prefetching hardware. We found that Distributed Intensity Online (DIO) is a very promising cache contention algorithm since it can achieve up to 2% from the optimal technique. Moreover, we propose a new framework for cache contention based on resource ontologies. In which ontologies instances will be used for communication between diverse processes instead of grasping schedules based on hardware.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Integrating Association Rules with Decision Trees in Object-Relational Databases
Authors:
Maruthi Rohit Ayyagari
Abstract:
Research has provided evidence that associative classification produces more accurate results compared to other classification models. The Classification Based on Association (CBA) is one of the famous Associative Classification algorithms that generates accurate classifiers. However, current association classification algorithms reside external to databases, which reduces the flexibility of enter…
▽ More
Research has provided evidence that associative classification produces more accurate results compared to other classification models. The Classification Based on Association (CBA) is one of the famous Associative Classification algorithms that generates accurate classifiers. However, current association classification algorithms reside external to databases, which reduces the flexibility of enterprise analytics systems. This paper implements the CBA in Oracle database using two variant models: hardcoding the CBA in Oracle Data Mining (ODM) package and Integrating Oracle Apriori model with the Oracle Decision tree model. We compared the proposed model performance with Naive Bayes, Support Vector Machine, Random Forests, and Decision Tree over 18 datasets from UCI. Results showed that our models outperformed the original CBA model with 1 percent and is competitive to chosen classification models over benchmark datasets.
△ Less
Submitted 21 April, 2019;
originally announced April 2019.
-
A new NS3 Implementation of CCNx 1.0 Protocol
Authors:
Marc Mosko,
Ramesh Ayyagari,
Priti Goel,
Eric Holmberg,
Mark Konezny
Abstract:
The ccns3Sim project is an open source implementation of the CCNx 1.0 protocols for the NS3 simulator. We describe the implementation and several important features including modularity and process delay simulation. The ccns3Sim implementation is a fresh NS3-specific implementation. Like NS3 itself, it uses C++98 standard, NS3 code style, NS3 smart pointers, NS3 xUnit, and integrates with the NS3…
▽ More
The ccns3Sim project is an open source implementation of the CCNx 1.0 protocols for the NS3 simulator. We describe the implementation and several important features including modularity and process delay simulation. The ccns3Sim implementation is a fresh NS3-specific implementation. Like NS3 itself, it uses C++98 standard, NS3 code style, NS3 smart pointers, NS3 xUnit, and integrates with the NS3 documentation and manual. A user or developer does not need to learn two systems. If one knows NS3, one should be able to get started with the CCNx code right away. A developer can easily use their own implementation of the layer 3 protocol, layer 4 protocol, forwarder, routing protocol, Pending Interest Table (PIT) or Forwarding Information Base (FIB) or Content Store (CS). A user may configure or specify a new implementation for any of these features at runtime in the simulation script. In this paper, we describe the software architecture and give examples of using the simulator. We evaluate the implementation with several example experiments on ICN caching.
△ Less
Submitted 15 July, 2017;
originally announced July 2017.
-
Formation Control in Multi-Agent Systems Over Packet Dropping Links
Authors:
Seshadhri Srinivasan,
R. Ayyagari
Abstract:
One major challenge in implementation of formation control problems stems from the packet loss that occur in these shared communication channel. In the presence of packet loss the coordination information among agents is lost. Moreover, there is a move to use wireless channels in formation control applications. It has been found in practice that packet losses are more pronounced in wireless channe…
▽ More
One major challenge in implementation of formation control problems stems from the packet loss that occur in these shared communication channel. In the presence of packet loss the coordination information among agents is lost. Moreover, there is a move to use wireless channels in formation control applications. It has been found in practice that packet losses are more pronounced in wireless channels, than their wired counterparts. In our analysis, we first show that packet loss may result in loss of rigidity. In turn this causes the entire formation to fail. Later, we present an estimation based formation control algorithm that is robust to packet loss among agents. The proposed estimation algorithm employs minimal spanning tree algorithm to compute the estimate of the node variables (coordination variables). Consequently, this reduces the communication overhead required for information exchange. Later, using simulation, we verify the data that is to be transmitted for optimal estimation of these variables in the event of a packet loss. Finally, the effectiveness of the proposed algorithm is illustrated using suitable simulation example.
△ Less
Submitted 26 June, 2015;
originally announced June 2015.
-
An analytical framework for analysis and design of networked control systems with random delays and packet losses
Authors:
M. Vallabhan,
S. Seshadhri,
S. Ashok,
S. Ramaswmay,
R. Ayyagari
Abstract:
Delays and data losses are undesirable from a control system perspective as they tend to adversely affect performance Networked Control Systems (NCSs) are a class of control systems wherein control components exchange information using a shared communication channel. Delays and packet losses in the communication channels are usually random, thereby making the analysis and design of control loops m…
▽ More
Delays and data losses are undesirable from a control system perspective as they tend to adversely affect performance Networked Control Systems (NCSs) are a class of control systems wherein control components exchange information using a shared communication channel. Delays and packet losses in the communication channels are usually random, thereby making the analysis and design of control loops more complex. The usual assumptions in classical control theory, such as delay free sensing and synchronous actuation, assume lesser significance when it comes to NCSs. Hence, this necessitates a reformulation/relook into the existing models used for NCS control loop analysis and design. In this paper, we study and present the reformulations required for NCSs to include random delays and packet loss in the channel. This paper thereby gives a complete overview of what has been accomplished thus far in NCS research and puts forth a unified framework for analyzing a host of problems that can be captured as NCSs subjected to random delays and packet losses.
△ Less
Submitted 20 June, 2015;
originally announced June 2015.