Search | arXiv e-print repository

Fine-tuning Smaller Language Models for Question Answering over Financial Documents

Authors: Karmvir Singh Phogat, Sai Akhil Puranam, Sridhar Dasaratha, Chetan Harsha, Shashishekar Ramakrishna

Abstract: Recent research has shown that smaller language models can acquire substantial reasoning abilities when fine-tuned with reasoning exemplars crafted by a significantly larger teacher model. We explore this paradigm for the financial domain, focusing on the challenge of answering questions that require multi-hop numerical reasoning over financial texts. We assess the performance of several smaller m… ▽ More Recent research has shown that smaller language models can acquire substantial reasoning abilities when fine-tuned with reasoning exemplars crafted by a significantly larger teacher model. We explore this paradigm for the financial domain, focusing on the challenge of answering questions that require multi-hop numerical reasoning over financial texts. We assess the performance of several smaller models that have been fine-tuned to generate programs that encode the required financial reasoning and calculations. Our findings demonstrate that these fine-tuned smaller models approach the performance of the teacher model. To provide a granular analysis of model performance, we propose an approach to investigate the specific student model capabilities that are enhanced by fine-tuning. Our empirical analysis indicates that fine-tuning refines the student models ability to express and apply the required financial concepts along with adapting the entity extraction for the specific data format. In addition, we hypothesize and demonstrate that comparable financial reasoning capability can be induced using relatively smaller datasets. △ Less

Submitted 22 August, 2024; originally announced August 2024.

arXiv:2311.14722 [pdf, other]

Zero-Shot Question Answering over Financial Documents using Large Language Models

Authors: Karmvir Singh Phogat, Chetan Harsha, Sridhar Dasaratha, Shashishekar Ramakrishna, Sai Akhil Puranam

Abstract: We introduce a large language model (LLM) based approach to answer complex questions requiring multi-hop numerical reasoning over financial reports. While LLMs have exhibited remarkable performance on various natural language and reasoning tasks, complex reasoning problems often rely on few-shot prompts that require carefully crafted examples. In contrast, our approach uses novel zero-shot prompts… ▽ More We introduce a large language model (LLM) based approach to answer complex questions requiring multi-hop numerical reasoning over financial reports. While LLMs have exhibited remarkable performance on various natural language and reasoning tasks, complex reasoning problems often rely on few-shot prompts that require carefully crafted examples. In contrast, our approach uses novel zero-shot prompts that guide the LLM to encode the required reasoning into a Python program or a domain specific language. The generated program is then executed by a program interpreter, thus mitigating the limitations of LLM in performing accurate arithmetic calculations. We evaluate the proposed approach on three financial datasets using some of the recently developed generative pretrained transformer (GPT) models and perform comparisons with various zero-shot baselines. The experimental results demonstrate that our approach significantly improves the accuracy for all the LLMs over their respective baselines. We provide a detailed analysis of the results, generating insights to support our findings. The success of our approach demonstrates the enormous potential to extract complex domain specific numerical reasoning by designing zero-shot prompts to effectively exploit the knowledge embedded in LLMs. △ Less

Submitted 19 November, 2023; originally announced November 2023.

arXiv:2109.13662 [pdf, ps, other]

DeepPSL: End-to-end perception and reasoning

Authors: Sridhar Dasaratha, Sai Akhil Puranam, Karmvir Singh Phogat, Sunil Reddy Tiyyagura, Nigel P. Duffy

Abstract: We introduce DeepPSL a variant of probabilistic soft logic (PSL) to produce an end-to-end trainable system that integrates reasoning and perception. PSL represents first-order logic in terms of a convex graphical model -- hinge-loss Markov random fields (HL-MRFs). PSL stands out among probabilistic logic frameworks due to its tractability having been applied to systems of more than 1 billion groun… ▽ More We introduce DeepPSL a variant of probabilistic soft logic (PSL) to produce an end-to-end trainable system that integrates reasoning and perception. PSL represents first-order logic in terms of a convex graphical model -- hinge-loss Markov random fields (HL-MRFs). PSL stands out among probabilistic logic frameworks due to its tractability having been applied to systems of more than 1 billion ground rules. The key to our approach is to represent predicates in first-order logic using deep neural networks and then to approximately back-propagate through the HL-MRF and thus train every aspect of the first-order system being represented. We believe that this approach represents an interesting direction for the integration of deep learning and reasoning techniques with applications to knowledge base learning, multi-task learning, and explainability. Evaluation on three different tasks demonstrates that DeepPSL significantly outperforms state-of-the-art neuro-symbolic methods on scalability while achieving comparable or better accuracy. △ Less

Submitted 4 February, 2023; v1 submitted 28 September, 2021; originally announced September 2021.

arXiv:2010.01290 [pdf, other]

Tracking Controller Design for Satellite Attitude Under Unknown Constant Disturbance Using Stable Embedding

Authors: Wonshick Ko, Karmvir Singh Phogat, Nicolas Petit, Dong Eui Chang

Abstract: We propose a tracking control law for the fully actuated rigid body system in the presence of any unknown constant disturbance by employing quaternions with the stable embedding technique and Lyapunov stability theory. The stable embedding technique extends the attitude dynamics from the set of unit quaternions to the set of quaternions, which is a Euclidean space, such that the set of unit quater… ▽ More We propose a tracking control law for the fully actuated rigid body system in the presence of any unknown constant disturbance by employing quaternions with the stable embedding technique and Lyapunov stability theory. The stable embedding technique extends the attitude dynamics from the set of unit quaternions to the set of quaternions, which is a Euclidean space, such that the set of unit quaternions is an invariant set of the extended dynamics. Such a stable extension of the system dynamics to a Euclidean space allows us to employ well studied Lyapunov techniques in Euclidean spaces such as LaSalle-Yoshizawa's theorem. A robust tracking control law is proposed for the attitude dynamics subject to unknown constant disturbance and the convergence properties of the tracking control law is rigorously proven. It is demonstrated with the help of numerical simulations that the proposed control law has a remarkable performance even in some challenging situations. △ Less

Submitted 3 October, 2020; originally announced October 2020.

arXiv:1912.12580 [pdf, ps, other]

Invariant extended Kalman filter on matrix Lie groups

Authors: Karmvir Singh Phogat, Dong Eui Chang

Abstract: We derive symmetry preserving invariant extended Kalman filters (IEKF) on matrix Lie groups. These Kalman filters have an advantage over conventional extended Kalman filters as the error dynamics for such filters are independent of the group configuration which, in turn, provides a uniform estimate of the region of convergence. The proposed IEKF differs from existing techniques in literature on th… ▽ More We derive symmetry preserving invariant extended Kalman filters (IEKF) on matrix Lie groups. These Kalman filters have an advantage over conventional extended Kalman filters as the error dynamics for such filters are independent of the group configuration which, in turn, provides a uniform estimate of the region of convergence. The proposed IEKF differs from existing techniques in literature on the account that it is derived using minimal tools from differential geometry that simplifies its representation and derivation to a large extent. The filter error dynamics is defined on the Lie algebra directly instead of identifying the Lie algebra with an Euclidean space or defining the error dynamics in local coordinates using exponential map, and the associated differential Riccati equations are described on the corresponding space of linear operators using tensor algebra. The proposed filter is implemented for the attitude dynamics of the rigid body, which is a benchmark problem in control, and its performance is compared against a conventional extended Kalman filter (EKF). Numerical experiments support that the IEKF is computationally less intensive and gives better performance than the EKF. △ Less

Submitted 28 December, 2019; originally announced December 2019.

arXiv:1910.05669 [pdf, ps, other]

doi 10.1109/TAC.2019.2946231

Model Predictive Tracking Control for Invariant Systems on Matrix Lie Groups via Stable Embedding into Euclidean Spaces

Authors: Dong Eui Chang, Karmvir Singh Phogat, Jongeun Choi

Abstract: For controller design for systems on manifolds embedded in Euclidean space, it is convenient to utilize a theory that requires a single global coordinate system on the ambient Euclidean space rather than multiple local charts on the manifold or coordinate-free tools from differential geometry. In this article, we apply such a theory to design model predictive tracking controllers for systems whose… ▽ More For controller design for systems on manifolds embedded in Euclidean space, it is convenient to utilize a theory that requires a single global coordinate system on the ambient Euclidean space rather than multiple local charts on the manifold or coordinate-free tools from differential geometry. In this article, we apply such a theory to design model predictive tracking controllers for systems whose dynamics evolve on manifolds and illustrate its efficacy with the fully actuated rigid body attitude control system. △ Less

Submitted 12 October, 2019; originally announced October 2019.

arXiv:1902.01038 [pdf, ps, other]

Exact isoholonomic motion of the planar Purcell's swimmer

Authors: Sudin Kadam, Karmvir Singh Phogat, Ravi N. Banavar, Debasish Chatterjee

Abstract: In this article we present the discrete-time isoholonomic problem of the planar Purcell's swimmer and solve it using the Discrete-time Pontryagin maximum principle. The 3-link Purcell's swimmer is a locomotion system moving in a low Reynolds number environment. The kinematics of the system evolves on a principal fiber bundle. A structure preserving discrete-time kinematic model of the system is ob… ▽ More In this article we present the discrete-time isoholonomic problem of the planar Purcell's swimmer and solve it using the Discrete-time Pontryagin maximum principle. The 3-link Purcell's swimmer is a locomotion system moving in a low Reynolds number environment. The kinematics of the system evolves on a principal fiber bundle. A structure preserving discrete-time kinematic model of the system is obtained in terms of the local form of a discrete connection. An adapted version of the Discrete Maximum Principle on matrix Lie groups is then employed to come up with the necessary optimality conditions for an optimal state transfer while minimizing the control effort. These necessary conditions appear as a two-point boundary value problem and are solved using a numerical technique. Results from numerical experiments are presented to illustrate the algorithm. △ Less

Submitted 4 February, 2019; originally announced February 2019.

arXiv:1811.12819 [pdf, other]

Structure-Preserving Constrained Optimal Trajectory Planning of a Wheeled Inverted Pendulum

Authors: Klaus Albert, Karmvir Singh Phogat, Felix Anhalt, Ravi N Banavar, Debasish Chatterjee, Boris Lohmann

Abstract: The Wheeled Inverted Pendulum (WIP) is an underactuated, nonholonomic mechatronic system, and has been popularized commercially as the Segway. Designing a control law for motion planning, that incorporates the state and control constraints, while respecting the configuration manifold, is a challenging problem. In this article we derive a discrete-time model of the WIP system using discrete mechani… ▽ More The Wheeled Inverted Pendulum (WIP) is an underactuated, nonholonomic mechatronic system, and has been popularized commercially as the Segway. Designing a control law for motion planning, that incorporates the state and control constraints, while respecting the configuration manifold, is a challenging problem. In this article we derive a discrete-time model of the WIP system using discrete mechanics and generate optimal trajectories for the WIP system by solving a discrete-time constrained optimal control problem. Further, we describe a nonlinear continuous-time model with parameters for designing a closed loop LQ-controller. A dual control architecture is implemented in which the designed optimal trajectory is then provided as a reference to the robot with the optimal control trajectory as a feedforward control action, and an LQ-controller in the feedback mode is employed to mitigate noise and disturbances for ensuing stable motion of the WIP system. While performing experiments on the WIP system involving aggressive maneuvers with fairly sharp turns, we found a high degree of congruence in the designed optimal trajectories and the path traced by the robot while tracking these trajectories. This corroborates the validity of the nonlinear model and the control scheme. Finally, these experiments demonstrate the highly nonlinear nature of the WIP system and robustness of the control scheme. △ Less

Submitted 2 October, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

Comments: 12 pages, 8 figures, 1 table. arXiv admin note: text overlap with arXiv:1710.10932

arXiv:1803.03052 [pdf, ps, other]

A frequency-constrained geometric Pontryagin maximum principle on matrix Lie groups

Authors: Shruti Kotpalliwar, Pradyumna Paruchuri, Karmvir Singh Phogat, Debasish Chatterjee, Ravi Banavar

Abstract: In this article we present a geometric discrete-time Pontryagin maximum principle (PMP) on matrix Lie groups that incorporates frequency constraints on the controls in addition to pointwise constraints on the states and control actions directly at the stage of the problem formulation. This PMP gives first order necessary conditions for optimality, and leads to two-point boundary value problems tha… ▽ More In this article we present a geometric discrete-time Pontryagin maximum principle (PMP) on matrix Lie groups that incorporates frequency constraints on the controls in addition to pointwise constraints on the states and control actions directly at the stage of the problem formulation. This PMP gives first order necessary conditions for optimality, and leads to two-point boundary value problems that may be solved by shooting techniques to arrive at optimal trajectories. We validate our theoretical results with a numerical experiment on the attitude control of a spacecraft on the Lie group SO(3). △ Less

Submitted 27 March, 2019; v1 submitted 8 March, 2018; originally announced March 2018.

arXiv:1710.10932 [pdf, other]

Structure-preserving discrete-time optimal maneuvers of a wheeled inverted pendulum

Authors: Karmvir Singh Phogat, Ravi Banavar, Debasish Chatterjee

Abstract: The Wheeled Inverted Pendulum (WIP) is a nonholonomic, underactuated mechanical system, and has been popularized commercially as the {\it Segway}. Designing optimal control laws for point-to-point state-transfer for this autonomous mechanical system, while respecting momentum and torque constraints as well as the underlying manifold, continues to pose challenging problems. In this article we prese… ▽ More The Wheeled Inverted Pendulum (WIP) is a nonholonomic, underactuated mechanical system, and has been popularized commercially as the {\it Segway}. Designing optimal control laws for point-to-point state-transfer for this autonomous mechanical system, while respecting momentum and torque constraints as well as the underlying manifold, continues to pose challenging problems. In this article we present a successful effort in this direction: We employ geometric mechanics to obtain a discrete-time model of the system, followed by the synthesis of an energy-optimal control based on a discrete-time maximum principle applicable to mechanical systems whose configuration manifold is a Lie group. Moreover, we incorporate state and momentum constraints into the discrete-time control directly at the synthesis stage. The control is implemented on a WIP with parameters obtained from an existing prototype; the results are highly encouraging, as demonstrated by numerical experiments. △ Less

Submitted 8 November, 2017; v1 submitted 30 October, 2017; originally announced October 2017.

arXiv:1612.08022 [pdf, other]

A discrete-time Pontryagin maximum principle on matrix Lie groups

Authors: Karmvir Singh Phogat, Debasish Chatterjee, Ravi Banavar

Abstract: In this article we derive a Pontryagin maximum principle (PMP) for discrete-time optimal control problems on matrix Lie groups. The PMP provides first order necessary conditions for optimality; these necessary conditions typically yield two point boundary value problems, and these boundary value problems can then solved to extract optimal control trajectories. Constrained optimal control problems… ▽ More In this article we derive a Pontryagin maximum principle (PMP) for discrete-time optimal control problems on matrix Lie groups. The PMP provides first order necessary conditions for optimality; these necessary conditions typically yield two point boundary value problems, and these boundary value problems can then solved to extract optimal control trajectories. Constrained optimal control problems for mechanical systems, in general, can only be solved numerically, and this motivates the need to derive discrete-time models that are accurate and preserve the non-flat manifold structures of the underlying continuous-time controlled systems. The PMPs for discrete-time systems evolving on Euclidean spaces are not readily applicable to discrete-time models evolving on non-flat manifolds. In this article we bridge this lacuna and establish a discrete-time PMP on matrix Lie groups. Our discrete-time models are derived via discrete mechanics, (a structure preserving discretization scheme,) leading to the preservation of the underlying manifold over time, thereby resulting in greater numerical accuracy of our technique. This PMP caters to a class of constrained optimal control problems that includes point-wise state and control action constraints, and encompasses a large class of control problems that arise in various field of engineering and the applied sciences. △ Less

Submitted 6 August, 2018; v1 submitted 23 December, 2016; originally announced December 2016.

Comments: We have rectified the error in equation (3.21) and made subsequent changes

arXiv:1509.04521 [pdf, ps, other]

doi 10.2514/1.G002861

Discrete-time optimal attitude control of spacecraft with momentum and control constraints

Authors: Karmvir Singh Phogat, Debasish Chatterjee, Ravi Banavar

Abstract: This article solves an optimal control problem arising in attitude control of a spacecraft under state and control constraints. We first derive the discrete-time attitude dynamics by employing discrete mechanics. The orientation transfer, with initial and final values of the orientation and momentum and the time duration being specified, is posed as an energy optimal control problem in discrete-ti… ▽ More This article solves an optimal control problem arising in attitude control of a spacecraft under state and control constraints. We first derive the discrete-time attitude dynamics by employing discrete mechanics. The orientation transfer, with initial and final values of the orientation and momentum and the time duration being specified, is posed as an energy optimal control problem in discrete-time subject to momentum and control constraints. Using variational analysis directly on the Lie group SO(3), we derive first order necessary conditions for optimality that leads to a constrained two point boundary value problem. This two point boundary value problem is solved via a novel multiple shooting technique that employs a root finding Newton algorithm. Robustness of the multiple shooting technique is demonstrated through a few representative numerical experiments. △ Less

Submitted 29 May, 2016; v1 submitted 15 September, 2015; originally announced September 2015.

MSC Class: 49J21; 49N90

Showing 1–12 of 12 results for author: Phogat, K S