-
Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English
Authors:
Avinash Anand,
Kritarth Prasad,
Chhavi Kirtani,
Ashwin R Nair,
Manvendra Kumar Nema,
Raj Jaiswal,
Rajiv Ratn Shah
Abstract:
Large Language Models (LLMs) excel in linguistic tasks but struggle with mathematical reasoning, particularly in non English languages like Hindi. This research aims to enhance the mathematical reasoning skills of smaller, resource efficient open-source LLMs in both Hindi and English. We evaluate models like OpenHathi 7B, LLaMA-2 7B, WizardMath 7B, Mistral 7B, LLeMMa 7B, MAmmoTH 7B, Gemini Pro, an…
▽ More
Large Language Models (LLMs) excel in linguistic tasks but struggle with mathematical reasoning, particularly in non English languages like Hindi. This research aims to enhance the mathematical reasoning skills of smaller, resource efficient open-source LLMs in both Hindi and English. We evaluate models like OpenHathi 7B, LLaMA-2 7B, WizardMath 7B, Mistral 7B, LLeMMa 7B, MAmmoTH 7B, Gemini Pro, and GPT-4 using zero-shot, few-shot chain-of-thought (CoT) methods, and supervised fine-tuning. Our approach incorporates curriculum learning, progressively training models on increasingly difficult problems, a novel Decomposition Strategy to simplify complex arithmetic operations, and a Structured Solution Design that divides solutions into phases. Our experiments result in notable performance enhancements. WizardMath 7B exceeds Gemini's accuracy on English datasets by +6% and matches Gemini's performance on Hindi datasets. Adopting a bilingual approach that combines English and Hindi samples achieves results comparable to individual language models, demonstrating the capability to learn mathematical reasoning in both languages. This research highlights the potential for improving mathematical reasoning in open-source LLMs.
△ Less
Submitted 24 December, 2024;
originally announced December 2024.
-
Enhancing LLMs for Physics Problem-Solving using Reinforcement Learning with Human-AI Feedback
Authors:
Avinash Anand,
Kritarth Prasad,
Chhavi Kirtani,
Ashwin R Nair,
Mohit Gupta,
Saloni Garg,
Anurag Gautam,
Snehal Buldeo,
Rajiv Ratn Shah
Abstract:
Large Language Models (LLMs) have demonstrated strong capabilities in text-based tasks but struggle with the complex reasoning required for physics problems, particularly in advanced arithmetic and conceptual understanding. While some research has explored ways to enhance LLMs in physics education using techniques such as prompt engineering and Retrieval Augmentation Generation (RAG), not enough e…
▽ More
Large Language Models (LLMs) have demonstrated strong capabilities in text-based tasks but struggle with the complex reasoning required for physics problems, particularly in advanced arithmetic and conceptual understanding. While some research has explored ways to enhance LLMs in physics education using techniques such as prompt engineering and Retrieval Augmentation Generation (RAG), not enough effort has been made in addressing their limitations in physics reasoning. This paper presents a novel approach to improving LLM performance on physics questions using Reinforcement Learning with Human and Artificial Intelligence Feedback (RLHAIF). We evaluate several reinforcement learning methods, including Proximal Policy Optimization (PPO), Direct Preference Optimization (DPO), and Remax optimization. These methods are chosen to investigate RL policy performance with different settings on the PhyQA dataset, which includes challenging physics problems from high school textbooks. Our RLHAIF model, tested on leading LLMs like LLaMA2 and Mistral, achieved superior results, notably with the MISTRAL-PPO model, demonstrating marked improvements in reasoning and accuracy. It achieved high scores, with a 58.67 METEOR score and a 0.74 Reasoning score, making it a strong example for future physics reasoning research in this area.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Multiplierless In-filter Computing for tinyML Platforms
Authors:
Abhishek Ramdas Nair,
Pallab Kumar Nath,
Shantanu Chakrabartty,
Chetan Singh Thakur
Abstract:
Wildlife conservation using continuous monitoring of environmental factors and biomedical classification, which generate a vast amount of sensor data, is a challenge due to limited bandwidth in the case of remote monitoring. It becomes critical to have classification where data is generated, and only classified data is used for monitoring. We present a novel multiplierless framework for in-filter…
▽ More
Wildlife conservation using continuous monitoring of environmental factors and biomedical classification, which generate a vast amount of sensor data, is a challenge due to limited bandwidth in the case of remote monitoring. It becomes critical to have classification where data is generated, and only classified data is used for monitoring. We present a novel multiplierless framework for in-filter acoustic classification using Margin Propagation (MP) approximation used in low-power edge devices deployable in remote areas with limited connectivity. The entire design of this classification framework is based on template-based kernel machine, which include feature extraction and inference, and uses basic primitives like addition/subtraction, shift, and comparator operations, for hardware implementation. Unlike full precision training methods for traditional classification, we use MP-based approximation for training, including backpropagation mitigating approximation errors. The proposed framework is general enough for acoustic classification. However, we demonstrate the hardware friendliness of this framework by implementing a parallel Finite Impulse Response (FIR) filter bank in a kernel machine classifier optimized for a Field Programmable Gate Array (FPGA). The FIR filter acts as the feature extractor and non-linear kernel for the kernel machine implemented using MP approximation and a downsampling method to reduce the order of the filters. The FPGA implementation on Spartan 7 shows that the MP-approximated in-filter kernel machine is more efficient than traditional classification frameworks with just less than 1K slices.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Band Gap Tuning of DC Reactively Sputtered ZnON Thin Films
Authors:
Kiran Jose,
J. G. Anjana,
Venu Anand,
Aswathi R. Nair
Abstract:
Zinc oxynitride (ZnO$_x$N$_y$) has recently emerged as a highly promising band gap-tunable semiconductor material for optoelectronic applications. In this study, a novel DC reactive sputtering protocol was developed to fabricate ZnO$_x$N$_y$ films with varying elemental concentrations, by precisely controlling the working pressure. The band gap was rigorously analyzed using UV-Visible spectroscopy…
▽ More
Zinc oxynitride (ZnO$_x$N$_y$) has recently emerged as a highly promising band gap-tunable semiconductor material for optoelectronic applications. In this study, a novel DC reactive sputtering protocol was developed to fabricate ZnO$_x$N$_y$ films with varying elemental concentrations, by precisely controlling the working pressure. The band gap was rigorously analyzed using UV-Visible spectroscopy, which was complemented by EDAX spectroscopy to determine the variations in the elemental composition. The correlation between the microstructure and band gap was investigated through the application of AFM, XRD, and Raman spectroscopy, while the Urbach theorem was used to evaluate the defect states. This study revealed the existence of intermediate structures formed during the tuning of the band gap, which can have important implications for future research aimed at developing heterostructures and 2D superlattices for photonics applications.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
In-filter Computing For Designing Ultra-light Acoustic Pattern Recognizers
Authors:
Abhishek Ramdas Nair,
Shantanu Chakrabartty,
Chetan Singh Thakur
Abstract:
We present a novel in-filter computing framework that can be used for designing ultra-light acoustic classifiers for use in smart internet-of-things (IoTs). Unlike a conventional acoustic pattern recognizer, where the feature extraction and classification are designed independently, the proposed architecture integrates the convolution and nonlinear filtering operations directly into the kernels of…
▽ More
We present a novel in-filter computing framework that can be used for designing ultra-light acoustic classifiers for use in smart internet-of-things (IoTs). Unlike a conventional acoustic pattern recognizer, where the feature extraction and classification are designed independently, the proposed architecture integrates the convolution and nonlinear filtering operations directly into the kernels of a Support Vector Machine (SVM). The result of this integration is a template-based SVM whose memory and computational footprint (training and inference) is light enough to be implemented on an FPGA-based IoT platform. While the proposed in-filter computing framework is general enough, in this paper, we demonstrate this concept using a Cascade of Asymmetric Resonator with Inner Hair Cells (CAR-IHC) based acoustic feature extraction algorithm. The complete system has been optimized using time-multiplexing and parallel-pipeline techniques for a Xilinx Spartan 7 series Field Programmable Gate Array (FPGA). We show that the system can achieve robust classification performance on benchmark sound recognition tasks using only ~ 1.5k Look-Up Tables (LUTs) and ~ 2.8k Flip-Flops (FFs), a significant improvement over other approaches.
△ Less
Submitted 11 September, 2021;
originally announced September 2021.
-
Multiplierless MP-Kernel Machine For Energy-efficient Edge Devices
Authors:
Abhishek Ramdas Nair,
Pallab Kumar Nath,
Shantanu Chakrabartty,
Chetan Singh Thakur
Abstract:
We present a novel framework for designing multiplierless kernel machines that can be used on resource-constrained platforms like intelligent edge devices. The framework uses a piecewise linear (PWL) approximation based on a margin propagation (MP) technique and uses only addition/subtraction, shift, comparison, and register underflow/overflow operations. We propose a hardware-friendly MP-based in…
▽ More
We present a novel framework for designing multiplierless kernel machines that can be used on resource-constrained platforms like intelligent edge devices. The framework uses a piecewise linear (PWL) approximation based on a margin propagation (MP) technique and uses only addition/subtraction, shift, comparison, and register underflow/overflow operations. We propose a hardware-friendly MP-based inference and online training algorithm that has been optimized for a Field Programmable Gate Array (FPGA) platform. Our FPGA implementation eliminates the need for DSP units and reduces the number of LUTs. By reusing the same hardware for inference and training, we show that the platform can overcome classification errors and local minima artifacts that result from the MP approximation. The implementation of this proposed multiplierless MP-kernel machine on FPGA results in an estimated energy consumption of 13.4 pJ and power consumption of 107 mW with ~9k LUTs and FFs each for a 256 x 32 sized kernel making it superior in terms of power, performance, and area compared to other comparable implementations.
△ Less
Submitted 9 September, 2022; v1 submitted 3 June, 2021;
originally announced June 2021.
-
Neuromorphic In-Memory Computing Framework using Memtransistor Cross-bar based Support Vector Machines
Authors:
P. Kumar,
A. R. Nair,
O. Chatterjee,
T. Paul,
A. Ghosh,
S. Chakrabartty,
C. S. Thakur
Abstract:
This paper presents a novel framework for designing support vector machines (SVMs), which does not impose restriction on the SVM kernel to be positive-definite and allows the user to define memory constraint in terms of fixed template vectors. This makes the framework scalable and enables its implementation for low-power, high-density and memory constrained embedded application. An efficient hardw…
▽ More
This paper presents a novel framework for designing support vector machines (SVMs), which does not impose restriction on the SVM kernel to be positive-definite and allows the user to define memory constraint in terms of fixed template vectors. This makes the framework scalable and enables its implementation for low-power, high-density and memory constrained embedded application. An efficient hardware implementation of the same is also discussed, which utilizes novel low power memtransistor based cross-bar architecture, and is robust to device mismatch and randomness. We used memtransistor measurement data, and showed that the designed SVMs can achieve classification accuracy comparable to traditional SVMs on both synthetic and real-world benchmark datasets. This framework would be beneficial for design of SVM based wake-up systems for internet of things (IoTs) and edge devices where memtransistors can be used to optimize system's energy-efficiency and perform in-memory matrix-vector multiplication (MVM).
△ Less
Submitted 29 May, 2019; v1 submitted 28 March, 2019;
originally announced March 2019.