Showing 1–2 of 2 results for author: Padarha, S
-
Enhancing Reasoning Capabilities in SLMs with Reward Guided Dataset Distillation
Authors:
Shreyansh Padarha
Abstract:
The push to compress and impart the proficiency of Large Language Models (LLMs) into more deployable and efficient Small Language Models (SLMs) has benefited from improvements in knowledge distillation (KD) techniques. These techniques allow a smaller student model to learn from a more capable and larger teacher model's responses. However, distillation often revolves around the student model merel…
▽ More
The push to compress and impart the proficiency of Large Language Models (LLMs) into more deployable and efficient Small Language Models (SLMs) has benefited from improvements in knowledge distillation (KD) techniques. These techniques allow a smaller student model to learn from a more capable and larger teacher model's responses. However, distillation often revolves around the student model merely copying the teacher's in-distribution responses, limiting its generalisability. This limitation is amplified on reasoning tasks and can be computationally expensive. In this study, we propose AdvDistill, a reward-guided dataset distillation framework. We utilise multiple generations (responses) from a teacher for each prompt and assign rewards based on rule-based verifiers. These varying and normally distributed rewards serve as weights when training student models. Our methods and their subsequent behavioural analysis demonstrate a significant improvement in student model performance for mathematical and complex reasoning tasks, showcasing the efficacy and benefits of incorporating a rewarding mechanism in dataset distillation processes.
△ Less
Submitted 25 June, 2025;
originally announced July 2025.
-
Data-Driven Dystopia: an uninterrupted breach of ethics
Authors:
Shreyansh Padarha
Abstract:
This article discusses the risks and complexities associated with the exponential rise in data and the misuse of data by large corporations. The article presents instances of data breaches and data harvesting practices that violate user privacy. It also explores the concept of "Weapons Of Math Destruction" (WMDs), which refers to big data models that perpetuate inequality and discrimination. The a…
▽ More
This article discusses the risks and complexities associated with the exponential rise in data and the misuse of data by large corporations. The article presents instances of data breaches and data harvesting practices that violate user privacy. It also explores the concept of "Weapons Of Math Destruction" (WMDs), which refers to big data models that perpetuate inequality and discrimination. The article highlights the need for companies to take responsibility for safeguarding user information and the ethical use of data models, AI, and ML. The article also emphasises the significance of data privacy for individuals in their daily lives and the need for a more conscious and responsible approach towards data management.
△ Less
Submitted 13 May, 2023;
originally announced May 2023.