-
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
Authors:
Tiasa Singha Roy,
Aditeya Baral,
Ayush Rajesh Jhaveri,
Yusuf Baig
Abstract:
Large language models (LLMs) demonstrate considerable potential in various natural language tasks but face significant challenges in mathematical reasoning, particularly in executing precise, multi-step logic. However, current evaluation frameworks judge their performance solely based on accuracy, which only accounts for the final answer. This study explores these pitfalls by employing a novel eva…
▽ More
Large language models (LLMs) demonstrate considerable potential in various natural language tasks but face significant challenges in mathematical reasoning, particularly in executing precise, multi-step logic. However, current evaluation frameworks judge their performance solely based on accuracy, which only accounts for the final answer. This study explores these pitfalls by employing a novel evaluation framework. We propose an evaluation metric called the MAPLE score, which holistically quantifies reasoning misalignment by integrating error rates, redundancy, and validity.
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
CMLFormer: A Dual Decoder Transformer with Switching Point Learning for Code-Mixed Language Modeling
Authors:
Aditeya Baral,
Allen George Ajith,
Roshan Nayak,
Mrityunjay Abhijeet Bhanja
Abstract:
Code-mixed languages, characterized by frequent within-sentence language transitions, present structural challenges that standard language models fail to address. In this work, we propose CMLFormer, an enhanced multi-layer dual-decoder Transformer with a shared encoder and synchronized decoder cross-attention, designed to model the linguistic and semantic dynamics of code-mixed text. CMLFormer is…
▽ More
Code-mixed languages, characterized by frequent within-sentence language transitions, present structural challenges that standard language models fail to address. In this work, we propose CMLFormer, an enhanced multi-layer dual-decoder Transformer with a shared encoder and synchronized decoder cross-attention, designed to model the linguistic and semantic dynamics of code-mixed text. CMLFormer is pre-trained on an augmented Hinglish corpus with switching point and translation annotations with multiple new objectives specifically aimed at capturing switching behavior, cross-lingual structure, and code-mixing complexity. Our experiments show that CMLFormer improves F1 score, precision, and accuracy over other approaches on the HASOC-2021 benchmark under select pre-training setups. Attention analyses further show that it can identify and attend to switching points, validating its sensitivity to code-mixed structure. These results demonstrate the effectiveness of CMLFormer's architecture and multi-task pre-training strategy for modeling code-mixed languages.
△ Less
Submitted 18 May, 2025;
originally announced May 2025.
-
Municipal cyber risk modeling using cryptographic computing to inform cyber policymaking
Authors:
Avital Baral,
Taylor Reynolds,
Lawrence Susskind,
Daniel J. Weitzner,
Angelina Wu
Abstract:
Municipalities are vulnerable to cyberattacks with devastating consequences, but they lack key information to evaluate their own risk and compare their security posture to peers. Using data from 83 municipalities collected via a cryptographically secure computation platform about their security posture, incidents, security control failures, and losses, we build data-driven cyber risk models and cy…
▽ More
Municipalities are vulnerable to cyberattacks with devastating consequences, but they lack key information to evaluate their own risk and compare their security posture to peers. Using data from 83 municipalities collected via a cryptographically secure computation platform about their security posture, incidents, security control failures, and losses, we build data-driven cyber risk models and cyber security benchmarks for municipalities. We produce benchmarks of the security posture in a sector, the frequency of cyber incidents, forecasted annual losses for organizations based on their defensive posture, and a weighting of cyber controls based on their individual failure rates and associated losses. Combined, these four items can help guide cyber policymaking by quantifying the cyber risk in a sector, identifying gaps that need to be addressed, prioritizing policy interventions, and tracking progress of those interventions over time. In the case of the municipalities, these newly derived risk measures highlight the need for continuous measured improvement of cybersecurity readiness, show clear areas of weakness and strength, and provide governments with some early targets for policy focus such as security education, incident response, and focusing efforts first on municipalities at the lowest security levels that have the highest risk reduction per security dollar invested.
△ Less
Submitted 5 February, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Maximum-Width Rainbow-Bisecting Empty Annulus
Authors:
Sang Won Bae,
Sandip Banerjee,
Arpita Baral,
Priya Ranjan Sinha Mahapatra,
Sang Duk Yoon
Abstract:
Given a set of $n$ colored points with $k$ colors in the plane, we study the problem of computing a maximum-width rainbow-bisecting empty annulus (of objects specifically axis-parallel square, axis-parallel rectangle and circle) problem. We call a region rainbow if it contains at least one point of each color. The maximum-width rainbow-bisecting empty annulus problem asks to find an annulus $A$ of…
▽ More
Given a set of $n$ colored points with $k$ colors in the plane, we study the problem of computing a maximum-width rainbow-bisecting empty annulus (of objects specifically axis-parallel square, axis-parallel rectangle and circle) problem. We call a region rainbow if it contains at least one point of each color. The maximum-width rainbow-bisecting empty annulus problem asks to find an annulus $A$ of a particular shape with maximum possible width such that $A$ does not contain any input points and it bisects the input point set into two parts, each of which is a rainbow. We compute a maximum-width rainbow-bisecting empty axis-parallel square, axis-parallel rectangular and circular annulus in $O(n^3)$ time using $O(n)$ space, in $O(k^2n^2\log n)$ time using $O(n\log n)$ space and in $O(n^3)$ time using $O(n^2)$ space respectively.
△ Less
Submitted 26 March, 2024; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Maximum-Width Empty Square and Rectangular Annulus
Authors:
Sang Won Bae,
Arpita Baral,
Priya Ranjan Sinha Mahapatra
Abstract:
An annulus is, informally, a ring-shaped region, often described by two concentric circles. The maximum-width empty annulus problem asks to find an annulus of a certain shape with the maximum possible width that avoids a given set of $n$ points in the plane. This problem can also be interpreted as the problem of finding an optimal location of a ring-shaped obnoxious facility among the input points…
▽ More
An annulus is, informally, a ring-shaped region, often described by two concentric circles. The maximum-width empty annulus problem asks to find an annulus of a certain shape with the maximum possible width that avoids a given set of $n$ points in the plane. This problem can also be interpreted as the problem of finding an optimal location of a ring-shaped obnoxious facility among the input points. In this paper, we study square and rectangular variants of the maximum-width empty anuulus problem, and present first nontrivial algorithms. Specifically, our algorithms run in $O(n^3)$ and $O(n^2 \log n)$ time for computing a maximum-width empty axis-parallel square and rectangular annulus, respectively. Both algorithms use only $O(n)$ space.
△ Less
Submitted 15 November, 2018;
originally announced November 2018.
-
Maximum-width Axis-Parallel Empty Rectangular Annulus
Authors:
Arpita Baral,
Abhilash Gondane,
Sanjib Sadhu,
Priya Ranjan Sinha Mahapatra
Abstract:
Given a set $P$ of $n$ points on $\mathbb R^{2}$, we address the problem of computing an axis-parallel empty rectangular annulus $A$ of maximum-width such that no point of $P$ lies inside $A$ but all points of $P$ must lie inside, outside and on the boundaries of two parallel rectangles forming the annulus $A$. We propose an $O(n^3)$ time and $O(n)$ space algorithm to solve the problem. In a parti…
▽ More
Given a set $P$ of $n$ points on $\mathbb R^{2}$, we address the problem of computing an axis-parallel empty rectangular annulus $A$ of maximum-width such that no point of $P$ lies inside $A$ but all points of $P$ must lie inside, outside and on the boundaries of two parallel rectangles forming the annulus $A$. We propose an $O(n^3)$ time and $O(n)$ space algorithm to solve the problem. In a particular case when the inner rectangle of an axis-parallel empty rectangular annulus reduces to an input point we can solve the problem in $O(n \log n)$ time and $O(n)$ space.
△ Less
Submitted 1 December, 2017;
originally announced December 2017.