-
Operational Control of a Multi-energy District Heating System: Comparison of Model-Predictive Control and Rule-Based Control
Authors:
Michael Nikhil Descamps,
Nicolas Lamaison,
Mathieu Vallee,
Roland Baviere
Abstract:
This study focuses on operational control strategies for a multi-energy District Heating Network (DHN). Two control strategies are investigated and compared: (i) a reactive rule-based control (RBC) and (ii) a model predictive control (MPC). For the purpose of the study a small scale district heating network is modelled using Modelica. The production plant combines a heat pump, a gas boiler and a t…
▽ More
This study focuses on operational control strategies for a multi-energy District Heating Network (DHN). Two control strategies are investigated and compared: (i) a reactive rule-based control (RBC) and (ii) a model predictive control (MPC). For the purpose of the study a small scale district heating network is modelled using Modelica. The production plant combines a heat pump, a gas boiler and a thermal solar field on the production side with a storage tank for flexibility purposes. On the consumption side, the virtual buildings are aggregated into a single consumer. We use our co-simulation and control platform, called Pegase, to implement the studied strategies. For both strategies the goal is to meet the consumers' demand while satisfying technical constraints. In addition MPC has the objective to minimize the operational costs, taking into account variable electricity prices and availability of solar thermal resource. Different scenarios are also defined and compared to study the effect of the heat plant sizing and forecasting error. The operational cost is reduced when switching from RBC to a MPC. As can be expected, MPC is more efficient when dealing with variable energy costs, intermittent solar energy and storage capabilities. This study also demonstrates how our tools enable an easy coupling of Modelica-based simulation with various control strategies. It especially supports the implementation and validation of complex MPC strategies in an efficient way, and yearly simulations are performed within 20 minutes.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
An efficient co-simulation and control approach to tackle complex multi-domain energetic systems: concepts and applications of the PEGASE platform
Authors:
Mathieu Vallee,
Roland Baviere,
Valérie Seguin,
Valéry Vuillerme,
Nicolas Lamaison,
Michael Nikhil Descamps,
Antoine Aurousseau
Abstract:
In this paper, we present a novel research software, called PEGASE, suitable for the design, validation and deployment of advanced control strategies for complex multi-domain energy systems. PEGASE especially features a highly efficient cosimulation engine, together with integrated solutions for defining both rule-based control strategies and Model-Predictive Control (MPC). The main principle behi…
▽ More
In this paper, we present a novel research software, called PEGASE, suitable for the design, validation and deployment of advanced control strategies for complex multi-domain energy systems. PEGASE especially features a highly efficient cosimulation engine, together with integrated solutions for defining both rule-based control strategies and Model-Predictive Control (MPC). The main principle behind the PEGASE platform is divide-and-conquer. Indeed, rather than trying to solve a problem as a monolithic entity, which can be highly complex for multi-domain large-scale systems, it is often more efficient to decompose it into several domains or sub-problems, and to simulate them in a decoupled way. To provide its cosimulation capabilities, we based PEGASE on two main components. The first one is a framework for integrating simulation models, which can be either compatible with the FMI standard or interfaced through an Application Programming Interface (API). The second one is a multi-threaded sequencer enabling several simulation sequences with different time steps. To provide advanced control capabilities, we also equipped PEGASE with a framework for MPC combining a comprehensive management of predictions data and a modeler dedicated to the formulation of Mixed Integer Linear Programs. We implemented this framework in C++ providing low formulation and resolution times for typical applications. Connection to hardware is also available via standard industry protocols thereby allowing PEGASE to control real energy systems. In this paper, we show how these basic functionalities, combined with dedicated modeling tools, enable setting up simulation and control applications suitable for tackling the complexity of various kinds of energy systems. To illustrate this, we present four application examples from our recent research work. These examples cover several domains, from concentrated solar thermal plants to optimal control of district heating networks. The variety of examples demonstrates the robustness and genericity of the approach.
△ Less
Submitted 18 June, 2025;
originally announced June 2025.
-
A Note on Normal Families Concerning Nonexceptional Functions and Differential Polynomials
Authors:
Nikhil Bharti,
Anil Singh
Abstract:
We study normality of a family of meromorphic functions, whose differential polynomials satisfy a certain condition, which significantly improves and generalizes some recent results of Chen (Filomat, 31(14) 2017, 4665-4671). Moreover, we demonstrate, with the help of examples, that the result is sharp.
We study normality of a family of meromorphic functions, whose differential polynomials satisfy a certain condition, which significantly improves and generalizes some recent results of Chen (Filomat, 31(14) 2017, 4665-4671). Moreover, we demonstrate, with the help of examples, that the result is sharp.
△ Less
Submitted 11 June, 2025;
originally announced June 2025.
-
A MUSCL-Hancock scheme for non-local conservation laws
Authors:
Nikhil Manoj,
G. D. Veerappa Gowda,
Sudarshan Kumar K
Abstract:
In this article, we propose a MUSCL-Hancock-type second-order scheme for the discretization of a general class of non-local conservation laws and present its convergence analysis. The main difficulty in designing a MUSCL-Hancock-type scheme for non-local equations lies in the discretization of the convolution term, which we carefully formulate to ensure second-order accuracy and facilitate rigorou…
▽ More
In this article, we propose a MUSCL-Hancock-type second-order scheme for the discretization of a general class of non-local conservation laws and present its convergence analysis. The main difficulty in designing a MUSCL-Hancock-type scheme for non-local equations lies in the discretization of the convolution term, which we carefully formulate to ensure second-order accuracy and facilitate rigorous convergence analysis. We derive several essential estimates including $\mathrm{L}^\infty,$ bounded variation ($\mathrm{BV}$) and $\mathrm{L}^1$- Lipschitz continuity in time, which together with the Kolmogorov's compactness theorem yield the convergence of the approximate solutions to a weak solution. Further, by incorporating a mesh-dependent modification in the slope limiter, we establish convergence to the entropy solution. Numerical experiments are provided to validate the theoretical results and to demonstrate the improved accuracy of the proposed scheme over its first-order counterpart.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
optHIM: Hybrid Iterative Methods for Continuous Optimization in PyTorch
Authors:
Nikhil Sridhar,
Sajiv Shah
Abstract:
We introduce optHIM, an open-source library of continuous unconstrained optimization algorithms implemented in PyTorch for both CPU and GPU. By leveraging PyTorch's autograd, optHIM seamlessly integrates function, gradient, and Hessian information into flexible line-search and trust-region methods. We evaluate eleven state-of-the-art variants on benchmark problems spanning convex and non-convex la…
▽ More
We introduce optHIM, an open-source library of continuous unconstrained optimization algorithms implemented in PyTorch for both CPU and GPU. By leveraging PyTorch's autograd, optHIM seamlessly integrates function, gradient, and Hessian information into flexible line-search and trust-region methods. We evaluate eleven state-of-the-art variants on benchmark problems spanning convex and non-convex landscapes. Through a suite of quantitative metrics and qualitative analyses, we demonstrate each method's strengths and trade-offs. optHIM aims to democratize advanced optimization by providing a transparent, extensible, and efficient framework for research and education.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
Hardness of Finding Kings and Strong Kings
Authors:
Ziad Ismaili Alaoui,
Nikhil S. Mande
Abstract:
A king in a directed graph is a vertex $v$ such that every other vertex is reachable from $v$ via a path of length at most $2$. It is well known that every tournament (a complete graph where each edge has a direction) has at least one king. Our contributions in this work are:
- We show that the query complexity of determining existence of a king in arbitrary $n$-vertex digraphs is $Θ(n^2)$. This…
▽ More
A king in a directed graph is a vertex $v$ such that every other vertex is reachable from $v$ via a path of length at most $2$. It is well known that every tournament (a complete graph where each edge has a direction) has at least one king. Our contributions in this work are:
- We show that the query complexity of determining existence of a king in arbitrary $n$-vertex digraphs is $Θ(n^2)$. This is in stark contrast to the case where the input is a tournament, where Shen, Sheng, and Wu [SICOMP'03] showed that a king can be found in $O(n^{3/2})$ queries.
- In an attempt to increase the "fairness" in the definition of tournament winners, Ho and Chang [IPL'03] defined a strong king to be a king $k$ such that, for every $v$ that dominates $k$, the number of length-$2$ paths from $k$ to $v$ is strictly larger than the number of length-$2$ paths from $v$ to $k$. We show that the query complexity of finding a strong king in a tournament is $Θ(n^2)$. This answers a question of Biswas, Jayapaul, Raman, and Satti [DAM'22] in the negative.
A key component in our proofs is the design of specific tournaments where every vertex is a king, and analyzing certain properties of these tournaments. We feel these constructions and properties are independently interesting and may lead to more interesting results about tournament solutions.
△ Less
Submitted 27 April, 2025;
originally announced April 2025.
-
Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning
Authors:
Nikhil Shivakumar Nayak,
Krishnateja Killamsetty,
Ligong Han,
Abhishek Bhandwaldar,
Prateek Chanda,
Kai Xu,
Hao Wang,
Aldo Pareja,
Oleg Silkin,
Mustafa Eyceoz,
Akash Srivastava
Abstract:
Continual learning in large language models (LLMs) is prone to catastrophic forgetting, where adapting to new tasks significantly degrades performance on previously learned ones. Existing methods typically rely on low-rank, parameter-efficient updates that limit the model's expressivity and introduce additional parameters per task, leading to scalability issues. To address these limitations, we pr…
▽ More
Continual learning in large language models (LLMs) is prone to catastrophic forgetting, where adapting to new tasks significantly degrades performance on previously learned ones. Existing methods typically rely on low-rank, parameter-efficient updates that limit the model's expressivity and introduce additional parameters per task, leading to scalability issues. To address these limitations, we propose a novel continual full fine-tuning approach leveraging adaptive singular value decomposition (SVD). Our method dynamically identifies task-specific low-rank parameter subspaces and constrains updates to be orthogonal to critical directions associated with prior tasks, thus effectively minimizing interference without additional parameter overhead or storing previous task gradients. We evaluate our approach extensively on standard continual learning benchmarks using both encoder-decoder (T5-Large) and decoder-only (LLaMA-2 7B) models, spanning diverse tasks including classification, generation, and reasoning. Empirically, our method achieves state-of-the-art results, up to 7% higher average accuracy than recent baselines like O-LoRA, and notably maintains the model's general linguistic capabilities, instruction-following accuracy, and safety throughout the continual learning process by reducing forgetting to near-negligible levels. Our adaptive SVD framework effectively balances model plasticity and knowledge retention, providing a practical, theoretically grounded, and computationally scalable solution for continual learning scenarios in large language models.
△ Less
Submitted 9 April, 2025;
originally announced April 2025.
-
Mathematical Modeling of Option Pricing with an Extended Black-Scholes Framework
Authors:
Nikhil Shivakumar Nayak
Abstract:
This study investigates enhancing option pricing by extending the Black-Scholes model to include stochastic volatility and interest rate variability within the Partial Differential Equation (PDE). The PDE is solved using the finite difference method. The extended Black-Scholes model and a machine learning-based LSTM model are developed and evaluated for pricing Google stock options. Both models we…
▽ More
This study investigates enhancing option pricing by extending the Black-Scholes model to include stochastic volatility and interest rate variability within the Partial Differential Equation (PDE). The PDE is solved using the finite difference method. The extended Black-Scholes model and a machine learning-based LSTM model are developed and evaluated for pricing Google stock options. Both models were backtested using historical market data. While the LSTM model exhibited higher predictive accuracy, the finite difference method demonstrated superior computational efficiency. This work provides insights into model performance under varying market conditions and emphasizes the potential of hybrid approaches for robust financial modeling.
△ Less
Submitted 13 April, 2025; v1 submitted 4 April, 2025;
originally announced April 2025.
-
Optimal Erasure Codes and Codes on Graphs
Authors:
Yeyuan Chen,
Mahdi Cheraghchi,
Nikhil Shagrithaya
Abstract:
We construct constant-sized ensembles of linear error-correcting codes over any fixed alphabet that can correct a given fraction of adversarial erasures at rates approaching the Singleton bound arbitrarily closely. We provide several applications of our results:
1. Explicit constructions of strong linear seeded symbol-fixing extractors and lossless condensers, over any fixed alphabet, with only…
▽ More
We construct constant-sized ensembles of linear error-correcting codes over any fixed alphabet that can correct a given fraction of adversarial erasures at rates approaching the Singleton bound arbitrarily closely. We provide several applications of our results:
1. Explicit constructions of strong linear seeded symbol-fixing extractors and lossless condensers, over any fixed alphabet, with only a constant seed length and optimal output lengths;
2. A strongly explicit construction of erasure codes on bipartite graphs (more generally, linear codes on matrices of arbitrary dimensions) with optimal rate and erasure-correction trade-offs;
3. A strongly explicit construction of erasure codes on non-bipartite graphs (more generally, linear codes on symmetric square matrices) achieving improved rates;
4. A strongly explicit construction of linear nearly-MDS codes over constant-sized alphabets that can be encoded and decoded in quasi-linear time.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Graph Attention for Heterogeneous Graphs with Positional Encoding
Authors:
Nikhil Shivakumar Nayak
Abstract:
Graph Neural Networks (GNNs) have emerged as the de facto standard for modeling graph data, with attention mechanisms and transformers significantly enhancing their performance on graph-based tasks. Despite these advancements, the performance of GNNs on heterogeneous graphs often remains complex, with networks generally underperforming compared to their homogeneous counterparts. This work benchmar…
▽ More
Graph Neural Networks (GNNs) have emerged as the de facto standard for modeling graph data, with attention mechanisms and transformers significantly enhancing their performance on graph-based tasks. Despite these advancements, the performance of GNNs on heterogeneous graphs often remains complex, with networks generally underperforming compared to their homogeneous counterparts. This work benchmarks various GNN architectures to identify the most effective methods for heterogeneous graphs, with a particular focus on node classification and link prediction. Our findings reveal that graph attention networks excel in these tasks. As a main contribution, we explore enhancements to these attention networks by integrating positional encodings for node embeddings. This involves utilizing the full Laplacian spectrum to accurately capture both the relative and absolute positions of each node within the graph, further enhancing performance on downstream tasks such as node classification and link prediction.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
Residual-based Chebyshev filtered subspace iteration for sparse Hermitian eigenvalue problems tolerant to inexact matrix-vector products
Authors:
Nikhil Kodali,
Kartick Ramakrishnan,
Phani Motamarri
Abstract:
Chebyshev Filtered Subspace Iteration (ChFSI) has been widely adopted for computing a small subset of extreme eigenvalues in large sparse matrices. This work introduces a residual-based reformulation of ChFSI, referred to as R-ChFSI, designed to accommodate inexact matrix-vector products while maintaining robust convergence properties. By reformulating the traditional Chebyshev recurrence to opera…
▽ More
Chebyshev Filtered Subspace Iteration (ChFSI) has been widely adopted for computing a small subset of extreme eigenvalues in large sparse matrices. This work introduces a residual-based reformulation of ChFSI, referred to as R-ChFSI, designed to accommodate inexact matrix-vector products while maintaining robust convergence properties. By reformulating the traditional Chebyshev recurrence to operate on residuals rather than eigenvector estimates, the R-ChFSI approach effectively suppresses the errors made in matrix-vector products, improving the convergence behaviour for both standard and generalized eigenproblems. This ability of R-ChFSI to be tolerant to inexact matrix-vector products allows one to incorporate approximate inverses for large-scale generalized eigenproblems, making the method particularly attractive where exact matrix factorizations or iterative methods become computationally expensive for evaluating inverses. It also allows us to compute the matrix-vector products in lower-precision arithmetic allowing us to leverage modern hardware accelerators. Through extensive benchmarking, we demonstrate that R-ChFSI achieves desired residual tolerances while leveraging low-precision arithmetic. For problems with millions of degrees of freedom and thousands of eigenvalues, R-ChFSI attains final residual norms in the range of 10$^{-12}$ to 10$^{-14}$, even with FP32 and TF32 arithmetic, significantly outperforming standard ChFSI in similar settings. In generalized eigenproblems, where approximate inverses are used, R-ChFSI achieves residual tolerances up to ten orders of magnitude lower, demonstrating its robustness to approximation errors. Finally, R-ChFSI provides a scalable and computationally efficient alternative for solving large-scale eigenproblems in high-performance computing environments.
△ Less
Submitted 14 April, 2025; v1 submitted 28 March, 2025;
originally announced March 2025.
-
Secant varieties of Segre-Veronese varieties $\mathbb{P}^m\times\mathbb{P}^n$ embedded by $\mathcal{O}(1,2)$ are non-defective for $n\gg m^3$, $m\geq3$
Authors:
Matěj Doležálek,
Nikhil Ken
Abstract:
We prove that for any $m\geq3$, $n\gg m^3$, all secant varieties of the Segre-Veronese variety $\mathbb{P}^m\times\mathbb{P}^n$ have the expected dimension. This was already proved by Abo and Brambilla in the subabundant case, hence we focus on the superabundant case. We generalize an approach due to Brambilla and Ottaviani into a construction we call the inductant. With this, the proof of non-def…
▽ More
We prove that for any $m\geq3$, $n\gg m^3$, all secant varieties of the Segre-Veronese variety $\mathbb{P}^m\times\mathbb{P}^n$ have the expected dimension. This was already proved by Abo and Brambilla in the subabundant case, hence we focus on the superabundant case. We generalize an approach due to Brambilla and Ottaviani into a construction we call the inductant. With this, the proof of non-defectivity reduces to checking a finite collection of base cases, which we verify using a computer-assisted proof.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
The equivalence of two Pin(2)-equivariant Seiberg-Witten Floer homologies
Authors:
Nikhil Pandit
Abstract:
We show that for a rational homology 3-sphere $Y$ equipped with a self-conjugate spin$^c$-structure $\mathfrak s$, the $\operatorname{Pin}(2)$-equivariant monopole Floer homology of $(Y,\mathfrak s)$, as defined by Lin, is isomorphic to the $\operatorname{Pin}(2)$-equivariant Seiberg-Witten Floer homology of $(Y,\mathfrak s)$ defined by Manolescu.
We show that for a rational homology 3-sphere $Y$ equipped with a self-conjugate spin$^c$-structure $\mathfrak s$, the $\operatorname{Pin}(2)$-equivariant monopole Floer homology of $(Y,\mathfrak s)$, as defined by Lin, is isomorphic to the $\operatorname{Pin}(2)$-equivariant Seiberg-Witten Floer homology of $(Y,\mathfrak s)$ defined by Manolescu.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
Near-Optimal List-Recovery of Linear Code Families
Authors:
Ray Li,
Nikhil Shagrithaya
Abstract:
We prove several results on linear codes achieving list-recovery capacity. We show that random linear codes achieve list-recovery capacity with constant output list size (independent of the alphabet size and length). That is, over alphabets of size at least $\ell^{Ω(1/\varepsilon)}$, random linear codes of rate $R$ are $(1-R-\varepsilon, \ell, (\ell/\varepsilon)^{O(\ell/\varepsilon)})$-list-recove…
▽ More
We prove several results on linear codes achieving list-recovery capacity. We show that random linear codes achieve list-recovery capacity with constant output list size (independent of the alphabet size and length). That is, over alphabets of size at least $\ell^{Ω(1/\varepsilon)}$, random linear codes of rate $R$ are $(1-R-\varepsilon, \ell, (\ell/\varepsilon)^{O(\ell/\varepsilon)})$-list-recoverable for all $R\in(0,1)$ and $\ell$. Together with a result of Levi, Mosheiff, and Shagrithaya, this implies that randomly punctured Reed-Solomon codes also achieve list-recovery capacity. We also prove that our output list size is near-optimal among all linear codes: all $(1-R-\varepsilon, \ell, L)$-list-recoverable linear codes must have $L\ge \ell^{Ω(R/\varepsilon)}$.
Our simple upper bound combines the Zyablov-Pinsker argument with recent bounds from Kopparty, Ron-Zewi, Saraf, Wootters, and Tamo on the maximum intersection of a "list-recovery ball" and a low-dimensional subspace with large distance. Our lower bound is inspired by a recent lower bound of Chen and Zhang.
△ Less
Submitted 27 February, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
Analysis of a central MUSCL-type scheme for conservation laws with discontinuous flux
Authors:
Nikhil Manoj,
Sudarshan Kumar K
Abstract:
In this article, we propose a second-order central scheme of the Nessyahu-Tadmor-type for a class of scalar conservation laws with discontinuous flux and present its convergence analysis. Since solutions to problems with discontinuous flux generally do not belong to the space of bounded variation (BV), we employ the theory of compensated compactness to establish the convergence of approximate solu…
▽ More
In this article, we propose a second-order central scheme of the Nessyahu-Tadmor-type for a class of scalar conservation laws with discontinuous flux and present its convergence analysis. Since solutions to problems with discontinuous flux generally do not belong to the space of bounded variation (BV), we employ the theory of compensated compactness to establish the convergence of approximate solutions. A major component of our analysis involves deriving the maximum principle and showing the $\mathrm{W}^{-1,2}_{\mathrm{loc}}$ compactness of a sequence constructed from approximate solutions. The latter is achieved through the derivation of several essential estimates on the approximate solutions. Furthermore, by incorporating a mesh-dependent correction term in the slope limiter, we show that the numerical solutions generated by the proposed second-order scheme converge to the entropy solution. Finally, we validate our theoretical results by presenting numerical examples.
△ Less
Submitted 24 March, 2025; v1 submitted 8 January, 2025;
originally announced January 2025.
-
Ramanujan Graphs and Interlacing Families
Authors:
Nikhil Srivastava
Abstract:
This survey accompanies a lecture on the paper ``Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees'' by A. Marcus, D. Spielman, and N. Srivastava at the 2024 International Congress of Basic Science (ICBS) in July, 2024. Its purpose is to explain the developments surrounding this work over the past ten or so years, with an emphasis on connections to other areas of mathematics. Earli…
▽ More
This survey accompanies a lecture on the paper ``Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees'' by A. Marcus, D. Spielman, and N. Srivastava at the 2024 International Congress of Basic Science (ICBS) in July, 2024. Its purpose is to explain the developments surrounding this work over the past ten or so years, with an emphasis on connections to other areas of mathematics. Earlier surveys about the interlacing families method by the same authors focused on applications in functional analysis, whereas the focus here is on applications in spectral graph theory.
△ Less
Submitted 30 December, 2024;
originally announced December 2024.
-
A positivity preserving second-order scheme for multi-dimensional system of non-local conservation laws
Authors:
Nikhil Manoj,
G. D. Veerappa Gowda,
Sudarshan Kumar K
Abstract:
Non-local systems of conservation laws play a crucial role in modeling flow mechanisms across various scenarios. The well-posedness of such problems is typically established by demonstrating the convergence of robust first-order schemes. However, achieving more accurate solutions necessitates the development of higher-order schemes. In this article, we present a fully discrete, second-order scheme…
▽ More
Non-local systems of conservation laws play a crucial role in modeling flow mechanisms across various scenarios. The well-posedness of such problems is typically established by demonstrating the convergence of robust first-order schemes. However, achieving more accurate solutions necessitates the development of higher-order schemes. In this article, we present a fully discrete, second-order scheme for a general class of non-local conservation law systems in multiple spatial dimensions. The method employs a MUSCL-type spatial reconstruction coupled with Runge-Kutta time integration. The proposed scheme is proven to preserve positivity in all the unknowns and exhibits L-infinity stability. Numerical experiments conducted on both the non-local scalar and system cases illustrate the8 importance of second-order scheme when compared to its first-order counterpart.
△ Less
Submitted 5 January, 2025; v1 submitted 24 December, 2024;
originally announced December 2024.
-
Results on normal harmonic and $\varphi$-normal harmonic mappings
Authors:
Nikhil Bharti,
Nguyen Van Thin
Abstract:
In this paper, we study the concepts of normal functions and $\varphi$-normal functions in the framework of planar harmonic mappings. We establish the harmonic mapping counterpart of the well-known Zalcman-Pang lemma and as a consequence, we prove that a harmonic mapping whose spherical derivative is bounded away from zero is normal. Furthermore, we introduce the concept of the extended spherical…
▽ More
In this paper, we study the concepts of normal functions and $\varphi$-normal functions in the framework of planar harmonic mappings. We establish the harmonic mapping counterpart of the well-known Zalcman-Pang lemma and as a consequence, we prove that a harmonic mapping whose spherical derivative is bounded away from zero is normal. Furthermore, we introduce the concept of the extended spherical derivative for harmonic mappings and obtain several sufficient conditions for a harmonic mapping to be $\varphi$-normal.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Sparse Pseudospectral Shattering
Authors:
Rikhav Shah,
Nikhil Srivastava,
Edward Zeng
Abstract:
The eigenvalues and eigenvectors of nonnormal matrices can be unstable under perturbations of their entries. This renders an obstacle to the analysis of numerical algorithms for non-Hermitian eigenvalue problems. A recent technique to handle this issue is pseudospectral shattering [BGVKS23], showing that adding a random perturbation to any matrix has a regularizing effect on the stability of the e…
▽ More
The eigenvalues and eigenvectors of nonnormal matrices can be unstable under perturbations of their entries. This renders an obstacle to the analysis of numerical algorithms for non-Hermitian eigenvalue problems. A recent technique to handle this issue is pseudospectral shattering [BGVKS23], showing that adding a random perturbation to any matrix has a regularizing effect on the stability of the eigenvalues and eigenvectors. Prior work has analyzed the regularizing effect of dense Gaussian perturbations, where independent noise is added to every entry of a given matrix [BVKS20, BGVKS23, BKMS21, JSS21].
We show that the same effect can be achieved by adding a sparse random perturbation. In particular, we show that given any $n\times n$ matrix $M$ of polynomially bounded norm: (a) perturbing $O(n\log^2(n))$ random entries of $M$ by adding i.i.d. complex Gaussians yields $\logκ_V(A)=O(\text{poly}\log(n))$ and $\log (1/η(A))=O(\text{poly}\log(n))$ with high probability; (b) perturbing $O(n^{1+α})$ random entries of $M$ for any constant $α>0$ yields $\logκ_V(A)=O_α(\log(n))$ and $\log(1/η(A))=O_α(\log(n))$ with high probability. Here, $κ_V(A)$ denotes the condition number of the eigenvectors of the perturbed matrix $A$ and $η(A)$ denotes its minimum eigenvalue gap.
A key mechanism of the proof is to reduce the study of $κ_V(A)$ to control of the pseudospectral area and minimum eigenvalue gap of $A$, which are further reduced to estimates on the least two singular values of shifts of $A$. We obtain the required least singular value estimates via a streamlining of an argument of Tao and Vu [TV07] specialized to the case of sparse complex Gaussian perturbations.
△ Less
Submitted 17 June, 2025; v1 submitted 29 November, 2024;
originally announced November 2024.
-
Mahler's $\frac{3}{2}$ problem in $\mathbb{Z}^{+} $
Authors:
Nikhil S Kumar
Abstract:
This problem was asked to K. Mahler by one of his Japanese colleagues, a Z-number is a positive real number $x$ such that the fractional parts of $x(\frac{3}{2})^n $ are less than $\frac{1}{2}$ for all integers $n$ such that $n \ge 0$. Kurt Mahler conjectured in 1968 that there are no Z-numbers. In this paper, we show that there are no Z-numbers in $\mathbb{Z}^{+} = \{1,2,3,...\}$.
This problem was asked to K. Mahler by one of his Japanese colleagues, a Z-number is a positive real number $x$ such that the fractional parts of $x(\frac{3}{2})^n $ are less than $\frac{1}{2}$ for all integers $n$ such that $n \ge 0$. Kurt Mahler conjectured in 1968 that there are no Z-numbers. In this paper, we show that there are no Z-numbers in $\mathbb{Z}^{+} = \{1,2,3,...\}$.
△ Less
Submitted 18 June, 2025; v1 submitted 5 November, 2024;
originally announced November 2024.
-
How Does Critical Batch Size Scale in Pre-training?
Authors:
Hanlin Zhang,
Depen Morwani,
Nikhil Vyas,
Jingfeng Wu,
Difan Zou,
Udaya Ghai,
Dean Foster,
Sham Kakade
Abstract:
Training large-scale models under given resources requires careful design of parallelism strategies. In particular, the efficiency notion of critical batch size (CBS), concerning the compromise between time and compute, marks the threshold beyond which greater data parallelism leads to diminishing returns. To operationalize it, we propose a measure of CBS and pre-train a series of auto-regressive…
▽ More
Training large-scale models under given resources requires careful design of parallelism strategies. In particular, the efficiency notion of critical batch size (CBS), concerning the compromise between time and compute, marks the threshold beyond which greater data parallelism leads to diminishing returns. To operationalize it, we propose a measure of CBS and pre-train a series of auto-regressive language models, ranging from 85 million to 1.2 billion parameters, on the C4 dataset. Through extensive hyper-parameter sweeps and careful control of factors such as batch size, momentum, and learning rate along with its scheduling, we systematically investigate the impact of scale on CBS. Then we fit scaling laws with respect to model and data sizes to decouple their effects. Overall, our results demonstrate that CBS scales primarily with data size rather than model size, a finding we justify theoretically through the analysis of infinite-width limits of neural networks and infinite-dimensional least squares regression. Of independent interest, we highlight the importance of common hyper-parameter choices and strategies for studying large-scale pre-training beyond fixed training durations.
△ Less
Submitted 21 April, 2025; v1 submitted 28 October, 2024;
originally announced October 2024.
-
Normal Families of Holomorphic Curves and Sharing of Moving Hyperplanes Wandering on $\mathbb{P}^n$
Authors:
Gopal Datt,
Naveen Gupta,
Nikhil Khanna,
Ritesh Pal
Abstract:
In this paper, we extend a result of Schwick concerning normality and sharing values in one complex variable for families of holomorphic curves taking values in $\mathbb{P}^n$. We consider wandering moving hyperplanes (i.e., depending on the respective holomorphic curve in the family under consideration), and establish a sufficient condition of normality concerning shared hyperplanes.
In this paper, we extend a result of Schwick concerning normality and sharing values in one complex variable for families of holomorphic curves taking values in $\mathbb{P}^n$. We consider wandering moving hyperplanes (i.e., depending on the respective holomorphic curve in the family under consideration), and establish a sufficient condition of normality concerning shared hyperplanes.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Khovanov homology and quantum error-correcting codes
Authors:
Milena Harned,
Pranav Venkata Konda,
Felix Shanglin Liu,
Nikhil Mudumbi,
Eric Yuang Shao,
Zheheng Xiao
Abstract:
Error-correcting codes for quantum computing are crucial to address the fundamental problem of communication in the presence of noise and imperfections. Audoux used Khovanov homology to define families of quantum error-correcting codes with desirable properties. We explore Khovanov homology and some of its many extensions, namely reduced, annular, and $\mathfrak{sl}_3$ homology, to generate new fa…
▽ More
Error-correcting codes for quantum computing are crucial to address the fundamental problem of communication in the presence of noise and imperfections. Audoux used Khovanov homology to define families of quantum error-correcting codes with desirable properties. We explore Khovanov homology and some of its many extensions, namely reduced, annular, and $\mathfrak{sl}_3$ homology, to generate new families of quantum codes and to establish several properties about codes that arise in this way, such as behavior of distance under Reidemeister moves or connected sums.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Fibonacci Partial Sums Tricks
Authors:
Nikhil Byrapuram,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Rajarshi Mandal,
Gordon Redwine,
Soham Samanta,
Daniel Wu,
Danyang Xu,
Ray Zhao
Abstract:
The following magic trick is at the center of this paper. While the audience writes the first ten terms of a Fibonacci-like sequence (the sequence following the same recursion as the Fibonacci sequence), the magician calculates the sum of these ten terms very fast by multiplying the 7th term by 11. This trick is based on the divisibility properties of partial sums of Fibonacci-like sequences. We f…
▽ More
The following magic trick is at the center of this paper. While the audience writes the first ten terms of a Fibonacci-like sequence (the sequence following the same recursion as the Fibonacci sequence), the magician calculates the sum of these ten terms very fast by multiplying the 7th term by 11. This trick is based on the divisibility properties of partial sums of Fibonacci-like sequences. We find the maximum Fibonacci number that divides the sum of the Fibonacci numbers 1 through $n$. We discuss the generalization of the trick for other second-order recurrences. We show that a similar trick exists for Pell-like sequences and does not exist for Jacobhstal-like sequences.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Representative Arm Identification: A fixed confidence approach to identify cluster representatives
Authors:
Sarvesh Gharat,
Aniket Yadav,
Nikhil Karamchandani,
Jayakrishnan Nair
Abstract:
We study the representative arm identification (RAI) problem in the multi-armed bandits (MAB) framework, wherein we have a collection of arms, each associated with an unknown reward distribution. An underlying instance is defined by a partitioning of the arms into clusters of predefined sizes, such that for any $j > i$, all arms in cluster $i$ have a larger mean reward than those in cluster $j$. T…
▽ More
We study the representative arm identification (RAI) problem in the multi-armed bandits (MAB) framework, wherein we have a collection of arms, each associated with an unknown reward distribution. An underlying instance is defined by a partitioning of the arms into clusters of predefined sizes, such that for any $j > i$, all arms in cluster $i$ have a larger mean reward than those in cluster $j$. The goal in RAI is to reliably identify a certain prespecified number of arms from each cluster, while using as few arm pulls as possible. The RAI problem covers as special cases several well-studied MAB problems such as identifying the best arm or any $M$ out of the top $K$, as well as both full and coarse ranking. We start by providing an instance-dependent lower bound on the sample complexity of any feasible algorithm for this setting. We then propose two algorithms, based on the idea of confidence intervals, and provide high probability upper bounds on their sample complexity, which orderwise match the lower bound. Finally, we do an empirical comparison of both algorithms along with an LUCB-type alternative on both synthetic and real-world datasets, and demonstrate the superior performance of our proposed schemes in most cases.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Weighted Yosida Mappings of Several Complex Variables
Authors:
Nikhil Bharti,
Nguyen Van Thin
Abstract:
Let $M$ be a complete complex Hermitian manifold with metric $E_{M}$ and let $\varphi: [0,\infty)\rightarrow (0,\infty)$ be positive function such that $$γ_r=\sup\limits_{r\leq a<b}\left|(\varphi(a)-\varphi(b))/(a-b)\right|\leq C,~r\in (0,\infty),$$ for some $C\in (0,1],$ and $\lim_{r\rightarrow\infty}γ_r=0.$ A holomorphic mapping $f:\mathbb{C}^{m}\rightarrow M$ is said to be a weighted Yosida map…
▽ More
Let $M$ be a complete complex Hermitian manifold with metric $E_{M}$ and let $\varphi: [0,\infty)\rightarrow (0,\infty)$ be positive function such that $$γ_r=\sup\limits_{r\leq a<b}\left|(\varphi(a)-\varphi(b))/(a-b)\right|\leq C,~r\in (0,\infty),$$ for some $C\in (0,1],$ and $\lim_{r\rightarrow\infty}γ_r=0.$ A holomorphic mapping $f:\mathbb{C}^{m}\rightarrow M$ is said to be a weighted Yosida mapping if for any $z,~ξ\in\mathbb{C}^{m}$ with $\|ξ\|=1,$ the quantity $\varphi(\|z\|)E_{M}(f(z), df(z)(ξ))$ remains bounded above, where $df(z)$ is the map from $T_z(\mathbb{C}^{m})$ to $T_{f(z)}(M)$ induced by $f.$ We present several criteria of holomorphic mappings belonging to the class of all weighted Yosida mappings.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Quasi-Monte Carlo Beyond Hardy-Krause
Authors:
Nikhil Bansal,
Haotian Jiang
Abstract:
The classical approaches to numerically integrating a function $f$ are Monte Carlo (MC) and quasi-Monte Carlo (QMC) methods. MC methods use random samples to evaluate $f$ and have error $O(σ(f)/\sqrt{n})$, where $σ(f)$ is the standard deviation of $f$. QMC methods are based on evaluating $f$ at explicit point sets with low discrepancy, and as given by the classical Koksma-Hlawka inequality, they h…
▽ More
The classical approaches to numerically integrating a function $f$ are Monte Carlo (MC) and quasi-Monte Carlo (QMC) methods. MC methods use random samples to evaluate $f$ and have error $O(σ(f)/\sqrt{n})$, where $σ(f)$ is the standard deviation of $f$. QMC methods are based on evaluating $f$ at explicit point sets with low discrepancy, and as given by the classical Koksma-Hlawka inequality, they have error $\widetilde{O}(σ_{\mathsf{HK}}(f)/n)$, where $σ_{\mathsf{HK}}(f)$ is the variation of $f$ in the sense of Hardy and Krause. These two methods have distinctive advantages and shortcomings, and a fundamental question is to find a method that combines the advantages of both.
In this work, we give a simple randomized algorithm that produces QMC point sets with the following desirable features: (1) It achieves substantially better error than given by the classical Koksma-Hlawka inequality. In particular, it has error $\widetilde{O}(σ_{\mathsf{SO}}(f)/n)$, where $σ_{\mathsf{SO}}(f)$ is a new measure of variation that we introduce, which is substantially smaller than the Hardy-Krause variation. (2) The algorithm only requires random samples from the underlying distribution, which makes it as flexible as MC. (3) It automatically achieves the best of both MC and QMC (and the above improvement over Hardy-Krause variation) in an optimal way. (4) The algorithm is extremely efficient, with an amortized $\widetilde{O}(1)$ runtime per sample.
Our method is based on the classical transference principle in geometric discrepancy, combined with recent algorithmic innovations in combinatorial discrepancy that besides producing low-discrepancy colorings, also guarantee certain subgaussian properties. This allows us to bypass several limitations of previous works in bridging the gap between MC and QMC methods and go beyond the Hardy-Krause variation.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
Equitable Congestion Pricing under the Markovian Traffic Model: An Application to Bogota
Authors:
Alfredo Torrico,
Natthawut Boonsiriphatthanajaroen,
Nikhil Garg,
Andrea Lodi,
Hugo Mainguy
Abstract:
Congestion pricing is used to raise revenues and reduce traffic and pollution. However, people have heterogeneous spatial demand patterns and willingness (or ability) to pay tolls, and so pricing may have substantial equity implications. We develop a data-driven approach to design congestion pricing given policymakers' equity and efficiency objectives. First, algorithmically, we extend the Markovi…
▽ More
Congestion pricing is used to raise revenues and reduce traffic and pollution. However, people have heterogeneous spatial demand patterns and willingness (or ability) to pay tolls, and so pricing may have substantial equity implications. We develop a data-driven approach to design congestion pricing given policymakers' equity and efficiency objectives. First, algorithmically, we extend the Markovian traffic equilibrium setting introduced by Baillon & Cominetti (2008) to model heterogeneous populations and incorporate prices and outside options such as public transit. Second, we empirically evaluate various pricing schemes using data collected by an industry partner in the city of Bogota, one of the most congested cities in the world. We find that pricing personalized to each economic stratum can be substantially more efficient and equitable than uniform pricing; however, non-personalized but area-based pricing can recover much of the gap.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
On the Anatomy of Attention
Authors:
Nikhil Khatri,
Tuomas Laakkonen,
Jonathon Liu,
Vincent Wang-Maścianica
Abstract:
We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focu…
▽ More
We introduce a category-theoretic diagrammatic formalism in order to systematically relate and reason about machine learning models. Our diagrams present architectures intuitively but without loss of essential detail, where natural relationships between models are captured by graphical transformations, and important differences and similarities can be identified at a glance. In this paper, we focus on attention mechanisms: translating folklore into mathematical derivations, and constructing a taxonomy of attention variants in the literature. As a first example of an empirical investigation underpinned by our formalism, we identify recurring anatomical components of attention, which we exhaustively recombine to explore a space of variations on the attention mechanism.
△ Less
Submitted 7 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
A New Perspective on Shampoo's Preconditioner
Authors:
Depen Morwani,
Itai Shapira,
Nikhil Vyas,
Eran Malach,
Sham Kakade,
Lucas Janson
Abstract:
Shampoo, a second-order optimization algorithm which uses a Kronecker product preconditioner, has recently garnered increasing attention from the machine learning community. The preconditioner used by Shampoo can be viewed either as an approximation of the Gauss--Newton component of the Hessian or the covariance matrix of the gradients maintained by Adagrad. We provide an explicit and novel connec…
▽ More
Shampoo, a second-order optimization algorithm which uses a Kronecker product preconditioner, has recently garnered increasing attention from the machine learning community. The preconditioner used by Shampoo can be viewed either as an approximation of the Gauss--Newton component of the Hessian or the covariance matrix of the gradients maintained by Adagrad. We provide an explicit and novel connection between the $\textit{optimal}$ Kronecker product approximation of these matrices and the approximation made by Shampoo. Our connection highlights a subtle but common misconception about Shampoo's approximation. In particular, the $\textit{square}$ of the approximation used by the Shampoo optimizer is equivalent to a single step of the power iteration algorithm for computing the aforementioned optimal Kronecker product approximation. Across a variety of datasets and architectures we empirically demonstrate that this is close to the optimal Kronecker product approximation. Additionally, for the Hessian approximation viewpoint, we empirically study the impact of various practical tricks to make Shampoo more computationally efficient (such as using the batch gradient and the empirical Fisher) on the quality of Hessian approximation.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Fibonometry and Beyond
Authors:
Nikhil Byrapuram,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Rajarshi Mandal,
Gordon Redwine,
Soham Samanta,
Daniel Wu,
Danyang Xu,
Ray Zhao
Abstract:
In 2013, Conway and Ryba wrote a fascinating paper called Fibonometry. The paper, as one might guess, is about the connection between Fibonacci numbers and trigonometry. We were fascinated by this paper and looked at how we could generalize it. We discovered that we weren't the first. In this paper, we describe our journey and summarize the results.
In 2013, Conway and Ryba wrote a fascinating paper called Fibonometry. The paper, as one might guess, is about the connection between Fibonacci numbers and trigonometry. We were fascinated by this paper and looked at how we could generalize it. We discovered that we weren't the first. In this paper, we describe our journey and summarize the results.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
On Convergence of the Iteratively Preconditioned Gradient-Descent (IPG) Observer
Authors:
Kushal Chakrabarti,
Nikhil Chopra
Abstract:
This paper considers the observer design problem for discrete-time nonlinear dynamical systems with sampled measurement data. Earlier, the recently proposed Iteratively Preconditioned Gradient-Descent (IPG) observer, a Newton-type observer, has been empirically shown to have improved robustness against measurement noise than the prominent nonlinear observers, a property that other Newton-type obse…
▽ More
This paper considers the observer design problem for discrete-time nonlinear dynamical systems with sampled measurement data. Earlier, the recently proposed Iteratively Preconditioned Gradient-Descent (IPG) observer, a Newton-type observer, has been empirically shown to have improved robustness against measurement noise than the prominent nonlinear observers, a property that other Newton-type observers lack. However, no theoretical guarantees on the convergence of the IPG observer were provided. This paper presents a rigorous convergence analysis of the IPG observer for a class of nonlinear systems in deterministic settings, proving its local linear convergence to the actual trajectory. Our assumptions are standard in the existing literature of Newton-type observers, and the analysis further confirms the relation of the IPG observer with the Newton observer, which was only hypothesized earlier.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
On the Ground State Energies of Discrete and Semiclassical Schrödinger Operators
Authors:
Isabel Detherage,
Nikhil Srivastava,
Zachary Stier
Abstract:
We study the infimum of the spectrum, or ground state energy (g.s.e.), of a discrete Schrödinger operator on $θ\mathbb{Z}^d$ parameterized by a potential $V:\mathbb{R}^d\rightarrow\mathbb{R}_{\ge 0}$ and a frequency parameter $θ\in (0,1)$. We relate this g.s.e. to that of a corresponding continuous semiclassical Schrödinger operator on $\mathbb{R}^d$ with parameter $θ$, arising from the same choic…
▽ More
We study the infimum of the spectrum, or ground state energy (g.s.e.), of a discrete Schrödinger operator on $θ\mathbb{Z}^d$ parameterized by a potential $V:\mathbb{R}^d\rightarrow\mathbb{R}_{\ge 0}$ and a frequency parameter $θ\in (0,1)$. We relate this g.s.e. to that of a corresponding continuous semiclassical Schrödinger operator on $\mathbb{R}^d$ with parameter $θ$, arising from the same choice of potential. We show that: the discrete g.s.e. is at most the continuous one for continuous periodic $V$ and irrational $θ$; the opposite inequality holds up to a factor of $1-o(1)$ as $θ\rightarrow 0$ for sufficiently regular smooth periodic $V$; and the opposite inequality holds up to a constant factor for every bounded $V$ and $θ$ with the property that discrete and continuous averages of $V$ on fundamental domains of $θ\mathbb{Z}^d$ are comparable. Our proofs are elementary and rely on sampling and interpolation to map low-energy functions for the discrete operator on $θ\mathbb{Z}^d$ to low-energy functions for the continuous operator on $\mathbb{R}^d$, and vice versa.
△ Less
Submitted 6 July, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Designing a K-state P-bit Engine
Authors:
Mohammad Khairul Bashar,
Abir Hasan,
Nikhil Shukla
Abstract:
Probabilistic bit (p-bit)-based compute engines utilize the unique capability of a p-bit to probabilistically switch between two states to solve computationally challenging problems. However, when solving problems that require more than two states (e.g., problems such as Max-3-Cut, verifying if a graph is K-partite (K>2) etc.), additional pre-processing steps such as graph reduction are required t…
▽ More
Probabilistic bit (p-bit)-based compute engines utilize the unique capability of a p-bit to probabilistically switch between two states to solve computationally challenging problems. However, when solving problems that require more than two states (e.g., problems such as Max-3-Cut, verifying if a graph is K-partite (K>2) etc.), additional pre-processing steps such as graph reduction are required to make the problem compatible with a two-state p-bit platform. Moreover, this not only increases the problem size by entailing the use of auxiliary variables but can also degrade the solution quality. In this work, we develop a unique framework for implementing a K-state (K>2) p-bit engine. Furthermore, from an implementation standpoint, we show that such a K-state p-bit engine can be implemented using N traditional (2-state) p-bits, and one multi-state p-bit -- a novel concept proposed here. Augmenting traditional p-bit platforms, our approach enables us to solve an archetypal combinatoric problem class requiring multiple states, namely Max-K-Cut (K=3, 4 shown here), without using any additional auxiliary variables. Thus, our work fundamentally advances the functional capability of p-bit engines, enabling them to solve a broader class of computationally challenging problems more efficiently.
△ Less
Submitted 27 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Reinforcement Learning Paycheck Optimization for Multivariate Financial Goals
Authors:
Melda Alaluf,
Giulia Crippa,
Sinong Geng,
Zijian Jing,
Nikhil Krishnan,
Sanjeev Kulkarni,
Wyatt Navarro,
Ronnie Sircar,
Jonathan Tang
Abstract:
We study paycheck optimization, which examines how to allocate income in order to achieve several competing financial goals. For paycheck optimization, a quantitative methodology is missing, due to a lack of a suitable problem formulation. To deal with this issue, we formulate the problem as a utility maximization problem. The proposed formulation is able to (i) unify different financial goals; (i…
▽ More
We study paycheck optimization, which examines how to allocate income in order to achieve several competing financial goals. For paycheck optimization, a quantitative methodology is missing, due to a lack of a suitable problem formulation. To deal with this issue, we formulate the problem as a utility maximization problem. The proposed formulation is able to (i) unify different financial goals; (ii) incorporate user preferences regarding the goals; (iii) handle stochastic interest rates. The proposed formulation also facilitates an end-to-end reinforcement learning solution, which is implemented on a variety of problem settings.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Wisdom and Foolishness of Noisy Matching Markets
Authors:
Kenny Peng,
Nikhil Garg
Abstract:
We consider a many-to-one matching market where colleges share true preferences over students but make decisions using only independent noisy rankings. Each student has a true value $v$, but each college $c$ ranks the student according to an independently drawn estimated value $v + X_c$ for $X_c\sim \mathcal{D}.$ We ask a basic question about the resulting stable matching: How noisy is the set of…
▽ More
We consider a many-to-one matching market where colleges share true preferences over students but make decisions using only independent noisy rankings. Each student has a true value $v$, but each college $c$ ranks the student according to an independently drawn estimated value $v + X_c$ for $X_c\sim \mathcal{D}.$ We ask a basic question about the resulting stable matching: How noisy is the set of matched students? Two striking effects can occur in large markets (i.e., with a continuum of students and a large number of colleges). When $\mathcal{D}$ is light-tailed, noise is fully attenuated: only the highest-value students are matched. When $\mathcal{D}$ is long-tailed, noise is fully amplified: students are matched uniformly at random. These results hold for any distribution of student preferences over colleges, and extend to when only subsets of colleges agree on true student valuations instead of the entire market. More broadly, our framework provides a tractable approach to analyze implications of imperfect preference formation in large markets.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Clustering Techniques for Stable Linear Dynamical Systems with applications to Hard Disk Drives
Authors:
Nikhil Potu Surya Prakash,
Joohwan Seo,
Jongeun Choi,
Roberto Horowitz
Abstract:
In Robust Control and Data Driven Robust Control design methodologies, multiple plant transfer functions or a family of transfer functions are considered and a common controller is designed such that all the plants that fall into this family are stabilized. Though the plants are stabilized, the controller might be sub-optimal for each of the plants when the variations in the plants are large. This…
▽ More
In Robust Control and Data Driven Robust Control design methodologies, multiple plant transfer functions or a family of transfer functions are considered and a common controller is designed such that all the plants that fall into this family are stabilized. Though the plants are stabilized, the controller might be sub-optimal for each of the plants when the variations in the plants are large. This paper presents a way of clustering stable linear dynamical systems for the design of robust controllers within each of the clusters such that the controllers are optimal for each of the clusters. First a k-medoids algorithm for hard clustering will be presented for stable Linear Time Invariant (LTI) systems and then a Gaussian Mixture Models (GMM) clustering for a special class of LTI systems, common for Hard Disk Drive plants, will be presented.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Geometric quantization results for semi-positive line bundles on a Riemann surface
Authors:
George Marinescu,
Nikhil Savale
Abstract:
In earlier work the authors proved the Bergman kernel expansion for semipositive line bundles over a Riemann surface whose curvature vanishes to atmost finite order at each point. Here we explore the related results and consequences of the expansion in the semipositive case including: Tian's approximation theorem for induced Fubini-Study metrics, leading order asymptotics and composition for Toepl…
▽ More
In earlier work the authors proved the Bergman kernel expansion for semipositive line bundles over a Riemann surface whose curvature vanishes to atmost finite order at each point. Here we explore the related results and consequences of the expansion in the semipositive case including: Tian's approximation theorem for induced Fubini-Study metrics, leading order asymptotics and composition for Toeplitz operators, asymptotics of zeroes for random sections and the asymptotics of holomorphic torsion.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Maximum Number of Quads
Authors:
Nikhil Byrapuram,
Hwiseo,
Choi,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Evin Liang,
Rajarshi Mandal,
Aika Oki,
Daniel Wu,
Michael Yang
Abstract:
We study the maximum number of quads among $\ell$ cards from an EvenQuads deck of size $2^n$. This corresponds to enumerating quadruples of integers in the range $[0,\ell-1]$ such that their bitwise XOR is zero. In this paper, we conjecture a formula that calculates the maximum number of quads among $\ell$ cards.
We study the maximum number of quads among $\ell$ cards from an EvenQuads deck of size $2^n$. This corresponds to enumerating quadruples of integers in the range $[0,\ell-1]$ such that their bitwise XOR is zero. In this paper, we conjecture a formula that calculates the maximum number of quads among $\ell$ cards.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
A Note on Analyzing the Stability of Oscillator Ising Machines
Authors:
Mohammad Khairul Bashar,
Zongli Lin,
Nikhil Shukla
Abstract:
The rich non-linear dynamics of the coupled oscillators (under second harmonic injection) can be leveraged to solve computationally hard problems in combinatorial optimization such as finding the ground state of the Ising Hamiltonian. While prior work on the stability of the so-called Oscillator Ising Machines (OIMs) has used the linearization method, in this letter, we present a complementary met…
▽ More
The rich non-linear dynamics of the coupled oscillators (under second harmonic injection) can be leveraged to solve computationally hard problems in combinatorial optimization such as finding the ground state of the Ising Hamiltonian. While prior work on the stability of the so-called Oscillator Ising Machines (OIMs) has used the linearization method, in this letter, we present a complementary method to analyze stability using the second order derivative test of the energy / cost function. We establish the equivalence between the two methods, thus augmenting the tool kit for the design and implementation of OIMs.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Polyak Minorant Method for Convex Optimization
Authors:
Nikhil Devanathan,
Stephen Boyd
Abstract:
In 1963 Boris Polyak suggested a particular step size for gradient descent methods, now known as the Polyak step size, that he later adapted to subgradient methods. The Polyak step size requires knowledge of the optimal value of the minimization problem, which is a strong assumption but one that holds for several important problems. In this paper we extend Polyak's method to handle constraints and…
▽ More
In 1963 Boris Polyak suggested a particular step size for gradient descent methods, now known as the Polyak step size, that he later adapted to subgradient methods. The Polyak step size requires knowledge of the optimal value of the minimization problem, which is a strong assumption but one that holds for several important problems. In this paper we extend Polyak's method to handle constraints and, as a generalization of subgradients, general minorants, which are convex functions that tightly lower bound the objective and constraint functions. We refer to this algorithm as the Polyak Minorant Method (PMM). It is closely related to cutting-plane and bundle methods.
△ Less
Submitted 3 April, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Using Elementary Techniques to Characterize the Relationship Between Wythoff's Game and the Golden Ratio
Authors:
Vincent Wang,
Nikhil Sampath,
Eric Yule,
Ethan Wang
Abstract:
Wythoff's game is a modification of the well-known game of ``nim." Wythoff's game, which does not resemble the Fibonacci sequence, has direct relation to the Golden ratio. We will explore the sequence behind this surprising relationship, and consider the implications of our elementary methods.
Wythoff's game is a modification of the well-known game of ``nim." Wythoff's game, which does not resemble the Fibonacci sequence, has direct relation to the Golden ratio. We will explore the sequence behind this surprising relationship, and consider the implications of our elementary methods.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
A Normal Criterion Concerning Sequence of Functions and their Differential Polynomials
Authors:
Nikhil Bharti
Abstract:
In this paper, a normality criterion concerning a sequence of meromorphic functions and their differential polynomials is obtained. Precisely, we have proved: Let $\left\{f_j\right\}$ be a sequence of meromorphic functions in the open unit disk $\mathbb{D}$ such that, for each $j,$ $f_j$ has poles of multiplicity at least $m,~m\in\mathbb{N}.$ Let $\left\{h_j\right\}$ be a sequence of holomorphic f…
▽ More
In this paper, a normality criterion concerning a sequence of meromorphic functions and their differential polynomials is obtained. Precisely, we have proved: Let $\left\{f_j\right\}$ be a sequence of meromorphic functions in the open unit disk $\mathbb{D}$ such that, for each $j,$ $f_j$ has poles of multiplicity at least $m,~m\in\mathbb{N}.$ Let $\left\{h_j\right\}$ be a sequence of holomorphic functions in $\mathbb{D}$ such that $h_j\rightarrow h$ locally uniformly in $\mathbb{D},$ where $h$ is holomorphic in $\mathbb{D}$ and $h\not\equiv 0.$ Let $Q[f_j]$ be a differential polynomial of $f_j$ having degree $λ_Q$ and weight $μ_Q.$ If, for each $j,$ $f_j(z)\neq 0$ and $Q[f_j]-h_j$ has at most $μ_Q + λ_Q(m-1)-1$ zeros, ignoring multiplicities, in $\mathbb{D},$ then $\left\{f_j\right\}$ is normal in $\mathbb{D}.$
△ Less
Submitted 10 December, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
EvenQuads Game and Error-Correcting Codes
Authors:
Nikhil Byrapuram,
Hwiseo,
Choi,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Evin Liang,
Rajarshi Mandal,
Aika Oki,
Daniel Wu,
Michael Yang
Abstract:
EvenQuads is a new card game that is a generalization of the SET game, where each card is characterized by three attributes, each taking four possible values. Four cards form a quad when, for each attribute, the values are the same, all different, or half and half. Given $\ell$ cards from the deck of EvenQuads, we can build an error-correcting linear binary code of length $\ell$ and Hamming distan…
▽ More
EvenQuads is a new card game that is a generalization of the SET game, where each card is characterized by three attributes, each taking four possible values. Four cards form a quad when, for each attribute, the values are the same, all different, or half and half. Given $\ell$ cards from the deck of EvenQuads, we can build an error-correcting linear binary code of length $\ell$ and Hamming distance 4. The quads correspond to codewords of weight 4. Error-correcting codes help us calculate the possible number of quads when given up to 8 cards. We also estimate the number of cards that do not contain quads for decks of different sizes. In addition, we discuss properties of error-correcting codes built on semimagic, magic, and strongly magic quad squares.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Fixed confidence community mode estimation
Authors:
Meera Pai,
Nikhil Karamchandani,
Jayakrishnan Nair
Abstract:
Our aim is to estimate the largest community (a.k.a., mode) in a population composed of multiple disjoint communities. This estimation is performed in a fixed confidence setting via sequential sampling of individuals with replacement. We consider two sampling models: (i) an identityless model, wherein only the community of each sampled individual is revealed, and (ii) an identity-based model, wher…
▽ More
Our aim is to estimate the largest community (a.k.a., mode) in a population composed of multiple disjoint communities. This estimation is performed in a fixed confidence setting via sequential sampling of individuals with replacement. We consider two sampling models: (i) an identityless model, wherein only the community of each sampled individual is revealed, and (ii) an identity-based model, wherein the learner is able to discern whether or not each sampled individual has been sampled before, in addition to the community of that individual. The former model corresponds to the classical problem of identifying the mode of a discrete distribution, whereas the latter seeks to capture the utility of identity information in mode estimation. For each of these models, we establish information theoretic lower bounds on the expected number of samples needed to meet the prescribed confidence level, and propose sound algorithms with a sample complexity that is provably asymptotically optimal. Our analysis highlights that identity information can indeed be utilized to improve the efficiency of community mode estimation.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
Kähler-Einstein Bergman metrics on pseudoconvex domains of dimension two
Authors:
Nikhil Savale,
Ming Xiao
Abstract:
We prove that a two dimensional pseudoconvex domain of finite type with a Kähler-Einstein Bergman metric is biholomorphic to the unit ball. This answers an old question of Yau for such domains. The proof relies on asymptotics of derivatives of the Bergman kernel along critically tangent paths approaching the boundary, where the order of tangency equals the type of the boundary point being approach…
▽ More
We prove that a two dimensional pseudoconvex domain of finite type with a Kähler-Einstein Bergman metric is biholomorphic to the unit ball. This answers an old question of Yau for such domains. The proof relies on asymptotics of derivatives of the Bergman kernel along critically tangent paths approaching the boundary, where the order of tangency equals the type of the boundary point being approached.
△ Less
Submitted 23 March, 2025; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Quad Squares
Authors:
Nikhil Byrapuram,
Hwiseo,
Choi,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Evin Liang,
Rajarshi Mandal,
Aika Oki,
Daniel Wu,
Michael Yang
Abstract:
We study 4-by-4 squares formed by cards from the EvenQuads deck. EvenQuads is a card game with 64 cards where cards have 3 attributes with 4 values in each attribute. A quad is four cards with all attributes the same, all different, or half and half. We define Latin quad squares as squares where the cards in each row and column have different values for each attribute. We define semimagic quad squ…
▽ More
We study 4-by-4 squares formed by cards from the EvenQuads deck. EvenQuads is a card game with 64 cards where cards have 3 attributes with 4 values in each attribute. A quad is four cards with all attributes the same, all different, or half and half. We define Latin quad squares as squares where the cards in each row and column have different values for each attribute. We define semimagic quad squares as squares where each row and column form a quad. For magic quad squares, we add a requirement that the diagonals have to form a quad. We also define strongly magic quad squares. We analyze types of semimagic and strongly magic quad squares. We also calculate the number of semimagic, magic, and strongly magic quad squares for quad decks of any size. These squares can be described in terms of integers. Four integers form a quad when their bitwise XOR is zero.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Iteratively Preconditioned Gradient-Descent Approach for Moving Horizon Estimation Problems
Authors:
Tianchen Liu,
Kushal Chakrabarti,
Nikhil Chopra
Abstract:
Moving horizon estimation (MHE) is a widely studied state estimation approach in several practical applications. In the MHE problem, the state estimates are obtained via the solution of an approximated nonlinear optimization problem. However, this optimization step is known to be computationally complex. Given this limitation, this paper investigates the idea of iteratively preconditioned gradient…
▽ More
Moving horizon estimation (MHE) is a widely studied state estimation approach in several practical applications. In the MHE problem, the state estimates are obtained via the solution of an approximated nonlinear optimization problem. However, this optimization step is known to be computationally complex. Given this limitation, this paper investigates the idea of iteratively preconditioned gradient-descent (IPG) to solve MHE problem with the aim of an improved performance than the existing solution techniques. To our knowledge, the preconditioning technique is used for the first time in this paper to reduce the computational cost and accelerate the crucial optimization step for MHE. The convergence guarantee of the proposed iterative approach for a class of MHE problems is presented. Additionally, sufficient conditions for the MHE problem to be convex are also derived. Finally, the proposed method is implemented on a unicycle localization example. The simulation results demonstrate that the proposed approach can achieve better accuracy with reduced computational costs.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Card Games Unveiled: Exploring the Underlying Linear Algebra
Authors:
Nikhil Byrapuram,
Hwiseo,
Choi,
Adam Ge,
Selena Ge,
Tanya Khovanova,
Sylvia Zia Lee,
Evin Liang,
Rajarshi Mandal,
Aika Oki,
Daniel Wu,
Michael Yang
Abstract:
We discuss four famous card games that can help learn linear algebra. The games are: SET, Socks, Spot it!, and EvenQuads. We describe the game in the language of vector, affine, and projective spaces. We also show how these games are connected to each other. A separate section is devoted to playing Socks with the EvenQuads deck and vice versa.
We discuss four famous card games that can help learn linear algebra. The games are: SET, Socks, Spot it!, and EvenQuads. We describe the game in the language of vector, affine, and projective spaces. We also show how these games are connected to each other. A separate section is devoted to playing Socks with the EvenQuads deck and vice versa.
△ Less
Submitted 1 September, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
The Complexity of Diagonalization
Authors:
Nikhil Srivastava
Abstract:
We survey recent progress on efficient algorithms for approximately diagonalizing a square complex matrix in the models of rational (variable precision) and finite (floating point) arithmetic. This question has been studied across several research communities for decades, but many mysteries remain. We present several open problems which we hope will be of broad interest.
We survey recent progress on efficient algorithms for approximately diagonalizing a square complex matrix in the models of rational (variable precision) and finite (floating point) arithmetic. This question has been studied across several research communities for decades, but many mysteries remain. We present several open problems which we hope will be of broad interest.
△ Less
Submitted 17 May, 2023;
originally announced May 2023.