Skip to main content

Showing 1–18 of 18 results for author: Rajaraman, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.04202  [pdf, ps, other

    cs.GT cs.LG

    Computational Intractability of Strategizing against Online Learners

    Authors: Angelos Assos, Yuval Dagan, Nived Rajaraman

    Abstract: Online learning algorithms are widely used in strategic multi-agent settings, including repeated auctions, contract design, and pricing competitions, where agents adapt their strategies over time. A key question in such environments is how an optimizing agent can best respond to a learning agent to improve its own long-term outcomes. While prior work has developed efficient algorithms for the opti… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

    Comments: 32 pages

    MSC Class: 91A05 ACM Class: F.2.2

  2. arXiv:2502.12118  [pdf, other

    cs.LG cs.CL

    Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Authors: Amrith Setlur, Nived Rajaraman, Sergey Levine, Aviral Kumar

    Abstract: Despite substantial advances in scaling test-time compute, an ongoing debate in the community is how it should be scaled up to enable continued and efficient improvements with scaling. There are largely two approaches: first, distilling successful search or thinking traces; and second, using verification (e.g., 0/1 outcome rewards, reward models, or verifiers) to guide reinforcement learning (RL)… ▽ More

    Submitted 18 February, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

  3. arXiv:2502.10178  [pdf, other

    cs.LG cs.AI cs.IT

    From Markov to Laplace: How Mamba In-Context Learns Markov Chains

    Authors: Marco Bondaschi, Nived Rajaraman, Xiuying Wei, Kannan Ramchandran, Razvan Pascanu, Caglar Gulcehre, Michael Gastpar, Ashok Vardhan Makkuva

    Abstract: While transformer-based language models have driven the AI revolution thus far, their computational complexity has spurred growing interest in viable alternatives, such as structured state space sequence models (SSMs) and Selective SSMs. Among these, Mamba (S6) and its variant Mamba-2 have shown remarkable inference speed ups over transformers while achieving comparable or superior performance on… ▽ More

    Submitted 14 February, 2025; originally announced February 2025.

  4. arXiv:2407.17686  [pdf, other

    cs.LG cs.CL cs.IT stat.ML

    Transformers on Markov Data: Constant Depth Suffices

    Authors: Nived Rajaraman, Marco Bondaschi, Kannan Ramchandran, Michael Gastpar, Ashok Vardhan Makkuva

    Abstract: Attention-based transformers have been remarkably successful at modeling generative processes across various domains and modalities. In this paper, we study the behavior of transformers on data drawn from \kth Markov processes, where the conditional distribution of the next symbol in a sequence depends on the previous $k$ symbols observed. We observe a surprising phenomenon empirically which contr… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: 29 pages, 10 figures

  5. arXiv:2404.08335  [pdf, other

    cs.CL cs.LG

    Toward a Theory of Tokenization in LLMs

    Authors: Nived Rajaraman, Jiantao Jiao, Kannan Ramchandran

    Abstract: While there has been a large body of research attempting to circumvent tokenization for language modeling (Clark et al., 2022; Xue et al., 2022), the current consensus is that it is a necessary initial step for designing state-of-the-art performant language models. In this paper, we investigate tokenization from a theoretical point of view by studying the behavior of transformers on simple data ge… ▽ More

    Submitted 10 April, 2025; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 60 pages, 11 figures. This work was published at NeurIPS 2024 with a different title, "An Analysis of Tokenization: Transformers under Markov data"

  6. arXiv:2303.11453  [pdf, other

    cs.LG stat.ML

    Greedy Pruning with Group Lasso Provably Generalizes for Matrix Sensing

    Authors: Nived Rajaraman, Devvrit, Aryan Mokhtari, Kannan Ramchandran

    Abstract: Pruning schemes have been widely used in practice to reduce the complexity of trained models with a massive number of parameters. In fact, several practical studies have shown that if a pruned model is fine-tuned with some gradient-based updates it generalizes well to new samples. Although the above pipeline, which we refer to as pruning + fine-tuning, has been extremely successful in lowering the… ▽ More

    Submitted 4 June, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 49 pages, 2 figures

  7. arXiv:2302.06025  [pdf, ps, other

    stat.ML cs.IT cs.LG math.ST

    Statistical Complexity and Optimal Algorithms for Non-linear Ridge Bandits

    Authors: Nived Rajaraman, Yanjun Han, Jiantao Jiao, Kannan Ramchandran

    Abstract: We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action. Compared with the linear model, two curious phenomena arise in non-linear models: first, in addition to the "learning phase" with a standard parametric rate for estimation or regret, there is an "burn-in period" with a fixed cost determined by the non-linear function; second, ac… ▽ More

    Submitted 9 January, 2024; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: Revised Section 3 and added an upper bound agnostic to the link function $f$

  8. arXiv:2301.12579  [pdf, other

    cs.LG cs.AI

    Sample Efficient Deep Reinforcement Learning via Local Planning

    Authors: Dong Yin, Sridhar Thiagarajan, Nevena Lazic, Nived Rajaraman, Botao Hao, Csaba Szepesvari

    Abstract: The focus of this work is sample-efficient deep reinforcement learning (RL) with a simulator. One useful property of simulators is that it is typically easy to reset the environment to a previously observed state. We propose an algorithmic framework, named uncertainty-first local planning (UFLP), that takes advantage of this property. Concretely, in each data collection iteration, with some probab… ▽ More

    Submitted 3 July, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

    Comments: 25 pages, 11 figures

  9. arXiv:2210.02604  [pdf, other

    stat.ML cs.LG

    Spectral Regularization Allows Data-frugal Learning over Combinatorial Spaces

    Authors: Amirali Aghazadeh, Nived Rajaraman, Tony Tu, Kannan Ramchandran

    Abstract: Data-driven machine learning models are being increasingly employed in several important inference problems in biology, chemistry, and physics which require learning over combinatorial spaces. Recent empirical evidence (see, e.g., [1], [2], [3]) suggests that regularizing the spectral representation of such models improves their generalization power when labeled data is scarce. However, despite th… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  10. arXiv:2205.15397  [pdf, other

    cs.LG stat.ML

    Minimax Optimal Online Imitation Learning via Replay Estimation

    Authors: Gokul Swamy, Nived Rajaraman, Matthew Peng, Sanjiban Choudhury, J. Andrew Bagnell, Zhiwei Steven Wu, Jiantao Jiao, Kannan Ramchandran

    Abstract: Online imitation learning is the problem of how best to mimic expert demonstrations, given access to the environment or an accurate simulator. Prior work has shown that in the infinite sample regime, exact moment matching achieves value equivalence to the expert policy. However, in the finite sample regime, even if one has no optimization error, empirical variance can lead to a performance gap tha… ▽ More

    Submitted 14 January, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  11. arXiv:2106.06676  [pdf, other

    cs.LG

    Semi-supervised Active Regression

    Authors: Fnu Devvrit, Nived Rajaraman, Pranjal Awasthi

    Abstract: Labelled data often comes at a high cost as it may require recruiting human labelers or running costly experiments. At the same time, in many practical scenarios, one already has access to a partially labelled, potentially biased dataset that can help with the learning task at hand. Motivated by such settings, we formally initiate a study of $semi-supervised$ $active$ $learning$ through the frame… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

  12. arXiv:2102.12948  [pdf, ps, other

    cs.LG stat.ML

    Provably Breaking the Quadratic Error Compounding Barrier in Imitation Learning, Optimally

    Authors: Nived Rajaraman, Yanjun Han, Lin F. Yang, Kannan Ramchandran, Jiantao Jiao

    Abstract: We study the statistical limits of Imitation Learning (IL) in episodic Markov Decision Processes (MDPs) with a state space $\mathcal{S}$. We focus on the known-transition setting where the learner is provided a dataset of $N$ length-$H$ trajectories from a deterministic expert policy and knows the MDP transition. We establish an upper bound $O(|\mathcal{S}|H^{3/2}/N)$ for the suboptimality using t… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

    Comments: 30 pages, 2 figures

  13. arXiv:2102.01938  [pdf, other

    cs.IT math.ST stat.ML

    How good is Good-Turing for Markov samples?

    Authors: Prafulla Chandra, Andrew Thangaraj, Nived Rajaraman

    Abstract: The Good-Turing (GT) estimator for the missing mass (i.e., total probability of missing symbols) in $n$ samples is the number of symbols that appeared exactly once divided by $n$. For i.i.d. samples, the bias and squared-error risk of the GT estimator can be shown to fall as $1/n$ by bounding the expected error uniformly over all symbols. In this work, we study convergence of the GT estimator for… ▽ More

    Submitted 27 May, 2023; v1 submitted 3 February, 2021; originally announced February 2021.

  14. arXiv:2009.11248  [pdf, other

    cs.CR cs.IT cs.LG stat.ML

    FastSecAgg: Scalable Secure Aggregation for Privacy-Preserving Federated Learning

    Authors: Swanand Kadhe, Nived Rajaraman, O. Ozan Koyluoglu, Kannan Ramchandran

    Abstract: Recent attacks on federated learning demonstrate that keeping the training data on clients' devices does not provide sufficient privacy, as the model parameters shared by clients can leak information about their training data. A 'secure aggregation' protocol enables the server to aggregate clients' models in a privacy-preserving manner. However, existing secure aggregation protocols incur high com… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Shorter version accepted in ICML Workshop on Federated Learning, July 2020, and CCS Workshop on Privacy-Preserving Machine Learning in Practice, November 2020

  15. arXiv:2009.05990  [pdf, ps, other

    cs.LG cs.AI math.OC stat.ML

    Toward the Fundamental Limits of Imitation Learning

    Authors: Nived Rajaraman, Lin F. Yang, Jiantao Jiao, Kannan Ramachandran

    Abstract: Imitation learning (IL) aims to mimic the behavior of an expert policy in a sequential decision-making problem given only demonstrations. In this paper, we focus on understanding the minimax statistical limits of IL in episodic Markov Decision Processes (MDPs). We first consider the setting where the learner is provided a dataset of $N$ expert trajectories ahead of time, and cannot interact with t… ▽ More

    Submitted 13 September, 2020; originally announced September 2020.

    Comments: 45 pages, 3 figures

  16. arXiv:1812.08617  [pdf, ps, other

    cs.IT

    Not Just Age but Age and Quality of Information

    Authors: Nived Rajaraman, Rahul Vaze, Goonwanth Reddy

    Abstract: A versatile scheduling problem to model a three-way tradeoff between delay/age, distortion, and energy is considered. The considered problem called the age and quality of information (AQI) is to select which packets to transmit at each time slot to minimize a linear combination of the distortion cost, the age/delay cost and the energy transmission cost in an online fashion. AQI generalizes multipl… ▽ More

    Submitted 20 December, 2018; originally announced December 2018.

  17. arXiv:1810.12861  [pdf, other

    cs.DS

    Submodular Maximization Under A Matroid Constraint: Asking more from an old friend, the Greedy Algorithm

    Authors: Nived Rajaraman, Rahul Vaze

    Abstract: The classical problem of maximizing a submodular function under a matroid constraint is considered. Defining a new measure for the increments made by the greedy algorithm at each step, called the discriminant, improved approximation ratio guarantees are derived for the greedy algorithm. At each step, discriminant measures the multiplicative gap in the incremental valuation between the item chosen… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: 24 pages

  18. arXiv:1705.05006  [pdf, ps, other

    cs.IT

    Minimax Risk for Missing Mass Estimation

    Authors: Nikhilesh Rajaraman, Andrew Thangaraj, Ananda Theertha Suresh

    Abstract: The problem of estimating the missing mass or total probability of unseen elements in a sequence of $n$ random samples is considered under the squared error loss function. The worst-case risk of the popular Good-Turing estimator is shown to be between $0.6080/n$ and $0.6179/n$. The minimax risk is shown to be lower bounded by $0.25/n$. This appears to be the first such published result on minimax… ▽ More

    Submitted 14 May, 2017; originally announced May 2017.

    Comments: IEEE International Symposium on Information Theory 2017, Aachen, Germany