Skip to main content

Showing 1–40 of 40 results for author: Mineiro, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.22304  [pdf, other

    cs.CL cs.LG

    Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

    Authors: Yihe Deng, Paul Mineiro

    Abstract: Mathematical reasoning is a crucial capability for Large Language Models (LLMs), yet generating detailed and accurate reasoning traces remains a significant challenge. This paper introduces a novel approach to produce high-quality reasoning traces for LLM fine-tuning using online learning \textbf{Flows}. Our method employs an incremental output production Flow, where component LLMs collaboratively… ▽ More

    Submitted 29 October, 2024; originally announced October 2024.

    Comments: 5 pages, 4 figures, 1 table

  2. arXiv:2406.10490  [pdf, other

    stat.ML cs.LG

    Active, anytime-valid risk controlling prediction sets

    Authors: Ziyu Xu, Nikos Karampatziakis, Paul Mineiro

    Abstract: Rigorously establishing the safety of black-box machine learning models concerning critical risk measures is important for providing guarantees about model behavior. Recently, Bates et. al. (JACM '24) introduced the notion of a risk controlling prediction set (RCPS) for producing prediction sets that are statistically guaranteed low risk from machine learning models. Our method extends this notion… ▽ More

    Submitted 31 October, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: 22 pages, 3 figures. Accepted to NeurIPS 2024

  3. arXiv:2406.04516  [pdf, other

    cs.LG

    Online Joint Fine-tuning of Multi-Agent Flows

    Authors: Paul Mineiro

    Abstract: A Flow is a collection of component models ("Agents") which constructs the solution to a complex problem via iterative communication. Flows have emerged as state of the art architectures for code generation, and are the raison d'etre for frameworks like Autogen. However, flows are currently constructed via a combination of manual prompt engineering and stagewise supervised learning techniques; the… ▽ More

    Submitted 16 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2405.20677  [pdf, other

    cs.LG stat.ML

    Provably Efficient Interactive-Grounded Learning with Personalized Reward

    Authors: Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, Paul Mineiro

    Abstract: Interactive-Grounded Learning (IGL) [Xie et al., 2021] is a powerful framework in which a learner aims at maximizing unobservable rewards through interacting with an environment and observing reward-dependent feedback on the taken actions. To deal with personalized rewards that are ubiquitous in applications such as recommendation systems, Maghakian et al. [2022] study a version of IGL with contex… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  5. arXiv:2404.15269  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Aligning LLM Agents by Learning Latent Preference from User Edits

    Authors: Ge Gao, Alexey Taymanov, Eduardo Salinas, Paul Mineiro, Dipendra Misra

    Abstract: We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is natur… ▽ More

    Submitted 23 November, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  6. arXiv:2402.08127  [pdf, other

    cs.LG

    Efficient Contextual Bandits with Uninformed Feedback Graphs

    Authors: Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, Paul Mineiro

    Abstract: Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems, capturing many real-life applications. A recent work by Zhang et al. (2023) studies the contextual version of this problem and proposes an efficient and optimal algorithm via a reduction to online regression. However, their algorithm crucially relies on seeing… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  7. arXiv:2302.14248  [pdf, other

    stat.ML cs.LG

    Time-uniform confidence bands for the CDF under nonstationarity

    Authors: Paul Mineiro, Steven R. Howard

    Abstract: Estimation of the complete distribution of a random variable is a useful primitive for both manual and automated decision making. This problem has received extensive attention in the i.i.d. setting, but the arbitrary data dependent setting remains largely unaddressed. Consistent with known impossibility results, we present computationally felicitous time-uniform and value-uniform bounds on the CDF… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  8. arXiv:2302.08631  [pdf, other

    cs.LG

    Practical Contextual Bandits with Feedback Graphs

    Authors: Mengxiao Zhang, Yuheng Zhang, Olga Vrousgou, Haipeng Luo, Paul Mineiro

    Abstract: While contextual bandit has a mature theory, effectively leveraging different feedback patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs, which interpolates between the full information and bandit regimes, provides a promising framework to mitigate the statistical complexity of learning. In this paper, we propose and analyze an approach to contextual bandits wi… ▽ More

    Submitted 26 October, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  9. arXiv:2302.08551  [pdf, other

    cs.LG

    Infinite Action Contextual Bandits with Reusable Data Exhaust

    Authors: Mark Rucker, Yinglun Zhu, Paul Mineiro

    Abstract: For infinite action contextual bandits, smoothed regret and reduction to regression results in state-of-the-art online performance with computational cost independent of the action set: unfortunately, the resulting data exhaust does not have well-defined importance-weights. This frustrates the execution of downstream data science processes such as offline model selection. In this paper we describe… ▽ More

    Submitted 7 June, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Final version after responding to reviewers

  10. arXiv:2211.15823  [pdf, other

    cs.LG cs.AI cs.IR

    Personalized Reward Learning with Interaction-Grounded Learning (IGL)

    Authors: Jessica Maghakian, Paul Mineiro, Kishan Panaganti, Mark Rucker, Akanksha Saran, Cheng Tan

    Abstract: In an era of countless content offerings, recommender systems alleviate information overload by providing users with personalized content suggestions. Due to the scarcity of explicit user feedback, modern recommender systems typically optimize for the same fixed combination of implicit feedback signals across all users. However, this approach disregards a growing body of work highlighting that (i)… ▽ More

    Submitted 3 March, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: ICLR 2023

  11. arXiv:2211.07614  [pdf, other

    cs.LG

    Towards Data-Driven Offline Simulations for Online Reinforcement Learning

    Authors: Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman

    Abstract: Modern decision-making systems, from robots to web recommendation engines, are expected to adapt: to user preferences, changing circumstances or even new tasks. Yet, it is still uncommon to deploy a dynamically learning agent (rather than a fixed policy) to a production system, as it's perceived as unsafe. Using historical data to reason about learning algorithms, similar to offline policy evaluat… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Presented at the 3rd Offline Reinforcement Learning Workshop at NeurIPS 2022

  12. arXiv:2210.14077  [pdf, other

    cs.LG

    Eigen Memory Trees

    Authors: Mark Rucker, Jordan T. Ash, John Langford, Paul Mineiro, Ida Momennejad

    Abstract: This work introduces the Eigen Memory Tree (EMT), a novel online memory model for sequential learning scenarios. EMTs store data at the leaves of a binary tree and route new samples through the structure using the principal components of previous experiences, facilitating efficient (logarithmic) access to relevant memories. We demonstrate that EMT outperforms existing online memory approaches, and… ▽ More

    Submitted 31 October, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: corrected an author name; corrected title plurality

  13. Deploying a Steered Query Optimizer in Production at Microsoft

    Authors: Wangda Zhang, Matteo Interlandi, Paul Mineiro, Shi Qiao, Nasim Ghazanfari Karlen Lie, Marc Friedman, Rafah Hosn, Hiren Patel, Alekh Jindal

    Abstract: Modern analytical workloads are highly heterogeneous and massively complex, making generic query optimizers untenable for many customers and scenarios. As a result, it is important to specialize these optimizers to instances of the workloads. In this paper, we continue a recent line of work in steering a query optimizer towards better plans for a given workload, and make major strides in pushing p… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Journal ref: Proceedings of the 2022 International Conference on Management of Data 2022 Jun 10 (pp. 2299-2311)

  14. arXiv:2210.13573  [pdf, other

    stat.ML cs.LG

    Conditionally Risk-Averse Contextual Bandits

    Authors: Mónika Farsang, Paul Mineiro, Wangda Zhang

    Abstract: Contextual bandits with average-case statistical guarantees are inadequate in risk-averse situations because they might trade off degraded worst-case behaviour for better average performance. Designing a risk-averse contextual bandit is challenging because exploration is necessary but risk-aversion is sensitive to the entire distribution of rewards; nonetheless we exhibit the first risk-averse con… ▽ More

    Submitted 8 July, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  15. arXiv:2210.11133  [pdf, other

    stat.ML cs.LG

    A lower confidence sequence for the changing mean of non-negative right heavy-tailed observations with bounded mean

    Authors: Paul Mineiro

    Abstract: A confidence sequence (CS) is an anytime-valid sequential inference primitive which produces an adapted sequence of sets for a predictable parameter sequence with a time-uniform coverage guarantee. This work constructs a non-parametric non-asymptotic lower CS for the running average conditional expectation whose slack converges to zero given non-negative right heavy-tailed observations with bounde… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Reference implementation at https://github.com/microsoft/csrobust

  16. arXiv:2210.10768  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Anytime-valid off-policy inference for contextual bandits

    Authors: Ian Waudby-Smith, Lili Wu, Aaditya Ramdas, Nikos Karampatziakis, Paul Mineiro

    Abstract: Contextual bandit algorithms are ubiquitous tools for active sequential experimentation in healthcare and the tech industry. They involve online learning algorithms that adaptively learn policies over time to map observed contexts $X_t$ to actions $A_t$ in an attempt to maximize stochastic rewards $R_t$. This adaptivity raises interesting but hard statistical inference questions, especially counte… ▽ More

    Submitted 15 August, 2024; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 43 pages, 6 figures

  17. arXiv:2207.05849  [pdf, other

    cs.LG stat.ML

    Contextual Bandits with Smooth Regret: Efficient Learning in Continuous Action Spaces

    Authors: Yinglun Zhu, Paul Mineiro

    Abstract: Designing efficient general-purpose contextual bandit algorithms that work with large -- or even continuous -- action spaces would facilitate application to important scenarios such as information retrieval, recommendation systems, and continuous control. While obtaining standard regret guarantees can be hopeless, alternative regret notions have been proposed to tackle the large action setting. We… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: To appear at ICML 2022

  18. arXiv:2207.05836  [pdf, other

    cs.LG stat.ML

    Contextual Bandits with Large Action Spaces: Made Practical

    Authors: Yinglun Zhu, Dylan J. Foster, John Langford, Paul Mineiro

    Abstract: A central problem in sequential decision making is to develop algorithms that are practical and computationally efficient, yet support the use of flexible, general-purpose models. Focusing on the contextual bandit problem, recent progress provides provably efficient algorithms with strong empirical performance when the number of possible alternatives ("actions") is small, but guarantees for decisi… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: To appear at ICML 2022

  19. arXiv:2206.08364  [pdf, other

    cs.LG cs.AI cs.HC stat.ML

    Interaction-Grounded Learning with Action-inclusive Feedback

    Authors: Tengyang Xie, Akanksha Saran, Dylan J. Foster, Lekan Molu, Ida Momennejad, Nan Jiang, Paul Mineiro, John Langford

    Abstract: Consider the problem setting of Interaction-Grounded Learning (IGL), in which a learner's goal is to optimally interact with the environment with no explicit reward to ground its policies. The agent observes a context vector, takes an action, and receives a feedback vector, using this information to effectively optimize a policy with respect to a latent reward function. Prior analyzed approaches f… ▽ More

    Submitted 12 October, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: Published in NeurIPS 2022

  20. arXiv:2106.06926  [pdf, other

    cs.LG cs.AI stat.ML

    Bellman-consistent Pessimism for Offline Reinforcement Learning

    Authors: Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal

    Abstract: The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning. Despite the robustness it adds to the algorithm, overly pessimistic reasoning can be equally damaging in precluding the discovery of good policies, which is an issue for the popular bonus-based pessimism. In this paper, we introduce the notion of Bell… ▽ More

    Submitted 23 October, 2023; v1 submitted 13 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 (Oral)

  21. arXiv:2106.04887  [pdf, other

    cs.LG cs.AI stat.ML

    Interaction-Grounded Learning

    Authors: Tengyang Xie, John Langford, Paul Mineiro, Ida Momennejad

    Abstract: Consider a prosthetic arm, learning to adapt to its user's control signals. We propose Interaction-Grounded Learning for this novel setting, in which a learner's goal is to interact with the environment with no grounding or explicit reward to optimize its policies. Such a problem evades common RL solutions which require an explicit reward. The learning agent observes a multidimensional context vec… ▽ More

    Submitted 13 July, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: Published in ICML 2021

  22. arXiv:2106.04815  [pdf, other

    cs.LG

    ChaCha for Online AutoML

    Authors: Qingyun Wu, Chi Wang, John Langford, Paul Mineiro, Marco Rossi

    Abstract: We propose the ChaCha (Champion-Challengers) algorithm for making an online choice of hyperparameters in online learning settings. ChaCha handles the process of determining a champion and scheduling a set of `live' challengers over time based on sample complexity bounds. It is guaranteed to have sublinear regret after the optimal configuration is added into consideration by an application-dependen… ▽ More

    Submitted 11 June, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 16 pages (including supplementary appendix). Appearing at ICML 2021

    Journal ref: ICML 2021

  23. arXiv:2106.00589  [pdf, other

    cs.LG

    Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning

    Authors: Bogdan Mazoure, Paul Mineiro, Pavithra Srinath, Reza Sharifi Sedeh, Doina Precup, Adith Swaminathan

    Abstract: We study session-based recommendation scenarios where we want to recommend items to users during sequential interactions to improve their long-term utility. Optimizing a long-term metric is challenging because the learning signal (whether the recommendations achieved their desired goals) is delayed and confounded by other user interactions with the system. Targeting immediately measurable proxies… ▽ More

    Submitted 14 September, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

  24. arXiv:2102.09540  [pdf, other

    cs.LG math.ST stat.ML

    Off-policy Confidence Sequences

    Authors: Nikos Karampatziakis, Paul Mineiro, Aaditya Ramdas

    Abstract: We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting. These confidence sequences are based on recent ideas from martingale analysis and are non-asymptotic, non-parametric, and valid at arbitrary stopping times. We provide algorithms for computing these confidence sequences that strike a good balance between computational and statisti… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

  25. arXiv:1906.03323  [pdf, other

    cs.LG stat.ML

    Empirical Likelihood for Contextual Bandits

    Authors: Nikos Karampatziakis, John Langford, Paul Mineiro

    Abstract: We propose an estimator and confidence interval for computing the value of a policy from off-policy data in the contextual bandit setting. To this end we apply empirical likelihood techniques to formulate our estimator and confidence interval as simple convex optimization problems. Using the lower bound of our confidence interval, we then propose an off-policy policy optimization algorithm that se… ▽ More

    Submitted 17 October, 2020; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: Accepted at NeurIPS 2020

  26. arXiv:1905.02219  [pdf, other

    cs.LG stat.ML

    Lessons from Contextual Bandit Learning in a Customer Support Bot

    Authors: Nikos Karampatziakis, Sebastian Kochman, Jade Huang, Paul Mineiro, Kathy Osborne, Weizhu Chen

    Abstract: In this work, we describe practical lessons we have learned from successfully using contextual bandits (CBs) to improve key business metrics of the Microsoft Virtual Agent for customer support. While our current use cases focus on single step einforcement learning (RL) and mostly in the domain of natural language processing and information retrieval we believe many of our findings are generally ap… ▽ More

    Submitted 18 June, 2019; v1 submitted 6 May, 2019; originally announced May 2019.

    Comments: Reinforcement Learning for Real Life Workshop

  27. arXiv:1807.06473  [pdf, other

    cs.LG stat.ML

    Contextual Memory Trees

    Authors: Wen Sun, Alina Beygelzimer, Hal Daumé III, John Langford, Paul Mineiro

    Abstract: We design and study a Contextual Memory Tree (CMT), a learning memory controller that inserts new memories into an experience store of unbounded size. It is designed to efficiently query for memories from that store, supporting logarithmic time insertion and retrieval operations. Hence CMT can be integrated into existing statistical learning algorithms as an augmented memory unit without substanti… ▽ More

    Submitted 2 June, 2019; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: ICM 2019

  28. arXiv:1606.04988  [pdf, other

    stat.ML cs.LG

    Logarithmic Time One-Against-Some

    Authors: Hal Daume III, Nikos Karampatziakis, John Langford, Paul Mineiro

    Abstract: We create a new online reduction of multiclass classification to binary classification for which training and prediction time scale logarithmically with the number of classes. Compared to previous approaches, we obtain substantially better statistical performance for two reasons: First, we prove a tighter and more complete boosting theorem, and second we translate the results more directly into an… ▽ More

    Submitted 30 November, 2016; v1 submitted 15 June, 2016; originally announced June 2016.

  29. arXiv:1602.02181  [pdf, other

    stat.ML cs.LG

    Active Information Acquisition

    Authors: He He, Paul Mineiro, Nikos Karampatziakis

    Abstract: We propose a general framework for sequential and dynamic acquisition of useful information in order to solve a particular task. While our goal could in principle be tackled by general reinforcement learning, our particular setting is constrained enough to allow more efficient algorithms. In this paper, we work under the Learning to Search framework and show how to formulate the goal of finding a… ▽ More

    Submitted 5 February, 2016; originally announced February 2016.

  30. arXiv:1511.03260  [pdf, ps, other

    stat.ML cs.LG

    A Hierarchical Spectral Method for Extreme Classification

    Authors: Paul Mineiro, Nikos Karampatziakis

    Abstract: Extreme classification problems are multiclass and multilabel classification problems where the number of outputs is so large that straightforward strategies are neither statistically nor computationally viable. One strategy for dealing with the computational burden is via a tree decomposition of the output space. While this typically leads to training and inference that scales sublinearly with th… ▽ More

    Submitted 3 February, 2016; v1 submitted 10 November, 2015; originally announced November 2015.

    Comments: Reference implementation available at https://github.com/pmineiro/xlst

  31. arXiv:1503.08873  [pdf, ps, other

    cs.LG

    Fast Label Embeddings for Extremely Large Output Spaces

    Authors: Paul Mineiro, Nikos Karampatziakis

    Abstract: Many modern multiclass and multilabel problems are characterized by increasingly large output spaces. For these problems, label embeddings have been shown to be a useful primitive that can improve computational and statistical efficiency. In this work we utilize a correspondence between rank constrained estimation and low dimensional label embeddings that uncovers a fast label embedding algorithm… ▽ More

    Submitted 30 March, 2015; originally announced March 2015.

    Comments: Accepted as a workshop contribution at ICLR 2015

  32. arXiv:1502.02710  [pdf, ps, other

    cs.LG

    Scalable Multilabel Prediction via Randomized Methods

    Authors: Nikos Karampatziakis, Paul Mineiro

    Abstract: Modeling the dependence between outputs is a fundamental challenge in multilabel classification. In this work we show that a generic regularized nonlinearity mapping independent predictions to joint predictions is sufficient to achieve state-of-the-art performance on a variety of benchmark problems. Crucially, we compute the joint predictions without ever obtaining any independent predictions, whi… ▽ More

    Submitted 20 April, 2015; v1 submitted 9 February, 2015; originally announced February 2015.

  33. arXiv:1502.02704  [pdf, other

    cs.LG

    Learning Reductions that Really Work

    Authors: Alina Beygelzimer, Hal Daumé III, John Langford, Paul Mineiro

    Abstract: We provide a summary of the mathematical and computational techniques that have enabled learning reductions to effectively address a wide class of problems, and show that this approach to solving machine learning problems can be broadly useful.

    Submitted 9 February, 2015; originally announced February 2015.

  34. arXiv:1412.6547  [pdf, other

    cs.LG

    Fast Label Embeddings via Randomized Linear Algebra

    Authors: Paul Mineiro, Nikos Karampatziakis

    Abstract: Many modern multiclass and multilabel problems are characterized by increasingly large output spaces. For these problems, label embeddings have been shown to be a useful primitive that can improve computational and statistical efficiency. In this work we utilize a correspondence between rank constrained estimation and low dimensional label embeddings that uncovers a fast label embedding algorithm… ▽ More

    Submitted 5 July, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

    Comments: To appear in the proceedings of the ECML/PKDD 2015 conference. Reference implementation available at https://github.com/pmineiro/randembed

  35. arXiv:1411.3409  [pdf, other

    stat.ML cs.LG

    A Randomized Algorithm for CCA

    Authors: Paul Mineiro, Nikos Karampatziakis

    Abstract: We present RandomizedCCA, a randomized algorithm for computing canonical analysis, suitable for large datasets stored either out of core or on a distributed file system. Accurate results can be obtained in as few as two data passes, which is relevant for distributed processing frameworks in which iteration is expensive (e.g., Hadoop). The strategy also provides an excellent initializer for standar… ▽ More

    Submitted 12 November, 2014; originally announced November 2014.

  36. arXiv:1408.2065  [pdf

    cs.LG stat.ML

    Normalized Online Learning

    Authors: Stephane Ross, Paul Mineiro, John Langford

    Abstract: We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale. This has several useful effects: there is no need to pre-normalize data, the test-time and test-space complexity are reduced, and the algorithms are more robust.

    Submitted 9 August, 2014; originally announced August 2014.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-537-545

  37. arXiv:1310.6304  [pdf, ps, other

    cs.LG

    Combining Structured and Unstructured Randomness in Large Scale PCA

    Authors: Nikos Karampatziakis, Paul Mineiro

    Abstract: Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top principal components of a dataset with a large number of rows (examples) and columns (features). Our algorithm leverages both structured and unstructured random proj… ▽ More

    Submitted 24 October, 2013; v1 submitted 23 October, 2013; originally announced October 2013.

  38. arXiv:1310.1934  [pdf, other

    cs.LG stat.ML

    Discriminative Features via Generalized Eigenvectors

    Authors: Nikos Karampatziakis, Paul Mineiro

    Abstract: Representing examples in a way that is compatible with the underlying classifier can greatly enhance the performance of a learning system. In this paper we investigate scalable techniques for inducing discriminative features by taking advantage of simple second order structure in the data. We focus on multiclass classification and show that features extracted from the generalized eigenvectors of t… ▽ More

    Submitted 7 October, 2013; originally announced October 2013.

  39. arXiv:1306.1840  [pdf, other

    cs.LG stat.ML

    Loss-Proportional Subsampling for Subsequent ERM

    Authors: Paul Mineiro, Nikos Karampatziakis

    Abstract: We propose a sampling scheme suitable for reducing a data set prior to selecting a hypothesis with minimum empirical risk. The sampling only considers a subset of the ultimate (unknown) hypothesis set, but can nonetheless guarantee that the final excess risk will compare favorably with utilizing the entire original data set. We demonstrate the practical benefits of our approach on a large dataset… ▽ More

    Submitted 23 June, 2013; v1 submitted 7 June, 2013; originally announced June 2013.

    Comments: Appears in the proceedings of the 30th International Conference on Machine Learning

  40. arXiv:1305.6646  [pdf, other

    cs.LG stat.ML

    Normalized Online Learning

    Authors: Stephane Ross, Paul Mineiro, John Langford

    Abstract: We introduce online learning algorithms which are independent of feature scales, proving regret bounds dependent on the ratio of scales existent in the data rather than the absolute scale. This has several useful effects: there is no need to pre-normalize data, the test-time and test-space complexity are reduced, and the algorithms are more robust.

    Submitted 28 May, 2013; originally announced May 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)