Skip to main content

Showing 1–34 of 34 results for author: Oreshkin, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14113  [pdf, ps, other

    cs.LG cs.AI stat.ML

    SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting

    Authors: Yitian Zhang, Liheng Ma, Antonios Valkanas, Boris N. Oreshkin, Mark Coates

    Abstract: Koopman operator theory provides a framework for nonlinear dynamical system analysis and time-series forecasting by mapping dynamics to a space of real-valued measurement functions, enabling a linear operator representation. Despite the advantage of linearity, the operator is generally infinite-dimensional. Therefore, the objective is to learn measurement functions that yield a tractable finite-di… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  2. arXiv:2506.06657  [pdf, ps, other

    cs.CL cs.AI

    Quantile Regression with Large Language Models for Price Prediction

    Authors: Nikhita Vedula, Dushyanta Dhyani, Laleh Jalali, Boris Oreshkin, Mohsen Bayati, Shervin Malmasi

    Abstract: Large Language Models (LLMs) have shown promise in structured prediction tasks, including regression, but existing approaches primarily focus on point estimates and lack systematic comparison across different methods. We investigate probabilistic regression using LLMs for unstructured inputs, addressing challenging text-to-distribution prediction tasks such as price estimation where both nuanced t… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: Accepted to Findings of ACL, 2025

  3. arXiv:2412.02722  [pdf, other

    cs.LG cs.AI

    Enhanced N-BEATS for Mid-Term Electricity Demand Forecasting

    Authors: Mateusz Kasprzyk, Paweł Pełka, Boris N. Oreshkin, Grzegorz Dudek

    Abstract: This paper presents an enhanced N-BEATS model, N-BEATS*, for improved mid-term electricity load forecasting (MTLF). Building on the strengths of the original N-BEATS architecture, which excels in handling complex time series data without requiring preprocessing or domain-specific knowledge, N-BEATS* introduces two key modifications. (1) A novel loss function -- combining pinball loss based on MAPE… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

  4. arXiv:2411.05852  [pdf, other

    cs.LG stat.ML

    $\spadesuit$ SPADE $\spadesuit$ Split Peak Attention DEcomposition

    Authors: Malcolm Wolff, Kin G. Olivares, Boris Oreshkin, Sunny Ruan, Sitan Yang, Abhinav Katoch, Shankar Ramasubramanian, Youxin Zhang, Michael W. Mahoney, Dmitry Efimov, Vincent Quenneville-Bélair

    Abstract: Demand forecasting faces challenges induced by Peak Events (PEs) corresponding to special periods such as promotions and holidays. Peak events create significant spikes in demand followed by demand ramp down periods. Neural networks like MQCNN and MQT overreact to demand peaks by carrying over the elevated PE demand into subsequent Post-Peak-Event (PPE) periods, resulting in significantly over-bia… ▽ More

    Submitted 21 January, 2025; v1 submitted 6 November, 2024; originally announced November 2024.

    Journal ref: 31st Conference on Neural Information Processing In 38th Conference on Neural Information Processing Systems NIPS 2017, Time Series in the Age of Large Models Workshop, 2024

  5. arXiv:2410.20022  [pdf, other

    cs.CL cs.LG

    Dynamic layer selection in decoder-only transformers

    Authors: Theodore Glavas, Joud Chataoui, Florence Regol, Wassim Jabbour, Antonios Valkanas, Boris N. Oreshkin, Mark Coates

    Abstract: The vast size of Large Language Models (LLMs) has prompted a search to optimize inference. One effective approach is dynamic inference, which adapts the architecture to the sample-at-hand to reduce the overall computational cost. We empirically examine two common dynamic inference methods for natural language generation (NLG): layer skipping and early exiting. We find that a pre-trained decoder-on… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  6. arXiv:2410.03919  [pdf, other

    cs.LG stat.ML

    Online Posterior Sampling with a Diffusion Prior

    Authors: Branislav Kveton, Boris Oreshkin, Youngsuk Park, Aniket Deshmukh, Rui Song

    Abstract: Posterior sampling in contextual bandits with a Gaussian prior can be implemented exactly or approximately using the Laplace approximation. The Gaussian prior is computationally efficient but it cannot describe complex distributions. In this work, we propose approximate posterior sampling algorithms for contextual bandits with a diffusion model prior. The key idea is to sample from a chain of appr… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: Proceedings of the 38th Conference on Neural Information Processing Systems

  7. arXiv:2405.18281  [pdf, other

    cs.LG cs.AI

    MODL: Multilearner Online Deep Learning

    Authors: Antonios Valkanas, Boris N. Oreshkin, Mark Coates

    Abstract: Online deep learning tackles the challenge of learning from data streams by balancing two competing goals: fast learning and deep learning. However, existing research primarily emphasizes deep learning solutions, which are more adept at handling the ``deep'' aspect than the ``fast'' aspect of online learning. In this work, we introduce an alternative paradigm through a hybrid multilearner approach… ▽ More

    Submitted 20 March, 2025; v1 submitted 28 May, 2024; originally announced May 2024.

  8. arXiv:2404.17451  [pdf, other

    cs.LG stat.ML

    Any-Quantile Probabilistic Forecasting of Short-Term Electricity Demand

    Authors: Slawek Smyl, Boris N. Oreshkin, Paweł Pełka, Grzegorz Dudek

    Abstract: Power systems operate under uncertainty originating from multiple factors that are impossible to account for deterministically. Distributional forecasting is used to control and mitigate risks associated with this uncertainty. Recent progress in deep learning has helped to significantly improve the accuracy of point forecasts, while accurate distributional forecasting still presents a significant… ▽ More

    Submitted 4 October, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  9. arXiv:2302.04774  [pdf, ps, other

    cs.CV

    3D Human Pose and Shape Estimation via HybrIK-Transformer

    Authors: Boris N. Oreshkin

    Abstract: HybrIK relies on a combination of analytical inverse kinematics and deep learning to produce more accurate 3D pose estimation from 2D monocular images. HybrIK has three major components: (1) pretrained convolution backbone, (2) deconvolution to lift 3D pose from 2D convolution features, (3) analytical inverse kinematics pass correcting deep learning prediction using learned distribution of plausib… ▽ More

    Submitted 22 April, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

  10. arXiv:2208.08274  [pdf, other

    cs.GR cs.LG

    SMPL-IK: Learned Morphology-Aware Inverse Kinematics for AI Driven Artistic Workflows

    Authors: Vikram Voleti, Boris N. Oreshkin, Florent Bocquelet, Félix G. Harvey, Louis-Simon Ménard, Christopher Pal

    Abstract: Inverse Kinematics (IK) systems are often rigid with respect to their input character, thus requiring user intervention to be adapted to new skeletons. In this paper we aim at creating a flexible, learned IK solver applicable to a wide variety of human morphologies. We extend a state-of-the-art machine learning IK solver to operate on the well known Skinned Multi-Person Linear model (SMPL). We cal… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  11. arXiv:2201.12886  [pdf, other

    cs.LG cs.AI

    N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting

    Authors: Cristian Challu, Kin G. Olivares, Boris N. Oreshkin, Federico Garza, Max Mergenthaler-Canseco, Artur Dubrawski

    Abstract: Recent progress in neural forecasting accelerated improvements in the performance of large-scale forecasting systems. Yet, long-horizon forecasting remains a very difficult task. Two common challenges afflicting the task are the volatility of the predictions and their computational complexity. We introduce N-HiTS, a model which addresses both challenges by incorporating novel hierarchical interpol… ▽ More

    Submitted 29 November, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

    Comments: Accepted at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

  12. arXiv:2201.06701  [pdf, other

    cs.LG

    Motion Inbetweening via Deep $Δ$-Interpolator

    Authors: Boris N. Oreshkin, Antonios Valkanas, Félix G. Harvey, Louis-Simon Ménard, Florent Bocquelet, Mark J. Coates

    Abstract: We show that the task of synthesizing human motion conditioned on a set of key frames can be solved more accurately and effectively if a deep learning based interpolator operates in the delta mode using the spherical linear interpolator as a baseline. We empirically demonstrate the strength of our approach on publicly available datasets achieving state-of-the-art performance. We further generalize… ▽ More

    Submitted 16 August, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

  13. arXiv:2109.09705  [pdf, other

    cs.LG cs.DC cs.NE stat.ML

    Neural forecasting at scale

    Authors: Philippe Chatigny, Shengrui Wang, Jean-Marc Patenaude, Boris N. Oreshkin

    Abstract: We study the problem of efficiently scaling ensemble-based deep neural networks for multi-step time series (TS) forecasting on a large set of time series. Current state-of-the-art deep ensemble models have high memory and computational requirements, hampering their use to forecast millions of TS in practical scenarios. We propose N-BEATS(P), a global parallel variant of the N-BEATS model designed… ▽ More

    Submitted 28 January, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

  14. arXiv:2106.01981  [pdf, other

    cs.CV cs.GR cs.LG

    ProtoRes: Proto-Residual Network for Pose Authoring via Learned Inverse Kinematics

    Authors: Boris N. Oreshkin, Florent Bocquelet, Félix G. Harvey, Bay Raitt, Dominic Laflamme

    Abstract: Our work focuses on the development of a learnable neural representation of human pose for advanced AI assisted animation tooling. Specifically, we tackle the problem of constructing a full static human pose based on sparse and variable user inputs (e.g. locations and/or orientations of a subset of body joints). To solve this problem, we propose a novel neural architecture that combines residual c… ▽ More

    Submitted 16 August, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  15. arXiv:2012.15440  [pdf, other

    eess.SP cs.LG

    Adaptive filters for the moving target indicator system

    Authors: Boris N. Oreshkin

    Abstract: Adaptive algorithms belong to an important class of algorithms used in radar target detection to overcome prior uncertainty of interference covariance. The contamination of the empirical covariance matrix by the useful signal leads to significant degradation of performance of this class of adaptive algorithms. Regularization, also known in radar literature as sample covariance loading, can be used… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

  16. Optimization of loading factor preventing target cancellation

    Authors: Boris N. Oreshkin, Peter A. Bakulev

    Abstract: Adaptive algorithms based on sample matrix inversion belong to an important class of algorithms used in radar target detection to overcome prior uncertainty of interference covariance. Sample matrix inversion problem is generally ill conditioned. Moreover, the contamination of the empirical covariance matrix by the useful signal leads to significant degradation of performance of this class of adap… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Journal ref: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing

  17. Optimization over Random and Gradient Probabilistic Pixel Sampling for Fast, Robust Multi-Resolution Image Registration

    Authors: Boris N. Oreshkin, Tal Arbel

    Abstract: This paper presents an approach to fast image registration through probabilistic pixel sampling. We propose a practical scheme to leverage the benefits of two state-of-the-art pixel sampling approaches: gradient magnitude based pixel sampling and uniformly random sampling. Our framework involves learning the optimal balance between the two sampling schemes off-line during training, based on a smal… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2010.00988

    Journal ref: WBIR 2012. Lecture Notes in Computer Science, vol 7359. Springer, Berlin, Heidelberg

  18. Uncertainty driven probabilistic voxel selection for image registration

    Authors: Boris N. Oreshkin, Tal Arbel

    Abstract: This paper presents a novel probabilistic voxel selection strategy for medical image registration in time-sensitive contexts, where the goal is aggressive voxel sampling (e.g. using less than 1% of the total number) while maintaining registration accuracy and low failure rate. We develop a Bayesian framework whereby, first, a voxel sampling probability field (VSPF) is built based on the uncertaint… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Journal ref: in IEEE Transactions on Medical Imaging, vol. 32, no. 10, pp. 1777-1790, Oct. 2013

  19. arXiv:2009.11961  [pdf, ps, other

    cs.LG eess.SP

    N-BEATS neural network for mid-term electricity load forecasting

    Authors: Boris N. Oreshkin, Grzegorz Dudek, Paweł Pełka, Ekaterina Turkina

    Abstract: This paper addresses the mid-term electricity load forecasting problem. Solving this problem is necessary for power system operation and planning as well as for negotiating forward contracts in deregulated energy markets. We show that our proposed deep neural network modeling approach based on the deep neural architecture is effective at solving the mid-term electricity load forecasting problem. P… ▽ More

    Submitted 2 April, 2021; v1 submitted 24 September, 2020; originally announced September 2020.

  20. arXiv:2007.15531  [pdf, other

    cs.LG stat.ML

    FC-GAGA: Fully Connected Gated Graph Architecture for Spatio-Temporal Traffic Forecasting

    Authors: Boris N. Oreshkin, Arezou Amini, Lucy Coyle, Mark J. Coates

    Abstract: Forecasting of multivariate time-series is an important problem that has applications in traffic management, cellular network configuration, and quantitative finance. A special case of the problem arises when there is a graph available that captures the relationships between the time-series. In this paper we propose a novel learning architecture that achieves performance competitive with or better… ▽ More

    Submitted 14 December, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

  21. arXiv:2002.02887  [pdf, other

    cs.LG stat.ML

    Meta-learning framework with applications to zero-shot time-series forecasting

    Authors: Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

    Abstract: Can meta-learning discover generic ways of processing time series (TS) from a diverse dataset so as to greatly improve generalization on new TS coming from different datasets? This work provides positive evidence to this using a broad meta-learning framework which we show subsumes many existing meta-learning algorithms. Our theoretical analysis suggests that residual connections act as a meta-lear… ▽ More

    Submitted 14 December, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  22. arXiv:2001.09540  [pdf, other

    cs.CV

    Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Embeddings

    Authors: Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand

    Abstract: Significant progress has been made recently in developing few-shot object segmentation methods. Learning is shown to be successful in few-shot segmentation settings, using pixel-level, scribbles and bounding box supervision. This paper takes another approach, i.e., only requiring image-level label for few-shot object segmentation. We propose a novel multi-modal interaction module for few-shot obje… ▽ More

    Submitted 17 May, 2020; v1 submitted 26 January, 2020; originally announced January 2020.

    Comments: Accepted to IJCAI'20. The first three authors listed contributed equally

  23. arXiv:1912.08936  [pdf, other

    cs.CV

    One-Shot Weakly Supervised Video Object Segmentation

    Authors: Mennatullah Siam, Naren Doraiswamy, Boris N. Oreshkin, Hengshuai Yao, Martin Jagersand

    Abstract: Conventional few-shot object segmentation methods learn object segmentation from a few labelled support images with strongly labelled segmentation masks. Recent work has shown to perform on par with weaker levels of supervision in terms of scribbles and bounding boxes. However, there has been limited attention given to the problem of few-shot object segmentation with image-level supervision. We pr… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

  24. arXiv:1909.09859  [pdf, other

    stat.ME cs.AI

    DECoVaC: Design of Experiments with Controlled Variability Components

    Authors: Thomas Boquet, Laure Delisle, Denis Kochetkov, Nathan Schucher, Parmida Atighehchian, Boris Oreshkin, Julien Cornebise

    Abstract: Reproducible research in Machine Learning has seen a salutary abundance of progress lately: workflows, transparency, and statistical analysis of validation and test performance. We build on these efforts and take them further. We offer a principled experimental design methodology, based on linear mixed models, to study and separate the effects of multiple factors of variation in machine learning e… ▽ More

    Submitted 21 September, 2019; originally announced September 2019.

  25. arXiv:1906.11892  [pdf, other

    cs.CV cs.LG stat.ML

    CLAREL: Classification via retrieval loss for zero-shot learning

    Authors: Boris N. Oreshkin, Negar Rostamzadeh, Pedro O. Pinheiro, Christopher Pal

    Abstract: We address the problem of learning fine-grained cross-modal representations. We propose an instance-based deep metric learning approach in joint visual and textual space. The key novelty of this paper is that it shows that using per-image semantic supervision leads to substantial improvement in zero-shot performance over using class-only supervision. On top of that, we provide a probabilistic just… ▽ More

    Submitted 5 April, 2020; v1 submitted 31 May, 2019; originally announced June 2019.

  26. arXiv:1905.10437  [pdf, other

    cs.LG stat.ML

    N-BEATS: Neural basis expansion analysis for interpretable time series forecasting

    Authors: Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

    Abstract: We focus on solving the univariate times series point forecasting problem using deep learning. We propose a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. The architecture has a number of desirable properties, being interpretable, applicable without modification to a wide array of target domains, and fast to train. We test the… ▽ More

    Submitted 20 February, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

  27. arXiv:1902.11123  [pdf, other

    cs.CV cs.LG stat.ML

    Adaptive Masked Proxies for Few-Shot Segmentation

    Authors: Mennatullah Siam, Boris Oreshkin, Martin Jagersand

    Abstract: Deep learning has thrived by training on large-scale datasets. However, in robotics applications sample efficiency is critical. We propose a novel adaptive masked proxies method that constructs the final segmentation layer weights from few labelled samples. It utilizes multi-resolution average pooling on base embeddings masked with the label to act as a positive proxy for the new class, while fusi… ▽ More

    Submitted 14 October, 2019; v1 submitted 19 February, 2019; originally announced February 2019.

    Comments: Accepted to ICCV'19

  28. arXiv:1902.07104  [pdf, other

    cs.LG stat.ML

    Adaptive Cross-Modal Few-Shot Learning

    Authors: Chen Xing, Negar Rostamzadeh, Boris N. Oreshkin, Pedro O. Pinheiro

    Abstract: Metric-based meta-learning techniques have successfully been applied to few-shot classification problems. In this paper, we propose to leverage cross-modal information to enhance metric-based few-shot learning methods. Visual and semantic feature spaces have different structures by definition. For certain concepts, visual features might be richer and more discriminative than text ones. While for o… ▽ More

    Submitted 17 February, 2020; v1 submitted 19 February, 2019; originally announced February 2019.

  29. arXiv:1806.07528  [pdf, other

    stat.ML cs.LG

    Uncertainty in Multitask Transfer Learning

    Authors: Alexandre Lacoste, Boris Oreshkin, Wonchang Chung, Thomas Boquet, Negar Rostamzadeh, David Krueger

    Abstract: Using variational Bayes neural networks, we develop an algorithm capable of accumulating knowledge into a prior from multiple different tasks. The result is a rich and meaningful prior capable of few-shot learning on new tasks. The posterior can go beyond the mean field approximation and yields good uncertainty on the performed experiments. Analysis on toy tasks shows that it can learn from signif… ▽ More

    Submitted 6 July, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

  30. arXiv:1805.10123  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    TADAM: Task dependent adaptive metric for improved few-shot learning

    Authors: Boris N. Oreshkin, Pau Rodriguez, Alexandre Lacoste

    Abstract: Few-shot learning has become essential for producing models that generalize from few examples. In this work, we identify that metric scaling and metric task conditioning are important to improve the performance of few-shot algorithms. Our analysis reveals that simple metric scaling completely changes the nature of few-shot algorithm parameter updates. Metric scaling provides improvements up to 14%… ▽ More

    Submitted 25 January, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

    Journal ref: Advances in Neural Information Processing Systems 31, 2018

  31. arXiv:1712.05016  [pdf, other

    stat.ML cs.LG

    Deep Prior

    Authors: Alexandre Lacoste, Thomas Boquet, Negar Rostamzadeh, Boris Oreshkin, Wonchang Chung, David Krueger

    Abstract: The recent literature on deep learning offers new tools to learn a rich probability distribution over high dimensional data such as images or sounds. In this work we investigate the possibility of learning the prior distribution over neural network parameters using such tools. Our resulting variational Bayes algorithm generalizes well to new tasks, even when very few training examples are provided… ▽ More

    Submitted 15 December, 2017; v1 submitted 13 December, 2017; originally announced December 2017.

    Comments: Workshop paper, Accepted at Bayesian Deep Learning workshop, NIPS 2017

  32. Efficient delay-tolerant particle filtering

    Authors: Boris N. Oreshkin, Xuan Liu, Mark J. Coates

    Abstract: This paper proposes a novel framework for delay-tolerant particle filtering that is computationally efficient and has limited memory requirements. Within this framework the informativeness of a delayed (out-of-sequence) measurement (OOSM) is estimated using a lightweight procedure and uninformative measurements are immediately discarded. The framework requires the identification of a threshold tha… ▽ More

    Submitted 22 September, 2010; originally announced September 2010.

  33. Greedy Gossip with Eavesdropping

    Authors: Deniz Ustebay, Boris Oreshkin, Mark Coates, Michael Rabbat

    Abstract: This paper presents greedy gossip with eavesdropping (GGE), a novel randomized gossip algorithm for distributed computation of the average consensus problem. In gossip algorithms, nodes in the network randomly communicate with their neighbors and exchange information iteratively. The algorithms are simple and decentralized, making them attractive for wireless network applications. In general, go… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: 25 pages, 7 figures

  34. arXiv:0903.3537  [pdf, ps, other

    cs.DC cs.IT cs.MA

    Optimization and Analysis of Distributed Averaging with Short Node Memory

    Authors: Boris N. Oreshkin, Mark J. Coates, Michael G. Rabbat

    Abstract: In this paper, we demonstrate, both theoretically and by numerical examples, that adding a local prediction component to the update rule can significantly improve the convergence rate of distributed averaging algorithms. We focus on the case where the local predictor is a linear combination of the node's two previous values (i.e., two memory taps), and our update rule computes a combination of t… ▽ More

    Submitted 5 February, 2010; v1 submitted 20 March, 2009; originally announced March 2009.