Skip to main content

Showing 1–16 of 16 results for author: Bourdev, L

.
  1. arXiv:2212.10674  [pdf, other

    eess.IV

    PIM: Video Coding using Perceptual Importance Maps

    Authors: Evgenya Pergament, Pulkit Tandon, Oren Rippel, Lubomir Bourdev, Alexander G. Anderson, Bruno Olshausen, Tsachy Weissman, Sachin Katti, Kedar Tatwawadi

    Abstract: Human perception is at the core of lossy video compression, with numerous approaches developed for perceptual quality assessment and improvement over the past two decades. In the determination of perceptual quality, different spatio-temporal regions of the video differ in their relative importance to the human viewer. However, since it is challenging to infer or even collect such fine-grained info… ▽ More

    Submitted 9 April, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  2. arXiv:2205.03969  [pdf, other

    eess.IV

    An Interactive Annotation Tool for Perceptual Video Compression

    Authors: Evgenya Pergament, Pulkit Tandon, Kedar Tatwawadi, Oren Rippel, Lubomir Bourdev, Bruno Olshausen, Tsachy Weissman, Sachin Katti, Alexander G. Anderson

    Abstract: Human perception is at the core of lossy video compression and yet, it is challenging to collect data that is sufficiently dense to drive compression. In perceptual quality assessment, human feedback is typically collected as a single scalar quality score indicating preference of one distorted video over another. In reality, some videos may be better in some parts but not in others. We propose an… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  3. arXiv:2104.14335  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    ELF-VC: Efficient Learned Flexible-Rate Video Coding

    Authors: Oren Rippel, Alexander G. Anderson, Kedar Tatwawadi, Sanjay Nair, Craig Lytle, Lubomir Bourdev

    Abstract: While learned video codecs have demonstrated great promise, they have yet to achieve sufficient efficiency for practical deployment. In this work, we propose several novel ideas for learned video compression which allow for improved performance for the low-latency mode (I- and P-frames only) along with a considerable increase in computational efficiency. In this setting, for natural videos our app… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Journal ref: International Conference on Computer Vision, 2021

  4. arXiv:1811.06981  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Learned Video Compression

    Authors: Oren Rippel, Sanjay Nair, Carissa Lew, Steve Branson, Alexander G. Anderson, Lubomir Bourdev

    Abstract: We present a new algorithm for video coding, learned end-to-end for the low-latency mode. In this setting, our approach outperforms all existing video codecs across nearly the entire bitrate range. To our knowledge, this is the first ML-based method to do so. We evaluate our approach on standard video compression test sets of varying resolutions, and benchmark against all mainstream commercial c… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

  5. arXiv:1705.05823  [pdf, other

    stat.ML cs.CV cs.LG

    Real-Time Adaptive Image Compression

    Authors: Oren Rippel, Lubomir Bourdev

    Abstract: We present a machine learning-based approach to lossy image compression which outperforms all existing codecs, while running in real-time. Our algorithm typically produces files 2.5 times smaller than JPEG and JPEG 2000, 2 times smaller than WebP, and 1.7 times smaller than BPG on datasets of generic images across all quality levels. At the same time, our codec is designed to be lightweight and… ▽ More

    Submitted 16 May, 2017; originally announced May 2017.

    Comments: Published at ICML 2017

  6. arXiv:1511.06681  [pdf, other

    cs.CV

    Deep End2End Voxel2Voxel Prediction

    Authors: Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

    Abstract: Over the last few years deep learning methods have emerged as one of the most prominent approaches for video analysis. However, so far their most successful applications have been in the area of video classification and detection, i.e., problems involving the prediction of a single class label or a handful of output variables per video. Furthermore, while deep networks are commonly recognized as t… ▽ More

    Submitted 20 November, 2015; originally announced November 2015.

  7. arXiv:1511.05939  [pdf, other

    stat.ML cs.LG

    Metric Learning with Adaptive Density Discrimination

    Authors: Oren Rippel, Manohar Paluri, Piotr Dollar, Lubomir Bourdev

    Abstract: Distance metric learning (DML) approaches learn a transformation to a representation space where distance is in correspondence with a predefined notion of similarity. While such models offer a number of compelling benefits, it has been difficult for these to compete with modern classification algorithms in performance and even in feature extraction. In this work, we propose a novel approach expl… ▽ More

    Submitted 1 March, 2016; v1 submitted 18 November, 2015; originally announced November 2015.

    Comments: ICLR 2016

  8. arXiv:1511.03776  [pdf, other

    cs.CV

    ProNet: Learning to Propose Object-specific Boxes for Cascaded Neural Networks

    Authors: Chen Sun, Manohar Paluri, Ronan Collobert, Ram Nevatia, Lubomir Bourdev

    Abstract: This paper aims to classify and locate objects accurately and efficiently, without using bounding box annotations. It is challenging as objects in the wild could appear at arbitrary locations and in different scales. In this paper, we propose a novel classification architecture ProNet based on convolutional neural networks. It uses computationally efficient neural networks to propose image regions… ▽ More

    Submitted 12 April, 2016; v1 submitted 12 November, 2015; originally announced November 2015.

    Comments: CVPR 2016 (fixed reference issue)

  9. arXiv:1505.03873  [pdf, other

    cs.CV

    Improving Image Classification with Location Context

    Authors: Kevin Tang, Manohar Paluri, Li Fei-Fei, Rob Fergus, Lubomir Bourdev

    Abstract: With the widespread availability of cellphones and cameras that have GPS capabilities, it is common for images being uploaded to the Internet today to have GPS coordinates associated with them. In addition to research that tries to predict GPS coordinates from visual features, this also opens up the door to problems that are conditioned on the availability of GPS coordinates. In this work, we tack… ▽ More

    Submitted 14 May, 2015; originally announced May 2015.

  10. arXiv:1501.05703  [pdf, other

    cs.CV

    Beyond Frontal Faces: Improving Person Recognition Using Multiple Cues

    Authors: Ning Zhang, Manohar Paluri, Yaniv Taigman, Rob Fergus, Lubomir Bourdev

    Abstract: We explore the task of recognizing peoples' identities in photo albums in an unconstrained setting. To facilitate this, we introduce the new People In Photo Albums (PIPA) dataset, consisting of over 60000 instances of 2000 individuals collected from public Flickr photo albums. With only about half of the person images containing a frontal face, the recognition task is very challenging due to the l… ▽ More

    Submitted 30 January, 2015; v1 submitted 22 January, 2015; originally announced January 2015.

  11. arXiv:1412.6115  [pdf, other

    cs.CV cs.LG cs.NE

    Compressing Deep Convolutional Networks using Vector Quantization

    Authors: Yunchao Gong, Liu Liu, Ming Yang, Lubomir Bourdev

    Abstract: Deep convolutional neural networks (CNN) has become the most promising method for object recognition, repeatedly demonstrating record breaking results for image classification and object detection in recent years. However, a very deep CNN generally involves many layers with millions of parameters, making the storage of the network model to be extremely large. This prohibits the usage of deep CNNs… ▽ More

    Submitted 18 December, 2014; originally announced December 2014.

  12. arXiv:1412.0767  [pdf, other

    cs.CV

    Learning Spatiotemporal Features with 3D Convolutional Networks

    Authors: Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

    Abstract: We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets; 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is… ▽ More

    Submitted 6 October, 2015; v1 submitted 1 December, 2014; originally announced December 2014.

  13. arXiv:1407.0717  [pdf, other

    cs.CV

    Deep Poselets for Human Detection

    Authors: Lubomir Bourdev, Fei Yang, Rob Fergus

    Abstract: We address the problem of detecting people in natural scenes using a part approach based on poselets. We propose a bootstrapping method that allows us to collect millions of weakly labeled examples for each poselet type. We use these examples to train a Convolutional Neural Net to discriminate different poselet types and separate them from the background class. We then use the trained CNN as a way… ▽ More

    Submitted 2 July, 2014; originally announced July 2014.

  14. arXiv:1406.2080  [pdf, other

    cs.CV cs.LG cs.NE

    Training Convolutional Networks with Noisy Labels

    Authors: Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, Rob Fergus

    Abstract: The availability of large labeled datasets has allowed Convolutional Network models to achieve impressive recognition results. However, in many settings manual annotation of the data is impractical; instead our data has noisy labels, i.e. there is some freely available label for each image which may or may not be accurate. In this paper, we explore the performance of discriminatively-trained Convn… ▽ More

    Submitted 10 April, 2015; v1 submitted 9 June, 2014; originally announced June 2014.

    Comments: Accepted as a workshop contribution at ICLR 2015

  15. arXiv:1405.0312  [pdf, other

    cs.CV

    Microsoft COCO: Common Objects in Context

    Authors: Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár

    Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object lo… ▽ More

    Submitted 20 February, 2015; v1 submitted 1 May, 2014; originally announced May 2014.

    Comments: 1) updated annotation pipeline description and figures; 2) added new section describing datasets splits; 3) updated author list

  16. arXiv:1311.5591  [pdf, other

    cs.CV

    PANDA: Pose Aligned Networks for Deep Attribute Modeling

    Authors: Ning Zhang, Manohar Paluri, Marc'Aurelio Ranzato, Trevor Darrell, Lubomir Bourdev

    Abstract: We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion. Convolutional Neural Nets (CNN) have been shown to perform very well on large scale object recognition problems. In the context of attribute classification, however, the signal is o… ▽ More

    Submitted 5 May, 2014; v1 submitted 21 November, 2013; originally announced November 2013.

    Comments: 8 pages