Skip to main content

Showing 1–7 of 7 results for author: Rennie, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:1905.12794  [pdf, other

    cs.CV

    Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

    Authors: Hui Wu, Yupeng Gao, Xiaoxiao Guo, Ziad Al-Halah, Steven Rennie, Kristen Grauman, Rogerio Feris

    Abstract: Conversational interfaces for the detail-oriented retail fashion domain are more natural, expressive, and user friendly than classical keyword-based search interfaces. In this paper, we introduce the Fashion IQ dataset to support and advance research on interactive fashion image retrieval. Fashion IQ is the first fashion dataset to provide human-generated captions that distinguish similar pairs of… ▽ More

    Submitted 25 November, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

  2. arXiv:1805.00145  [pdf, other

    cs.CV cs.AI

    Dialog-based Interactive Image Retrieval

    Authors: Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, Rogerio Schmidt Feris

    Abstract: Existing methods for interactive image retrieval have demonstrated the merit of integrating user feedback, improving retrieval results. However, most current systems rely on restricted forms of user feedback, such as binary relevance responses, or feedback based on a fixed set of relative attributes, which limits their impact. In this paper, we introduce a new approach to interactive image search… ▽ More

    Submitted 20 December, 2018; v1 submitted 30 April, 2018; originally announced May 2018.

    Comments: accepted at NeurIPS 2018

  3. arXiv:1711.08393  [pdf, other

    cs.CV cs.LG

    BlockDrop: Dynamic Inference Paths in Residual Networks

    Authors: Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris

    Abstract: Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications. We introduce BlockDrop, an approach that learns to dynamically choose which layers of a deep network to execute during inference so as to best reduce total computation without degrading prediction accuracy. Exploiting the robustness of R… ▽ More

    Submitted 28 January, 2019; v1 submitted 22 November, 2017; originally announced November 2017.

    Comments: CVPR 2018

  4. arXiv:1612.00563  [pdf, other

    cs.LG cs.AI cs.CV

    Self-critical Sequence Training for Image Captioning

    Authors: Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Jarret Ross, Vaibhava Goel

    Abstract: Recently it has been shown that policy-gradient methods for reinforcement learning can be utilized to train deep end-to-end systems directly on non-differentiable metrics for the task at hand. In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, signifi… ▽ More

    Submitted 15 November, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

    Comments: CVPR 2017 + additional analysis + fixed baseline results, 16 pages

  5. arXiv:1604.08242  [pdf, other

    cs.CL

    The IBM 2016 English Conversational Telephone Speech Recognition System

    Authors: George Saon, Tom Sercu, Steven Rennie, Hong-Kwang J. Kuo

    Abstract: We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6.6% on the Switchboard subset of the Hub5 2000 evaluation testset. On the acoustic side, we use a score fusion of three strong models: recurrent nets with maxout activations, very deep convolutional nets with 3x3 kernels, and bidir… ▽ More

    Submitted 22 June, 2016; v1 submitted 27 April, 2016; originally announced April 2016.

    Comments: Submitted to Interspeech 2016

  6. arXiv:1506.03705  [pdf, other

    cs.LG stat.ML

    Random Maxout Features

    Authors: Youssef Mroueh, Steven Rennie, Vaibhava Goel

    Abstract: In this paper, we propose and study random maxout features, which are constructed by first projecting the input data onto sets of randomly generated vectors with Gaussian elements, and then outputing the maximum projection value for each set. We show that the resulting random feature map, when used in conjunction with linear models, allows for the locally linear estimation of the function of inter… ▽ More

    Submitted 12 June, 2015; v1 submitted 11 June, 2015; originally announced June 2015.

  7. arXiv:1505.05899  [pdf, other

    cs.CL

    The IBM 2015 English Conversational Telephone Speech Recognition System

    Authors: George Saon, Hong-Kwang J. Kuo, Steven Rennie, Michael Picheny

    Abstract: We describe the latest improvements to the IBM English conversational telephone speech recognition system. Some of the techniques that were found beneficial are: maxout networks with annealed dropout rates; networks with a very large number of outputs trained on 2000 hours of data; joint modeling of partially unfolded recurrent neural networks and convolutional nets by combining the bottleneck and… ▽ More

    Submitted 21 May, 2015; originally announced May 2015.

    Comments: Submitted to Interspeech 2015