Skip to main content

Showing 1–13 of 13 results for author: Polanía, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.07001  [pdf, other

    cs.CV cs.AI cs.LG

    From Image to Video: An Empirical Study of Diffusion Representations

    Authors: Pedro Vélez, Luisa F. Polanía, Yi Yang, Chuhan Zhang, Rishabh Kabra, Anurag Arnab, Mehdi S. M. Sajjadi

    Abstract: Diffusion models have revolutionized generative modeling, enabling unprecedented realism in image and video synthesis. This success has sparked interest in leveraging their representations for visual understanding tasks. While recent works have explored this potential for image generation, the visual understanding capabilities of video diffusion models remain largely uncharted. To address this gap… ▽ More

    Submitted 19 March, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

  2. arXiv:2412.15212  [pdf, ps, other

    cs.CV cs.AI cs.LG

    Scaling 4D Representations

    Authors: João Carreira, Dilara Gokay, Michael King, Chuhan Zhang, Ignacio Rocco, Aravindh Mahendran, Thomas Albert Keck, Joseph Heyward, Skanda Koppula, Etienne Pot, Goker Erdogan, Yana Hasson, Yi Yang, Klaus Greff, Guillaume Le Moing, Sjoerd van Steenkiste, Daniel Zoran, Drew A. Hudson, Pedro Vélez, Luisa Polanía, Luke Friedman, Chris Duvarney, Ross Goroshin, Kelsey Allen, Jacob Walker , et al. (10 additional authors not shown)

    Abstract: Scaling has not yet been convincingly demonstrated for pure self-supervised learning from video. However, prior work has focused evaluations on semantic-related tasks $\unicode{x2013}$ action classification, ImageNet classification, etc. In this paper we focus on evaluating self-supervised learning on non-semantic vision tasks that are more spatial (3D) and temporal (+1D = 4D), such as camera pose… ▽ More

    Submitted 9 July, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

  3. arXiv:2306.00763  [pdf, other

    cs.CV cs.AI

    Learning Disentangled Prompts for Compositional Image Synthesis

    Authors: Kihyuk Sohn, Albert Shaw, Yuan Hao, Han Zhang, Luisa Polania, Huiwen Chang, Lu Jiang, Irfan Essa

    Abstract: We study domain-adaptive image synthesis, the problem of teaching pretrained image generative models a new style or concept from as few as one image to synthesize novel images, to better understand the compositional image synthesis. We present a framework that leverages a pretrained class-conditional generation model and visual prompt tuning. Specifically, we propose a novel source class distilled… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: tech report

  4. arXiv:2210.00990  [pdf, other

    cs.CV cs.AI

    Visual Prompt Tuning for Generative Transfer Learning

    Authors: Kihyuk Sohn, Yuan Hao, José Lezama, Luisa Polania, Huiwen Chang, Han Zhang, Irfan Essa, Lu Jiang

    Abstract: Transferring knowledge from an image synthesis model trained on a large dataset is a promising direction for learning generative image models from various domains efficiently. While previous works have studied GAN models, we present a recipe for learning vision transformers by generative knowledge transfer. We base our framework on state-of-the-art generative vision transformers that represent an… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: technical report

  5. arXiv:2004.07268  [pdf, other

    cs.CV cs.LG

    Learning Furniture Compatibility with Graph Neural Networks

    Authors: Luisa F. Polania, Mauricio Flores, Yiran Li, Matthew Nokleby

    Abstract: We propose a graph neural network (GNN) approach to the problem of predicting the stylistic compatibility of a set of furniture items from images. While most existing results are based on siamese networks which evaluate pairwise compatibility between items, the proposed GNN architecture exploits relational information among groups of items. We present two GNN models, both of which comprise a deep… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at CVPR Workshops

  6. arXiv:2002.09023  [pdf, other

    cs.CV

    Audio-video Emotion Recognition in the Wild using Deep Hybrid Networks

    Authors: Xin Guo, Luisa F. Polanía, Kenneth E. Barner

    Abstract: This paper presents an audiovisual-based emotion recognition hybrid network. While most of the previous work focuses either on using deep models or hand-engineered features extracted from images, we explore multiple deep models built on both images and audio signals. Specifically, in addition to convolutional neural networks (CNN) and recurrent neutral networks (RNN) trained on facial images, the… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

  7. arXiv:1912.05035  [pdf, other

    cs.CV cs.LG eess.IV

    Deep Adaptive Wavelet Network

    Authors: Maria Ximena Bastidas Rodriguez, Adrien Gruson, Luisa F. Polania, Shin Fujieda, Flavio Prieto Ortiz, Kohei Takayama, Toshiya Hachisuka

    Abstract: Even though convolutional neural networks have become the method of choice in many fields of computer vision, they still lack interpretability and are usually designed manually in a cumbersome trial-and-error process. This paper aims at overcoming those limitations by proposing a deep neural network, which is designed in a systematic fashion and is interpretable, by integrating multiresolution ana… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

  8. arXiv:1909.12911  [pdf, other

    cs.CV

    Graph Neural Networks for Image Understanding Based on Multiple Cues: Group Emotion Recognition and Event Recognition as Use Cases

    Authors: Xin Guo, Luisa F. Polania, Bin Zhu, Charles Boncelet, Kenneth E. Barner

    Abstract: A graph neural network (GNN) for image understanding based on multiple cues is proposed in this paper. Compared to traditional feature and decision fusion approaches that neglect the fact that features can interact and exchange information, the proposed GNN is able to pass information among features extracted from different models. Two image understanding tasks, namely group-level emotion recognit… ▽ More

    Submitted 28 February, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: Paper accepted for publication at the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV)

  9. arXiv:1905.03703  [pdf, other

    cs.CV

    Learning fashion compatibility across apparel categories for outfit recommendation

    Authors: Luisa F. Polania, Satyajit Gupte

    Abstract: This paper addresses the problem of generating recommendations for completing the outfit given that a user is interested in a particular apparel item. The proposed method is based on a siamese network used for feature extraction followed by a fully-connected network used for learning a fashion compatibility metric. The embeddings generated by the siamese network are augmented with color histogram… ▽ More

    Submitted 1 May, 2019; originally announced May 2019.

    Comments: Accepted for publication at ICIP 2019

  10. arXiv:1811.03268  [pdf, other

    cs.CV

    Ordinal Regression using Noisy Pairwise Comparisons for Body Mass Index Range Estimation

    Authors: Luisa Polania, Dongning Wang, Glenn Fung

    Abstract: Ordinal regression aims to classify instances into ordinal categories. In this paper, body mass index (BMI) category estimation from facial images is cast as an ordinal regression problem. In particular, noisy binary search algorithms based on pairwise comparisons are employed to exploit the ordinal relationship among BMI categories. Comparisons are performed with Siamese architectures, one of whi… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: Paper accepted for publication at the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV 2019)

  11. arXiv:1802.02185  [pdf, other

    cs.CV

    Smile detection in the wild based on transfer learning

    Authors: Xin Guo, Luisa F. Polanía, Kenneth E. Barner

    Abstract: Smile detection from unconstrained facial images is a specialized and challenging problem. As one of the most informative expressions, smiles convey basic underlying emotions, such as happiness and satisfaction, which lead to multiple applications, e.g., human behavior analysis and interactive controlling. Compared to the size of databases for face recognition, far less labeled data is available f… ▽ More

    Submitted 17 January, 2018; originally announced February 2018.

  12. Exploiting Restricted Boltzmann Machines and Deep Belief Networks in Compressed Sensing

    Authors: Luisa F. Polania, Kenneth E. Barner

    Abstract: This paper proposes a CS scheme that exploits the representational power of restricted Boltzmann machines and deep learning architectures to model the prior distribution of the sparsity pattern of signals belonging to the same class. The determined probability distribution is then used in a maximum a posteriori (MAP) approach for the reconstruction. The parameters of the prior distribution are lea… ▽ More

    Submitted 30 May, 2017; originally announced May 2017.

    Comments: Accepted for publication at IEEE Transactions on Signal Processing

  13. Exploiting Prior Knowledge in Compressed Sensing Wireless ECG Systems

    Authors: Luisa F. Polania, Rafael E. Carrillo, Manuel Blanco-Velasco, Kenneth E. Barner

    Abstract: Recent results in telecardiology show that compressed sensing (CS) is a promising tool to lower energy consumption in wireless body area networks for electrocardiogram (ECG) monitoring. However, the performance of current CS-based algorithms, in terms of compression rate and reconstruction quality of the ECG, still falls short of the performance attained by state-of-the-art wavelet based algorithm… ▽ More

    Submitted 29 May, 2014; v1 submitted 16 May, 2014; originally announced May 2014.

    Comments: Accepted for publication at IEEE Journal of Biomedical and Health Informatics