Skip to main content

Showing 1–17 of 17 results for author: Diba, A

.
  1. arXiv:2409.16581  [pdf, other

    cs.CV

    SelectiveKD: A semi-supervised framework for cancer detection in DBT through Knowledge Distillation and Pseudo-labeling

    Authors: Laurent Dillard, Hyeonsoo Lee, Weonsuk Lee, Tae Soo Kim, Ali Diba, Thijs Kooi

    Abstract: When developing Computer Aided Detection (CAD) systems for Digital Breast Tomosynthesis (DBT), the complexity arising from the volumetric nature of the modality poses significant technical challenges for obtaining large-scale accurate annotations. Without access to large-scale annotations, the resulting model may not generalize to different domains. Given the costly nature of obtaining DBT annotat… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

    Comments: 10 pages, 2 figures, 1 table

    MSC Class: 68T45; 92C55 68T45; 92C55 ACM Class: I.4.9; I.5.4

  2. arXiv:2406.03430  [pdf, other

    eess.IV cs.CV

    Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis

    Authors: Moein Heidari, Sina Ghorbani Kolahi, Sanaz Karimijafarbigloo, Bobby Azad, Afshin Bozorgpour, Soheila Hatami, Reza Azad, Ali Diba, Ulas Bagci, Dorit Merhof, Ilker Hacihaliloglu

    Abstract: Sequence modeling plays a vital role across various domains, with recurrent neural networks being historically the predominant method of performing these tasks. However, the emergence of transformers has altered this paradigm due to their superior performance. Built upon these advances, transformers have conjoined CNNs as two leading foundational models for learning visual representations. However… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: This is the first version of our survey, and the paper is currently under review

  3. arXiv:2103.11264  [pdf, other

    cs.CV cs.AI cs.LG

    Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

    Authors: M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc Van Gool, Rainer Stiefelhagen

    Abstract: Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks. For this and other video understanding tasks, supervised approaches have achieved encouraging performance but require a high volume of detailed frame-level annotations. We present a fully automatic and unsupervised approach for… ▽ More

    Submitted 27 March, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  4. arXiv:2011.08652  [pdf, other

    cs.CV

    3D CNNs with Adaptive Temporal Feature Resolutions

    Authors: Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc Van Gool, Juergen Gall

    Abstract: While state-of-the-art 3D Convolutional Neural Networks (CNN) achieve very good results on action recognition datasets, they are computationally very expensive and require many GFLOPs. While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips. In this work, we therefore introduce a different… ▽ More

    Submitted 11 August, 2021; v1 submitted 17 November, 2020; originally announced November 2020.

    Comments: CVPR 2021

  5. arXiv:2010.07258  [pdf, other

    cs.CV

    Self-Supervised Ranking for Representation Learning

    Authors: Ali Varamesh, Ali Diba, Tinne Tuytelaars, Luc Van Gool

    Abstract: We present a new framework for self-supervised representation learning by formulating it as a ranking problem in an image retrieval context on a large number of random views (augmentations) obtained from images. Our work is based on two intuitions: first, a good representation of images must yield a high-quality image ranking in a retrieval task; second, we would expect random views of an image to… ▽ More

    Submitted 20 November, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

  6. arXiv:1904.11451  [pdf, other

    cs.CV

    Large Scale Holistic Video Understanding

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc Van Gool

    Abstract: Video recognition has been advanced in recent years by benchmarks with rich annotations. However, research is still mainly limited to human action or sports recognition - focusing on a highly specific video understanding task and thus leaving a significant gap towards describing the overall content of a video. We fill this gap by presenting a large-scale "Holistic Video Understanding Dataset"~(HVU… ▽ More

    Submitted 15 December, 2020; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: ECCV 2020

  7. arXiv:1904.11407  [pdf, other

    cs.CV

    DynamoNet: Dynamic Action and Motion Network

    Authors: Ali Diba, Vivek Sharma, Luc Van Gool, Rainer Stiefelhagen

    Abstract: In this paper, we are interested in self-supervised learning the motion cues in videos using dynamic motion filters for a better motion representation to finally boost human action recognition in particular. Thus far, the vision community has focused on spatio-temporal approaches using standard filters, rather we here propose dynamic filters that adaptively learn the video-specific internal motion… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

  8. arXiv:1806.07754  [pdf, other

    cs.CV

    Spatio-Temporal Channel Correlation Networks for Action Classification

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, M. Mahdi Arzani, Rahman Yousefzadeh, Juergen Gall, Luc Van Gool

    Abstract: The work in this paper is driven by the question if spatio-temporal correlations are enough for 3D convolutional neural networks (CNN)? Most of the traditional 3D networks use local spatio-temporal features. We introduce a new block that models correlations between channels of a 3D CNN with respect to temporal and spatial features. This new block can be added as a residual unit to different parts… ▽ More

    Submitted 7 February, 2019; v1 submitted 19 June, 2018; originally announced June 2018.

    Comments: Accepted in ECCV 2018. arXiv admin note: substantial text overlap with arXiv:1711.08200

  9. arXiv:1711.08200  [pdf, other

    cs.CV

    Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification

    Authors: Ali Diba, Mohsen Fayyaz, Vivek Sharma, Amir Hossein Karami, Mohammad Mahdi Arzani, Rahman Yousefzadeh, Luc Van Gool

    Abstract: The work in this paper is driven by the question how to exploit the temporal cues available in videos for their accurate classification, and for human action recognition in particular? Thus far, the vision community has focused on spatio-temporal approaches with fixed temporal convolution kernel depths. We introduce a new temporal layer that models variable temporal convolution kernel depths. We e… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

  10. arXiv:1711.08174  [pdf, other

    cs.CV

    Weakly Supervised Object Discovery by Generative Adversarial & Ranking Networks

    Authors: Ali Diba, Vivek Sharma, Rainer Stiefelhagen, Luc Van Gool

    Abstract: The deep generative adversarial networks (GAN) recently have been shown to be promising for different computer vision applications, like image edit- ing, synthesizing high resolution images, generating videos, etc. These networks and the corresponding learning scheme can handle various visual space map- pings. We approach GANs with a novel training method and learning objective, to discover multip… ▽ More

    Submitted 17 April, 2018; v1 submitted 22 November, 2017; originally announced November 2017.

  11. arXiv:1710.07558  [pdf, other

    cs.CV cs.AI

    Classification Driven Dynamic Image Enhancement

    Authors: Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc Van Gool, Rainer Stiefelhagen

    Abstract: Convolutional neural networks rely on image texture and structure to serve as discriminative features to classify the image content. Image enhancement techniques can be used as preprocessing steps to help improve the overall image quality and in turn improve the overall effectiveness of a CNN. Existing image enhancement methods, however, are designed to improve the perceptual quality of an image f… ▽ More

    Submitted 28 March, 2018; v1 submitted 20 October, 2017; originally announced October 2017.

  12. arXiv:1611.08258  [pdf, other

    cs.CV

    Weakly Supervised Cascaded Convolutional Networks

    Authors: Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash, Luc Van Gool

    Abstract: Object detection is a challenging task in visual understanding domain, and even more so if the supervision is to be weak. Recently, few efforts to handle the task without expensive human annotations is established by promising deep neural network. A new architecture of cascaded networks is proposed to learn a convolutional neural network (CNN) under such conditions. We introduce two such architect… ▽ More

    Submitted 24 November, 2016; originally announced November 2016.

  13. arXiv:1611.06678  [pdf, other

    cs.CV

    Deep Temporal Linear Encoding Networks

    Authors: Ali Diba, Vivek Sharma, Luc Van Gool

    Abstract: The CNN-encoding of features from entire videos for the representation of human actions has rarely been addressed. Instead, CNN work has focused on approaches to fuse spatial and temporal networks, but these were typically limited to processing shorter sequences. We present a new video representation, called temporal linear encoding (TLE) and embedded inside of CNNs as a new layer, which captures… ▽ More

    Submitted 21 November, 2016; originally announced November 2016.

    Comments: Ali Diba and Vivek Sharma contributed equally to this work and listed in alphabetical order

  14. arXiv:1608.08851  [pdf, other

    cs.CV

    Efficient Two-Stream Motion and Appearance 3D CNNs for Video Classification

    Authors: Ali Diba, Ali Mohammad Pazandeh, Luc Van Gool

    Abstract: The video and action classification have extremely evolved by deep neural networks specially with two stream CNN using RGB and optical flow as inputs and they present outstanding performance in terms of video analysis. One of the shortcoming of these methods is handling motion information extraction which is done out side of the CNNs and relatively time consuming also on GPUs. So proposing end-to-… ▽ More

    Submitted 2 September, 2016; v1 submitted 31 August, 2016; originally announced August 2016.

  15. arXiv:1608.03217  [pdf, other

    cs.CV

    DeepCAMP: Deep Convolutional Action & Attribute Mid-Level Patterns

    Authors: Ali Diba, Ali Mohammad Pazandeh, Hamed Pirsiavash, Luc Van Gool

    Abstract: The recognition of human actions and the determination of human attributes are two tasks that call for fine-grained classification. Indeed, often rather small and inconspicuous objects and features have to be detected to tell their classes apart. In order to deal with this challenge, we propose a novel convolutional neural network that mines mid-level image patches that are sufficiently dedicated… ▽ More

    Submitted 10 August, 2016; originally announced August 2016.

    Comments: in CVPR 2016

  16. arXiv:1606.04702  [pdf, other

    cs.CV

    DeepProposals: Hunting Objects and Actions by Cascading Deep Convolutional Layers

    Authors: Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc Van Gool

    Abstract: In this paper, a new method for generating object and action proposals in images and videos is proposed. It builds on activations of different convolutional layers of a pretrained CNN, combining the localization accuracy of the early layers with the high informative-ness (and hence recall) of the later layers. To this end, we build an inverse cascade that, going backward from the later to the earl… ▽ More

    Submitted 15 June, 2016; originally announced June 2016.

    Comments: 15 pages

  17. arXiv:1510.04445  [pdf, other

    cs.CV

    DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers

    Authors: Amir Ghodrati, Ali Diba, Marco Pedersoli, Tinne Tuytelaars, Luc Van Gool

    Abstract: In this paper we evaluate the quality of the activation layers of a convolutional neural network (CNN) for the gen- eration of object proposals. We generate hypotheses in a sliding-window fashion over different activation layers and show that the final convolutional layers can find the object of interest with high recall but poor localization due to the coarseness of the feature maps. Instead, the… ▽ More

    Submitted 15 October, 2015; originally announced October 2015.

    Comments: ICCV 2015