Skip to main content

Showing 1–18 of 18 results for author: Gygli, M

.
  1. arXiv:2311.03402  [pdf, other

    cs.CV

    CycleCL: Self-supervised Learning for Periodic Videos

    Authors: Matteo Destro, Michael Gygli

    Abstract: Analyzing periodic video sequences is a key topic in applications such as automatic production systems, remote sensing, medical applications, or physical training. An example is counting repetitions of a physical exercise. Due to the distinct characteristics of periodic data, self-supervised methods designed for standard image datasets do not capture changes relevant to the progression of the cycl… ▽ More

    Submitted 13 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Accepted at WACV 2024

  2. arXiv:2310.04632  [pdf, other

    cs.CL cs.AI cs.LG

    Automatic Anonymization of Swiss Federal Supreme Court Rulings

    Authors: Joel Niklaus, Robin Mamié, Matthias Stürmer, Daniel Brunner, Marcel Gygli

    Abstract: Releasing court decisions to the public relies on proper anonymization to protect all involved parties, where necessary. The Swiss Federal Supreme Court relies on an existing system that combines different traditional computational methods with human experts. In this work, we enhance the existing anonymization software using a large dataset annotated with entities to be anonymized. We compared BER… ▽ More

    Submitted 31 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: Accepted to NLLP @ EMNLP 2023

    MSC Class: 68T50 ACM Class: I.2

  3. arXiv:2103.13318  [pdf, other

    cs.CV

    Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types

    Authors: Thomas Mensink, Jasper Uijlings, Alina Kuznetsova, Michael Gygli, Vittorio Ferrari

    Abstract: Transfer learning enables to re-use knowledge learned on a source task to help learning a target task. A simple form of transfer learning is common in current state-of-the-art computer vision models, i.e. pre-training a model for image classification on the ILSVRC dataset, and then fine-tune on any target task. However, previous systematic studies of transfer learning have been limited and the cir… ▽ More

    Submitted 20 November, 2021; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: Accepted for future publication in TPAMI

  4. arXiv:2004.03898  [pdf, other

    cs.LG cs.CV stat.ML

    Towards Reusable Network Components by Learning Compatible Representations

    Authors: Michael Gygli, Jasper Uijlings, Vittorio Ferrari

    Abstract: This paper proposes to make a first step towards compatible and hence reusable network components. Rather than training networks for different tasks independently, we adapt the training process to produce network components that are compatible across tasks. In particular, we split a network into two components, a features extractor and a target task head, and propose various approaches to accompli… ▽ More

    Submitted 16 December, 2020; v1 submitted 8 April, 2020; originally announced April 2020.

    Comments: Preprint; To be presented at AAAI 2021

  5. arXiv:2002.11000  [pdf, other

    cs.CR

    Distributed Ledger for Provenance Tracking of Artificial Intelligence Assets

    Authors: Philipp Lüthi, Thibault Gagnaux, Marcel Gygli

    Abstract: High availability of data is responsible for the current trends in Artificial Intelligence (AI) and Machine Learning (ML). However, high-grade datasets are reluctantly shared between actors because of lacking trust and fear of losing control. Provenance tracing systems are a possible measure to build trust by improving transparency. Especially the tracing of AI assets along complete AI value chain… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  6. arXiv:1911.12709  [pdf, other

    cs.CV

    Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections

    Authors: Theodora Kontogianni, Michael Gygli, Jasper Uijlings, Vittorio Ferrari

    Abstract: In interactive object segmentation a user collaborates with a computer vision model to segment an object. Recent works employ convolutional neural networks for this task: Given an image and a set of corrections made by the user as input, they output a segmentation mask. These approaches achieve strong performance by training on large datasets but they keep the model parameters unchanged at test ti… ▽ More

    Submitted 8 November, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: ECCV 2020 Camera Ready

  7. arXiv:1906.01542  [pdf, other

    cs.CV

    Natural Vocabulary Emerges from Free-Form Annotations

    Authors: Jordi Pont-Tuset, Michael Gygli, Vittorio Ferrari

    Abstract: We propose an approach for annotating object classes using free-form text written by undirected and untrained annotators. Free-form labeling is natural for annotators, they intuitively provide very specific and exhaustive labels, and no training stage is necessary. We first collect 729 labels on 15k images using 124 different annotators. Then we automatically enrich the structure of these free-for… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

  8. arXiv:1905.10576  [pdf, other

    cs.CV cs.HC

    Efficient Object Annotation via Speaking and Pointing

    Authors: Michael Gygli, Vittorio Ferrari

    Abstract: Deep neural networks deliver state-of-the-art visual recognition, but they rely on large datasets, which are time-consuming to annotate. These datasets are typically annotated in two stages: (1) determining the presence of object classes at the image level and (2) marking the spatial extent for all objects of these classes. In this work we use speech, together with mouse inputs, to speed up this p… ▽ More

    Submitted 19 December, 2019; v1 submitted 25 May, 2019; originally announced May 2019.

    Comments: this article is an extension of arXiv:1811.09461, which was published at CVPR 2019

  9. arXiv:1811.09461  [pdf, other

    cs.CV cs.HC

    Fast Object Class Labelling via Speech

    Authors: Michael Gygli, Vittorio Ferrari

    Abstract: Object class labelling is the task of annotating images with labels on the presence or absence of objects from a given class vocabulary. Simply asking one yes/no question per class, however, has a cost that is linear in the vocabulary size and is thus inefficient for large vocabularies. Modern approaches rely on a hierarchical organization of the vocabulary to reduce annotation time, but remain ex… ▽ More

    Submitted 11 April, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

    Comments: to be published at CVPR 2019

  10. PHD-GIFs: Personalized Highlight Detection for Automatic GIF Creation

    Authors: Ana García del Molino, Michael Gygli

    Abstract: Highlight detection models are typically trained to identify cues that make visual content appealing or interesting for the general public, with the objective of reducing a video to such moments. However, the "interestingness" of a video segment or image is subjective. Thus, such highlight models provide results of limited relevance for the individual user. On the other hand, training one model pe… ▽ More

    Submitted 7 August, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

    Comments: Accepted for publication at the 2018 ACM Multimedia Conference (MM '18)

  11. arXiv:1801.00269  [pdf, other

    cs.CV

    Interactive Video Object Segmentation in the Wild

    Authors: Arnaud Benard, Michael Gygli

    Abstract: In this paper we present our system for human-in-the-loop video object segmentation. The backbone of our system is a method for one-shot video object segmentation. While fast, this method requires an accurate pixel-level segmentation of one (or several) frames as input. As manually annotating such a segmentation is impractical, we propose a deep interactive image segmentation method, that can accu… ▽ More

    Submitted 31 December, 2017; originally announced January 2018.

  12. arXiv:1705.08214  [pdf, other

    cs.CV cs.MM

    Ridiculously Fast Shot Boundary Detection with Fully Convolutional Neural Networks

    Authors: Michael Gygli

    Abstract: Shot boundary detection (SBD) is an important component of many video analysis tasks, such as action recognition, video indexing, summarization and editing. Previous work typically used a combination of low-level features like color histograms, in conjunction with simple models such as SVMs. Instead, we propose to learn shot detection end-to-end, from pixels to final shot boundaries. For training… ▽ More

    Submitted 23 May, 2017; originally announced May 2017.

  13. arXiv:1705.00581  [pdf, other

    cs.CV cs.CL cs.MM

    Query-adaptive Video Summarization via Quality-aware Relevance Estimation

    Authors: Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, Luc Van Gool

    Abstract: Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem by posing query-relevant summarization as a video frame subset selection problem, which lets us optimise for summaries which are simultaneously diverse, represe… ▽ More

    Submitted 28 September, 2017; v1 submitted 1 May, 2017; originally announced May 2017.

    Comments: ACM Multimedia 2017

  14. arXiv:1703.04363  [pdf, other

    cs.LG cs.AI cs.CV

    Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

    Authors: Michael Gygli, Mohammad Norouzi, Anelia Angelova

    Abstract: We approach structured output prediction by optimizing a deep value network (DVN) to precisely estimate the task loss on different output configurations for a given input. Once the model is trained, we perform inference by gradient descent on the continuous relaxations of the output variables to find outputs with promising scores from the value network. When applied to image segmentation, the valu… ▽ More

    Submitted 8 August, 2017; v1 submitted 13 March, 2017; originally announced March 2017.

    Comments: Published at ICML 2017

  15. arXiv:1703.02437  [pdf, other

    cs.CV cs.LG cs.MM

    PathTrack: Fast Trajectory Annotation with Path Supervision

    Authors: Santiago Manen, Michael Gygli, Dengxin Dai, Luc Van Gool

    Abstract: Progress in Multiple Object Tracking (MOT) has been historically limited by the size of the available datasets. We present an efficient framework to annotate trajectories and use it to produce a MOT dataset of unprecedented size. In our novel path supervision the annotator loosely follows the object with the cursor while watching the video, providing a path annotation for each object in the sequen… ▽ More

    Submitted 22 March, 2017; v1 submitted 7 March, 2017; originally announced March 2017.

    Comments: 10 pages, ICCV submission

  16. arXiv:1701.00599  [pdf, other

    cs.MM cs.CV cs.SD

    AENet: Learning Deep Audio Features for Video Analysis

    Authors: Naoya Takahashi, Michael Gygli, Luc Van Gool

    Abstract: We propose a new deep network for audio event recognition, called AENet. In contrast to speech, sounds coming from audio events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of clear sub-word units that are present in speech. In order to incorporate this long-time frequency structure of audio events,… ▽ More

    Submitted 3 January, 2017; v1 submitted 3 January, 2017; originally announced January 2017.

    Comments: 12 pages, 9 figures. arXiv admin note: text overlap with arXiv:1604.07160

  17. arXiv:1605.04850  [pdf, other

    cs.CV cs.MM

    Video2GIF: Automatic Generation of Animated GIFs from Video

    Authors: Michael Gygli, Yale Song, Liangliang Cao

    Abstract: We introduce the novel problem of automatically generating animated GIFs from video. GIFs are short looping video with no sound, and a perfect combination between image and video that really capture our attention. GIFs tell a story, express emotion, turn events into humorous moments, and are the new wave of photojournalism. We pose the question: Can we automate the entirely manual and elaborate pr… ▽ More

    Submitted 16 May, 2016; originally announced May 2016.

    Comments: Accepted to CVPR 2016

  18. arXiv:1604.07160  [pdf, other

    cs.SD cs.MM

    Deep Convolutional Neural Networks and Data Augmentation for Acoustic Event Detection

    Authors: Naoya Takahashi, Michael Gygli, Beat Pfister, Luc Van Gool

    Abstract: We propose a novel method for Acoustic Event Detection (AED). In contrast to speech, sounds coming from acoustic events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of a clear sub-word unit. In order to incorporate the long-time frequency structure for AED, we introduce a convolutional neural network… ▽ More

    Submitted 7 December, 2016; v1 submitted 25 April, 2016; originally announced April 2016.

    Comments: Presented in INTERSPEECH 2016