Skip to main content

Showing 1–5 of 5 results for author: Flanagan, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.08544  [pdf, other

    cs.CV

    Moment of Untruth: Dealing with Negative Queries in Video Moment Retrieval

    Authors: Kevin Flanagan, Dima Damen, Michael Wray

    Abstract: Video Moment Retrieval is a common task to evaluate the performance of visual-language models - it involves localising start and end times of moments in videos from query sentences. The current task formulation assumes that the queried moment is present in the video, resulting in false positive moment predictions when irrelevant query sentences are provided. In this paper we propose the task of Ne… ▽ More

    Submitted 13 February, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

    Comments: 16 pages, 9 figures. Accepted at WACV 2025. Paper webpage: https://keflanagan.github.io/Moment-of-Untruth

  2. arXiv:2502.04144  [pdf, other

    cs.CV

    HD-EPIC: A Highly-Detailed Egocentric Video Dataset

    Authors: Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen

    Abstract: We present a validation dataset of newly-collected kitchen-based egocentric videos, manually annotated with highly detailed and interconnected ground-truth labels covering: recipe steps, fine-grained actions, ingredients with nutritional values, moving objects, and audio annotations. Importantly, all annotations are grounded in 3D through digital twinning of the scene, fixtures, object locations,… ▽ More

    Submitted 25 March, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted at CVPR 2025. Project Webpage and Dataset: http://hd-epic.github.io

  3. arXiv:2410.07238  [pdf, other

    cs.HC

    vailá: Versatile Anarcho Integrated Liberation Ánalysis in Multimodal Toolbox

    Authors: Paulo Roberto Pereira Santiago, Abel Gonçalves Chinaglia, Kira Flanagan, Bruno L. S. Bedo, Ligia Yumi Mochida, Juan Aceros, Aline Bononi, Guilherme Manna Cesar

    Abstract: Human movement analysis is crucial in health and sports biomechanics for understanding physical performance, guiding rehabilitation, and preventing injuries. However, existing tools are often proprietary, expensive, and function as "black boxes", limiting user control and customization. This paper introduces vailá-Versatile Anarcho Integrated Liberation Ánalysis in Multimodal Toolbox-an open-sourc… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: 21 pages, 13 figures, submitted to arXiv under cs.SE (Software Engineering)

    MSC Class: 92C10; 68U10; 65D18; 65K10 ACM Class: I.4.8; J.3; H.5.2; I.2.10

  4. arXiv:2402.02335  [pdf, other

    cs.CV cs.IR

    Video Editing for Video Retrieval

    Authors: Bin Zhu, Kevin Flanagan, Adriano Fragomeni, Michael Wray, Dima Damen

    Abstract: Though pre-training vision-language models have demonstrated significant benefits in boosting video-text retrieval performance from large-scale web videos, fine-tuning still plays a critical role with manually annotated clips with start and end times, which requires considerable human effort. To address this issue, we explore an alternative cheaper source of annotations, single timestamps, for vid… ▽ More

    Submitted 7 September, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

  5. arXiv:2310.17395  [pdf, other

    cs.CV

    Learning Temporal Sentence Grounding From Narrated EgoVideos

    Authors: Kevin Flanagan, Dima Damen, Michael Wray

    Abstract: The onset of long-form egocentric datasets such as Ego4D and EPIC-Kitchens presents a new challenge for the task of Temporal Sentence Grounding (TSG). Compared to traditional benchmarks on which this task is evaluated, these datasets offer finer-grained sentences to ground in notably longer videos. In this paper, we develop an approach for learning to ground sentences in these datasets using only… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted in BMVC 2023