Skip to main content

Showing 1–44 of 44 results for author: Heikkila, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.01620  [pdf, other

    cs.CV

    A Conic Transformation Approach for Solving the Perspective-Three-Point Problem

    Authors: Haidong Wu, Snehal Bhayani, Janne Heikkilä

    Abstract: We propose a conic transformation method to solve the Perspective-Three-Point (P3P) problem. In contrast to the current state-of-the-art solvers, which formulate the P3P problem by intersecting two conics and constructing a degenerate conic to find the intersection, our approach builds upon a new formulation based on a transformation that maps the two conics to a new coordinate system, where one o… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  2. arXiv:2503.17862  [pdf, other

    cs.CV cs.AI

    A Causal Adjustment Module for Debiasing Scene Graph Generation

    Authors: Li Liu, Shuzhou Sun, Shuaifeng Zhi, Fan Shi, Zhen Liu, Janne Heikkilä, Yongxiang Liu

    Abstract: While recent debiasing methods for Scene Graph Generation (SGG) have shown impressive performance, these efforts often attribute model bias solely to the long-tail distribution of relationships, overlooking the more profound causes stemming from skewed object and object pair distributions. In this paper, we employ causal inference techniques to model the causality among these observed skewed distr… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

    Comments: 18 pages, 8 tables, 10 figures

  3. arXiv:2501.17805  [pdf

    cs.CY cs.AI cs.LG

    International AI Safety Report

    Authors: Yoshua Bengio, Sören Mindermann, Daniel Privitera, Tamay Besiroglu, Rishi Bommasani, Stephen Casper, Yejin Choi, Philip Fox, Ben Garfinkel, Danielle Goldfarb, Hoda Heidari, Anson Ho, Sayash Kapoor, Leila Khalatbari, Shayne Longpre, Sam Manning, Vasilios Mavroudis, Mantas Mazeika, Julian Michael, Jessica Newman, Kwan Yee Ng, Chinasa T. Okolo, Deborah Raji, Girish Sastry, Elizabeth Seger , et al. (71 additional authors not shown)

    Abstract: The first International AI Safety Report comprehensively synthesizes the current evidence on the capabilities, risks, and safety of advanced AI systems. The report was mandated by the nations attending the AI Safety Summit in Bletchley, UK. Thirty nations, the UN, the OECD, and the EU each nominated a representative to the report's Expert Advisory Panel. A total of 100 AI experts contributed, repr… ▽ More

    Submitted 29 January, 2025; originally announced January 2025.

  4. arXiv:2501.10453  [pdf, other

    cs.LG cs.AI cs.CY

    Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation

    Authors: Shuzhou Sun, Li Liu, Yongxiang Liu, Zhen Liu, Shuanghui Zhang, Janne Heikkilä, Xiang Li

    Abstract: Bias in Foundation Models (FMs) - trained on vast datasets spanning societal and historical knowledge - poses significant challenges for fairness and equity across fields such as healthcare, education, and finance. These biases, rooted in the overrepresentation of stereotypes and societal inequalities in training data, exacerbate real-world discrimination, reinforce harmful stereotypes, and erode… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

    Comments: 60 pages, 5 figures

  5. arXiv:2412.19189  [pdf, other

    cs.CV cs.LG

    An End-to-End Depth-Based Pipeline for Selfie Image Rectification

    Authors: Ahmed Alhawwary, Phong Nguyen-Ha, Janne Mustaniemi, Janne Heikkilä

    Abstract: Portraits or selfie images taken from a close distance typically suffer from perspective distortion. In this paper, we propose an end-to-end deep learning-based rectification pipeline to mitigate the effects of perspective distortion. We learn to predict the facial depth by training a deep CNN. The estimated depth is utilized to adjust the camera-to-subject distance by moving the camera farther, i… ▽ More

    Submitted 26 December, 2024; originally announced December 2024.

  6. arXiv:2403.10683  [pdf, other

    cs.CV

    GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: This paper introduces GS-Pose, a unified framework for localizing and estimating the 6D pose of novel objects. GS-Pose begins with a set of posed RGB images of a previously unseen object and builds three distinct representations stored in a database. At inference, GS-Pose operates sequentially by locating the object in the input image, estimating its initial 6D pose using a retrieval approach, and… ▽ More

    Submitted 14 August, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Project Page: https://dingdingcai.github.io/gs-pose

  7. arXiv:2310.18103  [pdf, other

    cs.SC

    A Novel Application of Polynomial Solvers in mmWave Analog Radio Beamforming

    Authors: Snehal Bhayani, Praneeth Susarla, S. S. Krishna Chaitanya Bulusu, Olli Silven, Markku Juntti, Janne Heikkila

    Abstract: Beamforming is a signal processing technique where an array of antenna elements can be steered to transmit and receive radio signals in a specific direction. The usage of millimeter wave (mmWave) frequencies and multiple input multiple output (MIMO) beamforming are considered as the key innovations of 5th Generation (5G) and beyond communication systems. The technique initially performs a beam ali… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted for publication in the SIGSAM's ACM Communications in Computer Algebra, as an extended abstract

  8. arXiv:2307.05276  [pdf, other

    cs.CV

    Unbiased Scene Graph Generation via Two-stage Causal Modeling

    Authors: Shuzhou Sun, Shuaifeng Zhi, Qing Liao, Janne Heikkilä, Li Liu

    Abstract: Despite the impressive performance of recent unbiased Scene Graph Generation (SGG) methods, the current debiasing literature mainly focuses on the long-tailed distribution problem, whereas it overlooks another source of bias, i.e., semantic confusion, which makes the SGG model prone to yield false predictions for similar relationships. In this paper, we explore a debiasing procedure for the SGG ta… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: 17 pages, 9 figures. Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  9. arXiv:2304.01816  [pdf, other

    cs.CV

    Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

    Authors: Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

    Abstract: Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images. However, our survey of 37 recent papers reveals that many works rely solely on automatic measures (e.g., FID) or perform poorly described human evaluations that are not reliable or repeatable. This paper proposes a standard… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  10. arXiv:2302.07300  [pdf, other

    cs.CV

    MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: Acquiring labeled 6D poses from real images is an expensive and time-consuming task. Though massive amounts of synthetic RGB images are easy to obtain, the models trained on them suffer from noticeable performance degradation due to the synthetic-to-real domain gap. To mitigate this degradation, we propose a practical self-supervised domain adaptation approach that takes advantage of real RGB(-D)… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: SCIA2023

  11. arXiv:2301.06443  [pdf, ps, other

    cs.CV

    Sparse resultant based minimal solvers in computer vision and their connection with the action matrix

    Authors: Snehal Bhayani, Janne Heikkilä, Zuzana Kukelova

    Abstract: Many computer vision applications require robust and efficient estimation of camera geometry from a minimal number of input data measurements, i.e., solving minimal problems in a RANSAC framework. Minimal problems are usually formulated as complex systems of sparse polynomials. The systems usually are overdetermined and consist of polynomials with algebraically constrained coefficients. Most state… ▽ More

    Submitted 1 September, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:1912.10268

  12. arXiv:2301.01057  [pdf, other

    cs.CV

    BS3D: Building-scale 3D Reconstruction from RGB-D Images

    Authors: Janne Mustaniemi, Juho Kannala, Esa Rahtu, Li Liu, Janne Heikkilä

    Abstract: Various datasets have been proposed for simultaneous localization and mapping (SLAM) and related problems. Existing datasets often include small environments, have incomplete ground truth, or lack important sensor data, such as depth and infrared images. We propose an easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera. Unlike complex and expensive ac… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  13. arXiv:2209.15072  [pdf, other

    cs.CV

    Partially calibrated semi-generalized pose from hybrid point correspondences

    Authors: Snehal Bhayani, Viktor Larsson, Torsten Sattler, Janne Heikkila, Zuzana Kukelova

    Abstract: In this paper we study the problem of estimating the semi-generalized pose of a partially calibrated camera, i.e., the pose of a perspective camera with unknown focal length w.r.t. a generalized camera, from a hybrid set of 2D-2D and 2D-3D point correspondences. We study all possible camera configurations within the generalized camera system. To derive practical solvers to previously unsolved chal… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  14. arXiv:2208.04717  [pdf, other

    cs.CV cs.GR

    Cascaded and Generalizable Neural Radiance Fields for Fast View Synthesis

    Authors: Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

    Abstract: We present CG-NeRF, a cascade and generalizable neural radiance fields method for view synthesis. Recent generalizing view synthesis methods can render high-quality novel views using a set of nearby input views. However, the rendering speed is still slow due to the nature of uniformly-point sampling of neural radiance fields. Existing scene-specific methods can train and render novel views efficie… ▽ More

    Submitted 19 November, 2023; v1 submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  15. arXiv:2208.02129  [pdf, other

    cs.CV

    SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: This paper presents an efficient symmetry-agnostic and correspondence-free framework, referred to as SC6D, for 6D object pose estimation from a single monocular RGB image. SC6D requires neither the 3D CAD model of the object nor any prior knowledge of the symmetries. The pose estimation is decomposed into three sub-tasks: a) object 3D rotation representation learning and matching; b) estimation of… ▽ More

    Submitted 18 September, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

    Comments: 3DV 2022

  16. arXiv:2203.16062  [pdf, other

    cs.CV cs.IR

    AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

    Authors: Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai

    Abstract: Evaluation measures have a crucial impact on the direction of research. Therefore, it is of utmost importance to develop appropriate and reliable evaluation measures for new applications where conventional measures are not well suited. Video Moment Retrieval (VMR) is one such application, and the current practice is to use R@$K,θ$ for evaluating VMR systems. However, this measure has two disadvant… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR2022

  17. arXiv:2203.14438  [pdf, other

    cs.CV

    Optimal Correction Cost for Object Detection Evaluation

    Authors: Mayu Otani, Riku Togashi, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

    Abstract: Mean Average Precision (mAP) is the primary evaluation measure for object detection. Although object detection has a broad range of applications, mAP evaluates detectors in terms of the performance of ranked instance retrieval. Such the assumption for the evaluation task does not suit some downstream tasks. To alleviate the gap between downstream tasks and the evaluation scenario, we propose Optim… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  18. arXiv:2203.01994  [pdf, other

    cs.CV

    Fast Neural Architecture Search for Lightweight Dense Prediction Networks

    Authors: Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

    Abstract: We present LDP, a lightweight dense prediction neural architecture search (NAS) framework. Starting from a pre-defined generic backbone, LDP applies the novel Assisted Tabu Search for efficient architecture exploration. LDP is fast and suitable for various dense estimation problems, unlike previous NAS methods that are either computational demanding or deployed only for a single subtask. The perfo… ▽ More

    Submitted 9 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: 15 pages, 11 figures, 8 tables. arXiv admin note: substantial text overlap with arXiv:2108.11105

  19. arXiv:2203.01072  [pdf, other

    cs.CV

    OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose Estimation

    Authors: Dingding Cai, Janne Heikkilä, Esa Rahtu

    Abstract: This paper proposes a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask. Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning. We achieve this by decomposing the 6D pose into viewpoint, in-p… ▽ More

    Submitted 7 April, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  20. arXiv:2112.13889  [pdf, other

    cs.CV cs.GR

    Free-Viewpoint RGB-D Human Performance Capture and Rendering

    Authors: Phong Nguyen-Ha, Nikolaos Sarafianos, Christoph Lassner, Janne Heikkila, Tony Tung

    Abstract: Capturing and faithfully rendering photo-realistic humans from novel views is a fundamental problem for AR/VR applications. While prior work has shown impressive performance capture results in laboratory settings, it is non-trivial to achieve casual free-viewpoint human capture and rendering for unseen identities with high fidelity, especially for facial expressions, hands, and clothes. To tackle… ▽ More

    Submitted 2 August, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

    Comments: Accepted at ECCV 2022, Project page: https://www.phongnhhn.info/HVS_Net/index.html

  21. arXiv:2108.11105  [pdf, other

    cs.CV

    Lightweight Monocular Depth with a Novel Neural Architecture Search Method

    Authors: Lam Huynh, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne Heikkila

    Abstract: This paper presents a novel neural architecture search method, called LiDNAS, for generating lightweight monocular depth estimation models. Unlike previous neural architecture search (NAS) approaches, where finding optimized networks are computationally highly demanding, the introduced novel Assisted Tabu Search leads to efficient architecture exploration. Moreover, we construct the search space o… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 11 pages, 10 figures

  22. arXiv:2108.11098  [pdf, other

    cs.CV

    Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss

    Authors: Lam Huynh, Matteo Pedone, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne Heikkila

    Abstract: Deep neural networks have recently thrived on single image depth estimation. That being said, current developments on this topic highlight an apparent compromise between accuracy and network size. This work proposes an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection. Specifically, we utilize a sparse set of… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

    Comments: 11 pages, 7 figures

  23. arXiv:2103.06535  [pdf, other

    cs.CV

    Calibrated and Partially Calibrated Semi-Generalized Homographies

    Authors: Snehal Bhayani, Torsten Sattler, Daniel Barath, Patrik Beliansky, Janne Heikkila, Zuzana Kukelova

    Abstract: In this paper, we propose the first minimal solutions for estimating the semi-generalized homography given a perspective and a generalized camera. The proposed solvers use five 2D-2D image point correspondences induced by a scene plane. One of them assumes the perspective camera to be fully calibrated, while the other solver estimates the unknown focal length together with the absolute pose parame… ▽ More

    Submitted 11 October, 2021; v1 submitted 11 March, 2021; originally announced March 2021.

    Comments: Accepted to ICCV 2021 and to appear in the conference proceedings

  24. arXiv:2012.10296  [pdf, other

    cs.CV

    Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion

    Authors: Lam Huynh, Phong Nguyen-Ha, Jiri Matas, Esa Rahtu, Janne Heikkila

    Abstract: In this paper, we propose enhancing monocular depth estimation by adding 3D points as depth guidance. Unlike existing depth completion methods, our approach performs well on extremely sparse and unevenly distributed point clouds, which makes it agnostic to the source of the 3D points. We achieve this by introducing a novel multi-scale 3D point fusion network that is both lightweight and efficient.… ▽ More

    Submitted 25 August, 2021; v1 submitted 18 December, 2020; originally announced December 2020.

    Comments: 10 pages, 9 figures

  25. arXiv:2011.14398  [pdf, other

    cs.CV cs.GR

    RGBD-Net: Predicting color and depth images for novel views synthesis

    Authors: Phong Nguyen-Ha, Animesh Karnewar, Lam Huynh, Esa Rahtu, Jiri Matas, Janne Heikkila

    Abstract: We propose a new cascaded architecture for novel view synthesis, called RGBD-Net, which consists of two core components: a hierarchical depth regression network and a depth-aware generator network. The former one predicts depth maps of the target views by using adaptive depth scaling, while the latter one leverages the predicted depths and renders spatially and temporally consistent target images.… ▽ More

    Submitted 9 July, 2021; v1 submitted 29 November, 2020; originally announced November 2020.

    Comments: 19 pages, 15 figures. Code will be available at: https://github.com/phongnhhn92/RGBDNet

  26. arXiv:2009.00325  [pdf, other

    cs.CV

    Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

    Authors: Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

    Abstract: The query-based moment retrieval is a problem of localising a specific clip from an untrimmed video according a query sentence. This is a challenging task that requires interpretation of both the natural language query and the video content. Like in many other areas in computer vision and machine learning, the progress in query-based moment retrieval is heavily driven by the benchmark datasets and… ▽ More

    Submitted 7 October, 2020; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: British Machine Vision Conference (BMVC), 2020. (v2) added references

  27. arXiv:2007.10100  [pdf, other

    cs.CV cs.SC

    Computing stable resultant-based minimal solvers by hiding a variable

    Authors: Snehal Bhayani, Zuzana Kukelova, Janne Heikkilä

    Abstract: Many computer vision applications require robust and efficient estimation of camera geometry. The robust estimation is usually based on solving camera geometry problems from a minimal number of input data measurements, i.e., solving minimal problems, in a RANSAC-style framework. Minimal problems often result in complex systems of polynomial equations. The existing state-of-the-art methods for solv… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: arXiv admin note: text overlap with arXiv:1912.10268

    ACM Class: I.4; I.1

  28. arXiv:2006.10841  [pdf, other

    cs.CV

    Learning non-rigid surface reconstruction from spatio-temporal image patches

    Authors: Matteo Pedone, Abdelrahman Mostafa, Janne heikkilä

    Abstract: We present a method to reconstruct a dense spatio-temporal depth map of a non-rigidly deformable object directly from a video sequence. The estimation of depth is performed locally on spatio-temporal patches of the video, and then the full depth video of the entire shape is recovered by combining them together. Since the geometric complexity of a local spatio-temporal patch of a deforming non-rigi… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

  29. arXiv:2004.04548  [pdf, other

    cs.CV cs.LG eess.IV

    Sequential View Synthesis with Transformer

    Authors: Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Janne Heikkila

    Abstract: This paper addresses the problem of novel view synthesis by means of neural rendering, where we are interested in predicting the novel view at an arbitrary camera pose based on a given set of input images from other viewpoints. Using the known query pose and input poses, we create an ordered set of observations that leads to the target view. Thus, the problem of single novel view synthesis is refo… ▽ More

    Submitted 22 September, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Code is available at: https://github.com/phongnhhn92/TransformerGQN; Supplementary material: https://bit.ly/3kEgnzU

  30. arXiv:2004.02760  [pdf, other

    cs.CV

    Guiding Monocular Depth Estimation Using Depth-Attention Volume

    Authors: Lam Huynh, Phong Nguyen-Ha, Jiri Matas, Esa Rahtu, Janne Heikkila

    Abstract: Recovering the scene depth from a single image is an ill-posed problem that requires additional priors, often referred to as monocular depth cues, to disambiguate different 3D interpretations. In recent works, those priors have been learned in an end-to-end manner from large datasets by using deep neural networks. In this paper, we propose guiding depth estimation to favor planar structures that a… ▽ More

    Submitted 16 August, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: 30 pages

  31. arXiv:1912.10268  [pdf, other

    cs.CV cs.SC

    A sparse resultant based method for efficient minimal solvers

    Authors: Snehal Bhayani, Zuzana Kukelova, Janne Heikkilä

    Abstract: Many computer vision applications require robust and efficient estimation of camera geometry. The robust estimation is usually based on solving camera geometry problems from a minimal number of input data measurements, i.e. solving minimal problems in a RANSAC framework. Minimal problems often result in complex systems of polynomial equations. Many state-of-the-art efficient polynomial solvers to… ▽ More

    Submitted 21 December, 2019; originally announced December 2019.

  32. arXiv:1912.05000  [pdf, other

    cs.CV

    Improving land cover segmentation across satellites using domain adaptation

    Authors: Nadir Bengana, Janne Heikkilä

    Abstract: Land use and land cover mapping are essential to various fields of study, including forestry, agriculture, and urban management. Using earth observation satellites both facilitate and accelerate the task. Lately, deep learning methods have proven to be excellent at automating the mapping via semantic image segmentation. However, because deep neural networks require large amounts of labeled data, i… ▽ More

    Submitted 1 April, 2020; v1 submitted 25 November, 2019; originally announced December 2019.

    Comments: 12 pages, Transaction

  33. arXiv:1904.05124  [pdf, other

    cs.CV cs.AI cs.GR

    Predicting Novel Views Using Generative Adversarial Query Network

    Authors: Phong Nguyen-Ha, Lam Huynh, Esa Rahtu, Janne Heikkila

    Abstract: The problem of predicting a novel view of the scene using an arbitrary number of observations is a challenging problem for computers as well as for humans. This paper introduces the Generative Adversarial Query Network (GAQN), a general learning framework for novel view synthesis that combines Generative Query Network (GQN) and Generative Adversarial Networks (GANs). The conventional GQN encodes i… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: 12 pages, 4 figures, accepted for presentation at the Scandinavian Conference on Image Analysis 2019

  34. arXiv:1903.11328  [pdf, other

    cs.CV

    Rethinking the Evaluation of Video Summaries

    Authors: Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

    Abstract: Video summarization is a technique to create a short skim of the original video while preserving the main stories/content. There exists a substantial interest in automatizing this process due to the rapid growth of the available material. The recent progress has been facilitated by public benchmark datasets, which enable easy and fair comparison of methods. Currently the established evaluation pro… ▽ More

    Submitted 11 April, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: CVPR'19 poster

  35. An efficient solution for semantic segmentation: ShuffleNet V2 with atrous separable convolutions

    Authors: Sercan Türkmen, Janne Heikkilä

    Abstract: Assigning a label to each pixel in an image, namely semantic segmentation, has been an important task in computer vision, and has applications in autonomous driving, robotic navigation, localization, and scene understanding. Fully convolutional neural networks have proved to be a successful solution for the task over the years but most of the work being done focuses primarily on accuracy. In this… ▽ More

    Submitted 3 April, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: 12 pages, 6 figures, 5 tables

  36. arXiv:1811.09485  [pdf, other

    cs.CV

    LSD$_2$ -- Joint Denoising and Deblurring of Short and Long Exposure Images with CNNs

    Authors: Janne Mustaniemi, Juho Kannala, Jiri Matas, Simo Särkkä, Janne Heikkilä

    Abstract: The paper addresses the problem of acquiring high-quality photographs with handheld smartphone cameras in low-light imaging conditions. We propose an approach based on capturing pairs of short and long exposure images in rapid succession and fusing them into a single high-quality photograph. Unlike existing methods, we take advantage of both images simultaneously and perform a joint denoising and… ▽ More

    Submitted 1 September, 2020; v1 submitted 23 November, 2018; originally announced November 2018.

  37. arXiv:1810.00986  [pdf, other

    cs.CV

    Gyroscope-Aided Motion Deblurring with Deep Networks

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: We propose a deblurring method that incorporates gyroscope measurements into a convolutional neural network (CNN). With the help of such measurements, it can handle extremely strong and spatially-variant motion blur. At the same time, the image data is used to overcome the limitations of gyro-based blur estimation. To train our network, we also introduce a novel way of generating realistic trainin… ▽ More

    Submitted 23 November, 2018; v1 submitted 1 October, 2018; originally announced October 2018.

  38. arXiv:1807.11677  [pdf, other

    cs.CV

    Leveraging Unlabeled Whole-Slide-Images for Mitosis Detection

    Authors: Saad Ullah Akram, Talha Qaiser, Simon Graham, Juho Kannala, Janne Heikkilä, Nasir Rajpoot

    Abstract: Mitosis count is an important biomarker for prognosis of various cancers. At present, pathologists typically perform manual counting on a few selected regions of interest in breast whole-slide-images (WSIs) of patient biopsies. This task is very time-consuming, tedious and subjective. Automated mitosis detection methods have made great advances in recent years. However, these methods require exhau… ▽ More

    Submitted 31 July, 2018; originally announced July 2018.

    Comments: Accepted for MICCAI COMPAY 2018 Workshop

  39. arXiv:1805.08542  [pdf, other

    cs.CV

    Fast Motion Deblurring for Feature Detection and Matching Using Inertial Measurements

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: Many computer vision and image processing applications rely on local features. It is well-known that motion blur decreases the performance of traditional feature detectors and descriptors. We propose an inertial-based deblurring method for improving the robustness of existing feature detectors and descriptors against the motion blur. Unlike most deblurring algorithms, the method can handle spatial… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  40. arXiv:1804.08912  [pdf, other

    cs.CV

    Accurate 3-D Reconstruction with RGB-D Cameras using Depth Map Fusion and Pose Refinement

    Authors: Markus Ylimäki, Juho Kannala, Janne Heikkilä

    Abstract: Depth map fusion is an essential part in both stereo and RGB-D based 3-D reconstruction pipelines. Whether produced with a passive stereo reconstruction or using an active depth sensor, such as Microsoft Kinect, the depth maps have noise and may have poor initial registration. In this paper, we introduce a method which is capable of handling outliers, and especially, even significant registration… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.

    Comments: Accepted to ICPR 2018

  41. arXiv:1705.03386  [pdf, other

    cs.CV

    Cell Tracking via Proposal Generation and Selection

    Authors: Saad Ullah Akram, Juho Kannala, Lauri Eklund, Janne Heikkilä

    Abstract: Microscopy imaging plays a vital role in understanding many biological processes in development and disease. The recent advances in automation of microscopes and development of methods and markers for live cell imaging has led to rapid growth in the amount of image data being captured. To efficiently and reliably extract useful insights from these captured sequences, automated cell tracking is ess… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  42. arXiv:1611.09498  [pdf, other

    cs.CV

    Inertial-Based Scale Estimation for Structure from Motion on Mobile Devices

    Authors: Janne Mustaniemi, Juho Kannala, Simo Särkkä, Jiri Matas, Janne Heikkilä

    Abstract: Structure from motion algorithms have an inherent limitation that the reconstruction can only be determined up to the unknown scale factor. Modern mobile devices are equipped with an inertial measurement unit (IMU), which can be used for estimating the scale of the reconstruction. We propose a method that recovers the metric scale given inertial measurements and camera poses. In the process, we al… ▽ More

    Submitted 11 August, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

  43. arXiv:1609.08758  [pdf, other

    cs.CV

    Video Summarization using Deep Semantic Features

    Authors: Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

    Abstract: This paper presents a video summarization technique for an Internet video to provide a quick way to overview its content. This is a challenging problem because finding important or informative parts of the original video requires to understand its content. Furthermore the content of Internet videos is very diverse, ranging from home videos to documentaries, which makes video summarization much mor… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: 16 pages, the 13th Asian Conference on Computer Vision (ACCV'16)

  44. arXiv:1608.02367  [pdf, other

    cs.CV

    Learning Joint Representations of Videos and Sentences with Web Image Search

    Authors: Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

    Abstract: Our objective is video retrieval based on natural language queries. In addition, we consider the analogous problem of retrieving sentences or generating descriptions given an input video. Recent work has addressed the problem by embedding visual and textual inputs into a common space where semantic similarities correlate to distances. We also adopt the embedding approach, and make the following co… ▽ More

    Submitted 8 August, 2016; originally announced August 2016.

    Comments: 16 pages, 4th Workshop on Web-scale Vision and Social Media (VSM), ECCV 2016