Search | arXiv e-print repository

To what extent can current French mobile network support agricultural robots?

Authors: Pierre La Rocca, Gaël Guennebaud, Aurélie Bugeau

Abstract: The large-scale integration of robots in agriculture offers many promises for enhancing sustainability and increasing food production. The numerous applications of agricultural robots rely on the transmission of data via mobile network, with the amount of data depending on the services offered by the robots and the level of on-board technology. Nevertheless, infrastructure required to deploy these… ▽ More The large-scale integration of robots in agriculture offers many promises for enhancing sustainability and increasing food production. The numerous applications of agricultural robots rely on the transmission of data via mobile network, with the amount of data depending on the services offered by the robots and the level of on-board technology. Nevertheless, infrastructure required to deploy these robots, as well as the related energy and environmental consequences, appear overlooked in the digital agriculture literature. In this study, we propose a method for assessing the additional energy consumption and carbon footprint induced by a large-scale deployment of agricultural robots. Our method also estimates the share of agricultural area that can be managed by the deployed robots with respect to network infrastructure constraints. We have applied this method to metropolitan France mobile network and agricultural parcels for five different robotic scenarios. Our results show that increasing the robot's bitrate needs leads to significant additional impacts, which increase at a pace that is poorly captured by classical linear extrapolation methods. When constraining the network to the existing sites, increased bitrate needs also comes with a rapidly decreasing manageable agricultural area. △ Less

Submitted 26 June, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

arXiv:2409.17617 [pdf, other]

Estimating The Carbon Footprint Of Digital Agriculture Deployment: A Parametric Bottom-Up Modelling Approach

Authors: Pierre La Rocca, Gaël Guennebaud, Aurélie Bugeau, Anne-Laure Ligozat

Abstract: Digitalization appears as a lever to enhance agriculture sustainability. However, existing works on digital agriculture's own sustainability remain scarce, disregarding the environmental effects of deploying digital devices on a large-scale. We propose a bottom-up method to estimate the carbon footprint of digital agriculture scenarios considering deployment of devices over a diversity of farm siz… ▽ More Digitalization appears as a lever to enhance agriculture sustainability. However, existing works on digital agriculture's own sustainability remain scarce, disregarding the environmental effects of deploying digital devices on a large-scale. We propose a bottom-up method to estimate the carbon footprint of digital agriculture scenarios considering deployment of devices over a diversity of farm sizes. It is applied to two use-cases and demonstrates that digital agriculture encompasses a diversity of devices with heterogeneous carbon footprints and that more complex devices yield higher footprints not always compensated by better performances or scaling gains. By emphasizing the necessity of considering the multiplicity of devices, and the territorial distribution of farm sizes when modelling digital agriculture deployments, this study highlights the need for further exploration of the first-order effects of digital technologies in agriculture. △ Less

Submitted 26 September, 2024; originally announced September 2024.

Comments: Journal of Industrial Ecology, In press, 10.1111/jiec.13568

arXiv:2312.15948 [pdf, other]

How digital will the future be? Analysis of prospective scenarios

Authors: Aurélie Bugeau, Anne-Laure Ligozat

Abstract: With the climate change context, many prospective studies, generally encompassing all areas of society, imagine possible futures to expand the range of options. The role of digital technologies within these possible futures is rarely specifically targeted. Which digital technologies and methodologies do these studies envision in a world that has mitigated and adapted to climate change? In this pap… ▽ More With the climate change context, many prospective studies, generally encompassing all areas of society, imagine possible futures to expand the range of options. The role of digital technologies within these possible futures is rarely specifically targeted. Which digital technologies and methodologies do these studies envision in a world that has mitigated and adapted to climate change? In this paper, we propose a typology for scenarios to survey digital technologies and their applications in 14 prospective studies and their corresponding 35 future scenarios. Our finding is that all the scenarios consider digital technology to be present in the future. We observe that only a few of them question our relationship with digital technology and all aspects related to its materiality, and none of the general studies envision breakthroughs concerning technologies used today. Our result demonstrates the lack of a systemic view of information and communication technologies. We therefore argue for new prospective studies to envision the future of ICT. △ Less

Submitted 11 January, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Journal ref: Workshop on Computing within Limits, 2024

arXiv:2306.08323 [pdf, other]

doi 10.1088/2515-7620/acf81b

How to estimate carbon footprint when training deep learning models? A guide and review

Authors: Lucia Bouza Heguerte, Aurélie Bugeau, Loïc Lannelongue

Abstract: Machine learning and deep learning models have become essential in the recent fast development of artificial intelligence in many sectors of the society. It is now widely acknowledge that the development of these models has an environmental cost that has been analyzed in many studies. Several online and software tools have been developed to track energy consumption while training machine learning… ▽ More Machine learning and deep learning models have become essential in the recent fast development of artificial intelligence in many sectors of the society. It is now widely acknowledge that the development of these models has an environmental cost that has been analyzed in many studies. Several online and software tools have been developed to track energy consumption while training machine learning models. In this paper, we propose a comprehensive introduction and comparison of these tools for AI practitioners wishing to start estimating the environmental impact of their work. We review the specific vocabulary, the technical requirements for each tool. We compare the energy consumption estimated by each tool on two deep neural networks for image processing and on different types of servers. From these experiments, we provide some advice for better choosing the right tool and infrastructure. △ Less

Submitted 25 September, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: Environmental Research Communications, 2023

arXiv:2304.03151 [pdf, other]

Assessing VoD pressure on network power consumption

Authors: Gaël Guennebaud, Aurélie Bugeau, Antoine Dudouit

Abstract: Assessing the energy consumption or carbon footprint of data distribution of video streaming services is usually carried out through energy or carbon intensity figures (in Wh or gCO2e per GB). In this paper, we first review the reasons why such approaches are likely to lead to misunderstandings and potentially to erroneous conclusions. To overcome those shortcomings, we propose a new methodology w… ▽ More Assessing the energy consumption or carbon footprint of data distribution of video streaming services is usually carried out through energy or carbon intensity figures (in Wh or gCO2e per GB). In this paper, we first review the reasons why such approaches are likely to lead to misunderstandings and potentially to erroneous conclusions. To overcome those shortcomings, we propose a new methodology whose key idea is to consider a video streaming usage at the whole scale of a territory, and evaluate the impact of this usage on the network infrastructure. At the core of our methodology is a parametric model of a simplified network and Content Delivery Network (CDN) infrastructure, which is automatically scaled according to peak usage needs. This allows us to compare the power consumption of this infrastructure under different scenarios, ranging from a sober baseline to a generalized use of high bitrate videos. Our results show that classical efficiency indicators do not reflect the power consumption increase of more intensive Internet usage, and might even lead to misleading conclusions. △ Less

Submitted 6 April, 2023; originally announced April 2023.

Journal ref: International Conference on Information and Communications Technology for Sustainability (ICT4S), 2023, Rennes, France

arXiv:2209.06530 [pdf, other]

A patch-based architecture for multi-label classification from single label annotations

Authors: Warren Jouanneau, Aurélie Bugeau, Marc Palyart, Nicolas Papadakis, Laurent Vézard

Abstract: In this paper, we propose a patch-based architecture for multi-label classification problems where only a single positive label is observed in images of the dataset. Our contributions are twofold. First, we introduce a light patch architecture based on the attention mechanism. Next, leveraging on patch embedding self-similarities, we provide a novel strategy for estimating negative examples and de… ▽ More In this paper, we propose a patch-based architecture for multi-label classification problems where only a single positive label is observed in images of the dataset. Our contributions are twofold. First, we introduce a light patch architecture based on the attention mechanism. Next, leveraging on patch embedding self-similarities, we provide a novel strategy for estimating negative examples and deal with positive and unlabeled learning problems. Experiments demonstrate that our architecture can be trained from scratch, whereas pre-training on similar databases is required for related methods from the literature. △ Less

Submitted 14 September, 2022; originally announced September 2022.

arXiv:2205.02146 [pdf, other]

An Analysis of Generative Methods for Multiple Image Inpainting

Authors: Coloma Ballester, Aurelie Bugeau, Samuel Hurault, Simone Parisotto, Patricia Vitoria

Abstract: Image inpainting refers to the restoration of an image with missing regions in a way that is not detectable by the observer. The inpainting regions can be of any size and shape. This is an ill-posed inverse problem that does not have a unique solution. In this work, we focus on learning-based image completion methods for multiple and diverse inpainting which goal is to provide a set of distinct so… ▽ More Image inpainting refers to the restoration of an image with missing regions in a way that is not detectable by the observer. The inpainting regions can be of any size and shape. This is an ill-posed inverse problem that does not have a unique solution. In this work, we focus on learning-based image completion methods for multiple and diverse inpainting which goal is to provide a set of distinct solutions for a given damaged image. These methods capitalize on the probabilistic nature of certain generative models to sample various solutions that coherently restore the missing content. Along the chapter, we will analyze the underlying theory and analyze the recent proposals for multiple inpainting. To investigate the pros and cons of each method, we present quantitative and qualitative comparisons, on common datasets, regarding both the quality and the diversity of the set of inpainted solutions. Our analysis allows us to identify the most successful generative strategies in both inpainting quality and inpainting diversity. This task is closely related to the learning of an accurate probability distribution of images. Depending on the dataset in use, the challenges that entail the training of such a model will be discussed through the analysis. △ Less

Submitted 4 May, 2022; originally announced May 2022.

ACM Class: I.4.4; I.4.10

arXiv:2204.02980 [pdf, other]

Analysis of Different Losses for Deep Learning Image Colorization

Authors: Coloma Ballester, Aurélie Bugeau, Hernan Carrillo, Michaël Clément, Rémi Giraud, Lara Raad, Patricia Vitoria

Abstract: Image colorization aims to add color information to a grayscale image in a realistic way. Recent methods mostly rely on deep learning strategies. While learning to automatically colorize an image, one can define well-suited objective functions related to the desired color output. Some of them are based on a specific type of error between the predicted image and ground truth one, while other losses… ▽ More Image colorization aims to add color information to a grayscale image in a realistic way. Recent methods mostly rely on deep learning strategies. While learning to automatically colorize an image, one can define well-suited objective functions related to the desired color output. Some of them are based on a specific type of error between the predicted image and ground truth one, while other losses rely on the comparison of perceptual properties. But, is the choice of the objective function that crucial, i.e., does it play an important role in the results? In this chapter, we aim to answer this question by analyzing the impact of the loss function on the estimated colorization results. To that goal, we review the different losses and evaluation metrics that are used in the literature. We then train a baseline network with several of the reviewed objective functions: classic L1 and L2 losses, as well as more complex combinations such as Wasserstein GAN and VGG-based LPIPS loss. Quantitative results show that the models trained with VGG-based LPIPS provide overall slightly better results for most evaluation metrics. Qualitative results exhibit more vivid colors when with Wasserstein GAN plus the L2 loss or again with the VGG-based LPIPS. Finally, the convenience of quantitative user studies is also discussed to overcome the difficulty of properly assessing on colorized images, notably for the case of old archive photographs where no ground truth is available. △ Less

Submitted 28 May, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: arXiv admin note: text overlap with arXiv:2204.02850

arXiv:2204.02850 [pdf, other]

Influence of Color Spaces for Deep Learning Image Colorization

Authors: Coloma Ballester, Aurélie Bugeau, Hernan Carrillo, Michaël Clément, Rémi Giraud, Lara Raad, Patricia Vitoria

Abstract: Colorization is a process that converts a grayscale image into a color one that looks as natural as possible. Over the years this task has received a lot of attention. Existing colorization methods rely on different color spaces: RGB, YUV, Lab, etc. In this chapter, we aim to study their influence on the results obtained by training a deep neural network, to answer the question: "Is it crucial to… ▽ More Colorization is a process that converts a grayscale image into a color one that looks as natural as possible. Over the years this task has received a lot of attention. Existing colorization methods rely on different color spaces: RGB, YUV, Lab, etc. In this chapter, we aim to study their influence on the results obtained by training a deep neural network, to answer the question: "Is it crucial to correctly choose the right color space in deep-learning based colorization?". First, we briefly summarize the literature and, in particular, deep learning-based methods. We then compare the results obtained with the same deep neural network architecture with RGB, YUV and Lab color spaces. Qualitative and quantitative analysis do not conclude similarly on which color space is better. We then show the importance of carefully designing the architecture and evaluation protocols depending on the types of images that are being processed and their specificities: strong/small contours, few/many objects, recent/archive images. △ Less

Submitted 6 April, 2022; originally announced April 2022.

arXiv:2110.11822 [pdf, other]

doi 10.3390/su14095172

Unraveling the Hidden Environmental Impacts of AI Solutions for Environment

Authors: Anne-Laure Ligozat, Julien Lefèvre, Aurélie Bugeau, Jacques Combaz

Abstract: In the past ten years, artificial intelligence has encountered such dramatic progress that it is now seen as a tool of choice to solve environmental issues and in the first place greenhouse gas emissions (GHG). At the same time the deep learning community began to realize that training models with more and more parameters requires a lot of energy and as a consequence GHG emissions. To our knowledg… ▽ More In the past ten years, artificial intelligence has encountered such dramatic progress that it is now seen as a tool of choice to solve environmental issues and in the first place greenhouse gas emissions (GHG). At the same time the deep learning community began to realize that training models with more and more parameters requires a lot of energy and as a consequence GHG emissions. To our knowledge, questioning the complete net environmental impacts of AI solutions for the environment (AI for Green), and not only GHG, has never been addressed directly. In this article, we propose to study the possible negative impacts of AI for Green. First, we review the different types of AI impacts, then we present the different methodologies used to assess those impacts, and show how to apply life cycle assessment to AI services. Finally, we discuss how to assess the environmental usefulness of a general AI service, and point out the limitations of existing work in AI for Green. △ Less

Submitted 21 April, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

arXiv:2010.04075 [pdf, other]

3D Object Detection and Pose Estimation of Unseen Objects in Color Images with Local Surface Embeddings

Authors: Giorgia Pitteri, Aurélie Bugeau, Slobodan Ilic, Vincent Lepetit

Abstract: We present an approach for detecting and estimating the 3D poses of objects in images that requires only an untextured CAD model and no training phase for new objects. Our approach combines Deep Learning and 3D geometry: It relies on an embedding of local 3D geometry to match the CAD models to the input images. For points at the surface of objects, this embedding can be computed directly from the… ▽ More We present an approach for detecting and estimating the 3D poses of objects in images that requires only an untextured CAD model and no training phase for new objects. Our approach combines Deep Learning and 3D geometry: It relies on an embedding of local 3D geometry to match the CAD models to the input images. For points at the surface of objects, this embedding can be computed directly from the CAD model; for image locations, we learn to predict it from the image itself. This establishes correspondences between 3D points on the CAD model and 2D locations of the input images. However, many of these correspondences are ambiguous as many points may have similar local geometries. We show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences. We can then robustly estimate a 3D pose from these discriminative correspondences using a RANSAC- like algorithm. We demonstrate the performance of this approach on the T-LESS dataset, by using a small number of objects to learn the embedding and testing it on the other objects. Our experiments show that our method is on par or better than previous methods. △ Less

Submitted 8 October, 2020; originally announced October 2020.

arXiv:2005.13053 [pdf, other]

doi 10.1109/TIP.2021.3062726

Multi-task deep learning for image segmentation using recursive approximation tasks

Authors: Rihuan Ke, Aurélie Bugeau, Nicolas Papadakis, Mark Kirkland, Peter Schuetz, Carola-Bibiane Schönlieb

Abstract: Fully supervised deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. We regard the segmentation problem as a sequence of approximation subproblems that are recursively defined and in increasing levels of approximation accuracy. The subprob… ▽ More Fully supervised deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. We regard the segmentation problem as a sequence of approximation subproblems that are recursively defined and in increasing levels of approximation accuracy. The subproblems are handled by a framework that consists of 1) a segmentation task that learns from pixel-level ground truth segmentation masks of a small fraction of the images, 2) a recursive approximation task that conducts partial object regions learning and data-driven mask evolution starting from partial masks of each object instance, and 3) other problem oriented auxiliary tasks that are trained with sparse annotations and promote the learning of dedicated features. Most training images are only labeled by (rough) partial masks, which do not contain exact object boundaries, rather than by their full segmentation masks. During the training phase, the approximation task learns the statistics of these partial masks, and the partial regions are recursively increased towards object boundaries aided by the learned information from the segmentation task in a fully data-driven fashion. The network is trained on an extremely small amount of precisely segmented images and a large set of coarse labels. Annotations can thus be obtained in a cheap way. We demonstrate the efficiency of our approach in three applications with microscopy images and ultrasound images. △ Less

Submitted 26 May, 2020; originally announced May 2020.

arXiv:1910.02012 [pdf, other]

Variational Osmosis for Non-linear Image Fusion

Authors: Simone Parisotto, Luca Calatroni, Aurélie Bugeau, Nicolas Papadakis, Carola-Bibiane Schönlieb

Abstract: We propose a new variational model for non-linear image fusion. Our approach is based on the use of an osmosis energy term related to the one studied in Vogel et al. (2013) and Weickert et al. (2013) The minimization of the proposed non-convex energy realizes visually plausible image data fusion, invariant to multiplicative brightness changes. On the practical side, it requires minimal supervision… ▽ More We propose a new variational model for non-linear image fusion. Our approach is based on the use of an osmosis energy term related to the one studied in Vogel et al. (2013) and Weickert et al. (2013) The minimization of the proposed non-convex energy realizes visually plausible image data fusion, invariant to multiplicative brightness changes. On the practical side, it requires minimal supervision and parameter tuning and can encode prior information on the structure of the images to be fused. For the numerical solution of the proposed model, we develop a primal-dual algorithm and we apply the resulting minimization scheme to solve multi-modal face fusion, color transfer and cultural heritage conservation problems. Visual and quantitative comparisons to state-of-the-art approaches prove the out-performance and the flexibility of our method. △ Less

Submitted 24 March, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

Comments: 10 pages, 8 figures

MSC Class: 62H35; 94A08; 65K10; 35A15

arXiv:1908.11656 [pdf, other]

LU-Net: An Efficient Network for 3D LiDAR Point Cloud Semantic Segmentation Based on End-to-End-Learned 3D Features and U-Net

Authors: Pierre Biasutti, Vincent Lepetit, Jean-François Aujol, Mathieu Brédif, Aurélie Bugeau

Abstract: We propose LU-Net -- for LiDAR U-Net, a new method for the semantic segmentation of a 3D LiDAR point cloud. Instead of applying some global 3D segmentation method such as PointNet, we propose an end-to-end architecture for LiDAR point cloud semantic segmentation that efficiently solves the problem as an image processing problem. We first extract high-level 3D features for each point given its 3D n… ▽ More We propose LU-Net -- for LiDAR U-Net, a new method for the semantic segmentation of a 3D LiDAR point cloud. Instead of applying some global 3D segmentation method such as PointNet, we propose an end-to-end architecture for LiDAR point cloud semantic segmentation that efficiently solves the problem as an image processing problem. We first extract high-level 3D features for each point given its 3D neighbors. Then, these features are projected into a 2D multichannel range-image by considering the topology of the sensor. Thanks to these learned features and this projection, we can finally perform the segmentation using a simple U-Net segmentation network, which performs very well while being very efficient. In this way, we can exploit both the 3D nature of the data and the specificity of the LiDAR sensor. This approach outperforms the state-of-the-art by a large margin on the KITTI dataset, as our experiments show. Moreover, this approach operates at 24fps on a single GPU. This is above the acquisition rate of common LiDAR sensors which makes it suitable for real-time applications. △ Less

Submitted 30 August, 2019; originally announced August 2019.

Comments: 9 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:1905.08748

arXiv:1906.12177 [pdf, other]

Learning to segment microscopy images with lazy labels

Authors: Rihuan Ke, Aurélie Bugeau, Nicolas Papadakis, Peter Schuetz, Carola-Bibiane Schönlieb

Abstract: The need for labour intensive pixel-wise annotation is a major limitation of many fully supervised learning methods for segmenting bioimages that can contain numerous object instances with thin separations. In this paper, we introduce a deep convolutional neural network for microscopy image segmentation. Annotation issues are circumvented by letting the network being trainable on coarse labels com… ▽ More The need for labour intensive pixel-wise annotation is a major limitation of many fully supervised learning methods for segmenting bioimages that can contain numerous object instances with thin separations. In this paper, we introduce a deep convolutional neural network for microscopy image segmentation. Annotation issues are circumvented by letting the network being trainable on coarse labels combined with only a very small number of images with pixel-wise annotations. We call this new labelling strategy `lazy' labels. Image segmentation is stratified into three connected tasks: rough inner region detection, object separation and pixel-wise segmentation. These tasks are learned in an end-to-end multi-task learning framework. The method is demonstrated on two microscopy datasets, where we show that the model gives accurate segmentation results even if exact boundary labels are missing for a majority of annotated data. It brings more flexibility and efficiency for training deep neural networks that are data hungry and is applicable to biomedical images with poor contrast at the object boundaries or with diverse textures and repeated patterns. △ Less

Submitted 10 September, 2020; v1 submitted 20 June, 2019; originally announced June 2019.

arXiv:1905.08748 [pdf, other]

RIU-Net: Embarrassingly simple semantic segmentation of 3D LiDAR point cloud

Authors: Pierre Biasutti, Aurélie Bugeau, Jean-François Aujol, Mathieu Brédif

Abstract: This paper proposes RIU-Net (for Range-Image U-Net), the adaptation of a popular semantic segmentation network for the semantic segmentation of a 3D LiDAR point cloud. The point cloud is turned into a 2D range-image by exploiting the topology of the sensor. This image is then used as input to a U-net. This architecture has already proved its efficiency for the task of semantic segmentation of medi… ▽ More This paper proposes RIU-Net (for Range-Image U-Net), the adaptation of a popular semantic segmentation network for the semantic segmentation of a 3D LiDAR point cloud. The point cloud is turned into a 2D range-image by exploiting the topology of the sensor. This image is then used as input to a U-net. This architecture has already proved its efficiency for the task of semantic segmentation of medical images. We demonstrate how it can also be used for the accurate semantic segmentation of a 3D LiDAR point cloud and how it represents a valid bridge between image processing and 3D point cloud processing. Our model is trained on range-images built from KITTI 3D object detection dataset. Experiments show that RIU-Net, despite being very simple, offers results that are comparable to the state-of-the-art of range-image based methods. Finally, we demonstrate that this architecture is able to operate at 90fps on a single GPU, which enables deployment for real-time segmentation. △ Less

Submitted 17 June, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

Comments: This version of the article contains revised scores to match the evaluation process presented in SqueezeSegV2

arXiv:1903.07169 [pdf, other]

SuperPatchMatch: an Algorithm for Robust Correspondences using Superpixel Patches

Authors: Rémi Giraud, Vinh-Thong Ta, Aurélie Bugeau, Pierrick Coupé, Nicolas Papadakis

Abstract: Superpixels have become very popular in many computer vision applications. Nevertheless, they remain underexploited since the superpixel decomposition may produce irregular and non stable segmentation results due to the dependency to the image content. In this paper, we first introduce a novel structure, a superpixel-based patch, called SuperPatch. The proposed structure, based on superpixel neigh… ▽ More Superpixels have become very popular in many computer vision applications. Nevertheless, they remain underexploited since the superpixel decomposition may produce irregular and non stable segmentation results due to the dependency to the image content. In this paper, we first introduce a novel structure, a superpixel-based patch, called SuperPatch. The proposed structure, based on superpixel neighborhood, leads to a robust descriptor since spatial information is naturally included. The generalization of the PatchMatch method to SuperPatches, named SuperPatchMatch, is introduced. Finally, we propose a framework to perform fast segmentation and labeling from an image database, and demonstrate the potential of our approach since we outperform, in terms of computational cost and accuracy, the results of state-of-the-art methods on both face labeling and medical image segmentation. △ Less

Submitted 17 March, 2019; originally announced March 2019.

Comments: IEEE Transactions on Image Processing (TIP), 2017 Selected for presentation at IEEE International Conference on Image Processing (ICIP) 2017

arXiv:1110.6895 [pdf]

doi 10.1007/978-3-642-27355-1_6

Multi-Layer Local Graph Words for Object Recognition

Authors: Svebor Karaman, Jenny Benois-Pineau, Rémi Mégret, Aurélie Bugeau

Abstract: In this paper, we propose a new multi-layer structural approach for the task of object based image retrieval. In our work we tackle the problem of structural organization of local features. The structural features we propose are nested multi-layered local graphs built upon sets of SURF feature points with Delaunay triangulation. A Bag-of-Visual-Words (BoVW) framework is applied on these graphs, gi… ▽ More In this paper, we propose a new multi-layer structural approach for the task of object based image retrieval. In our work we tackle the problem of structural organization of local features. The structural features we propose are nested multi-layered local graphs built upon sets of SURF feature points with Delaunay triangulation. A Bag-of-Visual-Words (BoVW) framework is applied on these graphs, giving birth to a Bag-of-Graph-Words representation. The multi-layer nature of the descriptors consists in scaling from trivial Delaunay graphs - isolated feature points - by increasing the number of nodes layer by layer up to graphs with maximal number of nodes. For each layer of graphs its own visual dictionary is built. The experiments conducted on the SIVAL and Caltech-101 data sets reveal that the graph features at different layers exhibit complementary performances on the same content and perform better than baseline BoVW approach. The combination of all existing layers, yields significant improvement of the object recognition performance compared to single level approaches. △ Less

Submitted 31 October, 2011; originally announced October 2011.

Comments: International Conference on MultiMedia Modeling, Klagenfurt : Autriche (2012)

arXiv:0709.1920 [pdf, ps, other]

Bandwidth selection for kernel estimation in mixed multi-dimensional spaces

Authors: Aurelie Bugeau, Patrick Pérez

Abstract: Kernel estimation techniques, such as mean shift, suffer from one major drawback: the kernel bandwidth selection. The bandwidth can be fixed for all the data set or can vary at each points. Automatic bandwidth selection becomes a real challenge in case of multidimensional heterogeneous features. This paper presents a solution to this problem. It is an extension of \cite{Comaniciu03a} which was b… ▽ More Kernel estimation techniques, such as mean shift, suffer from one major drawback: the kernel bandwidth selection. The bandwidth can be fixed for all the data set or can vary at each points. Automatic bandwidth selection becomes a real challenge in case of multidimensional heterogeneous features. This paper presents a solution to this problem. It is an extension of \cite{Comaniciu03a} which was based on the fundamental property of normal distributions regarding the bias of the normalized density gradient. The selection is done iteratively for each type of features, by looking for the stability of local bandwidth estimates across a predefined range of bandwidths. A pseudo balloon mean shift filtering and partitioning are introduced. The validity of the method is demonstrated in the context of color image segmentation based on a 5-dimensional space. △ Less

Submitted 14 September, 2007; v1 submitted 12 September, 2007; originally announced September 2007.

Showing 1–19 of 19 results for author: Bugeau, A