-
GHOST: Gaussian Hypothesis Open-Set Technique
Authors:
Ryan Rabinowitz,
Steve Cruz,
Manuel Günther,
Terrance E. Boult
Abstract:
Evaluations of large-scale recognition methods typically focus on overall performance. While this approach is common, it often fails to provide insights into performance across individual classes, which can lead to fairness issues and misrepresentation. Addressing these gaps is crucial for accurately assessing how well methods handle novel or unseen classes and ensuring a fair evaluation. To addre…
▽ More
Evaluations of large-scale recognition methods typically focus on overall performance. While this approach is common, it often fails to provide insights into performance across individual classes, which can lead to fairness issues and misrepresentation. Addressing these gaps is crucial for accurately assessing how well methods handle novel or unseen classes and ensuring a fair evaluation. To address fairness in Open-Set Recognition (OSR), we demonstrate that per-class performance can vary dramatically. We introduce Gaussian Hypothesis Open Set Technique (GHOST), a novel hyperparameter-free algorithm that models deep features using class-wise multivariate Gaussian distributions with diagonal covariance matrices. We apply Z-score normalization to logits to mitigate the impact of feature magnitudes that deviate from the model's expectations, thereby reducing the likelihood of the network assigning a high score to an unknown sample. We evaluate GHOST across multiple ImageNet-1K pre-trained deep networks and test it with four different unknown datasets. Using standard metrics such as AUOSCR, AUROC and FPR95, we achieve statistically significant improvements, advancing the state-of-the-art in large-scale OSR. Source code is provided online.
△ Less
Submitted 10 February, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Neural Radiance Fields for the Real World: A Survey
Authors:
Wenhui Xiao,
Remi Chierchia,
Rodrigo Santa Cruz,
Xuesong Li,
David Ahmedt-Aristizabal,
Olivier Salvado,
Clinton Fookes,
Leo Lebrat
Abstract:
Neural Radiance Fields (NeRFs) have remodeled 3D scene representation since release. NeRFs can effectively reconstruct complex 3D scenes from 2D images, advancing different fields and applications such as scene understanding, 3D content generation, and robotics. Despite significant research progress, a thorough review of recent innovations, applications, and challenges is lacking. This survey comp…
▽ More
Neural Radiance Fields (NeRFs) have remodeled 3D scene representation since release. NeRFs can effectively reconstruct complex 3D scenes from 2D images, advancing different fields and applications such as scene understanding, 3D content generation, and robotics. Despite significant research progress, a thorough review of recent innovations, applications, and challenges is lacking. This survey compiles key theoretical advancements and alternative representations and investigates emerging challenges. It further explores applications on reconstruction, highlights NeRFs' impact on computer vision and robotics, and reviews essential datasets and toolkits. By identifying gaps in the literature, this survey discusses open challenges and offers directions for future research.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
An analysis of data variation and bias in image-based dermatological datasets for machine learning classification
Authors:
Francisco Filho,
Emanoel Santos,
Rodrigo Mota,
Kelvin Cunha,
Fabio Papais,
Amanda Arruda,
Mateus Baltazar,
Camila Vieira,
José Gabriel Tavares,
Rafael Barros,
Othon Souza,
Thales Bezerra,
Natalia Lopes,
Érico Moutinho,
Jéssica Guido,
Shirley Cruz,
Paulo Borba,
Tsang Ing Ren
Abstract:
AI algorithms have become valuable in aiding professionals in healthcare. The increasing confidence obtained by these models is helpful in critical decision demands. In clinical dermatology, classification models can detect malignant lesions on patients' skin using only RGB images as input. However, most learning-based methods employ data acquired from dermoscopic datasets on training, which are l…
▽ More
AI algorithms have become valuable in aiding professionals in healthcare. The increasing confidence obtained by these models is helpful in critical decision demands. In clinical dermatology, classification models can detect malignant lesions on patients' skin using only RGB images as input. However, most learning-based methods employ data acquired from dermoscopic datasets on training, which are large and validated by a gold standard. Clinical models aim to deal with classification on users' smartphone cameras that do not contain the corresponding resolution provided by dermoscopy. Also, clinical applications bring new challenges. It can contain captures from uncontrolled environments, skin tone variations, viewpoint changes, noises in data and labels, and unbalanced classes. A possible alternative would be to use transfer learning to deal with the clinical images. However, as the number of samples is low, it can cause degradations on the model's performance; the source distribution used in training differs from the test set. This work aims to evaluate the gap between dermoscopic and clinical samples and understand how the dataset variations impact training. It assesses the main differences between distributions that disturb the model's prediction. Finally, from experiments on different architectures, we argue how to combine the data from divergent distributions, decreasing the impact on the model's final accuracy.
△ Less
Submitted 11 February, 2025; v1 submitted 15 January, 2025;
originally announced January 2025.
-
Exploring Quantum Neural Networks for Demand Forecasting
Authors:
Gleydson Fernandes de Jesus,
Maria Heloísa Fraga da Silva,
Otto Menegasso Pires,
Lucas Cruz da Silva,
Clebson dos Santos Cruz,
Valéria Loureiro da Silva
Abstract:
Forecasting demand for assets and services can be addressed in various markets, providing a competitive advantage when the predictive models used demonstrate high accuracy. However, the training of machine learning models incurs high computational costs, which may limit the training of prediction models based on available computational capacity. In this context, this paper presents an approach for…
▽ More
Forecasting demand for assets and services can be addressed in various markets, providing a competitive advantage when the predictive models used demonstrate high accuracy. However, the training of machine learning models incurs high computational costs, which may limit the training of prediction models based on available computational capacity. In this context, this paper presents an approach for training demand prediction models using quantum neural networks. For this purpose, a quantum neural network was used to forecast demand for vehicle financing. A classical recurrent neural network was used to compare the results, and they show a similar predictive capacity between the classical and quantum models, with the advantage of using a lower number of training parameters and also converging in fewer steps. Utilizing quantum computing techniques offers a promising solution to overcome the limitations of traditional machine learning approaches in training predictive models for complex market dynamics.
△ Less
Submitted 19 October, 2024;
originally announced October 2024.
-
Transformer based super-resolution downscaling for regional reanalysis: Full domain vs tiling approaches
Authors:
Antonio Pérez,
Mario Santa Cruz,
Daniel San Martín,
José Manuel Gutiérrez
Abstract:
Super-resolution (SR) is a promising cost-effective downscaling methodology for producing high-resolution climate information from coarser counterparts. A particular application is downscaling regional reanalysis outputs (predictand) from the driving global counterparts (predictor). This study conducts an intercomparison of various SR downscaling methods focusing on temperature and using the CERRA…
▽ More
Super-resolution (SR) is a promising cost-effective downscaling methodology for producing high-resolution climate information from coarser counterparts. A particular application is downscaling regional reanalysis outputs (predictand) from the driving global counterparts (predictor). This study conducts an intercomparison of various SR downscaling methods focusing on temperature and using the CERRA reanalysis (5.5 km resolution, produced with a regional atmospheric model driven by ERA5) as example. The method proposed in this work is the Swin transformer and two alternative methods are used as benchmark (fully convolutional U-Net and convolutional and dense DeepESD) as well as the simple bicubic interpolation. We compare two approaches, the standard one using the full domain as input and a more scalable tiling approach, dividing the full domain into tiles that are used as input. The methods are trained to downscale CERRA surface temperature, based on temperature information from the driving ERA5; in addition, the tiling approach includes static orographic information. We show that the tiling approach, which requires spatial transferability, comes at the cost of a lower performance (although it outperforms some full-domain benchmarks), but provides an efficient scalable solution that allows SR reduction on a pan-European scale and is valuable for real-time applications.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
SALVE: A 3D Reconstruction Benchmark of Wounds from Consumer-grade Videos
Authors:
Remi Chierchia,
Leo Lebrat,
David Ahmedt-Aristizabal,
Olivier Salvado,
Clinton Fookes,
Rodrigo Santa Cruz
Abstract:
Managing chronic wounds is a global challenge that can be alleviated by the adoption of automatic systems for clinical wound assessment from consumer-grade videos. While 2D image analysis approaches are insufficient for handling the 3D features of wounds, existing approaches utilizing 3D reconstruction methods have not been thoroughly evaluated. To address this gap, this paper presents a comprehen…
▽ More
Managing chronic wounds is a global challenge that can be alleviated by the adoption of automatic systems for clinical wound assessment from consumer-grade videos. While 2D image analysis approaches are insufficient for handling the 3D features of wounds, existing approaches utilizing 3D reconstruction methods have not been thoroughly evaluated. To address this gap, this paper presents a comprehensive study on 3D wound reconstruction from consumer-grade videos. Specifically, we introduce the SALVE dataset, comprising video recordings of realistic wound phantoms captured with different cameras. Using this dataset, we assess the accuracy and precision of state-of-the-art methods for 3D reconstruction, ranging from traditional photogrammetry pipelines to advanced neural rendering approaches. In our experiments, we observe that photogrammetry approaches do not provide smooth surfaces suitable for precise clinical measurements of wounds. Neural rendering approaches show promise in addressing this issue, advancing the use of this technology in wound care practices. We encourage the readers to visit the project page: https://remichierchia.github.io/SALVE/.
△ Less
Submitted 6 June, 2025; v1 submitted 28 July, 2024;
originally announced July 2024.
-
Please do not go: understanding turnover of software engineers from different perspectives
Authors:
Michelle Larissa Luciano Carvalho,
Paulo da Silva Cruz,
Eduardo Santana de Almeida,
Paulo Anselmo da Mota Silveira Neto,
Rafael Prikladnicki
Abstract:
Turnover consists of moving into and out of professional employees in the company in a given period. Such a phenomenon significantly impacts the software industry since it generates knowledge loss, delays in the schedule, and increased costs in the final project. Despite the efforts made by researchers and professionals to minimize the turnover, more studies are needed to understand the motivation…
▽ More
Turnover consists of moving into and out of professional employees in the company in a given period. Such a phenomenon significantly impacts the software industry since it generates knowledge loss, delays in the schedule, and increased costs in the final project. Despite the efforts made by researchers and professionals to minimize the turnover, more studies are needed to understand the motivation that drives Software Engineers to leave their jobs and the main strategies CEOs adopt to retain these professionals in software development companies. In this paper, we contribute a mixed methods study involving semi-structured interviews with Software Engineers and CEOs to obtain a wider opinion of these professionals about turnover and a subsequent validation survey with additional software engineers to check and review the insights from interviews. In studying such aspects, we identified 19 different reasons for software engineers' turnover and 18 more efficient strategies used in the software development industry to reduce it. Our findings provide several implications for industry and academia, which can drive future research.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
NeRF Director: Revisiting View Selection in Neural Volume Rendering
Authors:
Wenhui Xiao,
Rodrigo Santa Cruz,
David Ahmedt-Aristizabal,
Olivier Salvado,
Clinton Fookes,
Leo Lebrat
Abstract:
Neural Rendering representations have significantly contributed to the field of 3D computer vision. Given their potential, considerable efforts have been invested to improve their performance. Nonetheless, the essential question of selecting training views is yet to be thoroughly investigated. This key aspect plays a vital role in achieving high-quality results and aligns with the well-known tenet…
▽ More
Neural Rendering representations have significantly contributed to the field of 3D computer vision. Given their potential, considerable efforts have been invested to improve their performance. Nonetheless, the essential question of selecting training views is yet to be thoroughly investigated. This key aspect plays a vital role in achieving high-quality results and aligns with the well-known tenet of deep learning: "garbage in, garbage out". In this paper, we first illustrate the importance of view selection by demonstrating how a simple rotation of the test views within the most pervasive NeRF dataset can lead to consequential shifts in the performance rankings of state-of-the-art techniques. To address this challenge, we introduce a unified framework for view selection methods and devise a thorough benchmark to assess its impact. Significant improvements can be achieved without leveraging error or uncertainty estimation but focusing on uniform view coverage of the reconstructed object, resulting in a training-free approach. Using this technique, we show that high-quality renderings can be achieved faster by using fewer views. We conduct extensive experiments on both synthetic datasets and realistic data to demonstrate the effectiveness of our proposed method compared with random, conventional error-based, and uncertainty-guided view selection.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Divide and Conquer: Rethinking the Training Paradigm of Neural Radiance Fields
Authors:
Rongkai Ma,
Leo Lebrat,
Rodrigo Santa Cruz,
Gil Avraham,
Yan Zuo,
Clinton Fookes,
Olivier Salvado
Abstract:
Neural radiance fields (NeRFs) have exhibited potential in synthesizing high-fidelity views of 3D scenes but the standard training paradigm of NeRF presupposes an equal importance for each image in the training set. This assumption poses a significant challenge for rendering specific views presenting intricate geometries, thereby resulting in suboptimal performance. In this paper, we take a closer…
▽ More
Neural radiance fields (NeRFs) have exhibited potential in synthesizing high-fidelity views of 3D scenes but the standard training paradigm of NeRF presupposes an equal importance for each image in the training set. This assumption poses a significant challenge for rendering specific views presenting intricate geometries, thereby resulting in suboptimal performance. In this paper, we take a closer look at the implications of the current training paradigm and redesign this for more superior rendering quality by NeRFs. Dividing input views into multiple groups based on their visual similarities and training individual models on each of these groups enables each model to specialize on specific regions without sacrificing speed or efficiency. Subsequently, the knowledge of these specialized models is aggregated into a single entity via a teacher-student distillation paradigm, enabling spatial efficiency for online render-ing. Empirically, we evaluate our novel training framework on two publicly available datasets, namely NeRF synthetic and Tanks&Temples. Our evaluation demonstrates that our DaC training pipeline enhances the rendering quality of a state-of-the-art baseline model while exhibiting convergence to a superior minimum.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis
Authors:
Léo Lebrat,
Rodrigo Santa Cruz,
Remi Chierchia,
Yulia Arzhaeva,
Mohammad Ali Armin,
Joshua Goldsmith,
Jeremy Oorloff,
Prithvi Reddy,
Chuong Nguyen,
Lars Petersson,
Michelle Barakat-Johnson,
Georgina Luscombe,
Clinton Fookes,
Olivier Salvado,
David Ahmedt-Aristizabal
Abstract:
Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine…
▽ More
Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine learning-based frameworks. This paper introduces Syn3DWound, an open-source dataset of high-fidelity simulated wounds with 2D and 3D annotations. We propose baseline methods and a benchmarking framework for automated 3D morphometry analysis and 2D/3D wound segmentation.
△ Less
Submitted 3 March, 2024; v1 submitted 27 November, 2023;
originally announced November 2023.
-
Level of Awareness of PSU Bayambang Campus Students towards E learning Technologies
Authors:
Matthew John F. Sino Cruz,
Kim Eric B. Nanlabi,
Michael Ryan C. Peoro
Abstract:
The study assesses the awareness of PSU Bayambang Campus students regarding e-learning technologies. A Quantitative Research Approach was used, gathering data through a demographic questionnaire and ICT Resources assessment. The survey measured students' familiarity and knowledge of existing e-learning technologies. Around 52.50% of respondents were familiar with e learning concepts, but their exp…
▽ More
The study assesses the awareness of PSU Bayambang Campus students regarding e-learning technologies. A Quantitative Research Approach was used, gathering data through a demographic questionnaire and ICT Resources assessment. The survey measured students' familiarity and knowledge of existing e-learning technologies. Around 52.50% of respondents were familiar with e learning concepts, but their exposure and utilization levels need consideration. Technology, Support, and Users were identified as key factors influencing student awareness. Implementation can be improved through policies and resource provision. The researchers recommend integrating e learning policies, providing ICT Resources and Infrastructure, and offering training for students and teachers. This research serves as a guide for policy design, enhancing the University's learning process and facilitating better learning and interaction.
△ Less
Submitted 6 August, 2023;
originally announced August 2023.
-
RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023
Authors:
Aline Lima de Oliveira,
Cauê Addae da Silva Gomes,
Cecília Virginia Santos da Silva,
Charles Matheus de Sousa Alves,
Danilo Andrade Martins de Souza,
Driele Pires Ferreira Araújo Xavier,
Edgleyson Pereira da Silva,
Felipe Bezerra Martins,
Lucas Henrique Cavalcanti Santos,
Lucas Dias Maciel,
Matheus Paixão Gumercindo dos Santos,
Matheus Lafayette Vasconcelos,
Matheus Vinícius Teotonio do Nascimento Andrade,
João Guilherme Oliveira Carvalho de Melo,
João Pedro Souza Pereira de Moura,
José Ronald da Silva,
José Victor Silva Cruz,
Pedro Henrique Santana de Morais,
Pedro Paulo Salman de Oliveira,
Riei Joaquim Matos Rodrigues,
Roberto Costa Fernandes,
Ryan Vinicius Santos Morais,
Tamara Mayara Ramos Teobaldo,
Washington Igor dos Santos Silva,
Edna Natividade Silva Barros
Abstract:
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou…
▽ More
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Augmented Reality's Potential for Identifying and Mitigating Home Privacy Leaks
Authors:
Stefany Cruz,
Logan Danek,
Shinan Liu,
Christopher Kraemer,
Zixin Wang,
Nick Feamster,
Danny Yuxing Huang,
Yaxing Yao,
Josiah Hester
Abstract:
Users face various privacy risks in smart homes, yet there are limited ways for them to learn about the details of such risks, such as the data practices of smart home devices and their data flow. In this paper, we present Privacy Plumber, a system that enables a user to inspect and explore the privacy "leaks" in their home using an augmented reality tool. Privacy Plumber allows the user to learn…
▽ More
Users face various privacy risks in smart homes, yet there are limited ways for them to learn about the details of such risks, such as the data practices of smart home devices and their data flow. In this paper, we present Privacy Plumber, a system that enables a user to inspect and explore the privacy "leaks" in their home using an augmented reality tool. Privacy Plumber allows the user to learn and understand the volume of data leaving the home and how that data may affect a user's privacy -- in the same physical context as the devices in question, because we visualize the privacy leaks with augmented reality. Privacy Plumber uses ARP spoofing to gather aggregate network traffic information and presents it through an overlay on top of the device in an smartphone app. The increased transparency aims to help the user make privacy decisions and mend potential privacy leaks, such as instruct Privacy Plumber on what devices to block, on what schedule (i.e., turn off Alexa when sleeping), etc. Our initial user study with six participants demonstrates participants' increased awareness of privacy leaks in smart devices, which further contributes to their privacy decisions (e.g., which devices to block).
△ Less
Submitted 27 January, 2023;
originally announced January 2023.
-
Distributed Load Orchestration for Vision Computing in Multi-Access Edge Computing
Authors:
Ricardo N. Boing,
Hugo Vaz Sampaio,
Fernando Koch,
Rene N. S. Cruz,
Carlos B. Westphall
Abstract:
Multi-access Edge Computing (MEC) is a type of network architecture that provides cloud computing capabilities at the edge of the network. We consider the use case of video surveillance for an university campus running on a 5G-MEC environment. A key issue is the eventual overloading of computing resources on the MEC nodes during peak demand. We propose a new strategy for distributed orchestration…
▽ More
Multi-access Edge Computing (MEC) is a type of network architecture that provides cloud computing capabilities at the edge of the network. We consider the use case of video surveillance for an university campus running on a 5G-MEC environment. A key issue is the eventual overloading of computing resources on the MEC nodes during peak demand. We propose a new strategy for distributed orchestration in MEC environments based on how load balancing strategies organize processing queue. Then, we elaborated a strategy for deadline-aware queueing prioritization that organizes requests based on pre-established thresholds. We introduce a simulation-based experimentation environment and conduct a number of tests demonstrating the benefit of our approach by reducing the number of referrals and improving the effectiveness in meeting deadlines.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
CorticalFlow$^{++}$: Boosting Cortical Surface Reconstruction Accuracy, Regularity, and Interoperability
Authors:
Rodrigo Santa Cruz,
Léo Lebrat,
Darren Fu,
Pierrick Bourgeat,
Jurgen Fripp,
Clinton Fookes,
Olivier Salvado
Abstract:
The problem of Cortical Surface Reconstruction from magnetic resonance imaging has been traditionally addressed using lengthy pipelines of image processing techniques like FreeSurfer, CAT, or CIVET. These frameworks require very long runtimes deemed unfeasible for real-time applications and unpractical for large-scale studies. Recently, supervised deep learning approaches have been introduced to s…
▽ More
The problem of Cortical Surface Reconstruction from magnetic resonance imaging has been traditionally addressed using lengthy pipelines of image processing techniques like FreeSurfer, CAT, or CIVET. These frameworks require very long runtimes deemed unfeasible for real-time applications and unpractical for large-scale studies. Recently, supervised deep learning approaches have been introduced to speed up this task cutting down the reconstruction time from hours to seconds. Using the state-of-the-art CorticalFlow model as a blueprint, this paper proposes three modifications to improve its accuracy and interoperability with existing surface analysis tools, while not sacrificing its fast inference time and low GPU memory consumption. First, we employ a more accurate ODE solver to reduce the diffeomorphic mapping approximation error. Second, we devise a routine to produce smoother template meshes avoiding mesh artifacts caused by sharp edges in CorticalFlow's convex-hull based template. Last, we recast pial surface prediction as the deformation of the predicted white surface leading to a one-to-one mapping between white and pial surface vertices. This mapping is essential to many existing surface analysis tools for cortical morphometry. We name the resulting method CorticalFlow$^{++}$. Using large-scale datasets, we demonstrate the proposed changes provide more geometric accuracy and surface regularity while keeping the reconstruction time and GPU memory requirements almost unchanged.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
CorticalFlow: A Diffeomorphic Mesh Deformation Module for Cortical Surface Reconstruction
Authors:
Léo Lebrat,
Rodrigo Santa Cruz,
Frédéric de Gournay,
Darren Fu,
Pierrick Bourgeat,
Jurgen Fripp,
Clinton Fookes,
Olivier Salvado
Abstract:
In this paper we introduce CorticalFlow, a new geometric deep-learning model that, given a 3-dimensional image, learns to deform a reference template towards a targeted object. To conserve the template mesh's topological properties, we train our model over a set of diffeomorphic transformations. This new implementation of a flow Ordinary Differential Equation (ODE) framework benefits from a small…
▽ More
In this paper we introduce CorticalFlow, a new geometric deep-learning model that, given a 3-dimensional image, learns to deform a reference template towards a targeted object. To conserve the template mesh's topological properties, we train our model over a set of diffeomorphic transformations. This new implementation of a flow Ordinary Differential Equation (ODE) framework benefits from a small GPU memory footprint, allowing the generation of surfaces with several hundred thousand vertices. To reduce topological errors introduced by its discrete resolution, we derive numeric conditions which improve the manifoldness of the predicted triangle mesh. To exhibit the utility of CorticalFlow, we demonstrate its performance for the challenging task of brain cortical surface reconstruction. In contrast to current state-of-the-art, CorticalFlow produces superior surfaces while reducing the computation time from nine and a half minutes to one second. More significantly, CorticalFlow enforces the generation of anatomically plausible surfaces; the absence of which has been a major impediment restricting the clinical relevance of such surface reconstruction methods.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
The use of Data Augmentation as a technique for improving neural network accuracy in detecting fake news about COVID-19
Authors:
Wilton O. Júnior,
Mauricio S. da Cruz,
Andre Brasil Vieira Wyzykowski,
Arnaldo Bispo de Jesus
Abstract:
This paper aims to present how the application of Natural Language Processing (NLP) and data augmentation techniques can improve the performance of a neural network for better detection of fake news in the Portuguese language. Fake news is one of the main controversies during the growth of the internet in the last decade. Verifying what is fact and what is false has proven to be a difficult task,…
▽ More
This paper aims to present how the application of Natural Language Processing (NLP) and data augmentation techniques can improve the performance of a neural network for better detection of fake news in the Portuguese language. Fake news is one of the main controversies during the growth of the internet in the last decade. Verifying what is fact and what is false has proven to be a difficult task, while the dissemination of false news is much faster, which leads to the need for the creation of tools that, automated, assist in the process of verification of what is fact and what is false. In order to bring a solution, an experiment was developed with neural network using news, real and fake, which were never seen by artificial intelligence (AI). There was a significant performance in the news classification after the application of the mentioned techniques.
△ Less
Submitted 1 May, 2022;
originally announced May 2022.
-
Autoencoder for Synthetic to Real Generalization: From Simple to More Complex Scenes
Authors:
Steve Dias Da Cruz,
Bertram Taetz,
Thomas Stifter,
Didier Stricker
Abstract:
Learning on synthetic data and transferring the resulting properties to their real counterparts is an important challenge for reducing costs and increasing safety in machine learning. In this work, we focus on autoencoder architectures and aim at learning latent space representations that are invariant to inductive biases caused by the domain shift between simulated and real images showing the sam…
▽ More
Learning on synthetic data and transferring the resulting properties to their real counterparts is an important challenge for reducing costs and increasing safety in machine learning. In this work, we focus on autoencoder architectures and aim at learning latent space representations that are invariant to inductive biases caused by the domain shift between simulated and real images showing the same scenario. We train on synthetic images only, present approaches to increase generalizability and improve the preservation of the semantics to real datasets of increasing visual complexity. We show that pre-trained feature extractors (e.g. VGG) can be sufficient for generalization on images of lower complexity, but additional improvements are required for visually more complex scenes. To this end, we demonstrate a new sampling technique, which matches semantically important parts of the image, while randomizing the other parts, leads to salient feature extraction and a neglection of unimportant parts. This helps the generalization to real data and we further show that our approach outperforms fine-tuned classification models.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Autoencoder Attractors for Uncertainty Estimation
Authors:
Steve Dias Da Cruz,
Bertram Taetz,
Thomas Stifter,
Didier Stricker
Abstract:
The reliability assessment of a machine learning model's prediction is an important quantity for the deployment in safety critical applications. Not only can it be used to detect novel sceneries, either as out-of-distribution or anomaly sample, but it also helps to determine deficiencies in the training data distribution. A lot of promising research directions have either proposed traditional meth…
▽ More
The reliability assessment of a machine learning model's prediction is an important quantity for the deployment in safety critical applications. Not only can it be used to detect novel sceneries, either as out-of-distribution or anomaly sample, but it also helps to determine deficiencies in the training data distribution. A lot of promising research directions have either proposed traditional methods like Gaussian processes or extended deep learning based approaches, for example, by interpreting them from a Bayesian point of view. In this work we propose a novel approach for uncertainty estimation based on autoencoder models: The recursive application of a previously trained autoencoder model can be interpreted as a dynamical system storing training examples as attractors. While input images close to known samples will converge to the same or similar attractor, input samples containing unknown features are unstable and converge to different training samples by potentially removing or changing characteristic features. The use of dropout during training and inference leads to a family of similar dynamical systems, each one being robust on samples close to the training distribution but unstable on new features. Either the model reliably removes these features or the resulting instability can be exploited to detect problematic input samples. We evaluate our approach on several dataset combinations as well as on an industrial application for occupant classification in the vehicle interior for which we additionally release a new synthetic dataset.
△ Less
Submitted 11 May, 2022; v1 submitted 1 April, 2022;
originally announced April 2022.
-
NNLander-VeriF: A Neural Network Formal Verification Framework for Vision-Based Autonomous Aircraft Landing
Authors:
Ulices Santa Cruz,
Yasser Shoukry
Abstract:
In this paper, we consider the problem of formally verifying a Neural Network (NN) based autonomous landing system. In such a system, a NN controller processes images from a camera to guide the aircraft while approaching the runway. A central challenge for the safety and liveness verification of vision-based closed-loop systems is the lack of mathematical models that captures the relation between…
▽ More
In this paper, we consider the problem of formally verifying a Neural Network (NN) based autonomous landing system. In such a system, a NN controller processes images from a camera to guide the aircraft while approaching the runway. A central challenge for the safety and liveness verification of vision-based closed-loop systems is the lack of mathematical models that captures the relation between the system states (e.g., position of the aircraft) and the images processed by the vision-based NN controller. Another challenge is the limited abilities of state-of-the-art NN model checkers. Such model checkers can reason only about simple input-output robustness properties of neural networks. This limitation creates a gap between the NN model checker abilities and the need to verify a closed-loop system while considering the aircraft dynamics, the perception components, and the NN controller. To this end, this paper presents NNLander-VeriF, a framework to verify vision-based NN controllers used for autonomous landing. NNLander-VeriF addresses the challenges above by exploiting geometric models of perspective cameras to obtain a mathematical model that captures the relation between the aircraft states and the inputs to the NN controller. By converting this model into a NN (with manually assigned weights) and composing it with the NN controller, one can capture the relation between aircraft states and control actions using one augmented NN. Such an augmented NN model leads to a natural encoding of the closed-loop verification into several NN robustness queries, which state-of-the-art NN model checkers can handle. Finally, we evaluate our framework to formally verify the properties of a trained NN and we show its efficiency.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Enhanced Performance of Pre-Trained Networks by Matched Augmentation Distributions
Authors:
Touqeer Ahmad,
Mohsen Jafarzadeh,
Akshay Raj Dhamija,
Ryan Rabinowitz,
Steve Cruz,
Chunchun Li,
Terrance E. Boult
Abstract:
There exists a distribution discrepancy between training and testing, in the way images are fed to modern CNNs. Recent work tried to bridge this gap either by fine-tuning or re-training the network at different resolutions. However re-training a network is rarely cheap and not always viable. To this end, we propose a simple solution to address the train-test distributional shift and enhance the pe…
▽ More
There exists a distribution discrepancy between training and testing, in the way images are fed to modern CNNs. Recent work tried to bridge this gap either by fine-tuning or re-training the network at different resolutions. However re-training a network is rarely cheap and not always viable. To this end, we propose a simple solution to address the train-test distributional shift and enhance the performance of pre-trained models -- which commonly ship as a package with deep learning platforms \eg, PyTorch. Specifically, we demonstrate that running inference on the center crop of an image is not always the best as important discriminatory information may be cropped-off. Instead we propose to combine results for multiple random crops for a test image. This not only matches the train time augmentation but also provides the full coverage of the input image. We explore combining representation of random crops through averaging at different levels \ie, deep feature level, logit level, and softmax level. We demonstrate that, for various families of modern deep networks, such averaging results in better validation accuracy compared to using a single central crop per image. The softmax averaging results in the best performance for various pre-trained networks without requiring any re-training or fine-tuning whatsoever. On modern GPUs with batch processing, the paper's approach to inference of pre-trained networks, is essentially free as all images in a batch can all be processed at once.
△ Less
Submitted 19 January, 2022;
originally announced January 2022.
-
Provably Safe Model-Based Meta Reinforcement Learning: An Abstraction-Based Approach
Authors:
Xiaowu Sun,
Wael Fatnassi,
Ulices Santa Cruz,
Yasser Shoukry
Abstract:
While conventional reinforcement learning focuses on designing agents that can perform one task, meta-learning aims, instead, to solve the problem of designing agents that can generalize to different tasks (e.g., environments, obstacles, and goals) that were not considered during the design or the training of these agents. In this spirit, in this paper, we consider the problem of training a provab…
▽ More
While conventional reinforcement learning focuses on designing agents that can perform one task, meta-learning aims, instead, to solve the problem of designing agents that can generalize to different tasks (e.g., environments, obstacles, and goals) that were not considered during the design or the training of these agents. In this spirit, in this paper, we consider the problem of training a provably safe Neural Network (NN) controller for uncertain nonlinear dynamical systems that can generalize to new tasks that were not present in the training data while preserving strong safety guarantees. Our approach is to learn a set of NN controllers during the training phase. When the task becomes available at runtime, our framework will carefully select a subset of these NN controllers and compose them to form the final NN controller. Critical to our approach is the ability to compute a finite-state abstraction of the nonlinear dynamical system. This abstract model captures the behavior of the closed-loop system under all possible NN weights, and is used to train the NNs and compose them when the task becomes available. We provide theoretical guarantees that govern the correctness of the resulting NN. We evaluated our approach on the problem of controlling a wheeled robot in cluttered environments that were not present in the training data.
△ Less
Submitted 2 September, 2021;
originally announced September 2021.
-
Autoencoder Based Inter-Vehicle Generalization for In-Cabin Occupant Classification
Authors:
Steve Dias Da Cruz,
Bertram Taetz,
Oliver Wasenmüller,
Thomas Stifter,
Didier Stricker
Abstract:
Common domain shift problem formulations consider the integration of multiple source domains, or the target domain during training. Regarding the generalization of machine learning models between different car interiors, we formulate the criterion of training in a single vehicle: without access to the target distribution of the vehicle the model would be deployed to, neither with access to multipl…
▽ More
Common domain shift problem formulations consider the integration of multiple source domains, or the target domain during training. Regarding the generalization of machine learning models between different car interiors, we formulate the criterion of training in a single vehicle: without access to the target distribution of the vehicle the model would be deployed to, neither with access to multiple vehicles during training. We performed an investigation on the SVIRO dataset for occupant classification on the rear bench and propose an autoencoder based approach to improve the transferability. The autoencoder is on par with commonly used classification models when trained from scratch and sometimes out-performs models pre-trained on a large amount of data. Moreover, the autoencoder can transform images from unknown vehicles into the vehicle it was trained on. These results are corroborated by an evaluation on real infrared images from two vehicle interiors.
△ Less
Submitted 7 May, 2021;
originally announced May 2021.
-
Autonomic Management of Power Consumption with IoT and Fog Computing
Authors:
Hugo Vaz Sampaio,
Fernando Koch,
Carlos Becker Westphall,
Ricardo do Nascimento Boing,
Rene Nolio Santa Cruz
Abstract:
We introduce a system for Autonomic Management of Power Consumption in setups that involve Internet of Things (IoT) and Fog Computing. The Central IoT (CIoT) is a Fog Computing based solution to provide advanced orchestration mechanisms to manage dynamic duty cycles for extra energy savings. The solution works by adjusting Home (H) and Away (A) cycles based on contextual information, like environm…
▽ More
We introduce a system for Autonomic Management of Power Consumption in setups that involve Internet of Things (IoT) and Fog Computing. The Central IoT (CIoT) is a Fog Computing based solution to provide advanced orchestration mechanisms to manage dynamic duty cycles for extra energy savings. The solution works by adjusting Home (H) and Away (A) cycles based on contextual information, like environmental conditions, user behavior, behavior variation, regulations on energy and network resources utilization, among others. Performance analysis through a proof of concept present average energy savings of 58.4%, reaching up to 61.51% when augmenting with a scheduling system and variable long sleep cycles (LS). However, there is no linear relation increasing LS time and more savings. The significance of this research is to promote autonomic management as a solution to develop more energy efficient buildings and smarter cities, towards sustainable goals.
△ Less
Submitted 12 May, 2021; v1 submitted 6 May, 2021;
originally announced May 2021.
-
MongeNet: Efficient Sampler for Geometric Deep Learning
Authors:
Léo Lebrat,
Rodrigo Santa Cruz,
Clinton Fookes,
Olivier Salvado
Abstract:
Recent advances in geometric deep-learning introduce complex computational challenges for evaluating the distance between meshes. From a mesh model, point clouds are necessary along with a robust distance metric to assess surface quality or as part of the loss function for training models. Current methods often rely on a uniform random mesh discretization, which yields irregular sampling and noisy…
▽ More
Recent advances in geometric deep-learning introduce complex computational challenges for evaluating the distance between meshes. From a mesh model, point clouds are necessary along with a robust distance metric to assess surface quality or as part of the loss function for training models. Current methods often rely on a uniform random mesh discretization, which yields irregular sampling and noisy distance estimation. In this paper we introduce MongeNet, a fast and optimal transport based sampler that allows for an accurate discretization of a mesh with better approximation properties. We compare our method to the ubiquitous random uniform sampling and show that the approximation error is almost half with a very small computational overhead.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.
-
Safe-by-Repair: A Convex Optimization Approach for Repairing Unsafe Two-Level Lattice Neural Network Controllers
Authors:
Ulices Santa Cruz,
James Ferlez,
Yasser Shoukry
Abstract:
In this paper, we consider the problem of repairing a data-trained Rectified Linear Unit (ReLU) Neural Network (NN) controller for a discrete-time, input-affine system. That is we assume that such a NN controller is available, and we seek to repair unsafe closed-loop behavior at one known "counterexample" state while simultaneously preserving a notion of safe closed-loop behavior on a separate, ve…
▽ More
In this paper, we consider the problem of repairing a data-trained Rectified Linear Unit (ReLU) Neural Network (NN) controller for a discrete-time, input-affine system. That is we assume that such a NN controller is available, and we seek to repair unsafe closed-loop behavior at one known "counterexample" state while simultaneously preserving a notion of safe closed-loop behavior on a separate, verified set of states. To this end, we further assume that the NN controller has a Two-Level Lattice (TLL) architecture, and exhibit an algorithm that can systematically and efficiently repair such an network. Facilitated by this choice, our approach uses the unique semantics of the TLL architecture to divide the repair problem into two significantly decoupled sub-problems, one of which is concerned with repairing the un-safe counterexample -- and hence is essentially of local scope -- and the other of which ensures that the repairs are realized in the output of the network -- and hence is essentially of global scope. We then show that one set of sufficient conditions for solving each these sub-problems can be cast as a convex feasibility problem, and this allows us to formulate the TLL repair problem as two separate, but significantly decoupled, convex optimization problems. Finally, we evaluate our algorithm on a TLL controller on a simple dynamical model of a four-wheel-car.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
A Pub-Sub Architecture to Promote Blockchain Interoperability
Authors:
Sara Ghaemi,
Sara Rouhani,
Rafael Belchior,
Rui S. Cruz,
Hamzeh Khazaei,
Petr Musilek
Abstract:
The maturing of blockchain technology leads to heterogeneity, where multiple solutions specialize in a particular use case. While the development of different blockchain networks shows great potential for blockchains, the isolated networks have led to data and asset silos, limiting the applications of this technology. Blockchain interoperability solutions are essential to enable distributed ledger…
▽ More
The maturing of blockchain technology leads to heterogeneity, where multiple solutions specialize in a particular use case. While the development of different blockchain networks shows great potential for blockchains, the isolated networks have led to data and asset silos, limiting the applications of this technology. Blockchain interoperability solutions are essential to enable distributed ledgers to reach their full potential. Such solutions allow blockchains to support asset and data transfer, resulting in the development of innovative applications.
This paper proposes a novel blockchain interoperability solution for permissioned blockchains based on the publish/subscribe architecture. We implemented a prototype of this platform to show the feasibility of our design. We evaluate our solution by implementing examples of the different publisher and subscriber networks, such as Hyperledger Besu, which is an Ethereum client, and two different versions of Hyperledger Fabric. We present a performance analysis of the whole network that indicates its limits and bottlenecks. Finally, we discuss the extensibility and scalability of the platform in different scenarios. Our evaluation shows that our system can handle a throughput in the order of the hundreds of transactions per second.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
A Unifying Framework for Formal Theories of Novelty:Framework, Examples and Discussion
Authors:
T. E. Boult,
P. A. Grabowicz,
D. S. Prijatelj,
R. Stern,
L. Holder,
J. Alspector,
M. Jafarzadeh,
T. Ahmad,
A. R. Dhamija,
C. Li,
S. Cruz,
A. Shrivastava,
C. Vondrick,
W. J. Scheirer
Abstract:
Managing inputs that are novel, unknown, or out-of-distribution is critical as an agent moves from the lab to the open world. Novelty-related problems include being tolerant to novel perturbations of the normal input, detecting when the input includes novel items, and adapting to novel inputs. While significant research has been undertaken in these areas, a noticeable gap exists in the lack of a f…
▽ More
Managing inputs that are novel, unknown, or out-of-distribution is critical as an agent moves from the lab to the open world. Novelty-related problems include being tolerant to novel perturbations of the normal input, detecting when the input includes novel items, and adapting to novel inputs. While significant research has been undertaken in these areas, a noticeable gap exists in the lack of a formalized definition of novelty that transcends problem domains. As a team of researchers spanning multiple research groups and different domains, we have seen, first hand, the difficulties that arise from ill-specified novelty problems, as well as inconsistent definitions and terminology. Therefore, we present the first unified framework for formal theories of novelty and use the framework to formally define a family of novelty types. Our framework can be applied across a wide range of domains, from symbolic AI to reinforcement learning, and beyond to open world image recognition. Thus, it can be used to help kick-start new research efforts and accelerate ongoing work on these important novelty-related problems. This extended version of our AAAI 2021 paper included more details and examples in multiple domains.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
A Review of Open-World Learning and Steps Toward Open-World Learning Without Labels
Authors:
Mohsen Jafarzadeh,
Akshay Raj Dhamija,
Steve Cruz,
Chunchun Li,
Touqeer Ahmad,
Terrance E. Boult
Abstract:
In open-world learning, an agent starts with a set of known classes, detects, and manages things that it does not know, and learns them over time from a non-stationary stream of data. Open-world learning is related to but also distinct from a multitude of other learning problems and this paper briefly analyzes the key differences between a wide range of problems including incremental learning, gen…
▽ More
In open-world learning, an agent starts with a set of known classes, detects, and manages things that it does not know, and learns them over time from a non-stationary stream of data. Open-world learning is related to but also distinct from a multitude of other learning problems and this paper briefly analyzes the key differences between a wide range of problems including incremental learning, generalized novelty discovery, and generalized zero-shot learning. This paper formalizes various open-world learning problems including open-world learning without labels. These open-world problems can be addressed with modifications to known elements, we present a new framework that enables agents to combine various modules for novelty-detection, novelty-characterization, incremental learning, and instance management to learn new classes from a stream of unlabeled data in an unsupervised manner, survey how to adapt a few state-of-the-art techniques to fit the framework and use them to define seven baselines for performance on the open-world learning without labels problem. We then discuss open-world learning quality and analyze how that can improve instance management. We also discuss some of the general ambiguity issues that occur in open-world learning without labels.
△ Less
Submitted 3 January, 2022; v1 submitted 25 November, 2020;
originally announced November 2020.
-
Automatic Open-World Reliability Assessment
Authors:
Mohsen Jafarzadeh,
Touqeer Ahmad,
Akshay Raj Dhamija,
Chunchun Li,
Steve Cruz,
Terrance E. Boult
Abstract:
Image classification in the open-world must handle out-of-distribution (OOD) images. Systems should ideally reject OOD images, or they will map atop of known classes and reduce reliability. Using open-set classifiers that can reject OOD inputs can help. However, optimal accuracy of open-set classifiers depend on the frequency of OOD data. Thus, for either standard or open-set classifiers, it is im…
▽ More
Image classification in the open-world must handle out-of-distribution (OOD) images. Systems should ideally reject OOD images, or they will map atop of known classes and reduce reliability. Using open-set classifiers that can reject OOD inputs can help. However, optimal accuracy of open-set classifiers depend on the frequency of OOD data. Thus, for either standard or open-set classifiers, it is important to be able to determine when the world changes and increasing OOD inputs will result in reduced system reliability. However, during operations, we cannot directly assess accuracy as there are no labels. Thus, the reliability assessment of these classifiers must be done by human operators, made more complex because networks are not 100% accurate, so some failures are to be expected. To automate this process, herein, we formalize the open-world recognition reliability problem and propose multiple automatic reliability assessment policies to address this new problem using only the distribution of reported scores/probability data. The distributional algorithms can be applied to both classic classifiers with SoftMax as well as the open-world Extreme Value Machine (EVM) to provide automated reliability assessment. We show that all of the new algorithms significantly outperform detection using the mean of SoftMax.
△ Less
Submitted 13 December, 2020; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Topological properties of basins of attraction and expressiveness of width bounded neural networks
Authors:
Hans-Peter Beise,
Steve Dias Da Cruz
Abstract:
In Radhakrishnan et al. [2020], the authors empirically show that autoencoders trained with usual SGD methods shape out basins of attraction around their training data. We consider network functions of width not exceeding the input dimension and prove that in this situation basins of attraction are bounded and their complement cannot have bounded components. Our conditions in these results are met…
▽ More
In Radhakrishnan et al. [2020], the authors empirically show that autoencoders trained with usual SGD methods shape out basins of attraction around their training data. We consider network functions of width not exceeding the input dimension and prove that in this situation basins of attraction are bounded and their complement cannot have bounded components. Our conditions in these results are met in several experiments of the latter work and we thus address a question posed therein. We also show that under some more restrictive conditions the basins of attraction are path-connected. The tightness of the conditions in our results is demonstrated by means of several examples. Finally, the arguments used to prove the above results allow us to derive a root cause why scalar-valued neural network functions that fulfill our bounded width condition are not dense in spaces of continuous functions.
△ Less
Submitted 1 December, 2023; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Illumination Normalization by Partially Impossible Encoder-Decoder Cost Function
Authors:
Steve Dias Da Cruz,
Bertram Taetz,
Thomas Stifter,
Didier Stricker
Abstract:
Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models' robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder…
▽ More
Images recorded during the lifetime of computer vision based systems undergo a wide range of illumination and environmental conditions affecting the reliability of previously trained machine learning models. Image normalization is hence a valuable preprocessing component to enhance the models' robustness. To this end, we introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images (e.g. environmental features and illumination changes) to focus on the reconstruction of the salient features (e.g. class instances). Our method exploits the availability of identical sceneries under different illumination and environmental conditions for which we formulate a partially impossible reconstruction target: the input image will not convey enough information to reconstruct the target in its entirety. Its applicability is assessed on three publicly available datasets. We combine the triplet loss as a regularizer in the latent space representation and a nearest neighbour search to improve the generalization to unseen illuminations and class instances. The importance of the aforementioned post-processing is highlighted on an automotive application. To this end, we release a synthetic dataset of sceneries from three different passenger compartments where each scenery is rendered under ten different illumination and environmental conditions: see https://sviro.kl.dfki.de
△ Less
Submitted 9 November, 2020; v1 submitted 6 November, 2020;
originally announced November 2020.
-
DeepCSR: A 3D Deep Learning Approach for Cortical Surface Reconstruction
Authors:
Rodrigo Santa Cruz,
Leo Lebrat,
Pierrick Bourgeat,
Clinton Fookes,
Jurgen Fripp,
Olivier Salvado
Abstract:
The study of neurodegenerative diseases relies on the reconstruction and analysis of the brain cortex from magnetic resonance imaging (MRI). Traditional frameworks for this task like FreeSurfer demand lengthy runtimes, while its accelerated variant FastSurfer still relies on a voxel-wise segmentation which is limited by its resolution to capture narrow continuous objects as cortical surfaces. Havi…
▽ More
The study of neurodegenerative diseases relies on the reconstruction and analysis of the brain cortex from magnetic resonance imaging (MRI). Traditional frameworks for this task like FreeSurfer demand lengthy runtimes, while its accelerated variant FastSurfer still relies on a voxel-wise segmentation which is limited by its resolution to capture narrow continuous objects as cortical surfaces. Having these limitations in mind, we propose DeepCSR, a 3D deep learning framework for cortical surface reconstruction from MRI. Towards this end, we train a neural network model with hypercolumn features to predict implicit surface representations for points in a brain template space. After training, the cortical surface at a desired level of detail is obtained by evaluating surface representations at specific coordinates, and subsequently applying a topology correction algorithm and an isosurface extraction method. Thanks to the continuous nature of this approach and the efficacy of its hypercolumn features scheme, DeepCSR efficiently reconstructs cortical surfaces at high resolution capturing fine details in the cortical folding. Moreover, DeepCSR is as accurate, more precise, and faster than the widely used FreeSurfer toolbox and its deep learning powered variant FastSurfer on reconstructing cortical surfaces from MRI which should facilitate large-scale medical studies and new healthcare applications.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
Going deeper with brain morphometry using neural networks
Authors:
Rodrigo Santa Cruz,
Léo Lebrat,
Pierrick Bourgeat,
Vincent Doré,
Jason Dowling,
Jurgen Fripp,
Clinton Fookes,
Olivier Salvado
Abstract:
Brain morphometry from magnetic resonance imaging (MRI) is a consolidated biomarker for many neurodegenerative diseases. Recent advances in this domain indicate that deep convolutional neural networks can infer morphometric measurements within a few seconds. Nevertheless, the accuracy of the devised model for insightful bio-markers (mean curvature and thickness) remains unsatisfactory. In this pap…
▽ More
Brain morphometry from magnetic resonance imaging (MRI) is a consolidated biomarker for many neurodegenerative diseases. Recent advances in this domain indicate that deep convolutional neural networks can infer morphometric measurements within a few seconds. Nevertheless, the accuracy of the devised model for insightful bio-markers (mean curvature and thickness) remains unsatisfactory. In this paper, we propose a more accurate and efficient neural network model for brain morphometry named HerstonNet. More specifically, we develop a 3D ResNet-based neural network to learn rich features directly from MRI, design a multi-scale regression scheme by predicting morphometric measures at feature maps of different resolutions, and leverage a robust optimization method to avoid poor quality minima and reduce the prediction variance. As a result, HerstonNet improves the existing approach by 24.30% in terms of intraclass correlation coefficient (agreement measure) to FreeSurfer silver-standards while maintaining a competitive run-time.
△ Less
Submitted 7 September, 2020;
originally announced September 2020.
-
On the Composition and Limitations of Publicly Available COVID-19 X-Ray Imaging Datasets
Authors:
Beatriz Garcia Santa Cruz,
Jan Sölter,
Matias Nicolas Bossa,
Andreas Dominik Husch
Abstract:
Machine learning based methods for diagnosis and progression prediction of COVID-19 from imaging data have gained significant attention in the last months, in particular by the use of deep learning models. In this context hundreds of models where proposed with the majority of them trained on public datasets. Data scarcity, mismatch between training and target population, group imbalance, and lack…
▽ More
Machine learning based methods for diagnosis and progression prediction of COVID-19 from imaging data have gained significant attention in the last months, in particular by the use of deep learning models. In this context hundreds of models where proposed with the majority of them trained on public datasets. Data scarcity, mismatch between training and target population, group imbalance, and lack of documentation are important sources of bias, hindering the applicability of these models to real-world clinical practice. Considering that datasets are an essential part of model building and evaluation, a deeper understanding of the current landscape is needed. This paper presents an overview of the currently public available COVID-19 chest X-ray datasets. Each dataset is briefly described and potential strength, limitations and interactions between datasets are identified. In particular, some key properties of current datasets that could be potential sources of bias, impairing models trained on them are pointed out. These descriptions are useful for model building on those datasets, to choose the best dataset according the model goal, to take into account the specific limitations to avoid reporting overconfident benchmark results, and to discuss their impact on the generalisation capabilities in a specific clinical setting
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Distributed Attribute-Based Access Control System Using a Permissioned Blockchain
Authors:
Sara Rouhani,
Rafael Belchior,
Rui S. Cruz,
Ralph Deters
Abstract:
Auditing provides an essential security control in computer systems, by keeping track of all access attempts, including both legitimate and illegal access attempts. This phase can be useful to the context of audits, where eventual misbehaving parties can be held accountable. Blockchain technology can provide trusted auditability required for access control systems. In this paper, we propose a dist…
▽ More
Auditing provides an essential security control in computer systems, by keeping track of all access attempts, including both legitimate and illegal access attempts. This phase can be useful to the context of audits, where eventual misbehaving parties can be held accountable. Blockchain technology can provide trusted auditability required for access control systems. In this paper, we propose a distributed \ac{ABAC} system based on blockchain to provide trusted auditing of access attempts. Besides auditability, our system presents a level of transparency that both access requestors and resource owners can benefit from it. We present a system architecture with an implementation based on Hyperledger Fabric, achieving high efficiency and low computational overhead. The proposed solution is validated through a use case of independent digital libraries. Detailed performance analysis of our implementation is presented, taking into account different consensus mechanisms and databases. The experimental evaluation shows that our presented system can process 5,000 access control requests with the send rate of 200 per second and a latency of 0.3 seconds.
△ Less
Submitted 8 June, 2020;
originally announced June 2020.
-
Inferring Temporal Compositions of Actions Using Probabilistic Automata
Authors:
Rodrigo Santa Cruz,
Anoop Cherian,
Basura Fernando,
Dylan Campbell,
Stephen Gould
Abstract:
This paper presents a framework to recognize temporal compositions of atomic actions in videos. Specifically, we propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata to recognize complex actions as satisfying these expressions on the input video features. Our approach is different from existing works that…
▽ More
This paper presents a framework to recognize temporal compositions of atomic actions in videos. Specifically, we propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata to recognize complex actions as satisfying these expressions on the input video features. Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences. Instead, the proposed approach allows recognizing complex fine-grained activities using only pretrained action classifiers, without requiring any additional data, annotations or neural network training. To evaluate the potential of our approach, we provide experiments on synthetic datasets and challenging real action recognition datasets, such as MultiTHUMOS and Charades. We conclude that the proposed approach can extend state-of-the-art primitive action classifiers to vastly more complex activities without large performance degradation.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark
Authors:
Steve Dias Da Cruz,
Oliver Wasenmüller,
Hans-Peter Beise,
Thomas Stifter,
Didier Stricker
Abstract:
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles, in order to analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations (e.g. identical backgrounds and textures, few instances per class). This is in contrast to the intrinsically high variability of common ben…
▽ More
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles, in order to analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations (e.g. identical backgrounds and textures, few instances per class). This is in contrast to the intrinsically high variability of common benchmark datasets, which focus on improving the state-of-the-art of general tasks. Our dataset contains bounding boxes for object detection, instance segmentation masks, keypoints for pose estimation and depth images for each synthetic scenery as well as images for each individual seat for classification. The advantage of our use-case is twofold: The proximity to a realistic application to benchmark new approaches under novel circumstances while reducing the complexity to a more tractable environment, such that applications and theoretical questions can be tested on a more challenging dataset as toy problems. The data and evaluation server are available under https://sviro.kl.dfki.de.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
A Smartphone-Based Skin Disease Classification Using MobileNet CNN
Authors:
Jessica Velasco,
Cherry Pascion,
Jean Wilmar Alberio,
Jonathan Apuang,
John Stephen Cruz,
Mark Angelo Gomez,
Benjamin Jr. Molina,
Lyndon Tuala,
August Thio-ac,
Romeo Jr. Jorda
Abstract:
The MobileNet model was used by applying transfer learning on the 7 skin diseases to create a skin disease classification system on Android application. The proponents gathered a total of 3,406 images and it is considered as imbalanced dataset because of the unequal number of images on its classes. Using different sampling method and preprocessing of input data was explored to further improved the…
▽ More
The MobileNet model was used by applying transfer learning on the 7 skin diseases to create a skin disease classification system on Android application. The proponents gathered a total of 3,406 images and it is considered as imbalanced dataset because of the unequal number of images on its classes. Using different sampling method and preprocessing of input data was explored to further improved the accuracy of the MobileNet. Using under-sampling method and the default preprocessing of input data achieved an 84.28% accuracy. While, using imbalanced dataset and default preprocessing of input data achieved a 93.6% accuracy. Then, researchers explored oversampling the dataset and the model attained a 91.8% accuracy. Lastly, by using oversampling technique and data augmentation on preprocessing the input data provide a 94.4% accuracy and this model was deployed on the developed Android application.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
To Beta or Not To Beta: Information Bottleneck for DigitaL Image Forensics
Authors:
Aurobrata Ghosh,
Zheng Zhong,
Steve Cruz,
Subbu Veeravasarapu,
Terrance E Boult,
Maneesh Singh
Abstract:
We consider an information theoretic approach to address the problem of identifying fake digital images. We propose an innovative method to formulate the issue of localizing manipulated regions in an image as a deep representation learning problem using the Information Bottleneck (IB), which has recently gained popularity as a framework for interpreting deep neural networks. Tampered images pose a…
▽ More
We consider an information theoretic approach to address the problem of identifying fake digital images. We propose an innovative method to formulate the issue of localizing manipulated regions in an image as a deep representation learning problem using the Information Bottleneck (IB), which has recently gained popularity as a framework for interpreting deep neural networks. Tampered images pose a serious predicament since digitized media is a ubiquitous part of our lives. These are facilitated by the easy availability of image editing software and aggravated by recent advances in deep generative models such as GANs. We propose InfoPrint, a computationally efficient solution to the IB formulation using approximate variational inference and compare it to a numerical solution that is computationally expensive. Testing on a number of standard datasets, we demonstrate that InfoPrint outperforms the state-of-the-art and the numerical solution. Additionally, it also has the ability to detect alterations made by inpainting GANs.
△ Less
Submitted 11 August, 2019;
originally announced August 2019.
-
On decision regions of narrow deep neural networks
Authors:
Hans-Peter Beise,
Steve Dias Da Cruz,
Udo Schröder
Abstract:
We show that for neural network functions that have width less or equal to the input dimension all connected components of decision regions are unbounded. The result holds for continuous and strictly monotonic activation functions as well as for the ReLU activation function. This complements recent results on approximation capabilities by [Hanin 2017 Approximating] and connectivity of decision reg…
▽ More
We show that for neural network functions that have width less or equal to the input dimension all connected components of decision regions are unbounded. The result holds for continuous and strictly monotonic activation functions as well as for the ReLU activation function. This complements recent results on approximation capabilities by [Hanin 2017 Approximating] and connectivity of decision regions by [Nguyen 2018 Neural] for such narrow neural networks. Our results are illustrated by means of numerical experiments.
△ Less
Submitted 3 March, 2021; v1 submitted 3 July, 2018;
originally announced July 2018.
-
Neural Algebra of Classifiers
Authors:
Rodrigo Santa Cruz,
Basura Fernando,
Anoop Cherian,
Stephen Gould
Abstract:
The world is fundamentally compositional, so it is natural to think of visual recognition as the recognition of basic visually primitives that are composed according to well-defined rules. This strategy allows us to recognize unseen complex concepts from simple visual primitives. However, the current trend in visual recognition follows a data greedy approach where huge amounts of data are required…
▽ More
The world is fundamentally compositional, so it is natural to think of visual recognition as the recognition of basic visually primitives that are composed according to well-defined rules. This strategy allows us to recognize unseen complex concepts from simple visual primitives. However, the current trend in visual recognition follows a data greedy approach where huge amounts of data are required to learn models for any desired visual concept. In this paper, we build on the compositionality principle and develop an "algebra" to compose classifiers for complex visual concepts. To this end, we learn neural network modules to perform boolean algebra operations on simple visual classifiers. Since these modules form a complete functional set, a classifier for any complex visual concept defined as a boolean expression of primitives can be obtained by recursively applying the learned modules, even if we do not have a single training sample. As our experiments show, using such a framework, we can compose classifiers for complex visual concepts outperforming standard baselines on two well-known visual recognition benchmarks. Finally, we present a qualitative analysis of our method and its properties.
△ Less
Submitted 26 January, 2018;
originally announced January 2018.
-
Toward Open-Set Face Recognition
Authors:
Manuel Günther,
Steve Cruz,
Ethan M. Rudd,
Terrance E. Boult
Abstract:
Much research has been conducted on both face identification and face verification, with greater focus on the latter. Research on face identification has mostly focused on using closed-set protocols, which assume that all probe images used in evaluation contain identities of subjects that are enrolled in the gallery. Real systems, however, where only a fraction of probe sample identities are enrol…
▽ More
Much research has been conducted on both face identification and face verification, with greater focus on the latter. Research on face identification has mostly focused on using closed-set protocols, which assume that all probe images used in evaluation contain identities of subjects that are enrolled in the gallery. Real systems, however, where only a fraction of probe sample identities are enrolled in the gallery, cannot make this closed-set assumption. Instead, they must assume an open set of probe samples and be able to reject/ignore those that correspond to unknown identities. In this paper, we address the widespread misconception that thresholding verification-like scores is a good way to solve the open-set face identification problem, by formulating an open-set face identification protocol and evaluating different strategies for assessing similarity. Our open-set identification protocol is based on the canonical labeled faces in the wild (LFW) dataset. Additionally to the known identities, we introduce the concepts of known unknowns (known, but uninteresting persons) and unknown unknowns (people never seen before) to the biometric community. We compare three algorithms for assessing similarity in a deep feature space under an open-set protocol: thresholded verification-like scores, linear discriminant analysis (LDA) scores, and an extreme value machine (EVM) probabilities. Our findings suggest that thresholding EVM probabilities, which are open-set by design, outperforms thresholding verification-like scores.
△ Less
Submitted 18 May, 2017; v1 submitted 3 May, 2017;
originally announced May 2017.
-
DeepPermNet: Visual Permutation Learning
Authors:
Rodrigo Santa Cruz,
Basura Fernando,
Anoop Cherian,
Stephen Gould
Abstract:
We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matri…
▽ More
We present a principled approach to uncover the structure of visual data by solving a novel deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Unfortunately, permutation matrices are discrete, thereby posing difficulties for gradient-based methods. To this end, we resort to a continuous approximation of these matrices using doubly-stochastic matrices which we generate from standard CNN predictions using Sinkhorn iterations. Unrolling these iterations in a Sinkhorn network layer, we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on two challenging computer vision problems, namely, (i) relative attributes learning and (ii) self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for (i) and on the classification and segmentation tasks on the PASCAL VOC dataset for (ii).
△ Less
Submitted 10 April, 2017;
originally announced April 2017.
-
Open Set Intrusion Recognition for Fine-Grained Attack Categorization
Authors:
Steve Cruz,
Cora Coleman,
Ethan M. Rudd,
Terrance E. Boult
Abstract:
Confidently distinguishing a malicious intrusion over a network is an important challenge. Most intrusion detection system evaluations have been performed in a closed set protocol in which only classes seen during training are considered during classification. Thus far, there has been no realistic application in which novel types of behaviors unseen at training -- unknown classes as it were -- mus…
▽ More
Confidently distinguishing a malicious intrusion over a network is an important challenge. Most intrusion detection system evaluations have been performed in a closed set protocol in which only classes seen during training are considered during classification. Thus far, there has been no realistic application in which novel types of behaviors unseen at training -- unknown classes as it were -- must be recognized for manual categorization. This paper comparatively evaluates malware classification using both closed set and open set protocols for intrusion recognition on the KDDCUP'99 dataset. In contrast to much of the previous work, we employ a fine-grained recognition protocol, in which the dataset is loosely open set -- i.e., recognizing individual intrusion types -- e.g., "sendmail", "snmp guess", ..., etc., rather than more general attack categories (e.g., "DoS","Probe","R2L","U2R","Normal"). We also employ two different classifier types -- Gaussian RBF kernel SVMs, which are not theoretically guaranteed to bound open space risk, and W-SVMs, which are theoretically guaranteed to bound open space risk. We find that the W-SVM offers superior performance under the open set regime, particularly as the cost of misclassifying unknown classes at query time (i.e., classes not present in the training set) increases. Results of performance tradeoff with respect to cost of unknown as well as discussion of the ramifications of these findings in an operational setting are presented.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
On Differentiating Parameterized Argmin and Argmax Problems with Application to Bi-level Optimization
Authors:
Stephen Gould,
Basura Fernando,
Anoop Cherian,
Peter Anderson,
Rodrigo Santa Cruz,
Edison Guo
Abstract:
Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem. Here the solution of a parameterized lower-level problem binds variables that appear in the objective of an upper-level problem. The lower-level problem typically appears as an argmin or argmax optimization problem. Many techniques have been proposed to solve bi-level optimization pro…
▽ More
Some recent works in machine learning and computer vision involve the solution of a bi-level optimization problem. Here the solution of a parameterized lower-level problem binds variables that appear in the objective of an upper-level problem. The lower-level problem typically appears as an argmin or argmax optimization problem. Many techniques have been proposed to solve bi-level optimization problems, including gradient descent, which is popular with current end-to-end learning approaches. In this technical report we collect some results on differentiating argmin and argmax optimization problems with and without constraints and provide some insightful motivating examples.
△ Less
Submitted 20 July, 2016; v1 submitted 19 July, 2016;
originally announced July 2016.