-
BVE + EKF: A viewpoint estimator for the estimation of the object's position in the 3D task space using Extended Kalman Filters
Authors:
Sandro Costa Magalhães,
António Paulo Moreira,
Filipe Neves dos Santos,
Jorge Dias
Abstract:
RGB-D sensors face multiple challenges operating under open-field environments because of their sensitivity to external perturbations such as radiation or rain. Multiple works are approaching the challenge of perceiving the 3D position of objects using monocular cameras. However, most of these works focus mainly on deep learning-based solutions, which are complex, data-driven, and difficult to pre…
▽ More
RGB-D sensors face multiple challenges operating under open-field environments because of their sensitivity to external perturbations such as radiation or rain. Multiple works are approaching the challenge of perceiving the 3D position of objects using monocular cameras. However, most of these works focus mainly on deep learning-based solutions, which are complex, data-driven, and difficult to predict. So, we aim to approach the problem of predicting the 3D objects' position using a Gaussian viewpoint estimator named best viewpoint estimator (BVE) powered by an extended Kalman filter (EKF). The algorithm proved efficient on the tasks and reached a maximum average Euclidean error of about 32 mm. The experiments were deployed and evaluated in MATLAB using artificial Gaussian noise. Future work aims to implement the system in a robotic system.
△ Less
Submitted 3 October, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
MonoVisual3DFilter: 3D tomatoes' localisation with monocular cameras using histogram filters
Authors:
Sandro Costa Magalhães,
Filipe Neves dos Santos,
António Paulo Moreira,
Jorge Dias
Abstract:
Performing tasks in agriculture, such as fruit monitoring or harvesting, requires perceiving the objects' spatial position. RGB-D cameras are limited under open-field environments due to lightning interferences. So, in this study, we state to answer the research question: "How can we use and control monocular sensors to perceive objects' position in the 3D task space?" Towards this aim, we approac…
▽ More
Performing tasks in agriculture, such as fruit monitoring or harvesting, requires perceiving the objects' spatial position. RGB-D cameras are limited under open-field environments due to lightning interferences. So, in this study, we state to answer the research question: "How can we use and control monocular sensors to perceive objects' position in the 3D task space?" Towards this aim, we approached histogram filters (Bayesian discrete filters) to estimate the position of tomatoes in the tomato plant through the algorithm MonoVisual3DFilter. Two kernel filters were studied: the square kernel and the Gaussian kernel. The implemented algorithm was essayed in simulation, with and without Gaussian noise and random noise, and in a testbed at laboratory conditions. The algorithm reported a mean absolute error lower than 10 mm in simulation and 20 mm in the testbed at laboratory conditions with an assessing distance of about 0.5 m. So, the results are viable for real environments and should be improved at closer distances.
△ Less
Submitted 3 October, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Robust human position estimation in cooperative robotic cells
Authors:
António Amorim,
Diana Guimarães,
Tiago Mendonça,
Pedro Neto,
Paulo Costa,
António Paulo Moreira
Abstract:
Robots are increasingly present in our lives, sharing the workspace and tasks with human co-workers. However, existing interfaces for human-robot interaction / cooperation (HRI/C) have limited levels of intuitiveness to use and safety is a major concern when humans and robots share the same workspace. Many times, this is due to the lack of a reliable estimation of the human pose in space which is…
▽ More
Robots are increasingly present in our lives, sharing the workspace and tasks with human co-workers. However, existing interfaces for human-robot interaction / cooperation (HRI/C) have limited levels of intuitiveness to use and safety is a major concern when humans and robots share the same workspace. Many times, this is due to the lack of a reliable estimation of the human pose in space which is the primary input to calculate the human-robot minimum distance (required for safety and collision avoidance) and HRI/C featuring machine learning algorithms classifying human behaviours / gestures. Each sensor type has its own characteristics resulting in problems such as occlusions (vision) and drift (inertial) when used in an isolated fashion. In this paper, it is proposed a combined system that merges the human tracking provided by a 3D vision sensor with the pose estimation provided by a set of inertial measurement units (IMUs) placed in human body limbs. The IMUs compensate the gaps in occluded areas to have tracking continuity. To mitigate the lingering effects of the IMU offset we propose a continuous online calculation of the offset value. Experimental tests were designed to simulate human motion in a human-robot collaborative environment where the robot moves away to avoid unexpected collisions with de human. Results indicate that our approach is able to capture the human\textsc's position, for example the forearm, with a precision in the millimetre range and robustness to occlusions.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Benchmarking Edge Computing Devices for Grape Bunches and Trunks Detection using Accelerated Object Detection Single Shot MultiBox Deep Learning Models
Authors:
Sandro Costa Magalhães,
Filipe Neves Santos,
Pedro Machado,
António Paulo Moreira,
Jorge Dias
Abstract:
Purpose: Visual perception enables robots to perceive the environment. Visual data is processed using computer vision algorithms that are usually time-expensive and require powerful devices to process the visual data in real-time, which is unfeasible for open-field robots with limited energy. This work benchmarks the performance of different heterogeneous platforms for object detection in real-tim…
▽ More
Purpose: Visual perception enables robots to perceive the environment. Visual data is processed using computer vision algorithms that are usually time-expensive and require powerful devices to process the visual data in real-time, which is unfeasible for open-field robots with limited energy. This work benchmarks the performance of different heterogeneous platforms for object detection in real-time. This research benchmarks three architectures: embedded GPU -- Graphical Processing Units (such as NVIDIA Jetson Nano 2 GB and 4 GB, and NVIDIA Jetson TX2), TPU -- Tensor Processing Unit (such as Coral Dev Board TPU), and DPU -- Deep Learning Processor Unit (such as in AMD-Xilinx ZCU104 Development Board, and AMD-Xilinx Kria KV260 Starter Kit). Method: The authors used the RetinaNet ResNet-50 fine-tuned using the natural VineSet dataset. After the trained model was converted and compiled for target-specific hardware formats to improve the execution efficiency. Conclusions and Results: The platforms were assessed in terms of performance of the evaluation metrics and efficiency (time of inference). Graphical Processing Units (GPUs) were the slowest devices, running at 3 FPS to 5 FPS, and Field Programmable Gate Arrays (FPGAs) were the fastest devices, running at 14 FPS to 25 FPS. The efficiency of the Tensor Processing Unit (TPU) is irrelevant and similar to NVIDIA Jetson TX2. TPU and GPU are the most power-efficient, consuming about 5W. The performance differences, in the evaluation metrics, across devices are irrelevant and have an F1 of about 70 % and mean Average Precision (mAP) of about 60 %.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
Omnidirectional robot modeling and simulation
Authors:
Sandro Costa Magalhães,
António Paulo Moreira,
Paulo Costa
Abstract:
A robot simulation system is a basic need for any robotics application. With it, developers' teams of robots can test their algorithms and make initial calibrations without risk of damage to the real robots, assuring safety. However, building these simulation environments is usually time-consuming work, and when considering robot fleets, the simulation reveals to be computing expensive. With it, d…
▽ More
A robot simulation system is a basic need for any robotics application. With it, developers' teams of robots can test their algorithms and make initial calibrations without risk of damage to the real robots, assuring safety. However, building these simulation environments is usually time-consuming work, and when considering robot fleets, the simulation reveals to be computing expensive. With it, developers building teams of robots can test their algorithms and make initial calibrations without risk of damage to the real robots, assuring safety. An omnidirectional robot from the 5DPO robotics soccer team served to test this approach. The modeling issue was divided into two steps: modeling the motor's non-linear features and modeling the general behavior of the robot. A proper fitting of the robot was reached, considering the velocity robot's response.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Evaluating the Single-Shot MultiBox Detector and YOLO Deep Learning Models for the Detection of Tomatoes in a Greenhouse
Authors:
Sandro A. Magalhães,
Luís Castro,
Germano Moreira,
Filipe N. Santos,
mário Cunha,
Jorge Dias,
António P. Moreira
Abstract:
The development of robotic solutions for agriculture requires advanced perception capabilities that can work reliably in any crop stage. For example, to automatise the tomato harvesting process in greenhouses, the visual perception system needs to detect the tomato in any life cycle stage (flower to the ripe tomato). The state-of-the-art for visual tomato detection focuses mainly on ripe tomato, w…
▽ More
The development of robotic solutions for agriculture requires advanced perception capabilities that can work reliably in any crop stage. For example, to automatise the tomato harvesting process in greenhouses, the visual perception system needs to detect the tomato in any life cycle stage (flower to the ripe tomato). The state-of-the-art for visual tomato detection focuses mainly on ripe tomato, which has a distinctive colour from the background. This paper contributes with an annotated visual dataset of green and reddish tomatoes. This kind of dataset is uncommon and not available for research purposes. This will enable further developments in edge artificial intelligence for in situ and in real-time visual tomato detection required for the development of harvesting robots. Considering this dataset, five deep learning models were selected, trained and benchmarked to detect green and reddish tomatoes grown in greenhouses. Considering our robotic platform specifications, only the Single-Shot MultiBox Detector (SSD) and YOLO architectures were considered. The results proved that the system can detect green and reddish tomatoes, even those occluded by leaves. SSD MobileNet v2 had the best performance when compared against SSD Inception v2, SSD ResNet 50, SSD ResNet 101 and YOLOv4 Tiny, reaching an F1-score of 66.15%, an mAP of 51.46% and an inference time of 16.44 ms with the NVIDIA Turing Architecture platform, an NVIDIA Tesla T4, with 12 GB. YOLOv4 Tiny also had impressive results, mainly concerning inferring times of about 5 ms.
△ Less
Submitted 2 September, 2021;
originally announced September 2021.
-
3-D position estimation from inertial sensing: minimizing the error from the process of double integration of accelerations
Authors:
P. Neto,
J. N. Pires,
A. P Moreira
Abstract:
This paper introduces a new approach to 3-D position estimation from acceleration data, i.e., a 3-D motion tracking system having a small size and low-cost magnetic and inertial measurement unit (MIMU) composed by both a digital compass and a gyroscope as interaction technology. A major challenge is to minimize the error caused by the process of double integration of accelerations due to motion (t…
▽ More
This paper introduces a new approach to 3-D position estimation from acceleration data, i.e., a 3-D motion tracking system having a small size and low-cost magnetic and inertial measurement unit (MIMU) composed by both a digital compass and a gyroscope as interaction technology. A major challenge is to minimize the error caused by the process of double integration of accelerations due to motion (these ones have to be separated from the accelerations due to gravity). Owing to drift error, position estimation cannot be performed with adequate accuracy for periods longer than few seconds. For this reason, we propose a method to detect motion stops and only integrate accelerations in moments of effective hand motion during the demonstration process. The proposed system is validated and evaluated with experiments reporting a common daily life pick-and-place task.
△ Less
Submitted 18 November, 2013;
originally announced November 2013.
-
A low-cost laser scanning solution for flexible robotic cells: spray coating
Authors:
Marcos Ferreira,
António Paulo Moreira,
Pedro Neto
Abstract:
In this paper, an adaptive and low-cost robotic coating platform for small production series is presented. This new platform presents a flexible architecture that enables fast/automatic system adaptive behaviour without human intervention. The concept is based on contactless technology, using artificial vision and laser scanning to identify and characterize different workpieces travelling on a con…
▽ More
In this paper, an adaptive and low-cost robotic coating platform for small production series is presented. This new platform presents a flexible architecture that enables fast/automatic system adaptive behaviour without human intervention. The concept is based on contactless technology, using artificial vision and laser scanning to identify and characterize different workpieces travelling on a conveyor. Using laser triangulation, the workpieces are virtually reconstructed through a simplified cloud of three-dimensional (3D) points. From those reconstructed models, several algorithms are implemented to extract information about workpieces profile (pattern recognition), size, boundary and pose. Such information is then used to on-line adjust the base robot programmes. These robot programmes are off-line generated from a 3D computer-aided design model of each different workpiece profile. Finally, the robotic manipulator executes the coating process after its base programmes have been adjusted. This is a low-cost and fully autonomous system that allows adapting the robots behaviour to different manufacturing situations. It means that the robot is ready to work over any piece at any time, and thus, small production series can be reduced to as much as a one-object series. No skilled workers and large setup times are needed to operate it. Experimental results showed that this solution proved to be efficient and can be applied not only for spray coating purposes but also for many other industrial processes (automatic manipulation, pick-and-place, inspection, etc.).
△ Less
Submitted 9 September, 2013;
originally announced September 2013.
-
High-level robot programming based on CAD: dealing with unpredictable environments
Authors:
Pedro Neto,
Nuno Mendes,
Ricardo Araújo,
J. Norberto Pires,
A. Paulo Moreira
Abstract:
Purpose - The purpose of this paper is to present a CAD-based human-robot interface that allows non-expert users to teach a robot in a manner similar to that used by human beings to teach each other.
Design/methodology/approach - Intuitive robot programming is achieved by using CAD drawings to generate robot programs off-line. Sensory feedback allows minimization of the effects of uncertainty, p…
▽ More
Purpose - The purpose of this paper is to present a CAD-based human-robot interface that allows non-expert users to teach a robot in a manner similar to that used by human beings to teach each other.
Design/methodology/approach - Intuitive robot programming is achieved by using CAD drawings to generate robot programs off-line. Sensory feedback allows minimization of the effects of uncertainty, providing information to adjust the robot paths during robot operation.
Findings - It was found that it is possible to generate a robot program from a common CAD drawing and run it without any major concerns about calibration or CAD model accuracy.
Research limitations/implications - A limitation of the proposed system has to do with the fact that it was designed to be used for particular technological applications.
Practical implications - Since most manufacturing companies have CAD packages in their facilities today, CAD-based robot programming may be a good option to program robots without the need for skilled robot programmers.
Originality/value - The paper proposes a new CAD-based robot programming system. Robot programs are directly generated from a CAD drawing running on a commonly available 3D CAD package (Autodesk Inventor) and not from a commercial, computer aided robotics (CAR) software, making it a simple CAD integrated solution. This is a low-cost and low-setup time system where no advanced robot programming skills are required to operate it. In summary, robot programs are generated with a high-level of abstraction from the robot language.
△ Less
Submitted 9 September, 2013;
originally announced September 2013.