-
Adiabatic training for Variational Quantum Algorithms
Authors:
Ernesto Acosta,
Carlos Cano Gutierrez,
Guillermo Botella,
Roberto Campos
Abstract:
This paper presents a new hybrid Quantum Machine Learning (QML) model composed of three elements: a classical computer in charge of the data preparation and interpretation; a Gate-based Quantum Computer running the Variational Quantum Algorithm (VQA) representing the Quantum Neural Network (QNN); and an adiabatic Quantum Computer where the optimization function is executed to find the best paramet…
▽ More
This paper presents a new hybrid Quantum Machine Learning (QML) model composed of three elements: a classical computer in charge of the data preparation and interpretation; a Gate-based Quantum Computer running the Variational Quantum Algorithm (VQA) representing the Quantum Neural Network (QNN); and an adiabatic Quantum Computer where the optimization function is executed to find the best parameters for the VQA.
As of the moment of this writing, the majority of QNNs are being trained using gradient-based classical optimizers having to deal with the barren-plateau effect. Some gradient-free classical approaches such as Evolutionary Algorithms have also been proposed to overcome this effect. To the knowledge of the authors, adiabatic quantum models have not been used to train VQAs.
The paper compares the results of gradient-based classical algorithms against adiabatic optimizers showing the feasibility of integration for gate-based and adiabatic quantum computing models, opening the door to modern hybrid QML approaches for High Performance Computing.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
Neuromorphic Circuit Simulation with Memristors: Design and Evaluation Using MemTorch for MNIST and CIFAR
Authors:
Julio Souto,
Guillermo Botella,
Daniel García,
Raúl Murillo,
Alberto del Barrio
Abstract:
Memristors offer significant advantages as in-memory computing devices due to their non-volatility, low power consumption, and history-dependent conductivity. These attributes are particularly valuable in the realm of neuromorphic circuits for neural networks, which currently face limitations imposed by the Von Neumann architecture and high energy demands. This study evaluates the feasibility of u…
▽ More
Memristors offer significant advantages as in-memory computing devices due to their non-volatility, low power consumption, and history-dependent conductivity. These attributes are particularly valuable in the realm of neuromorphic circuits for neural networks, which currently face limitations imposed by the Von Neumann architecture and high energy demands. This study evaluates the feasibility of using memristors for in-memory processing by constructing and training three digital convolutional neural networks with the datasets MNIST, CIFAR10 and CIFAR100. Subsequent conversion of these networks into memristive systems was performed using Memtorch. The simulations, conducted under ideal conditions, revealed minimal precision losses of nearly 1% during inference. Additionally, the study analyzed the impact of tile size and memristor-specific non-idealities on performance, highlighting the practical implications of integrating memristors in neuromorphic computing systems. This exploration into memristive neural network applications underscores the potential of Memtorch in advancing neuromorphic architectures.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
An efficient method to automate tooth identification and 3D bounding box extraction from Cone Beam CT Images
Authors:
Ignacio Garrido Botella,
Ignacio Arranz Águeda,
Juan Carlos Armenteros Carmona,
Oleg Vorontsov,
Fernando Bayón Robledo,
Evgeny Solovykh,
Obrubov Aleksandr Andreevich,
Adrián Alonso Barriuso
Abstract:
Accurate identification, localization, and segregation of teeth from Cone Beam Computed Tomography (CBCT) images are essential for analyzing dental pathologies. Modeling an individual tooth can be challenging and intricate to accomplish, especially when fillings and other restorations introduce artifacts. This paper proposes a method for automatically detecting, identifying, and extracting teeth f…
▽ More
Accurate identification, localization, and segregation of teeth from Cone Beam Computed Tomography (CBCT) images are essential for analyzing dental pathologies. Modeling an individual tooth can be challenging and intricate to accomplish, especially when fillings and other restorations introduce artifacts. This paper proposes a method for automatically detecting, identifying, and extracting teeth from CBCT images. Our approach involves dividing the three-dimensional images into axial slices for image detection. Teeth are pinpointed and labeled using a single-stage object detector. Subsequently, bounding boxes are delineated and identified to create three-dimensional representations of each tooth. The proposed solution has been successfully integrated into the dental analysis tool Dentomo.
△ Less
Submitted 10 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Acceleration and energy consumption optimization in cascading classifiers for face detection on low-cost ARM big.LITTLE asymmetric architectures
Authors:
Alberto Corpas,
Luis Costero,
Guillermo Botella,
Francisco D. Igual,
Carlos García,
Manuel Rodríguez
Abstract:
This paper proposes a mechanism to accelerate and optimize the energy consumption of a face detection software based on Haar-like cascading classifiers, taking advantage of the features of low-cost Asymmetric Multicore Processors (AMPs) with limited power budget. A modelling and task scheduling/allocation is proposed in order to efficiently make use of the existing features on big.LITTLE ARM proce…
▽ More
This paper proposes a mechanism to accelerate and optimize the energy consumption of a face detection software based on Haar-like cascading classifiers, taking advantage of the features of low-cost Asymmetric Multicore Processors (AMPs) with limited power budget. A modelling and task scheduling/allocation is proposed in order to efficiently make use of the existing features on big.LITTLE ARM processors, including: (I) source-code adaptation for parallel computing, which enables code acceleration by applying the OmpSs programming model, a task-based programming model that handles data-dependencies between tasks in a transparent fashion; (II) different OmpSs task allocation policies which take into account the processor asymmetry and can dynamically set processing resources in a more efficient way based on their particular features. The proposed mechanism can be efficiently applied to take advantage of the processing elements existing on low-cost and low-energy multi-core embedded devices executing object detection algorithms based on cascading classifiers. Although these classifiers yield the best results for detection algorithms in the field of computer vision, their high computational requirements prevent them from being used on these devices under real-time requirements. Finally, we compare the energy efficiency of a heterogeneous architecture based on asymmetric multicore processors with a suitable task scheduling, with that of a homogeneous symmetric architecture.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Differential Evolution VQE for Crypto-currency Arbitrage. Quantum Optimization with many local minima
Authors:
Gines Carrascal,
Beatriz Roman,
Guillermo Botella,
Alberto del Barrio
Abstract:
Crypto-currency markets are known to exhibit inefficiencies, which presents opportunities for profitable cyclic transactions or arbitrage, where one currency is traded for another in a way that results in a net gain without incurring any risk. Quantum computing has shown promise in financial applications, particularly in resolving optimization problems like arbitrage. In this paper, we introduce a…
▽ More
Crypto-currency markets are known to exhibit inefficiencies, which presents opportunities for profitable cyclic transactions or arbitrage, where one currency is traded for another in a way that results in a net gain without incurring any risk. Quantum computing has shown promise in financial applications, particularly in resolving optimization problems like arbitrage. In this paper, we introduce a differential evolution (DE) optimization algorithm for Variational Quantum Eigensolver (VQE) using Qiskit framework. We elucidate the application of crypto-currency arbitrage using different VQE optimizers. Our findings indicate that the proposed DE-based method effectively converges to the optimal solution in scenarios where other commonly used optimizers, such as COBYLA, struggle to find the global minimum. We further test this procedure's feasibility on IBM's real quantum machines up to 127 qubits. With a three-currency scenario, the algorithm converged in 417 steps over a 12-hour period on the "ibm_geneva" machine. These results suggest the potential for achieving a quantum advantage in solving increasingly complex problems.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
PERCIVAL: Open-Source Posit RISC-V Core with Quire Capability
Authors:
David Mallasén,
Raul Murillo,
Alberto A. Del Barrio,
Guillermo Botella,
Luis Piñuel,
Manuel Prieto
Abstract:
The posit representation for real numbers is an alternative to the ubiquitous IEEE 754 floating-point standard. In this work, we present PERCIVAL, an application-level posit capable RISC-V core based on CVA6 that can execute all posit instructions, including the quire fused operations. This solves the obstacle encountered by previous works, which only included partial posit support or which had to…
▽ More
The posit representation for real numbers is an alternative to the ubiquitous IEEE 754 floating-point standard. In this work, we present PERCIVAL, an application-level posit capable RISC-V core based on CVA6 that can execute all posit instructions, including the quire fused operations. This solves the obstacle encountered by previous works, which only included partial posit support or which had to emulate posits in software, thus limiting the scope or the scalability of their applications. In addition, Xposit, a RISC-V extension for posit instructions is incorporated into LLVM. Therefore, PERCIVAL is the first work that integrates the complete posit instruction set in hardware. These elements allow for the native execution of posit instructions as well as the standard floating-point ones, further permitting the comparison of these representations. FPGA and ASIC synthesis show the hardware cost of implementing 32-bit posits and highlight the significant overhead of including a quire accumulator. However, results comparing posits and IEEE floats show that the quire enables a more accurate execution of dot products. In general matrix multiplications, the accuracy error is reduced up to 4 orders of magnitude when compared with single-precision floats. Furthermore, performance comparisons show that these accuracy improvements do not hinder their execution, as posits run as fast as single-precision floats and exhibit better timing than double-precision floats, thus potentially providing an alternative representation.
△ Less
Submitted 7 July, 2022; v1 submitted 30 November, 2021;
originally announced November 2021.
-
PLAM: a Posit Logarithm-Approximate Multiplier
Authors:
Raul Murillo,
Alberto A. Del Barrio,
Guillermo Botella,
Min Soo Kim,
HyunJin Kim,
Nader Bagherzadeh
Abstract:
The Posit Number System was introduced in 2017 as a replacement for floating-point numbers. Since then, the community has explored its application in Neural Network related tasks and produced some unit designs which are still far from being competitive with their floating-point counterparts. This paper proposes a Posit Logarithm-Approximate Multiplication (PLAM) scheme to significantly reduce the…
▽ More
The Posit Number System was introduced in 2017 as a replacement for floating-point numbers. Since then, the community has explored its application in Neural Network related tasks and produced some unit designs which are still far from being competitive with their floating-point counterparts. This paper proposes a Posit Logarithm-Approximate Multiplication (PLAM) scheme to significantly reduce the complexity of posit multipliers, the most power-hungry units within Deep Neural Network architectures. When comparing with state-of-the-art posit multipliers, experiments show that the proposed technique reduces the area, power, and delay of hardware multipliers up to 72.86%, 81.79%, and 17.01%, respectively, without accuracy degradation.
△ Less
Submitted 7 September, 2021; v1 submitted 18 February, 2021;
originally announced February 2021.
-
Template-Based Posit Multiplication for Training and Inferring in Neural Networks
Authors:
Raúl Murillo Montero,
Alberto A. Del Barrio,
Guillermo Botella
Abstract:
The posit number system is arguably the most promising and discussed topic in Arithmetic nowadays. The recent breakthroughs claimed by the format proposed by John L. Gustafson have put posits in the spotlight. In this work, we first describe an algorithm for multiplying two posit numbers, even when the number of exponent bits is zero. This configuration, scarcely tackled in literature, is particul…
▽ More
The posit number system is arguably the most promising and discussed topic in Arithmetic nowadays. The recent breakthroughs claimed by the format proposed by John L. Gustafson have put posits in the spotlight. In this work, we first describe an algorithm for multiplying two posit numbers, even when the number of exponent bits is zero. This configuration, scarcely tackled in literature, is particularly interesting because it allows the deployment of a fast sigmoid function. The proposed multiplication algorithm is then integrated as a template into the well-known FloPoCo framework. Synthesis results are shown to compare with the floating point multiplication offered by FloPoCo as well. Second, the performance of posits is studied in the scenario of Neural Networks in both training and inference stages. To the best of our knowledge, this is the first time that training is done with posit format, achieving promising results for a binary classification problem even with reduced posit configurations. In the inference stage, 8-bit posits are as good as floating point when dealing with the MNIST dataset, but lose some accuracy with CIFAR-10.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.
-
First Experiences Optimizing Smith-Waterman on Intel's Knights Landing Processor
Authors:
Enzo Rucci,
Carlos Garcia,
Guillermo Botella,
Armando De Giusti,
Marcelo Naiouf,
Manuel Prieto-Matias
Abstract:
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration on In…
▽ More
The well-known Smith-Waterman (SW) algorithm is the most commonly used method for local sequence alignments. However, SW is very computationally demanding for large protein databases. There exist several implementations that take advantage of computing parallelization on many-cores, FPGAs or GPUs, in order to increase the alignment throughtput. In this paper, we have explored SW acceleration on Intel KNL processor. The novelty of this architecture requires the revision of previous programming and optimization techniques on many-core architectures. To the best of authors knowledge, this is the first KNL architecture assessment for SW algorithm. Our evaluation, using the renowned Environmental NR database as benchmark, has shown that multi-threading and SIMD exploitation reports competitive performance (351 GCUPS) in comparison with other implementations.
△ Less
Submitted 23 February, 2017;
originally announced February 2017.
-
Dyna-H: a heuristic planning reinforcement learning algorithm applied to role-playing-game strategy decision systems
Authors:
Matilde Santos,
Jose Antonio Martin H.,
Victoria Lopez,
Guillermo Botella
Abstract:
In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical search…
▽ More
In a Role-Playing Game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, in mayor degree, the game performance. When classical search algorithms such as A* can be used, they are the very first option. Nevertheless, such methods rely on precise and complete models of the search space, and there are many interesting scenarios where their application is not possible. Then, model free methods for sequential decision making under uncertainty are the best choice. In this paper, we propose a heuristic planning strategy to incorporate the ability of heuristic-search in path-finding into a Dyna agent. The proposed Dyna-H algorithm, as A* does, selects branches more likely to produce outcomes than other branches. Besides, it has the advantages of being a model-free online reinforcement learning algorithm. The proposal was evaluated against the one-step Q-Learning and Dyna-Q algorithms obtaining excellent experimental results: Dyna-H significantly overcomes both methods in all experiments. We suggest also, a functional analogy between the proposed sampling from worst trajectories heuristic and the role of dreams (e.g. nightmares) in human behavior.
△ Less
Submitted 30 July, 2011; v1 submitted 20 January, 2011;
originally announced January 2011.