-
Logarithmic Depth Decomposition of Approximate Multi-Controlled Single-Qubit Gates Without Ancilla Qubits
Authors:
Jefferson D. S. Silva,
Adenilton J. da Silva
Abstract:
The synthesis of quantum operators involves decomposing general quantum gates into the gate set supported by a given quantum device. Multi-controlled gates are essential components in this process. In this work, we present improved decompositions of multi-controlled NOT gates with logarithmic depth using a single ancilla qubit, while also reducing the constant factors in the circuit depth compared…
▽ More
The synthesis of quantum operators involves decomposing general quantum gates into the gate set supported by a given quantum device. Multi-controlled gates are essential components in this process. In this work, we present improved decompositions of multi-controlled NOT gates with logarithmic depth using a single ancilla qubit, while also reducing the constant factors in the circuit depth compared to previous work. We optimize a previously proposed decomposition of multi-target, multi-controlled special unitary SU(2) gates by identifying the presence of a conditionally clean qubit. Additionally, we introduce the best-known decomposition of multi-controlled approximate unitary U(2) gates without using ancilla qubits. This approach significantly reduces the overall circuit depth and CNOT count while preserving an adjustable error parameter, yielding a more efficient and scalable solution for synthesizing large controlled-unitary gates. Our method is particularly suitable for both NISQ and fault-tolerant quantum architectures. All software developed in this project is freely available.
△ Less
Submitted 30 June, 2025;
originally announced July 2025.
-
Multilingual Vision-Language Pre-training for the Remote Sensing Domain
Authors:
João Daniel Silva,
Joao Magalhaes,
Devis Tuia,
Bruno Martins
Abstract:
Methods based on Contrastive Language-Image Pre-training (CLIP) are nowadays extensively used in support of vision-and-language tasks involving remote sensing data, such as cross-modal retrieval. The adaptation of CLIP to this specific domain has relied on model fine-tuning with the standard contrastive objective, using existing human-labeled image-caption datasets, or using synthetic data corresp…
▽ More
Methods based on Contrastive Language-Image Pre-training (CLIP) are nowadays extensively used in support of vision-and-language tasks involving remote sensing data, such as cross-modal retrieval. The adaptation of CLIP to this specific domain has relied on model fine-tuning with the standard contrastive objective, using existing human-labeled image-caption datasets, or using synthetic data corresponding to image-caption pairs derived from other annotations over remote sensing images (e.g., object classes). The use of different pre-training mechanisms has received less attention, and only a few exceptions have considered multilingual inputs. This work proposes a novel vision-and-language model for the remote sensing domain, exploring the fine-tuning of a multilingual CLIP model and testing the use of a self-supervised method based on aligning local and global representations from individual input images, together with the standard CLIP objective. Model training relied on assembling pre-existing datasets of remote sensing images paired with English captions, followed by the use of automated machine translation into nine additional languages. We show that translated data is indeed helpful, e.g. improving performance also on English. Our resulting model, which we named Remote Sensing Multilingual CLIP (RS-M-CLIP), obtains state-of-the-art results in a variety of vision-and-language tasks, including cross-modal and multilingual image-text retrieval, or zero-shot image classification.
△ Less
Submitted 30 October, 2024;
originally announced October 2024.
-
Low-depth Quantum Circuit Decomposition of Multi-controlled Gates
Authors:
Thiago Melo D. Azevedo,
Jefferson D. S. Silva,
Adenilton J. da Silva
Abstract:
Multi-controlled gates are fundamental components in the design of quantum algorithms, where efficient decompositions of these operators can enhance algorithm performance. The best asymptotic decomposition of an n-controlled X gate with one borrowed ancilla into single qubit and CNOT gates produces circuits with degree 3 polylogarithmic depth and employs a divide-and-conquer strategy. In this pape…
▽ More
Multi-controlled gates are fundamental components in the design of quantum algorithms, where efficient decompositions of these operators can enhance algorithm performance. The best asymptotic decomposition of an n-controlled X gate with one borrowed ancilla into single qubit and CNOT gates produces circuits with degree 3 polylogarithmic depth and employs a divide-and-conquer strategy. In this paper, we reduce the number of recursive calls in the divide-and-conquer algorithm and decrease the depth of n-controlled X gate decomposition to a degree of 2.799 polylogarithmic depth. With this optimized decomposition, we also reduce the depth of n-controlled SU(2) gates and approximate n-controlled U(2) gates. Decompositions described in this work achieve the lowest asymptotic depth reported in the literature. We also perform an optimization in the base of the recursive approach. Starting at 52 control qubits, the proposed n-controlled X gate with one borrowed ancilla has the shortest circuit depth in the literature. One can reproduce all the results with the freely available open-source code provided in a public repository.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Large Language Models for Captioning and Retrieving Remote Sensing Images
Authors:
João Daniel Silva,
João Magalhães,
Devis Tuia,
Bruno Martins
Abstract:
Image captioning and cross-modal retrieval are examples of tasks that involve the joint analysis of visual and linguistic information. In connection to remote sensing imagery, these tasks can help non-expert users in extracting relevant Earth observation information for a variety of applications. Still, despite some previous efforts, the development and application of vision and language models to…
▽ More
Image captioning and cross-modal retrieval are examples of tasks that involve the joint analysis of visual and linguistic information. In connection to remote sensing imagery, these tasks can help non-expert users in extracting relevant Earth observation information for a variety of applications. Still, despite some previous efforts, the development and application of vision and language models to the remote sensing domain have been hindered by the relatively small size of the available datasets and models used in previous studies. In this work, we propose RS-CapRet, a Vision and Language method for remote sensing tasks, in particular image captioning and text-image retrieval. We specifically propose to use a highly capable large decoder language model together with image encoders adapted to remote sensing imagery through contrastive language-image pre-training. To bridge together the image encoder and language decoder, we propose training simple linear layers with examples from combining different remote sensing image captioning datasets, keeping the other parameters frozen. RS-CapRet can then generate descriptions for remote sensing images and retrieve images from textual descriptions, achieving SOTA or competitive performance with existing methods. Qualitative results illustrate that RS-CapRet can effectively leverage the pre-trained large language model to describe remote sensing images, retrieve them based on different types of queries, and also show the ability to process interleaved sequences of images and text in a dialogue manner.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Linear decomposition of approximate multi-controlled single qubit gates
Authors:
Jefferson D. S. Silva,
Thiago Melo D. Azevedo,
Israel F. Araujo,
Adenilton J. da Silva
Abstract:
We provide a method for compiling approximate multi-controlled single qubit gates into quantum circuits without ancilla qubits. The total number of elementary gates to decompose an n-qubit multi-controlled gate is proportional to 32n, and the previous best approximate approach without auxiliary qubits requires 32nk elementary operations, where k is a function that depends on the error threshold. T…
▽ More
We provide a method for compiling approximate multi-controlled single qubit gates into quantum circuits without ancilla qubits. The total number of elementary gates to decompose an n-qubit multi-controlled gate is proportional to 32n, and the previous best approximate approach without auxiliary qubits requires 32nk elementary operations, where k is a function that depends on the error threshold. The proposed decomposition depends on an optimization technique that minimizes the CNOT gate count for multi-target and multi-controlled CNOT and SU(2) gates. Computational experiments show the reduction in the number of CNOT gates to apply multi-controlled U(2) gates. As multi-controlled single-qubit gates serve as fundamental components of quantum algorithms, the proposed decomposition offers a comprehensive solution that can significantly decrease the count of elementary operations employed in quantum computing applications.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Bounds for a alpha-eigenvalues
Authors:
João Domingos G. da Silva Jr,
Carla Silva Oliveira,
Liliana Manuela G. C. da Costa
Abstract:
Let G be a graph with adjacency matrix A(G) and degree diagonal matrix D(G). In 2017, Nikiforov [1] defined the matrix Aalpha(G), as a convex combination of A(G) and D(G), the following way, Aalpha(G) = alpha A(G) + (1 - alpha)D(G), where alpha belongs to [0,1]. In this paper, we present some new upper and lower bounds for the largest, second largest, and smallest eigenvalue of the Aalpha-matrix.…
▽ More
Let G be a graph with adjacency matrix A(G) and degree diagonal matrix D(G). In 2017, Nikiforov [1] defined the matrix Aalpha(G), as a convex combination of A(G) and D(G), the following way, Aalpha(G) = alpha A(G) + (1 - alpha)D(G), where alpha belongs to [0,1]. In this paper, we present some new upper and lower bounds for the largest, second largest, and smallest eigenvalue of the Aalpha-matrix. Moreover, extremal graphs attaining some of these bounds are characterized
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
On the characteristic polynomial of the $A_α$-matrix for some operations of graphs
Authors:
João Domingos G. da Silva Jr.,
Carla Silva Oliveira,
Liliana Manuela G. C. da Costa
Abstract:
Let G be a graph of order $n$ with adjacency matrix $A(G)$ and diagonal matrix of degree $D(G)$. For every $α\in [0,1]$, Nikiforov \cite{VN17} defined the matrix $A_α(G) = αD(G) + (1-α)A(G)$. In this paper we present the $A_α(G)$-characteristic polynomial when $G$ is obtained by coalescing two graphs, and if $G$ is a semi-regular bipartite graph we obtain the $A_α$-characteristic polynomial of the…
▽ More
Let G be a graph of order $n$ with adjacency matrix $A(G)$ and diagonal matrix of degree $D(G)$. For every $α\in [0,1]$, Nikiforov \cite{VN17} defined the matrix $A_α(G) = αD(G) + (1-α)A(G)$. In this paper we present the $A_α(G)$-characteristic polynomial when $G$ is obtained by coalescing two graphs, and if $G$ is a semi-regular bipartite graph we obtain the $A_α$-characteristic polynomial of the line graph associated to $G$. Moreover, if $G$ is a regular graph we exhibit the $A_α$-characteristic polynomial for the graphs obtained from some operations.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
A Comparison of Spatiotemporal Visualizations for 3D Urban Analytics
Authors:
Roberta Mota,
Nivan Ferreira,
Julio Daniel Silva,
Marius Horga,
Marcos Lage,
Luis Ceferino,
Usman Alim,
Ehud Sharlin,
Fabio Miranda
Abstract:
Recent technological innovations have led to an increase in the availability of 3D urban data, such as shadow, noise, solar potential, and earthquake simulations. These spatiotemporal datasets create opportunities for new visualizations to engage experts from different domains to study the dynamic behavior of urban spaces in this under explored dimension. However, designing 3D spatiotemporal urban…
▽ More
Recent technological innovations have led to an increase in the availability of 3D urban data, such as shadow, noise, solar potential, and earthquake simulations. These spatiotemporal datasets create opportunities for new visualizations to engage experts from different domains to study the dynamic behavior of urban spaces in this under explored dimension. However, designing 3D spatiotemporal urban visualizations is challenging, as it requires visual strategies to support analysis of time-varying data referent to the city geometry. Although different visual strategies have been used in 3D urban visual analytics, the question of how effective these visual designs are at supporting spatiotemporal analysis on building surfaces remains open. To investigate this, in this paper we first contribute a series of analytical tasks elicited after interviews with practitioners from three urban domains. We also contribute a quantitative user study comparing the effectiveness of four representative visual designs used to visualize 3D spatiotemporal urban data: spatial juxtaposition, temporal juxtaposition, linked view, and embedded view. Participants performed a series of tasks that required them to identify extreme values on building surfaces over time. Tasks varied in granularity for both space and time dimensions. Our results demonstrate that participants were more accurate using plot-based visualizations (linked view, embedded view) but faster using color-coded visualizations (spatial juxtaposition, temporal juxtaposition). Our results also show that, with increasing task complexity, plot-based visualizations perform better in preserving efficiency (time, accuracy) compared to color-coded visualizations. Based on our findings, we present a set of takeaways with design recommendations for 3D spatiotemporal urban visualizations for researchers and practitioners.
△ Less
Submitted 10 August, 2022;
originally announced August 2022.
-
Open vs Closed-ended questions in attitudinal surveys -- comparing, combining, and interpreting using natural language processing
Authors:
Vishnu Baburajan,
João de Abreu e Silva,
Francisco Camara Pereira
Abstract:
To improve the traveling experience, researchers have been analyzing the role of attitudes in travel behavior modeling. Although most researchers use closed-ended surveys, the appropriate method to measure attitudes is debatable. Topic Modeling could significantly reduce the time to extract information from open-ended responses and eliminate subjective bias, thereby alleviating analyst concerns. O…
▽ More
To improve the traveling experience, researchers have been analyzing the role of attitudes in travel behavior modeling. Although most researchers use closed-ended surveys, the appropriate method to measure attitudes is debatable. Topic Modeling could significantly reduce the time to extract information from open-ended responses and eliminate subjective bias, thereby alleviating analyst concerns. Our research uses Topic Modeling to extract information from open-ended questions and compare its performance with closed-ended responses. Furthermore, some respondents might prefer answering questions using their preferred questionnaire type. So, we propose a modeling framework that allows respondents to use their preferred questionnaire type to answer the survey and enable analysts to use the modeling frameworks of their choice to predict behavior. We demonstrate this using a dataset collected from the USA that measures the intention to use Autonomous Vehicles for commute trips. Respondents were presented with alternative questionnaire versions (open- and closed- ended). Since our objective was also to compare the performance of alternative questionnaire versions, the survey was designed to eliminate influences resulting from statements, behavioral framework, and the choice experiment. Results indicate the suitability of using Topic Modeling to extract information from open-ended responses; however, the models estimated using the closed-ended questions perform better compared to them. Besides, the proposed model performs better compared to the models used currently. Furthermore, our proposed framework will allow respondents to choose the questionnaire type to answer, which could be particularly beneficial to them when using voice-based surveys.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Semantic Segmentation with Labeling Uncertainty and Class Imbalance
Authors:
Patrik Olã Bressan,
José Marcato Junior,
José Augusto Correa Martins,
Diogo Nunes Gonçalves,
Daniel Matte Freitas,
Lucas Prado Osco,
Jonathan de Andrade Silva,
Zhipeng Luo,
Jonathan Li,
Raymundo Cordero Garcia,
Wesley Nunes Gonçalves
Abstract:
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix…
▽ More
Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Counting and Locating High-Density Objects Using Convolutional Neural Network
Authors:
Mauro dos Santos de Arruda,
Lucas Prado Osco,
Plabiany Rodrigo Acosta,
Diogo Nunes Gonçalves,
José Marcato Junior,
Ana Paula Marques Ramos,
Edson Takashi Matsubara,
Zhipeng Luo,
Jonathan Li,
Jonathan de Andrade Silva,
Wesley Nunes Gonçalves
Abstract:
This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our meth…
▽ More
This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our method returned a mean absolute error (MAE) of 2.05, a root-mean-squared error (RMSE) of 2.87 and a coefficient of determination (R$^2$) of 0.986. For the car dataset (CARPK and PUCPR+), our method was superior to state-of-the-art methods. In the these datasets, our approach achieved an MAE of 4.45 and 3.16, an RMSE of 6.18 and 4.39, and an R$^2$ of 0.975 and 0.999, respectively. The proposed method is suitable for dealing with high object-density, returning a state-of-the-art performance for counting and locating objects.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
A Review on Deep Learning in UAV Remote Sensing
Authors:
Lucas Prado Osco,
José Marcato Junior,
Ana Paula Marques Ramos,
Lúcio André de Castro Jorge,
Sarah Narges Fatholahi,
Jonathan de Andrade Silva,
Edson Takashi Matsubara,
Hemerson Pistori,
Wesley Nunes Gonçalves,
Jonathan Li
Abstract:
Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information p…
▽ More
Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information produced in its subfields. Recently, Unmanned Aerial Vehicles (UAV) based applications have dominated aerial sensing research. However, a literature revision that combines both "deep learning" and "UAV remote sensing" thematics has not yet been conducted. The motivation for our work was to present a comprehensive review of the fundamentals of Deep Learning (DL) applied in UAV-based imagery. We focused mainly on describing classification and regression techniques used in recent applications with UAV-acquired data. For that, a total of 232 papers published in international scientific journal databases was examined. We gathered the published material and evaluated their characteristics regarding application, sensor, and technique used. We relate how DL presents promising results and has the potential for processing tasks associated with UAV-based image data. Lastly, we project future perspectives, commentating on prominent DL paths to be explored in the UAV remote sensing field. Our revision consists of a friendly-approach to introduce, commentate, and summarize the state-of-the-art in UAV-based image applications with DNNs algorithms in diverse subfields of remote sensing, grouping it in the environmental, urban, and agricultural contexts.
△ Less
Submitted 20 August, 2023; v1 submitted 22 January, 2021;
originally announced January 2021.
-
An analysis of Reinforcement Learning applied to Coach task in IEEE Very Small Size Soccer
Authors:
Carlos H. C. Pena,
Mateus G. Machado,
Mariana S. Barros,
José D. P. Silva,
Lucas D. Maciel,
Tsang Ing Ren,
Edna N. S. Barros,
Pedro H. M. Braga,
Hansenclever F. Bassani
Abstract:
The IEEE Very Small Size Soccer (VSSS) is a robot soccer competition in which two teams of three small robots play against each other. Traditionally, a deterministic coach agent will choose the most suitable strategy and formation for each adversary's strategy. Therefore, the role of a coach is of great importance to the game. In this sense, this paper proposes an end-to-end approach for the coach…
▽ More
The IEEE Very Small Size Soccer (VSSS) is a robot soccer competition in which two teams of three small robots play against each other. Traditionally, a deterministic coach agent will choose the most suitable strategy and formation for each adversary's strategy. Therefore, the role of a coach is of great importance to the game. In this sense, this paper proposes an end-to-end approach for the coaching task based on Reinforcement Learning (RL). The proposed system processes the information during the simulated matches to learn an optimal policy that chooses the current formation, depending on the opponent and game conditions. We trained two RL policies against three different teams (balanced, offensive, and heavily offensive) in a simulated environment. Our results were assessed against one of the top teams of the VSSS league, showing promising results after achieving a win/loss ratio of approximately 2.0.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
New Algorithms for Computing a Single Component of the Discrete Fourier Transform
Authors:
G. Jerônimo da Silva Jr.,
R. M. Campello de Souza,
H. M. de Oliveira
Abstract:
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
△ Less
Submitted 9 March, 2015;
originally announced March 2015.