Search | arXiv e-print repository

When water phase matters: its effect on the stopping cross section for proton therapy and astrophysics

Authors: F. Matias, N. E. Koval, P. de Vera, R. Garcia-Molina, I. Abril, J. M. B. Shorto, H. Yoriyaz, J. J. N. Pereira, T. F. Silva, M. H. Tabacniks, M. Vos, P. L. Grande

Abstract: Accurately quantifying the energy loss rate of proton beams in liquid water is crucial for the precise application and improvement of proton therapy, whereas the slowing down of proton in water ices also plays an important role in astrophysics. However, precisely determining the electronic stopping power, particularly for the liquid phase, has been elusive so far. Experimental techniques are diffi… ▽ More Accurately quantifying the energy loss rate of proton beams in liquid water is crucial for the precise application and improvement of proton therapy, whereas the slowing down of proton in water ices also plays an important role in astrophysics. However, precisely determining the electronic stopping power, particularly for the liquid phase, has been elusive so far. Experimental techniques are difficult to apply to volatile liquids, and the availability of sufficient reliable measurements has been limited to the solid and vapor phases. The accuracy of current models is typically limited to proton energies just above the energy-loss maximum, making it difficult to predict radiation effects at an energy range of special relevance. We elucidate the phase differences in proton energy loss in water in a wide energy range (0.001-10 MeV) by means of real-time time-dependent density functional theory combined with the Penn method. This non-perturbative model, more computationally-efficient than current approaches, describes the phase effects in water in excellent agreement with available experimental data, revealing clear deviations around the maximum of the stopping power curve and below. As an important outcome, our calculations reveal that proton stopping quantities of liquid water and amorphous ice are identical, in agreement with recent similar observations for low-energy electrons, pointing out to this equivalence for all charged particles. This could help to overcome the limitation in obtaining reliable experimental information for the biologically-relevant liquid water target. △ Less

Submitted 29 May, 2025; originally announced May 2025.

arXiv:2505.08424 [pdf, other]

CMOS-Compatible, Wafer-Scale Processed Superconducting Qubits Exceeding Energy Relaxation Times of 200us

Authors: T. Mayer, J. Weber, E. Music, C. Moran Guizan, S. J. K. Lang, L. Schwarzenbach, C. Dhieb, B. Kiliclar, A. Maiwald, Z. Luo, W. Lerch, D. Zahn, I. Eisele, R. N. Pereira, C. Kutter

Abstract: We present the results of an industry-grade fabrication of superconducting qubits on 200 mm wafers utilizing CMOS-established processing methods. By automated waferprober resistance measurements at room temperature, we demonstrate a Josephson junction fabrication yield of 99.7% (shorts and opens) across more than 10000 junctions and a qubit frequency prediction accuracy of 1.6%. In cryogenic chara… ▽ More We present the results of an industry-grade fabrication of superconducting qubits on 200 mm wafers utilizing CMOS-established processing methods. By automated waferprober resistance measurements at room temperature, we demonstrate a Josephson junction fabrication yield of 99.7% (shorts and opens) across more than 10000 junctions and a qubit frequency prediction accuracy of 1.6%. In cryogenic characterization, we provide statistical results regarding energy relaxation times of the qubits with a median T1 of up to 100 us and individual devices consistently approaching 200 us in long-term measurements. This represents the best performance reported so far for superconducting qubits fabricated by industry-grade, wafer-level subtractive processes. △ Less

Submitted 20 May, 2025; v1 submitted 13 May, 2025; originally announced May 2025.

Comments: T. Mayer and J. Weber contributed equally to this work and are listed in alphabetical order. 6 pages, 2 figures

arXiv:2505.04337 [pdf, other]

3D-Integrated Superconducting qubits: CMOS-Compatible, Wafer-Scale Processing for Flip-Chip Architectures

Authors: T. Mayer, H. Bender, S. J. K. Lang, Z. Luo, J. Weber, C. Moran Guizan, C. Dhieb, D. Zahn, L. Schwarzenbach, W. Hell, M. Andronic, A. Drost, K. Neumeier, W. Lerch, L. Nebrich, A. Hagelauer, I. Eisele, R. N. Pereira, C. Kutter

Abstract: In this article, we present a technology development of a superconducting qubit device 3D-integrated by flip-chip-bonding and processed following CMOS fabrication standards and contamination rules on 200 mm wafers. We present the utilized proof-of-concept chip designs for qubit- and carrier chip, as well as the respective front-end and back-end fabrication techniques. In characterization of the ne… ▽ More In this article, we present a technology development of a superconducting qubit device 3D-integrated by flip-chip-bonding and processed following CMOS fabrication standards and contamination rules on 200 mm wafers. We present the utilized proof-of-concept chip designs for qubit- and carrier chip, as well as the respective front-end and back-end fabrication techniques. In characterization of the newly developed microbump technology based on metallized KOH-etched Si-islands, we observe a superconducting transition of the used metal stacks and radio frequency (RF) signal transfer through the bump connection with negligible attenuation. In time-domain spectroscopy of the qubits we find high yield qubit excitation with energy relaxation times of up to 15 us. △ Less

Submitted 23 May, 2025; v1 submitted 7 May, 2025; originally announced May 2025.

Comments: 8 pages, 5 figures

arXiv:2504.18173 [pdf, other]

Advancing Superconducting Qubits: CMOS-Compatible Processing and Room Temperature Characterization for Scalable Quantum Computing beyond 2D Architectures

Authors: S. J. K. Lang, T. Mayer, J. Weber, C. Dhieb, I. Eisele, W. Lerch, Z. Luo, C. Moran Guizan, E. Music, L. Sturm-Rogon, D. Zahn, R. N. Pereira, C. Kutter

Abstract: We report on an industry-grade CMOS-compatible qubit fabrication approach using a CMOS pilot line, enabling a yield of functional devices reaching 92.8%, with a resistance spread evaluated across the full wafer 200 mm diameter of 12.4% and relaxation times (T1) approaching 80 us. Furthermore, we conducted a comprehensive analysis of wafer-scale room temperature (RT) characteristics collected from… ▽ More We report on an industry-grade CMOS-compatible qubit fabrication approach using a CMOS pilot line, enabling a yield of functional devices reaching 92.8%, with a resistance spread evaluated across the full wafer 200 mm diameter of 12.4% and relaxation times (T1) approaching 80 us. Furthermore, we conducted a comprehensive analysis of wafer-scale room temperature (RT) characteristics collected from multiple wafers and fabrication runs, focusing on RT measurements and their correlation to low temperature qubit parameters. From defined test structures, a across-wafer junction area variation of 10.1% and oxide barrier variation of 7.2% was calculated. Additionally, we notably show a close-correlation between qubit junction resistance and frequency in accordance with the Ambegaokar-Baratoff relation with a critical temperature Tc of about 0.71 K. This overarching relation sets the stage for pre-cooldown qubit evaluation and sorting. In particular, such early-on device characterization and validation are crucial for increasing the fabrication yield and qubit frequency targeting, which currently represent major scaling challenges. Furthermore, it enables the fabrication of large multichip quantum systems in the future. Our findings highlight the great potential of CMOS-compatible industry-style fabrication of superconducting qubits for scalable quantum computing in a foundry pilot line cleanroom. △ Less

Submitted 26 May, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

Comments: S. J. K. Lang, T. Mayer and J. Weber contributed equally to this work and are listed in alphabetical order. 9 pages, 8 figures

arXiv:2504.16686 [pdf, other]

Wafer-Scale Characterization of Al/AlxOy/Al Josephson Junctions at Room Temperature

Authors: Simon J. K. Lang, Ignaz Eisele, Johannes Weber, Alexandra Schewski, Emir Music, Alwin Maiwald, Martin Heigl, Daniela Zahn, Zhen Luo, Lars Nebrich, Benedikt Schoof, Thomas Mayer, Leonhard Sturm-Rogon, Wilfried Lerch, Rui N. Pereira, Christoph Kutter

Abstract: Josephson junctions (JJs) are the key element of many devices operating at cryogenic temperatures. Development of time-efficient wafer-scale JJ characterization for process optimization and control of JJ fabrication is essential. Such statistical characterization has to rely on room temperature techniques since cryogenic measurements typically used for JJs are too time consuming and unsuitable for… ▽ More Josephson junctions (JJs) are the key element of many devices operating at cryogenic temperatures. Development of time-efficient wafer-scale JJ characterization for process optimization and control of JJ fabrication is essential. Such statistical characterization has to rely on room temperature techniques since cryogenic measurements typically used for JJs are too time consuming and unsuitable for wafer-scale characterization. In this work, we show that from room temperature capacitance and current-voltage measurements, with proper data analysis, we can independently obtain useful parameters of the JJs on wafer-scale, like oxide thickness, tunnel coefficient, and interfacial defect densities. Moreover, based on detailed analysis of current vs voltage characteristics, different charge transport mechanisms across the junctions can be distinguished. We exemplary demonstrate the worth of these methods by studying junctions fabricated on 200 mm wafers with an industrially scale-able concept based on subtractive processing using only CMOS compatible tools. From these studies, we find that our subtractive fabrication approach yields junctions with quite homogeneous average oxide thickness across the full wafers, with a spread of less then 3$\,$%. The analysis also revealed a variation of the tunnel coefficient with oxide thickness, pointing to a stoichiometry gradient across the junctions' oxide width. Moreover, we estimated relatively low interfacial defect densities in the range of 70 - 5000$\,$defects/cm$^2$ for our junctions and established that the density increased with decreasing oxide thickness, indicating that the wet etching process applied in the JJs fabrication for oxide thickness control leads to formation of interfacial trap state △ Less

Submitted 14 May, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

Comments: 8 pages, 10 figures, 1 tables

arXiv:2503.06428 [pdf, other]

Interference-Aware Edge Runtime Prediction with Conformal Matrix Completion

Authors: Tianshu Huang, Arjun Ramesh, Emily Ruppel, Nuno Pereira, Anthony Rowe, Carlee Joe-Wong

Abstract: Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly important for managing increasingly complex distributed systems in which more sophisticated processing is pushed to the edge in search of better latency. Previo… ▽ More Accurately estimating workload runtime is a longstanding goal in computer systems, and plays a key role in efficient resource provisioning, latency minimization, and various other system management tasks. Runtime prediction is particularly important for managing increasingly complex distributed systems in which more sophisticated processing is pushed to the edge in search of better latency. Previous approaches for runtime prediction in edge systems suffer from poor data efficiency or require intensive instrumentation; these challenges are compounded in heterogeneous edge computing environments, where historical runtime data may be sparsely available and instrumentation is often challenging. Moreover, edge computing environments often feature multi-tenancy due to limited resources at the network edge, potentially leading to interference between workloads and further complicating the runtime prediction problem. Drawing from insights across machine learning and computer systems, we design a matrix factorization-inspired method that generates accurate interference-aware predictions with tight provably-guaranteed uncertainty bounds. We validate our method on a novel WebAssembly runtime dataset collected from 24 unique devices, achieving a prediction error of 5.2% -- 2x better than a naive application of existing methods. △ Less

Submitted 8 March, 2025; originally announced March 2025.

Comments: To appear at MLSys 2025

arXiv:2410.06432 [pdf, other]

doi 10.1038/s41699-025-00551-7

Gating monolayer and bilayer graphene with a two-dimensional semiconductor

Authors: Randy Sterbentz, Bogyeom Kim, Anayeli Flores-Garibay, Kristine L. Haley, Nicholas T. Pereira, Kenji Watanabe, Takashi Taniguchi, Joshua O. Island

Abstract: Metals are commonly used as electrostatic gates in devices due to their abundant charge carrier densities that are necessary for efficient charging and discharging. A semiconducting gate can be beneficial for certain fabrication processes, in low light conditions, and for specific gating properties. We determine the effectiveness and limitations of a semiconducting gate in graphene and bilayer gra… ▽ More Metals are commonly used as electrostatic gates in devices due to their abundant charge carrier densities that are necessary for efficient charging and discharging. A semiconducting gate can be beneficial for certain fabrication processes, in low light conditions, and for specific gating properties. We determine the effectiveness and limitations of a semiconducting gate in graphene and bilayer graphene devices. Using the semiconducting transition metal dichalcogenides molybdenum disulfide (MoS2), molybdenum diselenide (MoSe2), tungsten disulfide (WS2), and tungsten diselenide (WSe2), we show that two-dimensional semiconductors can be used to suitably gate the graphene devices under appropriate operating conditions. For single-gated devices, semiconducting gates are comparable to metallic gates below liquid helium temperatures but include resistivity features resulting from gate voltage clamping of the semiconductor. In dual-gated devices, we pin down the parameter range of effective operation and find that the semiconducting depletion regime results in clamping and hysteresis from defect-state charge trapping. △ Less

Submitted 11 April, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

Comments: 33 pages, 14 figures

Journal ref: npj 2D Mater Appl 9, 29 (2025)

arXiv:2408.13249 [pdf, other]

doi 10.1021/acsanm.4c05338

Isolation and characterization of atomically thin mica phyllosilicates

Authors: Kristine L. Haley, Noah F. Lee, Vergil M. Schreiber, Nicholas T. Pereira, Randy M. Sterbentz, Timothy Y. Chung, Joshua O. Island

Abstract: One of the roadblocks to employing two-dimensional (2D) materials in next generation devices is the lack of high quality insulators. Insulating layered materials with inert and atomically flat surfaces are ideal for high performance transistors and this has been exemplified with commonly used boron nitride. While the list of insulating 2D materials is limited, the earth-abundant phyllosilicates ar… ▽ More One of the roadblocks to employing two-dimensional (2D) materials in next generation devices is the lack of high quality insulators. Insulating layered materials with inert and atomically flat surfaces are ideal for high performance transistors and this has been exemplified with commonly used boron nitride. While the list of insulating 2D materials is limited, the earth-abundant phyllosilicates are particularly attractive candidates. Here, we investigate the properties of atomically thin biotite and muscovite, the most common and commercially important micas from the rock-forming minerals. From a group of five natural bulk samples, energy dispersive X-ray spectroscopy is used to classify exfoliated flakes into three types of biotite, including the phlogopite endmember, and two muscovites. We provide a catalog of RGB contrast values for exfoliated flakes ranging from bilayer to approximately 175 nm. Additionally, we report the complex index of refraction for all investigated materials based on micro-reflectance measurements. Our findings suggest that earth-abundant phyllosilicates could serve as scalable insulators for logic devices employing 2D materials, potentially overcoming current limitations in the field. △ Less

Submitted 23 August, 2024; originally announced August 2024.

Comments: 18 pages, 4 figures

arXiv:2408.09289 [pdf]

doi 10.1002/adfm.202408110

Mono-exponential Current Attenuation with Distance across 16 nm Thick Bacteriorhodopsin Multilayers

Authors: Domenikos Chryssikos, Jerry A. Fereiro, Jonathan Rojas, Sudipta Bera, Defne Tüzün, Evanthia Kounoupioti, Rui N. Pereira, Christian Pfeiffer, Ali Khoshouei, Hendrik Dietz, Mordechai Sheves, David Cahen, Marc Tornow

Abstract: The remarkable ability of natural proteins to conduct electricity in the dry state over long distances remains largely inexplicable despite intensive research. In some cases, a (weakly) exponential length-attenuation, as in off-resonant tunneling transport, extends to thicknesses even beyond 10 nm. This report deals with such charge transport characteristics observed in self-assembled multilayers… ▽ More The remarkable ability of natural proteins to conduct electricity in the dry state over long distances remains largely inexplicable despite intensive research. In some cases, a (weakly) exponential length-attenuation, as in off-resonant tunneling transport, extends to thicknesses even beyond 10 nm. This report deals with such charge transport characteristics observed in self-assembled multilayers of the protein bacteriorhodopsin (bR). About 7.5 nm to 15.5 nm thick bR layers were prepared on conductive titanium nitride (TiN) substrates using aminohexylphosphonic acid and poly-diallyl-dimethylammonium electrostatic linkers. Using conical EGaIn top contacts, an intriguing, mono-exponential conductance attenuation as a function of the bR layer thickness with a small attenuation coefficient $β\approx 0.8 \space {\rm nm}^{-1}$ is measured at zero bias. Variable-temperature measurements using evaporated Ti/Au top contacts yield effective energy barriers of about 100 meV from fitting the data to tunneling, hopping, and carrier cascade transport models. The observed temperature-dependence is assigned to the protein-electrode interfaces. The transport length and temperature dependence of the current densities are consistent with tunneling through the protein-protein and protein-electrode interfaces, respectively. Importantly, our results call for new theoretical approaches to find the microscopic mechanism behind the remarkably efficient, long-range electron transport within bR. △ Less

Submitted 17 August, 2024; originally announced August 2024.

Journal ref: Adv. Funct. Mater. 2024, 2408110

arXiv:2403.09483 [pdf, other]

doi 10.1140/epjc/s10052-024-12993-2

Tracking of charged particles with nanosecond lifetimes at LHCb

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, J. A. Adams, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey , et al. (1060 additional authors not shown)

Abstract: A method is presented to reconstruct charged particles with lifetimes between 10 ps and 10 ns, which considers a combination of their decay products and the partial tracks created by the initial charged particle. Using the $Ξ^-$ baryon as a benchmark, the method is demonstrated with simulated events and proton-proton collision data at $\sqrt{s}=13$ TeV, corresponding to an integrated luminosity of… ▽ More A method is presented to reconstruct charged particles with lifetimes between 10 ps and 10 ns, which considers a combination of their decay products and the partial tracks created by the initial charged particle. Using the $Ξ^-$ baryon as a benchmark, the method is demonstrated with simulated events and proton-proton collision data at $\sqrt{s}=13$ TeV, corresponding to an integrated luminosity of 2.0 fb${}^{-1}$ collected with the LHCb detector in 2018. Significant improvements in the angular resolution and the signal purity are obtained. The method is implemented as part of the LHCb Run 3 event trigger in a set of requirements to select detached hyperons. This is the first demonstration of the applicability of this approach at the LHC, and the first to show its scaling with instantaneous luminosity. △ Less

Submitted 18 September, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-DP-2023-004.html (LHCb public pages)

Report number: CERN-EP-2024-077, LHCb-DP-2023-004

Journal ref: EPJC 84 (2024) 761

arXiv:2401.02853 [pdf, other]

Modeling of Proton Interaction with Organic Polymers: Implications for Cancer Therapy and Beyond

Authors: F. Matias, T. F. Silva, N. E. Koval, J. J. N. Pereira, P. C. G. Antunes, P. T. D. Siqueira, M. H. Tabacniks, H. Yoriyaz, J. M. B. Shorto, P. L. Grande

Abstract: This comprehensive study delves into the intricate interplay between protons and organic polymers, offering insights into proton therapy in cancer treatment. Focusing on the influence of the spatial electron density distribution on stopping power estimates, we employed time-dependent density functional theory (TDDFT), coupled with the Penn method. Surprisingly, the assumption of electron density h… ▽ More This comprehensive study delves into the intricate interplay between protons and organic polymers, offering insights into proton therapy in cancer treatment. Focusing on the influence of the spatial electron density distribution on stopping power estimates, we employed time-dependent density functional theory (TDDFT), coupled with the Penn method. Surprisingly, the assumption of electron density homogeneity in polymers is fundamentally flawed, resulting in an overestimation of stopping power values at energies below 2 MeV, approximately. Moreover, Bragg's rule application in specific compounds exhibited significant deviations from experimental data in the Bragg peak region, challenging established norms. △ Less

Submitted 24 January, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

Comments: 8 pages, 7 figures, research article

arXiv:2305.07511 [pdf, ps, other]

eXplainable Artificial Intelligence on Medical Images: A Survey

Authors: Matteus Vargas Simão da Silva, Rodrigo Reis Arrais, Jhessica Victoria Santos da Silva, Felipe Souza Tânios, Mateus Antonio Chinelatto, Natalia Backhaus Pereira, Renata De Paris, Lucas Cesar Ferreira Domingos, Rodrigo Dória Villaça, Vitor Lopes Fabris, Nayara Rossi Brito da Silva, Ana Claudia Akemi Matsuki de Faria, Jose Victor Nogueira Alves da Silva, Fabiana Cristina Queiroz de Oliveira Marucci, Francisco Alves de Souza Neto, Danilo Xavier Silva, Vitor Yukio Kondo, Claudio Filipi Gonçalves dos Santos

Abstract: Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such… ▽ More Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such black box models to permit the desired assessment. This survey analyses several recent studies in the XAI field applied to medical diagnosis research, allowing some explainability of the machine learning results in several different diseases, such as cancers and COVID-19. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2205.04972 [pdf, ps, other]

doi 10.1093/mnras/stac1370

On the stellar core physics of the 16 Cyg binary system: constraining the central hydrogen abundance using asteroseismology

Authors: Benard Nsamba, Margarida S. Cunha, Catarina I. S. A. Rocha, Cristiano J. G. N. Pereira, Mário J. P. F. G. Monteiro, Tiago L. Campante

Abstract: The unprecedented quality of the asteroseismic data of solar-type stars made available by space missions such as NASA's Kepler telescope are making it possible to explore stellar interior structures. This offers possibilities of constraining stellar core properties (such as core sizes, abundances, and physics) paving the way for improving the precision of the inferred stellar ages. We employ 16 Cy… ▽ More The unprecedented quality of the asteroseismic data of solar-type stars made available by space missions such as NASA's Kepler telescope are making it possible to explore stellar interior structures. This offers possibilities of constraining stellar core properties (such as core sizes, abundances, and physics) paving the way for improving the precision of the inferred stellar ages. We employ 16 Cyg A and B as our benchmark stars for an asteroseismic study in which we present a novel approach aimed at selecting from a sample of acceptable stellar models returned from Forward Modelling techniques, down to the ones that better represent the core of each star. This is accomplished by comparing specific properties of the observed frequency ratios for each star to the ones derived from the acceptable stellar models. We demonstrate that in this way we are able to constrain further the hydrogen mass fraction in the core, establishing the stars' precise evolutionary states and ages. The ranges of the derived core hydrogen mass fractions are [0.01 - 0.06] and [0.12 - 0.19] for 16 Cyg A and B, respectively, and, considering that the stars are coeval, the age and metal mass fraction parameters span the region [6.4 - 7.4] Gyr and [0.023 - 0.026], respectively. In addition, our findings show that using a single helium-to-heavy element enrichment ratio, ($ΔY/ΔZ$), when forward modelling the 16 Cyg binary system, may result in a sample of acceptable models that do not simultaneously fit the observed frequency ratios, further highlighting that such an approach to the definition of the helium content of the star may not be adequate in studies of individual stars. △ Less

Submitted 10 May, 2022; originally announced May 2022.

Comments: 13 pages, 8 figures, and 5 tables. Accepted for publication in MNRAS

arXiv:2202.12159 [pdf]

An NLP Solution to Foster the Use of Information in Electronic Health Records for Efficiency in Decision-Making in Hospital Care

Authors: Adelino Leite-Moreira, Afonso Mendes, Afonso Pedrosa, Amândio Rocha-Sousa, Ana Azevedo, André Amaral-Gomes, Cláudia Pinto, Helena Figueira, Nuno Rocha Pereira, Pedro Mendes, Tiago Pimenta

Abstract: The project aimed to define the rules and develop a technological solution to automatically identify a set of attributes within free-text clinical records written in Portuguese. The first application developed and implemented on this basis was a structured summary of a patient's clinical history, including previous diagnoses and procedures, usual medication, and relevant characteristics or conditi… ▽ More The project aimed to define the rules and develop a technological solution to automatically identify a set of attributes within free-text clinical records written in Portuguese. The first application developed and implemented on this basis was a structured summary of a patient's clinical history, including previous diagnoses and procedures, usual medication, and relevant characteristics or conditions for clinical decisions, such as allergies, being under anticoagulant therapy, etc. The project's goal was achieved by a multidisciplinary team that included clinicians, epidemiologists, computational linguists, machine learning researchers and software engineers, bringing together the expertise and perspectives of a public hospital, the university and the private sector. Relevant benefits to users and patients are related with facilitated access to the patient's history, which translates into exhaustiveness in apprehending the patient's clinical past and efficiency due to time saving. △ Less

Submitted 24 February, 2022; originally announced February 2022.

Comments: 11 pages, 2 figures

arXiv:2112.10938 [pdf, other]

doi 10.1016/j.infsof.2022.107089

CADV: A software visualization approach for code annotations distribution

Authors: Phyllipe Lima, Jorge Melegati, Everaldo Gomes, Nathalya Stefhany Pereira, Eduardo Guerra, Paulo Meirelles

Abstract: Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visuali… ▽ More Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visualization approach that graphically displays how code annotations are distributed and organized in a software system and developing a tool, as a reference implementation of the approach, to generate views and interact with users. We conducted an empirical evaluation through questionnaires and interviews to evaluate our visualization approach considering four aspects: effectiveness for program comprehension, perceived usefulness, perceived ease of use, and suitability for the intended audience. The resulting data was used to perform a qualitative and quantitative analysis. The tool identifies package responsibilities providing visual information about their annotations at different levels. Using the developed tool, the participants achieved a high correctness rate in the program comprehension tasks and performed very well in questions about the overview of the system under analysis. Finally, participants perceived that the tool outperforms existing approaches for code inspection when searching for information related to code annotations. The results show that the visualization approach using the developed tool is effective in program comprehension tasks related to code annotations, which can also be used to identify responsibilities in the application packages. Moreover, it was evaluated as suitable for newcomers to overview the usage of annotations in the system and for architects to perform a deep analysis that can potentially detect misplaced annotations and abnormal growths on their usage. △ Less

Submitted 27 June, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

Comments: 53 pages

arXiv:2111.09378 [pdf, other]

MPF6D: Masked Pyramid Fusion 6D Pose Estimation

Authors: Nuno Pereira, Luís A. Alexandre

Abstract: Object pose estimation has multiple important applications, such as robotic grasping and augmented reality. We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals and can still be used in real-time. Our method uses RGB-D data as input to segment objects and estimate their pose. It uses a neural network with multiple heads to identify the obj… ▽ More Object pose estimation has multiple important applications, such as robotic grasping and augmented reality. We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals and can still be used in real-time. Our method uses RGB-D data as input to segment objects and estimate their pose. It uses a neural network with multiple heads to identify the objects in the scene, generate the appropriate masks and estimate the values of the translation vectors and the quaternion that represents the objects' rotation. These heads leverage a pyramid architecture used during feature extraction and feature fusion. We conduct an empirical evaluation using the two most common datasets in the area, and compare against state-of-the-art approaches, illustrating the capabilities of MPF6D. Our method can be used in real-time with its low inference time and high accuracy. △ Less

Submitted 4 February, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

arXiv:2111.00174 [pdf, other]

Multi-User Augmented Reality with Infrastructure-free Collaborative Localization

Authors: John Miller, Elahe Soltanaghai, Raewyn Duvall, Jeff Chen, Vikram Bhat, Nuno Pereira, Anthony Rowe

Abstract: Multi-user augmented reality (AR) could someday empower first responders with the ability to see team members around corners and through walls. For this vision of people tracking in dynamic environments to be practical, we need a relative localization system that is nearly instantly available across wide-areas without any existing infrastructure or manual setup. In this paper, we present LocAR, an… ▽ More Multi-user augmented reality (AR) could someday empower first responders with the ability to see team members around corners and through walls. For this vision of people tracking in dynamic environments to be practical, we need a relative localization system that is nearly instantly available across wide-areas without any existing infrastructure or manual setup. In this paper, we present LocAR, an infrastructure-free 6-degrees-of-freedom (6DoF) localization system for AR applications that uses motion estimates and range measurements between users to establish an accurate relative coordinate system. We show that not only is it possible to perform collaborative localization without infrastructure or global coordinates, but that our approach provides nearly the same level of accuracy as fixed infrastructure approaches for AR teaming applications. LocAR uses visual-inertial odometry (VIO) in conjunction with ultra-wideband (UWB) ranging radios to estimate the relative position of each device in an ad-hoc manner. The system leverages a collaborative 6DoF particle filtering formulation that operates on sporadic messages exchanged between nearby users. Unlike map or landmark sharing approaches, this allows for collaborative AR sessions even if users do not overlap the same spaces. LocAR consists of an open-source UWB firmware and reference mobile phone application that can display the location of team members in real-time using mobile AR. We evaluate LocAR across multiple buildings under a wide-variety of conditions including a contiguous 30,000 square foot region spanning multiple floors and find that it achieves median geometric error in 3D of less than 1 meter between five users freely walking across 3 floors. △ Less

Submitted 30 October, 2021; originally announced November 2021.

arXiv:2101.03835 [pdf, other]

doi 10.1051/0004-6361/202038419

DAWIS, a Detection Algorithm with Wavelets for Intracluster light Studies

Authors: A. Ellien, E. Slezak, N. Martinet, F. Durret, C. Adami, R. Gavazzi, C. R. Rabaça, C. Da Rocha, D. N. Epitácio Pereira

Abstract: Large amounts of deep optical images will be available in the near future, allowing statistically significant studies of low surface brightness structures such as intracluster light (ICL) in galaxy clusters. The detection of these structures requires efficient algorithms dedicated to this task, where traditional methods suffer difficulties. We present our new Detection Algorithm with Wavelets for… ▽ More Large amounts of deep optical images will be available in the near future, allowing statistically significant studies of low surface brightness structures such as intracluster light (ICL) in galaxy clusters. The detection of these structures requires efficient algorithms dedicated to this task, where traditional methods suffer difficulties. We present our new Detection Algorithm with Wavelets for Intracluster light Studies (DAWIS), developed and optimised for the detection of low surface brightness sources in images, in particular (but not limited to) ICL. DAWIS follows a multiresolution vision based on wavelet representation to detect sources, embedded in an iterative procedure called synthesis-by-analysis approach to restore the complete unmasked light distribution of these sources with very good quality. The algorithm is built so sources can be classified based on criteria depending on the analysis goal; we display in this work the case of ICL detection and the measurement of ICL fractions. We test the efficiency of DAWIS on 270 mock images of galaxy clusters with various ICL profiles and compare its efficiency to more traditional ICL detection methods such as the surface brightness threshold method. We also run DAWIS on a real galaxy cluster image, and compare the output to results obtained with previous multiscale analysis algorithms. We find in simulations that in average DAWIS is able to disentangle galaxy light from ICL more efficiently, and to detect a greater quantity of ICL flux due to the way it handles sky background noise. We also show that the ICL fraction, a metric used on a regular basis to characterise ICL, is subject to several measurement biases both on galaxies and ICL fluxes. In the real galaxy cluster image, DAWIS detects a faint and extended source with an absolute magnitude two orders brighter than previous multiscale methods. △ Less

Submitted 11 January, 2021; originally announced January 2021.

Comments: 31 pages, 12 figures

Journal ref: A&A 649, A38 (2021)

arXiv:1911.07771 [pdf, other]

MaskedFusion: Mask-based 6D Object Pose Estimation

Authors: Nuno Pereira, Luís A. Alexandre

Abstract: MaskedFusion is a framework to estimate the 6D pose of objects using RGB-D data, with an architecture that leverages multiple sub-tasks in a pipeline to achieve accurate 6D poses. 6D pose estimation is an open challenge due to complex world objects and many possible problems when capturing data from the real world, e.g., occlusions, truncations, and noise in the data. Achieving accurate 6D poses w… ▽ More MaskedFusion is a framework to estimate the 6D pose of objects using RGB-D data, with an architecture that leverages multiple sub-tasks in a pipeline to achieve accurate 6D poses. 6D pose estimation is an open challenge due to complex world objects and many possible problems when capturing data from the real world, e.g., occlusions, truncations, and noise in the data. Achieving accurate 6D poses will improve results in other open problems like robot grasping or positioning objects in augmented reality. MaskedFusion improves the state-of-the-art by using object masks to eliminate non-relevant data. With the inclusion of the masks on the neural network that estimates the 6D pose of an object we also have features that represent the object shape. MaskedFusion is a modular pipeline where each sub-task can have different methods that achieve the objective. MaskedFusion achieved 97.3% on average using the ADD metric on the LineMOD dataset and 93.3% using the ADD-S AUC metric on YCB-Video Dataset, which is an improvement, compared to the state-of-the-art methods. The code is available on GitHub (https://github.com/kroglice/MaskedFusion). △ Less

Submitted 18 March, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1909.01759 [pdf, other]

Data Selection for Short Term load forecasting

Authors: Nestor Pereira, Miguel Angel Hombrados Herrera, Vanesssa Gómez-Verdejo, Andrea A. Mammoli, Manel Martínez-Ramón

Abstract: Power load forecast with Machine Learning is a fairly mature application of artificial intelligence and it is indispensable in operation, control and planning. Data selection techniqies have been hardly used in this application. However, the use of such techniques could be beneficial provided the assumption that the data is identically distributed is clearly not true in load forecasting, but it is… ▽ More Power load forecast with Machine Learning is a fairly mature application of artificial intelligence and it is indispensable in operation, control and planning. Data selection techniqies have been hardly used in this application. However, the use of such techniques could be beneficial provided the assumption that the data is identically distributed is clearly not true in load forecasting, but it is cyclostationary. In this work we present a fully automatic methodology to determine what are the most adequate data to train a predictor which is based on a full Bayesian probabilistic model. We assess the performance of the method with experiments based on real publicly available data recorded from several years in the United States of America. △ Less

Submitted 15 October, 2019; v1 submitted 2 September, 2019; originally announced September 2019.

arXiv:1907.08202 [pdf, other]

Mathematical description of a Frozen Wave beam after passing through a pair of convex lenses with different focal distance

Authors: Michel Zamboni-Rached, Grazielle de A. Lourenço-Vittorino, T. Viana de Sousa, Joel A. Varela Mendonça, Jessyca N. Pereira, Erasmo Recami

Abstract: In this paper, we shall provide an analytical solution describing a Frozen Wave beam after passing through a pair of convex lenses with different focal distances. In this paper, we shall provide an analytical solution describing a Frozen Wave beam after passing through a pair of convex lenses with different focal distances. △ Less

Submitted 24 September, 2019; v1 submitted 18 July, 2019; originally announced July 2019.

Comments: 5 pages and 1 figure; Replaced with minor improvements about the authors'names and Institutions

MSC Class: 78

arXiv:1903.00660 [pdf, other]

Controlling Robots using Artificial Intelligence and a Consortium Blockchain

Authors: Vasco Lopes, Luís A. Alexandre, Nuno Pereira

Abstract: Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, access… ▽ More Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, accessibility and non-repudiation. In this paper, we propose an architecture that uses blockchain as a ledger and smart-contract technology for robotic control by using external parties, Oracles, to process data. We show how to register events in a secure way, how it is possible to use smart-contracts to control robots and how to interface with external Artificial Intelligence algorithms for image analysis. The proposed architecture is modular and can be used in multiple contexts such as in manufacturing, network control, robot control, and others, since it is easy to integrate, adapt, maintain and extend to new domains. △ Less

Submitted 2 March, 2019; originally announced March 2019.

arXiv:1807.04321 [pdf, ps, other]

Driving forces on dislocations due to strain gradients and higher order gradients

Authors: P. C. N. Pereira, S. W. S. Apolinario

Abstract: Dislocations are topological defects known to be crucial in the onset of plasticity and in many properties of crystals. Classical Elasticity still fails to fully explain their dynamics under extreme conditions of high strain gradients and small scales, which can nowadays be scrutinized. In such conditions, corrections to the Volterra dislocation fields and to the Peach-Koehler force, for example,… ▽ More Dislocations are topological defects known to be crucial in the onset of plasticity and in many properties of crystals. Classical Elasticity still fails to fully explain their dynamics under extreme conditions of high strain gradients and small scales, which can nowadays be scrutinized. In such conditions, corrections to the Volterra dislocation fields and to the Peach-Koehler force, for example, become relevant. One way to go beyond the Volterra solution is to consider other terms in the total Laurent series solution. This is the so called core field. One of its consequences is to predict a driving force on the dislocation due to background strain/stress gradients, which has also been suggested by other core energy calculations. Here we confirm its existence by presenting a direct observation of strain gradients driving edge dislocations in 2D atomistic simulations. We show that, in systems with scale invariance, the results for such core force can be used to obtain the total core energy, allowing a standard value for this energy to be compared with other classical methods of obtaining it. The force measured in our system differs from the prediction obtained by a direct core field analysis. Moreover, we found that higher order gradients of strains can also act as relevant forces. △ Less

Submitted 4 January, 2021; v1 submitted 11 July, 2018; originally announced July 2018.

Comments: 8 pages

arXiv:1712.01118 [pdf, other]

doi 10.1103/PhysRevApplied.10.034023

Structured Light by linking together diffraction-resistant spatially shaped beams: "LEGO-BEAMS"

Authors: Michel Zamboni-Rached, Erasmo Recami, Tarcio A. Vieira, Marcos R. R. Gesualdi, J. N. Pereira

Abstract: In this paper we present a theoretical method, together with its experimental confirmation, to obtain structures of light by connecting diffraction-resistant cylindrical beams of finite lengths and different radii. The resulting "Lego-beams" can assume, on demand, various unprecedented spatial configurations. We also experimentally generate some of them on using a computational holographic techniq… ▽ More In this paper we present a theoretical method, together with its experimental confirmation, to obtain structures of light by connecting diffraction-resistant cylindrical beams of finite lengths and different radii. The resulting "Lego-beams" can assume, on demand, various unprecedented spatial configurations. We also experimentally generate some of them on using a computational holographic technique and a spatial light modulator. Our new, interesting method of linking together various "pieces of light" can find applications in all fields where structured light beams are needed: in particular, such as optical tweezers, e.g. for biological manipulations, optical guiding of atoms, light orbital angular momentum control, holography, lithography, non-linear-optics, interaction of electromagnetic radiation with Bose-Einstein condensates, and so on, besides the field in general of Localized Waves (non-diffracting beams and pulses). △ Less

Submitted 4 April, 2018; v1 submitted 30 November, 2017; originally announced December 2017.

Comments: 20 pages with 6 Figures. Paper submitted for pub. In this Second Version more attention is paid to the applications, and more references are added. Suitable, related modifications are inserted (even in the Abstract)

Journal ref: Phys. Rev. Applied 10, 034023 (2018)

arXiv:1710.07032 [pdf, other]

SLING: A framework for frame semantic parsing

Authors: Michael Ringgaard, Rahul Gupta, Fernando C. N. Pereira

Abstract: We describe SLING, a framework for parsing natural language into semantic frames. SLING supports general transition-based, neural-network parsing with bidirectional LSTM input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. The parsing model is trained end-to-end using only the text tokens as input. The transition system has been designed to output frame graphs directly… ▽ More We describe SLING, a framework for parsing natural language into semantic frames. SLING supports general transition-based, neural-network parsing with bidirectional LSTM input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. The parsing model is trained end-to-end using only the text tokens as input. The transition system has been designed to output frame graphs directly without any intervening symbolic representation. The SLING framework includes an efficient and scalable frame store implementation as well as a neural network JIT compiler for fast inference during parsing. SLING is implemented in C++ and it is available for download on GitHub. △ Less

Submitted 19 October, 2017; originally announced October 2017.

arXiv:1702.04438 [pdf, ps, other]

doi 10.1038/srep44900

A novel procedure for the identification of chaos in complex biological systems

Authors: D. Bazeia, M. B. P. N. Pereira, A. V. Brito, B. F. de Oliveira, J. G. G. S. Ramos

Abstract: We demonstrate the presence of chaos in stochastic simulations that are widely used to study biodiversity in nature. The investigation deals with a set of three distinct species that evolve according to the standard rules of mobility, reproduction and predation, with predation following the cyclic rules of the popular rock, paper and scissors game. The study uncovers the possibility to distinguish… ▽ More We demonstrate the presence of chaos in stochastic simulations that are widely used to study biodiversity in nature. The investigation deals with a set of three distinct species that evolve according to the standard rules of mobility, reproduction and predation, with predation following the cyclic rules of the popular rock, paper and scissors game. The study uncovers the possibility to distinguish between time evolutions that start from slightly different initial states, guided by the Hamming distance which heuristically unveils the chaotic behavior. The finding opens up a quantitative approach that relates the correlation length to the average density of maxima of a typical species, and an ensemble of stochastic simulations is implemented to support the procedure. The main result of the work shows how a single and simple experimental realization that counts the density of maxima associated with the chaotic evolution of the species serves to infer its correlation length. We use the result to investigate others distinct complex systems, one dealing with a set of differential equations that can be used to model a diversity of natural and artificial chaotic systems, and another one, focusing on the ocean water level. △ Less

Submitted 26 March, 2017; v1 submitted 14 February, 2017; originally announced February 2017.

Comments: 11 pages, 8 figures, accepted for publication in Scientific Reports

Journal ref: Sci. Rep. 7, 44900 (2017)

arXiv:1210.4972 [pdf]

Power Challenges of Large Scale Research Infrastructures: the Square Kilometer Array and Solar Energy Integration; Towards a zero-carbon footprint next generation telescope

Authors: Domingos Barbosa, Gonzalo Lobo Márquez, Valeriano Ruiz, Manuel Silva, Lourdes Verdes-Montenegro, Juande Santander-Vela, Dalmiro Maia, Sonia Antón, Arnold van Ardenne, Matthias Vetter, Michael Kramer, Reinhard Keller, Nuno Pereira, Vitor Silva, The BIOSTIRLING Consortium

Abstract: The Square Kilometer Array (SKA) will be the largest Global science project of the next two decades. It will encompass a sensor network dedicated to radioastronomy, covering two continents. It will be constructed in remote areas of South Africa and Australia, spreading over 3000Km, in high solar irradiance latitudes. Solar Power supply is therefore an option to power supply the SKA and contribute… ▽ More The Square Kilometer Array (SKA) will be the largest Global science project of the next two decades. It will encompass a sensor network dedicated to radioastronomy, covering two continents. It will be constructed in remote areas of South Africa and Australia, spreading over 3000Km, in high solar irradiance latitudes. Solar Power supply is therefore an option to power supply the SKA and contribute to a zero carbon footprint next generation telescope. Here we outline the major characteristics of the SKA and some innovation approaches on thermal solar energy Integration with SKA prototypes. △ Less

Submitted 17 October, 2012; originally announced October 2012.

Comments: 4 pages; 3 figures. Paper accepted for presentation at Proceedings of the 2nd International Workshop on Integration on Solar Power into Power Systems; 12th-13th November, 2012, Lisbon, Portugal

arXiv:1109.5936 [pdf, ps, other]

doi 10.1088/0953-8984/24/16/165501

Boron Nitride Nanotubes as Templates for Half-Metal Nanowires

Authors: Ronaldo J. C. Batista, Alan B. de Oliveira, Natália R. Pereira, Rafael S. Paolini, Taíse M. Manhabosco

Abstract: We investigate by means of DFT/GGA+U calculations the electronic and structural properties of magnetic nanotubes composed of an iron oxide monolayer and (n,0) Boron Nitride (BN) nanotubes, with n ranging from 6 up to 14. The formation energy per FeO molecule of FeO covered tubes is smaller than the formation energy of small FeO nanoparticles which suggest that the FeO molecules may cover the BN na… ▽ More We investigate by means of DFT/GGA+U calculations the electronic and structural properties of magnetic nanotubes composed of an iron oxide monolayer and (n,0) Boron Nitride (BN) nanotubes, with n ranging from 6 up to 14. The formation energy per FeO molecule of FeO covered tubes is smaller than the formation energy of small FeO nanoparticles which suggest that the FeO molecules may cover the BN nanotubes rather than to aggregate to form the FeO bulk. We propose a continuous model for the FeO covered BN nanotubes formation energy which predicts that BN tubes with diameter of roughly 13 Åare the most stable. Unlike carbon nanotubes, the band structure of FeO covered BN nanotubes can not be obtained by slicing the band structure of a FeO layer, the curvature and the interaction with the BN tube is determinant for the electronic behavior of FeO covered tubes. As a result the tubes are semiconductors, intrinsic half-metals or semi-half-metals that can become half-metals charged with either electrons and holes. Such a result may be important in the spintronics context. △ Less

Submitted 27 September, 2011; originally announced September 2011.

Comments: 6 pages, 6 figures

arXiv:1109.3967 [pdf, ps, other]

doi 10.1051/0004-6361/201117482

Intracluster light in clusters of galaxies at redshifts 0.4<z<0.8

Authors: L. Guennou, C. Adami, C. Da Rocha, F. Durret, M. P. Ulmer, S. Allam, S. Basa, C. Benoist, A. Biviano, D. Clowe, R. Gavazzi, C. Halliday, O. Ilbert, D. Johnston, D. Just, R. Kron, J. M. Kubo, V. LeBrun, P. Marshall, A. Mazure, K. J. Murphy, D. N. E. Pereira, C. R. Rabaca, F. Rostagni, G. Rudnick , et al. (5 additional authors not shown)

Abstract: The study of intracluster light can help us to understand the mechanisms taking place in galaxy clusters, and to place constraints on the cluster formation history and physical properties. However, owing to the intrinsic faintness of ICL emission, most searches and detailed studies of ICL have been limited to redshifts z<0.4.We search for ICL in a subsample of ten clusters detected by the ESO Dist… ▽ More The study of intracluster light can help us to understand the mechanisms taking place in galaxy clusters, and to place constraints on the cluster formation history and physical properties. However, owing to the intrinsic faintness of ICL emission, most searches and detailed studies of ICL have been limited to redshifts z<0.4.We search for ICL in a subsample of ten clusters detected by the ESO Distant Cluster Survey (EDisCS), at redshifts 0.4<z<0.8, that are also part of our DAFT/FADA Survey. We analyze the ICL by applying the OV WAV package, a wavelet-based technique, to deep HST ACS images in the F814W filter and to V-band VLT/FORS2 images of three clusters. Detection levels are assessed as a function of the diffuse light source surface brightness using simulations. In the F814W filter images, we detect diffuse light sources in all the clusters, with typical sizes of a few tens of kpc (assuming that they are at the cluster redshifts). The ICL detected by stacking the ten F814W images shows an 8sigma detection in the source center extending over a ~50x50kpc2 area, with a total absolute magnitude of -21.6 in the F814W filter, equivalent to about two L* galaxies per cluster. We find a weak correlation between the total F814W absolute magnitude of the ICL and the cluster velocity dispersion and mass. There is no apparent correlation between the cluster mass-to-light ratio (M/L) and the amount of ICL, and no evidence for any preferential orientation in the ICL source distribution. We find no strong variation in the amount of ICL between z=0 and z=0.8. In addition, we find wavelet-detected compact objects (WDCOs) in the three clusters for which data in two bands are available; these objects are probably very faint compact galaxies that in some cases are members of the respective clusters. We have shown that ICL is important in clusters at least up to z=0.8. △ Less

Submitted 19 September, 2011; originally announced September 2011.

Comments: Accepted in A&A. Six figures in jpg format. Paper still to be improved by A&A english corrector

arXiv:astro-ph/0210576 [pdf, ps, other]

doi 10.1007/10857603_31

Optical Diffuse Light in Nearby Compact Groups

Authors: C. Mendes de Oliveira, C. Da Rocha, C. R. Rabaça, D. N. E. Pereira, M. Bolte

Abstract: Analyses of B and R band observations of four compact groups reveal the presence of a considerable amount of diffuse, intergalactic light in two of them (HCG 79 and HCG 95). The diffuse component is presumably due to stellar material that has been tidally stripped from the galaxy group members. A new approach is used to measure this diffuse background light, using wavelet techniques for detectin… ▽ More Analyses of B and R band observations of four compact groups reveal the presence of a considerable amount of diffuse, intergalactic light in two of them (HCG 79 and HCG 95). The diffuse component is presumably due to stellar material that has been tidally stripped from the galaxy group members. A new approach is used to measure this diffuse background light, using wavelet techniques for detecting low surface brightness signals. The diffuse light component has a mean colour of $(B-R)$ = 1.4 - 1.5$\pm$0.1 and it comprises the following fractions of the total group light in the B band: 18%, 12%, 3% and 0% for groups HCG 95, HCG 79, HCG 92 and HCG 88, respectively. The diffuse light content of a group may represent an efficient tool for the determination of how long groups have been together in a compact configuration. △ Less

Submitted 12 November, 2002; v1 submitted 25 October, 2002; originally announced October 2002.

Comments: 4 pages, 2 figures. To appear in the proceedings of the ESO workshop "Extragalactic Globular Cluster Systems", August 2002, Garching, ed. M. Kissler-Patig, Springer-Verlag

arXiv:cs/9809110 [pdf, ps, other]

Similarity-Based Models of Word Cooccurrence Probabilities

Authors: Ido Dagan, Lillian Lee, Fernando C. N. Pereira

Abstract: In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine the likelihood of a word combination from its frequency in a training corpus. However, the… ▽ More In many applications of natural language processing (NLP) it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine the likelihood of a word combination from its frequency in a training corpus. However, the nature of language is such that many word combinations are infrequent and do not occur in any given corpus. In this work we propose a method for estimating the probability of such previously unseen word combinations using available information on ``most similar'' words. We describe probabilistic word association models based on distributional word similarity, and apply them to two tasks, language modeling and pseudo-word disambiguation. In the language modeling task, a similarity-based model is used to improve probability estimates for unseen bigrams in a back-off language model. The similarity-based method yields a 20% perplexity improvement in the prediction of unseen bigrams and statistically significant reductions in speech-recognition error. We also compare four similarity-based estimation methods against back-off and maximum-likelihood estimation methods on a pseudo-word sense disambiguation task in which we controlled for both unigram and bigram frequency to avoid giving too much weight to easy-to-disambiguate high-frequency configurations. The similarity-based methods perform up to 40% better on this particular task. △ Less

Submitted 27 September, 1998; originally announced September 1998.

Comments: 26 pages, 5 figures

ACM Class: I.2.7; I.2.6

Journal ref: Machine Learning, 34, 43-69 (1999)

arXiv:cmp-lg/9706007 [pdf, ps, other]

Aggregate and mixed-order Markov models for statistical language processing

Authors: Lawrence Saul, Fernando Pereira

Abstract: We consider the use of language models whose size and accuracy are intermediate between different order n-gram models. Two types of models are studied in particular. Aggregate Markov models are class-based bigram models in which the mapping from words to classes is probabilistic. Mixed-order Markov models combine bigram models whose predictions are conditioned on different words. Both types of m… ▽ More We consider the use of language models whose size and accuracy are intermediate between different order n-gram models. Two types of models are studied in particular. Aggregate Markov models are class-based bigram models in which the mapping from words to classes is probabilistic. Mixed-order Markov models combine bigram models whose predictions are conditioned on different words. Both types of models are trained by Expectation-Maximization (EM) algorithms for maximum likelihood estimation. We examine smoothing procedures in which these models are interposed between different order n-grams. This is found to significantly reduce the perplexity of unseen word combinations. △ Less

Submitted 9 June, 1997; originally announced June 1997.

Comments: 9 pages, 4 PostScript figures, uses psfig.sty and aclap.sty; to appear in the proceedings of EMNLP-2

arXiv:cmp-lg/9607016 [pdf, ps, other]

Beyond Word N-Grams

Authors: Fernando C. N. Pereira, Yoram Singer, Naftali Tishby

Abstract: We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mix… ▽ More We describe, analyze, and evaluate experimentally a new probabilistic model for word-sequence prediction in natural language based on prediction suffix trees (PSTs). By using efficient data structures, we extend the notion of PST to unbounded vocabularies. We also show how to use a Bayesian approach based on recursive priors over all possible PSTs to efficiently maintain tree mixtures. These mixtures have provably and practically better performance than almost any single model. We evaluate the model on several corpora. The low perplexity achieved by relatively small PST mixture models suggests that they may be an advantageous alternative, both theoretically and practically, to the widely used n-gram models. △ Less

Submitted 13 July, 1996; originally announced July 1996.

Comments: 15 pages, one PostScript figure, uses psfig.sty and fullname.sty. Revised version of a paper in the Proceedings of the Third Workshop on Very Large Corpora, MIT, 1995

arXiv:cmp-lg/9603002 [pdf, ps, other]

Finite-State Approximation of Phrase-Structure Grammars

Authors: Fernando C. N. Pereira, Rebecca N. Wright

Abstract: Phrase-structure grammars are effective models for important syntactic and semantic aspects of natural languages, but can be computationally too demanding for use as language models in real-time speech recognition. Therefore, finite-state models are used instead, even though they lack expressive power. To reconcile those two alternatives, we designed an algorithm to compute finite-state approxim… ▽ More Phrase-structure grammars are effective models for important syntactic and semantic aspects of natural languages, but can be computationally too demanding for use as language models in real-time speech recognition. Therefore, finite-state models are used instead, even though they lack expressive power. To reconcile those two alternatives, we designed an algorithm to compute finite-state approximations of context-free grammars and context-free-equivalent augmented phrase-structure grammars. The approximation is exact for certain context-free grammars generating regular languages, including all left-linear and right-linear context-free grammars. The algorithm has been used to build finite-state language models for limited-domain speech recognition tasks. △ Less

Submitted 8 March, 1996; originally announced March 1996.

Comments: 24 pages, uses psfig.sty; revised and extended version of the 1991 ACL meeting paper with the same title

arXiv:cmp-lg/9603001 [pdf, ps, other]

Speech Recognition by Composition of Weighted Finite Automata

Authors: Fernando C. N. Pereira, Michael D. Riley

Abstract: We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient alg… ▽ More We present a general framework based on weighted finite automata and weighted finite-state transducers for describing and implementing speech recognizers. The framework allows us to represent uniformly the information sources and data structures used in recognition, including context-dependent units, pronunciation dictionaries, language models and lattices. Furthermore, general but efficient algorithms can used for combining information sources in actual recognizers and for optimizing their application. In particular, a single composition algorithm is used both to combine in advance information sources such as language models and dictionaries, and to combine acoustic observations and information sources dynamically during recognition. △ Less

Submitted 7 March, 1996; originally announced March 1996.

Comments: 24 pages, uses psfig.sty

arXiv:cmp-lg/9504029 [pdf, ps, other]

Quantifiers, Anaphora, and Intensionality

Authors: Mary Dalrymple, John Lamping, Fernando Pereira, Vijay Saraswat

Abstract: The relationship between Lexical-Functional Grammar (LFG) {\em functional structures} (f-structures) for sentences and their semantic interpretations can be expressed directly in a fragment of linear logic in a way that correctly explains the constrained interactions between quantifier scope ambiguity, bound anaphora and intensionality. This deductive approach to semantic interpretaion obviates… ▽ More The relationship between Lexical-Functional Grammar (LFG) {\em functional structures} (f-structures) for sentences and their semantic interpretations can be expressed directly in a fragment of linear logic in a way that correctly explains the constrained interactions between quantifier scope ambiguity, bound anaphora and intensionality. This deductive approach to semantic interpretaion obviates the need for additional mechanisms, such as Cooper storage, to represent the possible scopes of a quantified NP, and explains the interactions between quantified NPs, anaphora and intensional verbs such as `seek'. A single specification in linear logic of the argument requirements of intensional verbs is sufficient to derive the correct reading predictions for intensional-verb clauses both with nonquantified and with quantified direct objects. In particular, both de dicto and de re readings are derived for quantified objects. The effects of type-raising or quantifying-in rules in other frameworks here just follow as linear-logic theorems. While our approach resembles current categorial approaches in important ways, it differs from them in allowing the greater type flexibility of categorial semantics while maintaining a precise connection to syntax. As a result, we are able to provide derivations for certain readings of sentences with intensional verbs and complex direct objects that are not derivable in current purely categorial accounts of the syntax-semantics interface. △ Less

Submitted 28 April, 1995; v1 submitted 28 April, 1995; originally announced April 1995.

Comments: 41 pages, uses lingmacros.sty, fullname.sty, lfgmacros.tex, tree-dvips.sty, tree-dvips.pro and attached macros. Extends and revises cmp-lg/9404009 and cmp-lg/9404010

arXiv:cmp-lg/9503008 [pdf, ps, other]

Ellipsis and Higher-Order Unification

Authors: Mary Dalrymple, Stuart M. Shieber, Fernando C. N. Pereira

Abstract: We present a new method for characterizing the interpretive possibilities generated by elliptical constructions in natural language. Unlike previous analyses, which postulate ambiguity of interpretation or derivation in the full clause source of the ellipsis, our analysis requires no such hidden ambiguity. Further, the analysis follows relatively directly from an abstract statement of the ellipsis… ▽ More We present a new method for characterizing the interpretive possibilities generated by elliptical constructions in natural language. Unlike previous analyses, which postulate ambiguity of interpretation or derivation in the full clause source of the ellipsis, our analysis requires no such hidden ambiguity. Further, the analysis follows relatively directly from an abstract statement of the ellipsis interpretation problem. It predicts correctly a wide range of interactions between ellipsis and other semantic phenomena such as quantifier scope and bound anaphora. Finally, although the analysis itself is stated nonprocedurally, it admits of a direct computational method for generating interpretations. △ Less

Submitted 8 March, 1995; originally announced March 1995.

Comments: 54 pages

Report number: CSLI-19-91 and Xerox SSL-91-105

Journal ref: Linguistics and Philosophy 14(4):399-452

arXiv:cmp-lg/9408011 [pdf, ps, other]

Distributional Clustering of English Words

Authors: Fernando Pereira, Naftali Tishby, Lillian Lee

Abstract: We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the bas… ▽ More We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data. △ Less

Submitted 22 August, 1994; originally announced August 1994.

Comments: 8 pages, appeared in the proceedings of ACL-93, Columbus, Ohio

arXiv:cmp-lg/9405001 [pdf, ps, other]

Similarity-Based Estimation of Word Cooccurrence Probabilities

Authors: Ido Dagan, Fernando Pereira, Lillian Lee

Abstract: In many applications of natural language processing it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine the likelihood of a word combination according to its frequency in a training corpus. However, th… ▽ More In many applications of natural language processing it is necessary to determine the likelihood of a given word combination. For example, a speech recognizer may need to determine which of the two word combinations ``eat a peach'' and ``eat a beach'' is more likely. Statistical NLP methods determine the likelihood of a word combination according to its frequency in a training corpus. However, the nature of language is such that many word combinations are infrequent and do not occur in a given corpus. In this work we propose a method for estimating the probability of such previously unseen word combinations using available information on ``most similar'' words. We describe a probabilistic word association model based on distributional word similarity, and apply it to improving probability estimates for unseen word bigrams in a variant of Katz's back-off model. The similarity-based method yields a 20% perplexity improvement in the prediction of unseen bigrams and statistically significant reductions in speech-recognition error. △ Less

Submitted 2 May, 1994; originally announced May 1994.

Comments: 13 pages, to appear in proceedings of ACL-94

arXiv:cmp-lg/9404010 [pdf, ps, other]

Intensional Verbs Without Type-Raising or Lexical Ambiguity

Authors: Mary Dalrymple, John Lamping, Fernando Pereira, Vijay Saraswat

Abstract: We present an analysis of the semantic interpretation of intensional verbs such as seek that allows them to take direct objects of either individual or quantifier type, producing both de dicto and de re readings in the quantifier case, all without needing to stipulate type-raising or quantifying-in rules. This simple account follows directly from our use of logical deduction in linear logic to e… ▽ More We present an analysis of the semantic interpretation of intensional verbs such as seek that allows them to take direct objects of either individual or quantifier type, producing both de dicto and de re readings in the quantifier case, all without needing to stipulate type-raising or quantifying-in rules. This simple account follows directly from our use of logical deduction in linear logic to express the relationship between syntactic structures and meanings. While our analysis resembles current categorial approaches in important ways, it differs from them in allowing the greater type flexibility of categorial semantics while maintaining a precise connection to syntax. As a result, we are able to provide derivations for certain readings of sentences with intensional verbs and complex direct objects that are not derivable in current purely categorial accounts of the syntax-semantics interface. The analysis forms a part of our ongoing work on semantic interpretation within the framework of Lexical-Functional Grammar. △ Less

Submitted 22 August, 1994; v1 submitted 27 April, 1994; originally announced April 1994.

Comments: 16 pages, revised and extended, to appear in the proceedings of the Conference on Information-Oriented Approaches to Logic, Language and Computation

Report number: ISTL-NLTT-1994-02-01

arXiv:cmp-lg/9404009 [pdf, ps, other]

A Deductive Account of Quantification in LFG

Authors: Mary Dalrymple, John Lamping, Fernando Pereira, Vijay Saraswat

Abstract: The relationship between Lexical-Functional Grammar (LFG) functional structures (f-structures) for sentences and their semantic interpretations can be expressed directly in a fragment of linear logic in a way that explains correctly the constrained interactions between quantifier scope ambiguity and bound anaphora. The use of a deductive framework to account for the compositional properties of q… ▽ More The relationship between Lexical-Functional Grammar (LFG) functional structures (f-structures) for sentences and their semantic interpretations can be expressed directly in a fragment of linear logic in a way that explains correctly the constrained interactions between quantifier scope ambiguity and bound anaphora. The use of a deductive framework to account for the compositional properties of quantifying expressions in natural language obviates the need for additional mechanisms, such as Cooper storage, to represent the different scopes that a quantifier might take. Instead, the semantic contribution of a quantifier is recorded as an ordinary logical formula, one whose use in a proof will establish the scope of the quantifier. The properties of linear logic ensure that each quantifier is scoped exactly once. Our analysis of quantifier scope can be seen as a recasting of Pereira's analysis (Pereira, 1991), which was expressed in higher-order intuitionistic logic. But our use of LFG and linear logic provides a much more direct and computationally more flexible interpretation mechanism for at least the same range of phenomena. We have developed a preliminary Prolog implementation of the linear deductions described in this work. △ Less

Submitted 27 May, 1994; v1 submitted 27 April, 1994; originally announced April 1994.

Comments: 27 pages, extensively revised

Report number: ISTL-NLTT-1993-06-01

arXiv:cmp-lg/9404008 [pdf, ps, other]

Principles and Implementation of Deductive Parsing

Authors: Stuart M. Shieber, Yves Schabes, Fernando C. N. Pereira

Abstract: We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars a… ▽ More We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars. △ Less

Submitted 26 April, 1994; originally announced April 1994.

Comments: 69 pages, includes full Prolog code

Report number: CRCT TR-11-94 (Computer Science Department, Harvard University)

Showing 1–42 of 42 results for author: Pereira, N