Skip to main content

Showing 1–50 of 95 results for author: Pino, J

.
  1. arXiv:2501.18882  [pdf, other

    cond-mat.mes-hall physics.optics quant-ph

    Programmable Synthetic Magnetism and Chiral Edge States in Nano-Optomechanical Quantum Hall Networks

    Authors: Jesse J. Slim, Javier del Pino, Ewold Verhagen

    Abstract: Artificial magnetic fields break time-reversal symmetry in engineered materials--also known as metamaterials, enabling robust, topological transport of neutral excitations, much like electronic conduction edge channels in the integer quantum Hall effect. We experimentally demonstrate the emergence of quantum-Hall-like chiral edge states in optomechanical resonator networks. Synthetic magnetic fiel… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: Main text (6 pages, 4 figures), Appendices (7 pages, 3 figures)

  2. arXiv:2410.00215  [pdf, other

    cs.LG

    Characterizing and Efficiently Accelerating Multimodal Generation Model Inference

    Authors: Yejin Lee, Anna Sun, Basil Hosmer, Bilge Acun, Can Balioglu, Changhan Wang, Charles David Hernandez, Christian Puhrsch, Daniel Haziza, Driss Guessous, Francisco Massa, Jacob Kahn, Jeffrey Wan, Jeremy Reizenstein, Jiaqi Zhai, Joe Isaacson, Joel Schlosser, Juan Pino, Kaushik Ram Sadagopan, Leonid Shamis, Linjian Ma, Min-Jae Hwang, Mingda Chen, Mostafa Elhoushi, Pedro Rodriguez , et al. (5 additional authors not shown)

    Abstract: Generative artificial intelligence (AI) technology is revolutionizing the computing industry. Not only its applications have broadened to various sectors but also poses new system design and optimization opportunities. The technology is capable of understanding and responding in multiple modalities. However, the advanced capability currently comes with significant system resource demands. To susta… ▽ More

    Submitted 9 May, 2025; v1 submitted 30 September, 2024; originally announced October 2024.

    Comments: 13 pages including references. 8 Figures. Under review to HPCA 2025 Industry Track

  3. arXiv:2409.15138  [pdf, other

    cond-mat.mes-hall physics.class-ph physics.optics

    Fluctuation instabilities via internal resonance in a multimode membrane as a mechanism for frequency combs

    Authors: Mengqi Fu, Orjan Ameye, Fan Yang, Jan Košata, Javier del Pino, Oded Zilberberg, Elke Scheer

    Abstract: We explore self-induced parametric coupling, also called internal resonances (IRs), in a membrane nanoelectromechanical system. Specifically, we focus on the formation of a limit cycle manifesting as a phononic frequency comb. Utilizing a pump-noisy-probe technique and theoretical modeling, we reveal the behavior of mechanical excitations revealing themselves as sidebands of the stationary IR resp… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  4. arXiv:2408.15794  [pdf, other

    cond-mat.mes-hall

    Slow and fast topological dynamical phase transitions in a Duffing resonator driven by two detuned tones

    Authors: Letizia Catalini, Javier del Pino, Soumya S. Kumar, Vincent Dumont, Gabriel Margiani, Oded Zilberberg, Alexander Eichler

    Abstract: The combination of a strong pump and a weak probe has been widely applied to investigate both optical and nanomechanical devices. Such pump-probe measurements allows for the exploration of nonlinear dynamics, driven by the large pump tone, by measuring the system response to a probe tone. In contrast, here we report on the dynamics of a mechanical Duffing resonator driven with a combination of two… ▽ More

    Submitted 17 December, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

    Comments: 9 pages, 6 figures

  5. arXiv:2406.16591  [pdf, other

    cond-mat.mes-hall

    Topological classification of driven-dissipative nonlinear systems

    Authors: Greta Villa, Javier del Pino, Vincent Dumont, Gianluca Rastelli, Mateusz Michałek, Alexander Eichler, Oded Zilberberg

    Abstract: In topology, one averages over local geometrical details to reveal robust global features. This approach proves crucial for understanding quantized bulk transport and exotic boundary effects of linear wave propagation in (meta-)materials. Moving beyond linear Hamiltonian systems, the study of topology in physics strives to characterize open (non-Hermitian) and interacting systems. Here, we establi… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures (G.V. and J.d.P. contributed equally to this work)

  6. The computational power of random quantum circuits in arbitrary geometries

    Authors: Matthew DeCross, Reza Haghshenas, Minzhao Liu, Enrico Rinaldi, Johnnie Gray, Yuri Alexeev, Charles H. Baldwin, John P. Bartolotta, Matthew Bohn, Eli Chertkov, Julia Cline, Jonhas Colina, Davide DelVento, Joan M. Dreiling, Cameron Foltz, John P. Gaebler, Thomas M. Gatterman, Christopher N. Gilbreth, Joshua Giles, Dan Gresh, Alex Hall, Aaron Hankin, Azure Hansen, Nathan Hewitt, Ian Hoffman , et al. (27 additional authors not shown)

    Abstract: Empirical evidence for a gap between the computational powers of classical and quantum computers has been provided by experiments that sample the output distributions of two-dimensional quantum circuits. Many attempts to close this gap have utilized classical simulations based on tensor network techniques, and their limitations shed light on the improvements to quantum hardware required to frustra… ▽ More

    Submitted 21 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Includes minor updates to the text and an updated author list to include researchers who made technical contributions in upgrading the machine to 56 qubits but were left off the original version by mistake

    Journal ref: Physical Review X 15, 021052 (2025)

  7. Residual-based Attention Physics-informed Neural Networks for Spatio-Temporal Ageing Assessment of Transformers Operated in Renewable Power Plants

    Authors: Ibai Ramirez, Joel Pino, David Pardo, Mikel Sanz, Luis del Rio, Alvaro Ortiz, Kateryna Morozovska, Jose I. Aizpurua

    Abstract: Transformers are crucial for reliable and efficient power system operations, particularly in supporting the integration of renewable energy. Effective monitoring of transformer health is critical to maintain grid stability and performance. Thermal insulation ageing is a key transformer failure mode, which is generally tracked by monitoring the hotspot temperature (HST). However, HST measurement is… ▽ More

    Submitted 3 October, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: 23 pages, 18 figures

  8. arXiv:2404.16728  [pdf, other

    quant-ph

    High-fidelity and Fault-tolerant Teleportation of a Logical Qubit using Transversal Gates and Lattice Surgery on a Trapped-ion Quantum Computer

    Authors: C. Ryan-Anderson, N. C. Brown, C. H. Baldwin, J. M. Dreiling, C. Foltz, J. P. Gaebler, T. M. Gatterman, N. Hewitt, C. Holliman, C. V. Horst, J. Johansen, D. Lucchetti, T. Mengle, M. Matheny, Y. Matsuoka, K. Mayer, M. Mills, S. A. Moses, B. Neyenhuis, J. Pino, P. Siegfried, R. P. Stutz, J. Walker, D. Hayes

    Abstract: Quantum state teleportation is commonly used in designs for large-scale fault-tolerant quantum computers. Using Quantinuum's H2 trapped-ion quantum processor, we implement the first demonstration of a fault-tolerant state teleportation circuit for a quantum error correction code - in particular, the planar topological [[7,1,3]] color code, or Steane code. The circuits use up to 30 trapped ions at… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  9. arXiv:2404.08616  [pdf, other

    quant-ph

    Benchmarking logical three-qubit quantum Fourier transform encoded in the Steane code on a trapped-ion quantum computer

    Authors: Karl Mayer, Ciarán Ryan-Anderson, Natalie Brown, Elijah Durso-Sabina, Charles H. Baldwin, David Hayes, Joan M. Dreiling, Cameron Foltz, John P. Gaebler, Thomas M. Gatterman, Justin A. Gerber, Kevin Gilmore, Dan Gresh, Nathan Hewitt, Chandler V. Horst, Jacob Johansen, Tanner Mengle, Michael Mills, Steven A. Moses, Peter E. Siegfried, Brian Neyenhuis, Juan Pino, Russell Stutz

    Abstract: We implement logically encoded three-qubit circuits for the quantum Fourier transform (QFT), using the [[7,1,3]] Steane code, and benchmark the circuits on the Quantinuum H2-1 trapped-ion quantum computer. The circuits require multiple logical two-qubit gates, which are implemented transversally, as well as logical non-Clifford single-qubit rotations, which are performed by non-fault-tolerant stat… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  10. arXiv:2404.02280  [pdf, other

    quant-ph

    Demonstration of logical qubits and repeated error correction with better-than-physical error rates

    Authors: A. Paetznick, M. P. da Silva, C. Ryan-Anderson, J. M. Bello-Rivas, J. P. Campora III, A. Chernoguzov, J. M. Dreiling, C. Foltz, F. Frachon, J. P. Gaebler, T. M. Gatterman, L. Grans-Samuelsson, D. Gresh, D. Hayes, N. Hewitt, C. Holliman, C. V. Horst, J. Johansen, D. Lucchetti, Y. Matsuoka, M. Mills, S. A. Moses, B. Neyenhuis, A. Paz, J. Pino , et al. (7 additional authors not shown)

    Abstract: The promise of quantum computers hinges on the ability to scale to large system sizes, e.g., to run quantum computations consisting of more than 100 million operations fault-tolerantly. This in turn requires suppressing errors to levels inversely proportional to the size of the computation. As a step towards this ambitious goal, we present experiments on a trapped-ion QCCD processor where, through… ▽ More

    Submitted 17 November, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: (v1) 13 pages, 8 figures; (v2) Fixed typos, added authors; (v3) Added Carbon details (instead of separate article), improved decoder, got more data, added authors, fixed misinterpreted physical teleportation baseline, added a figure, and fixed typos

  11. arXiv:2403.14402  [pdf, other

    cs.SD cs.CL eess.AS

    XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

    Authors: HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang

    Abstract: Speech recognition and translation systems perform poorly on noisy inputs, which are frequent in realistic environments. Augmenting these systems with visual signals has the potential to improve robustness to noise. However, audio-visual (AV) data is only available in limited amounts and for fewer languages than audio-only resources. To address this gap, we present XLAVS-R, a cross-lingual audio-v… ▽ More

    Submitted 12 August, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: ACL2024

  12. arXiv:2402.05755  [pdf, other

    cs.CL cs.SD eess.AS

    Spirit LM: Interleaved Spoken and Written Language Model

    Authors: Tu Anh Nguyen, Benjamin Muller, Bokai Yu, Marta R. Costa-jussa, Maha Elbayad, Sravya Popuri, Christophe Ropers, Paul-Ambroise Duquenne, Robin Algayres, Ruslan Mavlyutov, Itai Gat, Mary Williamson, Gabriel Synnaeve, Juan Pino, Benoit Sagot, Emmanuel Dupoux

    Abstract: We introduce Spirit LM, a foundation multimodal language model that freely mixes text and speech. Our model is based on a 7B pretrained text language model that we extend to the speech modality by continuously training it on text and speech units. Speech and text sequences are concatenated as a single stream of tokens, and trained with a word-level interleaving method using a small automatically-c… ▽ More

    Submitted 18 October, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  13. arXiv:2312.05187  [pdf, other

    cs.CL cs.SD eess.AS

    Seamless: Multilingual Expressive and Streaming Speech Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek , et al. (40 additional authors not shown)

    Abstract: Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  14. arXiv:2311.16273  [pdf, other

    cond-mat.mes-hall physics.ins-det

    Near-resonant nuclear spin detection with high-frequency mechanical resonators

    Authors: Diego A. Visani, Letizia Catalini, Christian L. Degen, Alexander Eichler, Javier del Pino

    Abstract: Mechanical resonators operating in the high-frequency regime have become a versatile platform for fundamental and applied quantum research. Their exceptional properties, such as low mass and high quality factor, make them also very appealing for force sensing experiments. In this Letter, we propose a method for detecting and ultimately controlling nuclear spins by directly coupling them to high-fr… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Includes Supplemental Material

  15. Measuring the Loschmidt amplitude for finite-energy properties of the Fermi-Hubbard model on an ion-trap quantum computer

    Authors: Kévin Hémery, Khaldoon Ghanem, Eleanor Crane, Sara L. Campbell, Joan M. Dreiling, Caroline Figgatt, Cameron Foltz, John P. Gaebler, Jacob Johansen, Michael Mills, Steven A. Moses, Juan M. Pino, Anthony Ransford, Mary Rowe, Peter Siegfried, Russell P. Stutz, Henrik Dreyer, Alexander Schuckert, Ramil Nigmatullin

    Abstract: Calculating the equilibrium properties of condensed matter systems is one of the promising applications of near-term quantum computing. Recently, hybrid quantum-classical time-series algorithms have been proposed to efficiently extract these properties from a measurement of the Loschmidt amplitude $\langle ψ| e^{-i \hat H t}|ψ\rangle$ from initial states $|ψ\rangle$ and a time evolution under the… ▽ More

    Submitted 22 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: 18 pages, 12 figures

    Journal ref: PRX Quantum 5, 030323 (2023)

  16. arXiv:2309.05825  [pdf, other

    quant-ph cond-mat.mes-hall

    Optomechanical realization of the bosonic Kitaev-Majorana chain

    Authors: Jesse J. Slim, Clara C. Wanjura, Matteo Brunelli, Javier del Pino, Andreas Nunnenkamp, Ewold Verhagen

    Abstract: The fermionic Kitaev chain is a canonical model featuring topological Majorana zero modes. We report the experimental realization of its bosonic analogue in a nano-optomechanical network where parametric interactions induce two-mode squeezing and beamsplitter coupling among the nanomechanical modes, equivalent to hopping and superconductor pairing in the fermionic case, respectively. We observe se… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: 21 pages, 5 figures

    Journal ref: Nature 627, 767 (2024)

  17. arXiv:2308.11596  [pdf, other

    cs.CL

    SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

    Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye, Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim , et al. (43 additional authors not shown)

    Abstract: What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages? While recent breakthroughs in text-based models have pushed machine translation coverage beyond 200 languages, unified speech-to-speech translation models have yet to achieve similar strides. More specifically, conventional speech-to-speech translation systems rely on cascaded s… ▽ More

    Submitted 24 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    ACM Class: I.2.7

  18. arXiv:2308.06092  [pdf, other

    nlin.AO cond-mat.mes-hall cond-mat.quant-gas

    Limit cycles as stationary states of an extended Harmonic Balance ansatz

    Authors: Javier del Pino, Jan Košata, Oded Zilberberg

    Abstract: A limit cycle is a self-sustained periodic motion appearing in autonomous ordinary differential equations. As the period of the limit cycle is a-priori unknown, it is challenging to find them as stationary states of a rotating ansatz. Correspondingly, their study commonly relies on brute-force time-evolution or on circumstantial evidence such as instabilities of fixed points. Alas, such approaches… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: Includes Supplemental Material

  19. arXiv:2307.13676  [pdf, other

    physics.class-ph physics.app-ph

    A biased Ising model using two coupled Kerr parametric oscillators with external force

    Authors: Pablo Álvarez, Davide Pittilini, Filippo Miserocchi, Sathyanarayanan Raamamurthy, Gabriel Margiani, Orjan Ameye, Javier del Pino, Oded Zilberberg, Alexander Eichler

    Abstract: Networks of coupled Kerr parametric oscillators (KPOs) are a leading physical platform for analog solving of complex optimization problems. These systems are colloquially known as ``Ising machines''. We experimentally and theoretically study such a network under the influence of an external force. The force breaks the collective phase-parity symmetry of the system and competes with the intrinsic c… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: 8 pages, 6 figures

  20. arXiv:2307.08655  [pdf, other

    cs.CL cs.SD eess.AS

    Multilingual Speech-to-Speech Translation into Multiple Target Languages

    Authors: Hongyu Gong, Ning Dong, Sravya Popuri, Vedanuj Goswami, Ann Lee, Juan Pino

    Abstract: Speech-to-speech translation (S2ST) enables spoken communication between people talking in different languages. Despite a few studies on multilingual S2ST, their focus is the multilinguality on the source side, i.e., the translation from multiple source languages to one target language. We present the first work on multilingual S2ST supporting multiple target languages. Leveraging recent advance i… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  21. arXiv:2306.07897  [pdf, ps, other

    math.AG cond-mat.soft math-ph nlin.PS physics.class-ph

    Khovanskii bases for semimixed systems of polynomial equations -- a case of approximating stationary nonlinear Newtonian dynamics

    Authors: Viktoriia Borovik, Paul Breiding, Javier del Pino, Mateusz Michałek, Oded Zilberberg

    Abstract: We provide an approach to counting roots of polynomial systems, where each polynomial is a general linear combination of prescribed, fixed polynomials. Our tools rely on the theory of Khovanskii bases, combined with toric geometry, the Bernstein-Khovanskii-Kushnirenko (BKK) Theorem, and fiber products. As a direct application of this theory, we solve the problem of counting the number of approxi… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  22. arXiv:2306.01084  [pdf, other

    cs.SD eess.AS

    Exploration on HuBERT with Multiple Resolutions

    Authors: Jiatong Shi, Yun Tang, Hirofumi Inaguma, Hongyu GOng, Juan Pino, Shinji Watanabe

    Abstract: Hidden-unit BERT (HuBERT) is a widely-used self-supervised learning (SSL) model in speech processing. However, we argue that its fixed 20ms resolution for hidden representations would not be optimal for various speech-processing tasks since their attributes (e.g., speaker characteristics and semantics) are based on different time scales. To address this limitation, we propose utilizing HuBERT repr… ▽ More

    Submitted 22 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted to Interspeech2023

  23. A Race Track Trapped-Ion Quantum Processor

    Authors: S. A. Moses, C. H. Baldwin, M. S. Allman, R. Ancona, L. Ascarrunz, C. Barnes, J. Bartolotta, B. Bjork, P. Blanchard, M. Bohn, J. G. Bohnet, N. C. Brown, N. Q. Burdick, W. C. Burton, S. L. Campbell, J. P. Campora III, C. Carron, J. Chambers, J. W. Chan, Y. H. Chen, A. Chernoguzov, E. Chertkov, J. Colina, J. P. Curtis, R. Daniel , et al. (71 additional authors not shown)

    Abstract: We describe and benchmark a new quantum charge-coupled device (QCCD) trapped-ion quantum computer based on a linear trap with periodic boundary conditions, which resembles a race track. The new system successfully incorporates several technologies crucial to future scalability, including electrode broadcasting, multi-layer RF routing, and magneto-optical trap (MOT) loading, while maintaining, and… ▽ More

    Submitted 16 May, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: 24 pages, 24 figures. Made some minor edits and added several more authors

    Journal ref: Phys. Rev. X 13, 041052 (2023)

  24. arXiv:2305.03766  [pdf, other

    quant-ph cond-mat.str-el

    Non-Abelian Topological Order and Anyons on a Trapped-Ion Processor

    Authors: Mohsin Iqbal, Nathanan Tantivasadakarn, Ruben Verresen, Sara L. Campbell, Joan M. Dreiling, Caroline Figgatt, John P. Gaebler, Jacob Johansen, Michael Mills, Steven A. Moses, Juan M. Pino, Anthony Ransford, Mary Rowe, Peter Siegfried, Russell P. Stutz, Michael Foss-Feig, Ashvin Vishwanath, Henrik Dreyer

    Abstract: Non-Abelian topological order (TO) is a coveted state of matter with remarkable properties, including quasiparticles that can remember the sequence in which they are exchanged. These anyonic excitations are promising building blocks of fault-tolerant quantum computers. However, despite extensive efforts, non-Abelian TO and its excitations have remained elusive, unlike the simpler quasiparticles or… ▽ More

    Submitted 14 February, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: 6 + 20 pages, 6 + 5 figures, 3 tables v2: Changed title, added a reference

    Journal ref: Nature 626 (2024) 505-511

  25. arXiv:2305.03101  [pdf, other

    cs.CL cs.SD eess.AS

    Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

    Authors: Yun Tang, Anna Y. Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden D. Tomasello, Juan Pino

    Abstract: Transducer and Attention based Encoder-Decoder (AED) are two widely used frameworks for speech-to-text tasks. They are designed for different purposes and each has its own benefits and drawbacks for speech-to-text tasks. In order to leverage strengths of both modeling methods, we propose a solution by combining Transducer and Attention based Encoder-Decoder (TAED) for speech-to-text tasks. The new… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: ACL 2023 main conference

  26. arXiv:2304.04618  [pdf, other

    cs.SD cs.CL eess.AS

    Enhancing Speech-to-Speech Translation with Multiple TTS Targets

    Authors: Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe

    Abstract: It has been known that direct speech-to-speech translation (S2ST) models usually suffer from the data scarcity issue because of the limited existing parallel materials for both source and target speech. Therefore to train a direct S2ST system, previous works usually utilize text-to-speech (TTS) systems to generate samples in the target language by augmenting the data from speech-to-text translatio… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  27. arXiv:2304.04596  [pdf, other

    cs.SD cs.CL eess.AS

    ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

    Authors: Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe

    Abstract: ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community. ESPnet-ST-v2 supports 1) offline speech-to-text translation (ST), 2) simultaneous speech-to-text translation (SST), and 3) offline speech-to-speech translation (S2ST) -- each task is supported with a wide variety of approaches, differentiating ESPnet-… ▽ More

    Submitted 6 July, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: ACL 2023; System Demonstration

  28. arXiv:2303.00628  [pdf, ps, other

    cs.CL eess.AS

    MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

    Authors: Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang

    Abstract: We introduce MuAViC, a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation providing 1200 hours of audio-visual speech in 9 languages. It is fully transcribed and covers 6 English-to-X translation as well as 6 X-to-English translation directions. To the best of our knowledge, this is the first open benchmark for audio-visual speech-to-text translati… ▽ More

    Submitted 7 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  29. arXiv:2301.11716  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Pre-training for Speech Translation: CTC Meets Optimal Transport

    Authors: Phuong-Hang Le, Hongyu Gong, Changhan Wang, Juan Pino, Benjamin Lecouteux, Didier Schwab

    Abstract: The gap between speech and text modalities is a major challenge in speech-to-text translation (ST). Different methods have been proposed to reduce this gap, but most of them require architectural changes in ST training. In this work, we propose to mitigate this issue at the pre-training stage, requiring no change in the ST model. First, we show that the connectionist temporal classification (CTC)… ▽ More

    Submitted 5 June, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: ICML 2023 (oral presentation). This version fixed URLs, updated affiliations & acknowledgements, and improved formatting

  30. arXiv:2212.08055  [pdf, other

    cs.CL cs.SD eess.AS

    UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

    Authors: Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino

    Abstract: Direct speech-to-speech translation (S2ST), in which all components can be optimized jointly, is advantageous over cascaded approaches to achieve fast inference with a simplified pipeline. We present a novel two-pass direct S2ST architecture, UnitY, which first generates textual representations and predicts discrete acoustic units subsequently. We enhance the model performance by subword predictio… ▽ More

    Submitted 26 May, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL 2023 (main conference)

  31. arXiv:2211.12119  [pdf, other

    quant-ph cond-mat.mes-hall

    Dynamical gauge fields with bosonic codes

    Authors: Javier del Pino, Oded Zilberberg

    Abstract: The quantum simulation of dynamical gauge field theories offers the opportunity to study complex high-energy physics with controllable low-energy devices. For quantum computation, bosonic codes promise robust error correction that exploits multi-particle redundancy in bosons. Here, we demonstrate how bosonic codes can be used to simulate dynamical gauge fields. We encode both matter and dynamical… ▽ More

    Submitted 7 February, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: Revised text, figures, and Supplemental Material (included)

  32. arXiv:2211.06474  [pdf, other

    cs.CL cs.SD eess.AS

    Speech-to-Speech Translation For A Real-world Unwritten Language

    Authors: Peng-Jen Chen, Kevin Tran, Yilin Yang, Jingfei Du, Justine Kao, Yu-An Chung, Paden Tomasello, Paul-Ambroise Duquenne, Holger Schwenk, Hongyu Gong, Hirofumi Inaguma, Sravya Popuri, Changhan Wang, Juan Pino, Wei-Ning Hsu, Ann Lee

    Abstract: We study speech-to-speech translation (S2ST) that translates speech from one language into another language and focuses on building systems to support languages without standard text writing systems. We use English-Taiwanese Hokkien as a case study, and present an end-to-end solution from training data collection, modeling choices to benchmark dataset release. First, we present efforts on creating… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  33. arXiv:2211.04508  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

    Authors: Paul-Ambroise Duquenne, Hongyu Gong, Ning Dong, Jingfei Du, Ann Lee, Vedanuj Goswani, Changhan Wang, Juan Pino, Benoît Sagot, Holger Schwenk

    Abstract: We present SpeechMatrix, a large-scale multilingual corpus of speech-to-speech translations mined from real speech of European Parliament recordings. It contains speech alignments in 136 language pairs with a total of 418 thousand hours of speech. To evaluate the quality of this parallel speech, we train bilingual speech-to-speech translation models on mined data only and establish extensive basel… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 18 pages

  34. arXiv:2210.14731  [pdf, other

    cond-mat.mes-hall cond-mat.stat-mech

    Deterministic and stochastic sampling of two coupled Kerr parametric oscillators

    Authors: Gabriel Margiani, Javier del Pino, Toni L. Heugel, Nicholas E. Bousse, Sebastián Guerrero, Thomas W. Kenny, Oded Zilberberg, Deividas Sabonis, Alexander Eichler

    Abstract: The vision of building computational hardware for problem optimization has spurred large efforts in the physics community. In particular, networks of Kerr parametric oscillators (KPOs) are envisioned as simulators for finding the ground states of Ising Hamiltonians. It was shown, however, that KPO networks can feature large numbers of unexpected solutions that are difficult to sample with the exis… ▽ More

    Submitted 3 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Journal ref: Phys. Rev. Research 5, L012029 (2023)

  35. arXiv:2210.10191  [pdf, other

    cs.CL cs.SD eess.AS

    Simple and Effective Unsupervised Speech Translation

    Authors: Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino

    Abstract: The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages. To address this issue, we study a simple and effective approach to build speech translation systems without labeled data by leveraging recent advances in unsupervised speech recognit… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

  36. arXiv:2207.08523  [pdf, other

    cond-mat.mes-hall physics.optics quant-ph

    Quadrature nonreciprocity: unidirectional bosonic transmission without breaking time-reversal symmetry

    Authors: Clara C. Wanjura, Jesse J. Slim, Javier del Pino, Matteo Brunelli, Ewold Verhagen, Andreas Nunnenkamp

    Abstract: Nonreciprocity means that the transmission of a signal depends on its direction of propagation. Despite vastly different platforms and underlying working principles, the realisations of nonreciprocal transport in linear, time-independent systems rely on Aharonov-Bohm interference among several pathways and require breaking time-reversal symmetry. Here we extend the notion of nonreciprocity to unid… ▽ More

    Submitted 17 April, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Includes: Main Text (7 pages, 4 figures), Methods & References (5 pages, 1 figure), Supplementary Information (14 pages, 2 figures)

    Journal ref: Nature Physics 19, 1429 (2023)

  37. arXiv:2204.05409  [pdf, other

    cs.CL

    Unified Speech-Text Pre-training for Speech Translation and Recognition

    Authors: Yun Tang, Hongyu Gong, Ning Dong, Changhan Wang, Wei-Ning Hsu, Jiatao Gu, Alexei Baevski, Xian Li, Abdelrahman Mohamed, Michael Auli, Juan Pino

    Abstract: We describe a method to jointly pre-train speech and text in an encoder-decoder modeling framework for speech translation and recognition. The proposed method incorporates four self-supervised and supervised subtasks for cross modality learning. A self-supervised speech subtask leverages unlabelled speech data, and a (self-)supervised text to text subtask makes use of abundant text training data.… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: ACL 2022 main conference

  38. arXiv:2204.02967  [pdf, other

    cs.CL cs.SD eess.AS

    Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

    Authors: Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee

    Abstract: Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) synthesis. In this work, we explore self-supervised pre-training with unlabeled speech data and… ▽ More

    Submitted 13 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: Accepted to be published in the Proceedings of Interspeech 2022

  39. arXiv:2202.00571  [pdf, other

    cond-mat.mes-hall nlin.AO nlin.CD

    HarmonicBalance.jl: A Julia suite for nonlinear dynamics using harmonic balance

    Authors: Jan Košata, Javier del Pino, Toni L. Heugel, Oded Zilberberg

    Abstract: HarmonicBalance.jl is a publicly available Julia package designed to simplify and solve systems of periodic time-dependent nonlinear ordinary differential equations. Time dependence of the system parameters is treated with the harmonic balance method, which approximates the system's behaviour as a set of harmonic terms with slowly-varying amplitudes. Under this approximation, the set of all possib… ▽ More

    Submitted 17 May, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Submission to SciPost/ Resubmission to SciPost Codebases

  40. arXiv:2112.08352  [pdf, other

    cs.CL cs.AI cs.LG eess.AS

    Textless Speech-to-Speech Translation on Real Data

    Authors: Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu

    Abstract: We present a textless speech-to-speech translation (S2ST) system that can translate speech from one language into another language and can be built without the need of any text data. Different from existing work in the literature, we tackle the challenge in modeling multi-speaker target speech and train the systems with real-world S2ST data. The key to our approach is a self-supervised unit-based… ▽ More

    Submitted 4 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted to NAACL 2022 (long paper)

  41. arXiv:2111.09296  [pdf, other

    cs.CL cs.SD eess.AS

    XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale

    Authors: Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

    Abstract: This paper presents XLS-R, a large-scale model for cross-lingual speech representation learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a million hours of publicly available speech audio in 128 languages, an order of magnitude more public data than the largest known prior work. Our evaluation covers a wide range of tasks, domains, data regimes and languages, b… ▽ More

    Submitted 16 December, 2021; v1 submitted 17 November, 2021; originally announced November 2021.

  42. arXiv:2110.14710  [pdf, other

    cond-mat.mes-hall physics.optics quant-ph

    Non-Hermitian chiral phononics through optomechanically-induced squeezing

    Authors: Javier del Pino, Jesse J. Slim, Ewold Verhagen

    Abstract: Imposing chirality on a physical system engenders unconventional energy flow and responses, such as the Aharonov-Bohm effect and the topological quantum Hall phase for electrons in a symmetry-breaking magnetic field. Recently, great interest has arisen in combining that principle with broken Hermiticity to explore novel topological phases and applications. Here, we report unique phononic states fo… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: Included Main body and Methods (19 pages, 12 figures), in addition to the Supplementary Information document (13 pages, 5 figures)

  43. arXiv:2110.08250  [pdf, other

    cs.CL cs.SD eess.AS

    Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention

    Authors: Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

    Abstract: We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations. Our approach leverages recent progress on direct speech-to-speech translation with discrete units, in which a sequence of discrete representations, instead of continuous spectrogram features, learned in an unsupervised m… ▽ More

    Submitted 12 January, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

  44. arXiv:2110.08214  [pdf, other

    cs.CL cs.SD eess.AS

    From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation

    Authors: Danni Liu, Changhan Wang, Hongyu Gong, Xutai Ma, Yun Tang, Juan Pino

    Abstract: Speech-to-speech translation (S2ST) converts input speech to speech in another language. A challenge of delivering S2ST in real time is the accumulated delay between the translation and speech synthesis modules. While recently incremental text-to-speech (iTTS) models have shown large quality improvements, they typically require additional future text inputs to reach optimal performance. In this wo… ▽ More

    Submitted 15 July, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted by Interspeech 2022

  45. arXiv:2109.06912  [pdf, other

    eess.AS cs.CL cs.SD

    fairseq S^2: A Scalable and Integrable Speech Synthesis Toolkit

    Authors: Changhan Wang, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Ann Lee, Peng-Jen Chen, Jiatao Gu, Juan Pino

    Abstract: This paper presents fairseq S^2, a fairseq extension for speech synthesis. We implement a number of autoregressive (AR) and non-AR text-to-speech models, and their multi-speaker variants. To enable training speech synthesis models with less curated data, a number of preprocessing tools are built and their importance is shown empirically. To facilitate faster iteration of development and analysis,… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021 Demo

  46. Suppression of mid-circuit measurement crosstalk errors with micromotion

    Authors: J. P. Gaebler, C. H. Baldwin, S. A. Moses, J. M. Dreiling, C. Figgatt, M. Foss-Feig, D. Hayes, J. M. Pino

    Abstract: Mid-circuit measurement and reset are crucial primitives in quantum computation, but such operations require strong interactions with selected qubits while maintaining isolation of neighboring qubits, which is a significant challenge in many systems. For trapped ion systems, measurement is performed with laser-induced fluorescence. Stray light from the detection beam and fluorescence from the meas… ▽ More

    Submitted 3 January, 2022; v1 submitted 24 August, 2021; originally announced August 2021.

    Comments: 13 pages, 7 figures

    Journal ref: Phys. Rev. A 104, 062440 (2021)

  47. arXiv:2107.06959  [pdf, ps, other

    cs.CL cs.SD eess.AS

    FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

    Authors: Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

    Abstract: In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We furth… ▽ More

    Submitted 14 August, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

    Comments: Accepted by IWSLT 2021 as a system paper

  48. arXiv:2107.05782  [pdf, other

    cs.CL cs.SD eess.AS

    Improving Speech Translation by Understanding and Learning from the Auxiliary Text Translation Task

    Authors: Yun Tang, Juan Pino, Xian Li, Changhan Wang, Dmitriy Genzel

    Abstract: Pretraining and multitask learning are widely used to improve the speech to text translation performance. In this study, we are interested in training a speech to text translation model along with an auxiliary text to text translation task. We conduct a detailed analysis to understand the impact of the auxiliary task on the primary task within the multitask learning framework. Our analysis confirm… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: Accepted by ACL 2021

  49. arXiv:2107.05604  [pdf, other

    cs.CL cs.LG eess.AS

    Direct speech-to-speech translation with discrete units

    Authors: Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

    Abstract: We present a direct speech-to-speech translation (S2ST) model that translates speech from one language to speech in another language without relying on intermediate text generation. We tackle the problem by first applying a self-supervised discrete speech encoder on the target speech and then training a sequence-to-sequence speech-to-unit translation (S2UT) model to predict the discrete representa… ▽ More

    Submitted 21 March, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: Accepted to ACL 2022 (long paper)

  50. arXiv:2106.10840  [pdf, other

    cs.CL cs.AI

    Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

    Authors: Hongyu Gong, Yun Tang, Juan Pino, Xian Li

    Abstract: Multi-head attention has each of the attention heads collect salient information from different parts of an input sequence, making it a powerful mechanism for sequence modeling. Multilingual and multi-domain learning are common scenarios for sequence modeling, where the key challenge is to maximize positive transfer and mitigate negative transfer across languages and domains. In this paper, we fin… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.