Skip to main content

Showing 1–13 of 13 results for author: Williams, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2504.00742  [pdf, other

    eess.AS

    Expanding and Analyzing ODAQ -- the Open Dataset of Audio Quality

    Authors: Sascha Dick, Christoph Thompson, Chih-Wei Wu, Matteo Torcoli, Pablo Delgado, Phillip A. Williams, Emanuel Habets

    Abstract: The Open Dataset of Audio Quality (ODAQ) was recently introduced to address the scarcity of openly available audio datasets with corresponding subjective quality scores. The dataset, released under permissive licenses, comprises audio material processed using six different signal processing methods operating at five quality levels, along with corresponding subjective test results. To expand the da… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: Accepted for presentation at the Audio Engineering Society (AES) 157th Convention, October 2024, New York, USA

  2. arXiv:2503.24063  [pdf, other

    eess.IV econ.GN eess.SY

    A robot-assisted pipeline to rapidly scan 1.7 million historical aerial photographs

    Authors: Sheila Masson, Alan Potts, Allan Williams, Steve Berggreen, Kevin McLaren, Sam Martin, Eugenio Noda, Nicklas Nordfors, Nic Ruecroft, Hannah Druckenmiller, Solomon Hsiang, Andreas Madestam, Anna Tompsett

    Abstract: During the 20th Century, aerial surveys captured hundreds of millions of high-resolution photographs of the earth's surface. These images, the precursors to modern satellite imagery, represent an extraordinary visual record of the environmental and social upheavals of the 20th Century. However, most of these images currently languish in physical archives where retrieval is difficult and costly. Di… ▽ More

    Submitted 8 April, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

  3. arXiv:2503.14304  [pdf, other

    eess.IV cs.CV

    RoMedFormer: A Rotary-Embedding Transformer Foundation Model for 3D Genito-Pelvic Structure Segmentation in MRI and CT

    Authors: Yuheng Li, Mingzhe Hu, Richard L. J. Qiu, Maria Thor, Andre Williams, Deborah Marshall, Xiaofeng Yang

    Abstract: Deep learning-based segmentation of genito-pelvic structures in MRI and CT is crucial for applications such as radiation therapy, surgical planning, and disease diagnosis. However, existing segmentation models often struggle with generalizability across imaging modalities, and anatomical variations. In this work, we propose RoMedFormer, a rotary-embedding transformer-based foundation model designe… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  4. arXiv:2412.08983  [pdf, other

    cs.RO eess.SY

    An Event-Triggered Framework for Trust-Mediated Human-Autonomy Interaction

    Authors: Daniel A. Williams, Airlie Chapman, Chris Manzie

    Abstract: Inspired by the increased cooperation between humans and autonomous systems, we present a new hybrid systems framework capturing the interconnected dynamics underlying these interactions. The framework accommodates models arising from both the autonomous systems and cognitive psychology literature in order to represent key elements such as human trust in the autonomous system. The intermittent nat… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

  5. arXiv:2411.08135  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    On the Role of Speech Data in Reducing Toxicity Detection Bias

    Authors: Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà

    Abstract: Text toxicity detection systems exhibit significant biases, producing disproportionate rates of false positives on samples mentioning demographic groups. But what about toxicity detection in speech? To investigate the extent to which text-based biases are mitigated by speech-based systems, we produce a set of high-quality group annotations for the multilingual MuTox dataset, and then leverage thes… ▽ More

    Submitted 16 May, 2025; v1 submitted 12 November, 2024; originally announced November 2024.

    Comments: Accepted at NAACL 2025

    Journal ref: In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (Volume 1), pages 1454-1468

  6. arXiv:2411.00697  [pdf

    physics.optics eess.SP physics.app-ph

    All-Optical Excitable Spiking Laser Neuron in InP Generic Integration Technology

    Authors: Lukas Puts, Daan Lenstra, Kevin A. Williams, Weiming Yao

    Abstract: Brain-inspired, neuromorphic devices implemented in integrated photonic hardware have attracted significant interest recently as part of efforts towards novel non-von Neumann computing paradigms that make use of the low loss, high-speed and parallel operations in optics. An all-optical spiking laser neuron fabricated on the indium-phosphide generic integration technology platform may be a practica… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

    Comments: 21 pages, 13 figures

  7. arXiv:2401.00197  [pdf, other

    eess.AS

    ODAQ: Open Dataset of Audio Quality

    Authors: Matteo Torcoli, Chih-Wei Wu, Sascha Dick, Phillip A. Williams, Mhd Modar Halimeh, William Wolcott, Emanuel A. P. Habets

    Abstract: Research into the prediction and analysis of perceived audio quality is hampered by the scarcity of openly available datasets of audio signals accompanied by corresponding subjective quality scores. To address this problem, we present the Open Dataset of Audio Quality (ODAQ), a new dataset containing the results of a MUSHRA listening test conducted with expert listeners from 2 international labora… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: Accepted paper. IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Seoul, Korea, April 2024

  8. arXiv:2312.14069  [pdf, other

    cs.CL cs.SD eess.AS

    EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

    Authors: Maureen de Seyssel, Antony D'Avirro, Adina Williams, Emmanuel Dupoux

    Abstract: We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis. We apply this to two tasks: speech resynthesis and speech-to-speech translation. In both cases, the benchmark evaluates the ability of the model to encode emphasis in the speech input and accurately reproduce it in the output, potentially across a… ▽ More

    Submitted 14 October, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted at EMNLP 2024 (Main)

  9. arXiv:2309.02539  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation

    Authors: Karn N. Watcharasupat, Chih-Wei Wu, Yiwei Ding, Iroro Orife, Aaron J. Hipple, Phillip A. Williams, Scott Kramer, Alexander Lerch, William Wolcott

    Abstract: Cinematic audio source separation is a relatively new subtask of audio source separation, with the aim of extracting the dialogue, music, and effects stems from their mixture. In this work, we developed a model generalizing the Bandsplit RNN for any complete or overcomplete partitions of the frequency axis. Psychoacoustically motivated frequency scales were used to inform the band definitions whic… ▽ More

    Submitted 1 December, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted to the IEEE Open Journal of Signal Processing (ICASSP 2024 Track)

    Journal ref: IEEE Open Journal of Signal Processing, vol. 5, pp. 73-81, 2024

  10. arXiv:2211.13846  [pdf, other

    eess.SY

    Asynchronous Event-Triggered Control for Non-Linear Systems

    Authors: Daniel A. Williams, Airlie Chapman, Chris Manzie

    Abstract: With the increasing ubiquity of networked control systems, various strategies for sampling constituent subsystems' outputs have emerged. In contrast with periodic sampling, event-triggered control provides a way to efficiently sample a subsystem and conserve network resource usage, by triggering an update only when a state-dependent error threshold is satisfied. Herein we describe a novel scheme f… ▽ More

    Submitted 5 April, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  11. arXiv:2203.14437  [pdf, other

    eess.SY

    Individual and Team Trust Preferences for Robotic Swarm Behaviors

    Authors: Elena M Vella, Daniel A Williams, Airlie Chapman, Chris Manzie

    Abstract: Trust between humans and multi-agent robotic swarms may be analyzed using human preferences. These preferences are expressed by an individual as a sequence of ordered comparisons between pairs of swarm behaviors. An individual's preference graph can be formed from this sequence. In addition, swarm behaviors may be mapped to a feature vector space. We formulate a linear optimization problem to loca… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

  12. Trajectory Planning with Deep Reinforcement Learning in High-Level Action Spaces

    Authors: Kyle R. Williams, Rachel Schlossman, Daniel Whitten, Joe Ingram, Srideep Musuvathy, Anirudh Patel, James Pagan, Kyle A. Williams, Sam Green, Anirban Mazumdar, Julie Parish

    Abstract: This paper presents a technique for trajectory planning based on continuously parameterized high-level actions (motion primitives) of variable duration. This technique leverages deep reinforcement learning (Deep RL) to formulate a policy which is suitable for real-time implementation. There is no separation of motion primitive generation and trajectory planning: each individual short-horizon motio… ▽ More

    Submitted 12 August, 2022; v1 submitted 30 September, 2021; originally announced October 2021.

    Journal ref: IEEE Transactions on Aerospace and Electronic Systems, 59 (2023) 2513-2529

  13. arXiv:2001.07739  [pdf, ps, other

    cs.CV cs.LG eess.IV

    EMOPAIN Challenge 2020: Multimodal Pain Evaluation from Facial and Bodily Expressions

    Authors: Joy O. Egede, Siyang Song, Temitayo A. Olugbade, Chongyang Wang, Amanda Williams, Hongying Meng, Min Aung, Nicholas D. Lane, Michel Valstar, Nadia Bianchi-Berthouze

    Abstract: The EmoPain 2020 Challenge is the first international competition aimed at creating a uniform platform for the comparison of machine learning and multimedia processing methods of automatic chronic pain assessment from human expressive behaviour, and also the identification of pain-related behaviours. The objective of the challenge is to promote research in the development of assistive technologies… ▽ More

    Submitted 9 March, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: 8 pages