Skip to main content

Showing 1–19 of 19 results for author: Maa, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2509.20741  [pdf, ps, other

    eess.AS cs.ET cs.LG

    Real-Time System for Audio-Visual Target Speech Enhancement

    Authors: T. Aleksandra Ma, Sile Yin, Li-Chia Yang, Shuo Zhang

    Abstract: We present a live demonstration for RAVEN, a real-time audio-visual speech enhancement system designed to run entirely on a CPU. In single-channel, audio-only settings, speech enhancement is traditionally approached as the task of extracting clean speech from environmental noise. More recent work has explored the use of visual cues, such as lip movements, to improve robustness, particularly in the… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

    Comments: Accepted into WASPAA 2025 demo session

  2. arXiv:2507.21448  [pdf, ps, other

    eess.AS cs.ET cs.LG

    Real-Time Audio-Visual Speech Enhancement Using Pre-trained Visual Representations

    Authors: T. Aleksandra Ma, Sile Yin, Li-Chia Yang, Shuo Zhang

    Abstract: Speech enhancement in audio-only settings remains challenging, particularly in the presence of interfering speakers. This paper presents a simple yet effective real-time audio-visual speech enhancement (AVSE) system, RAVEN, which isolates and enhances the on-screen target speaker while suppressing interfering speakers and background noise. We investigate how visual embeddings learned from audio-vi… ▽ More

    Submitted 4 August, 2025; v1 submitted 28 July, 2025; originally announced July 2025.

    Comments: Accepted into Interspeech 2025; corrected author name typo

  3. arXiv:2507.19531  [pdf, ps, other

    eess.SY stat.ME

    A safety governor for learning explicit MPC controllers from data

    Authors: Anjie Mao, Zheming Wang, Hao Gu, Bo Chen, Li Yu

    Abstract: We tackle neural networks (NNs) to approximate model predictive control (MPC) laws. We propose a novel learning-based explicit MPC structure, which is reformulated into a dual-mode scheme over maximal constrained feasible set. The scheme ensuring the learning-based explicit MPC reduces to linear feedback control while entering the neighborhood of origin. We construct a safety governor to ensure th… ▽ More

    Submitted 21 July, 2025; originally announced July 2025.

  4. arXiv:2501.11570  [pdf, other

    cs.SD cs.IR cs.LG eess.AS

    Uncertainty Estimation in the Real World: A Study on Music Emotion Recognition

    Authors: Karn N. Watcharasupat, Yiwei Ding, T. Aleksandra Ma, Pavan Seshadri, Alexander Lerch

    Abstract: Any data annotation for subjective tasks shows potential variations between individuals. This is particularly true for annotations of emotional responses to musical stimuli. While older approaches to music emotion recognition systems frequently addressed this uncertainty problem through probabilistic modeling, modern systems based on neural networks tend to ignore the variability and focus only on… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

    Comments: To be presented as a Findings paper at the 2025 European Conference on Information Retrieval (ECIR)

  5. arXiv:2412.19552  [pdf, ps, other

    physics.med-ph eess.IV

    Contrast-Optimized Basis Functions for Self-Navigated Motion Correction in Quantitative MRI

    Authors: Elisa Marchetto, Sebastian Flassbeck, Andrew Mao, Jakob Assländer

    Abstract: Purpose: The long scan times of quantitative MRI techniques make motion artifacts more likely. For MR-Fingerprinting-like approaches, this problem can be addressed with self-navigated retrospective motion correction based on reconstructions in a singular value decomposition (SVD) subspace. However, the SVD promotes high signal intensity in all tissues, which limits the contrast between tissue type… ▽ More

    Submitted 17 June, 2025; v1 submitted 27 December, 2024; originally announced December 2024.

  6. arXiv:2409.07730  [pdf, other

    eess.AS cs.IR cs.LG cs.SD

    Music auto-tagging in the long tail: A few-shot approach

    Authors: T. Aleksandra Ma, Alexander Lerch

    Abstract: In the realm of digital music, using tags to efficiently organize and retrieve music from extensive databases is crucial for music catalog owners. Human tagging by experts is labor-intensive but mostly accurate, whereas automatic tagging through supervised learning has approached satisfying accuracy but is restricted to a predefined set of training tags. Few-shot learning offers a viable solution… ▽ More

    Submitted 16 September, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: Published in Audio Engineering Society NY Show 2024 as a Peer Reviewed (Category 1) paper; typos corrected

    ACM Class: H.3.3

  7. arXiv:2405.07905  [pdf, other

    eess.IV cs.CV

    PLUTO: Pathology-Universal Transformer

    Authors: Dinkar Juyal, Harshith Padigela, Chintan Shah, Daniel Shenker, Natalia Harguindeguy, Yi Liu, Blake Martin, Yibo Zhang, Michael Nercessian, Miles Markey, Isaac Finberg, Kelsey Luu, Daniel Borders, Syed Ashar Javed, Emma Krause, Raymond Biju, Aashish Sood, Allen Ma, Jackson Nyman, John Shamshoian, Guillaume Chhor, Darpan Sanghavi, Marc Thibault, Limin Yu, Fedaa Najdawi , et al. (8 additional authors not shown)

    Abstract: Pathology is the study of microscopic inspection of tissue, and a pathology diagnosis is often the medical gold standard to diagnose disease. Pathology images provide a unique challenge for computer-vision-based analysis: a single pathology Whole Slide Image (WSI) is gigapixel-sized and often contains hundreds of thousands to millions of objects of interest across multiple resolutions. In this wor… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  8. arXiv:2403.00892  [pdf, other

    eess.SY cs.LG

    PowerFlowMultiNet: Multigraph Neural Networks for Unbalanced Three-Phase Distribution Systems

    Authors: Salah Ghamizi, Jun Cao, Aoxiang Ma, Pedro Rodriguez

    Abstract: Efficiently solving unbalanced three-phase power flow in distribution grids is pivotal for grid analysis and simulation. There is a pressing need for scalable algorithms capable of handling large-scale unbalanced power grids that can provide accurate and fast solutions. To address this, deep learning techniques, especially Graph Neural Networks (GNNs), have emerged. However, existing literature pr… ▽ More

    Submitted 6 September, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  9. arXiv:2306.04730  [pdf, other

    eess.SP cs.LG math.NA math.OC stat.ML

    Stochastic Natural Thresholding Algorithms

    Authors: Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

    Abstract: Sparse signal recovery is one of the most fundamental problems in various applications, including medical imaging and remote sensing. Many greedy algorithms based on the family of hard thresholding operators have been developed to solve the sparse signal recovery problem. More recently, Natural Thresholding (NT) has been proposed with improved computational efficiency. This paper proposes and disc… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  10. arXiv:2208.09096  [pdf, other

    cs.SD cs.LG eess.AS

    Representation Learning for the Automatic Indexing of Sound Effects Libraries

    Authors: Alison B. Ma, Alexander Lerch

    Abstract: Labeling and maintaining a commercial sound effects library is a time-consuming task exacerbated by databases that continually grow in size and undergo taxonomy updates. Moreover, sound search and taxonomy creation are complicated by non-uniform metadata, an unrelenting problem even with the introduction of a new industry standard, the Universal Category System. To address these problems and overc… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: Accepted at the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), 10 pages, 7 figures

  11. arXiv:2207.08998  [pdf

    eess.IV cs.CV cs.LG q-bio.QM

    Discovering novel systemic biomarkers in photos of the external eye

    Authors: Boris Babenko, Ilana Traynis, Christina Chen, Preeti Singh, Akib Uddin, Jorge Cuadros, Lauren P. Daskivich, April Y. Maa, Ramasamy Kim, Eugene Yu-Chuan Kang, Yossi Matias, Greg S. Corrado, Lily Peng, Dale R. Webster, Christopher Semturs, Jonathan Krause, Avinash V. Varadarajan, Naama Hammel, Yun Liu

    Abstract: External eye photos were recently shown to reveal signs of diabetic retinal disease and elevated HbA1c. In this paper, we evaluate if external eye photos contain information about additional systemic medical conditions. We developed a deep learning system (DLS) that takes external eye photos as input and predicts multiple systemic parameters, such as those related to the liver (albumin, AST); kidn… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

  12. Orthogonal Rational Approximation of Transfer Functions for High-Frequency Circuits

    Authors: Andrew Ma, Arif Ege Engin

    Abstract: Rational function approximations find applications in many areas including macro-modeling of high-frequency circuits, model order reduction for controller design, interpolation and extrapolation of system responses, surrogate models for high-energy physics, and approximation of elementary mathematical functions. The unknown denominator polynomial of the model results in a non-linear problem, which… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Journal ref: Int J Circ Theor Appl. 2022; 1- 13

  13. arXiv:2110.13670  [pdf, other

    eess.IV cs.CV

    W-Net: A Two-Stage Convolutional Network for Nucleus Detection in Histopathology Image

    Authors: Anyu Mao, Jialun Wu, Xinrui Bao, Zeyu Gao, Tieliang Gong, Chen Li

    Abstract: Pathological diagnosis is the gold standard for cancer diagnosis, but it is labor-intensive, in which tasks such as cell detection, classification, and counting are particularly prominent. A common solution for automating these tasks is using nucleus segmentation technology. However, it is hard to train a robust nucleus segmentation model, due to several challenging problems, the nucleus adhesion,… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: BIBM 2021 accepted,including 8 pages, 3 figures

  14. arXiv:2011.11732  [pdf

    eess.IV cs.CV cs.LG

    Detecting hidden signs of diabetes in external eye photographs

    Authors: Boris Babenko, Akinori Mitani, Ilana Traynis, Naho Kitade, Preeti Singh, April Maa, Jorge Cuadros, Greg S. Corrado, Lily Peng, Dale R. Webster, Avinash Varadarajan, Naama Hammel, Yun Liu

    Abstract: Diabetes-related retinal conditions can be detected by examining the posterior of the eye. By contrast, examining the anterior of the eye can reveal conditions affecting the front of the eye, such as changes to the eyelids, cornea, or crystalline lens. In this work, we studied whether external photographs of the front of the eye can reveal insights into both diabetic retinal diseases and blood glu… ▽ More

    Submitted 23 November, 2020; originally announced November 2020.

    Journal ref: Nature Biomedical Engineering 2022

  15. arXiv:2011.09766  [pdf, other

    cs.CV cs.LG eess.IV

    Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery

    Authors: Zhuo Zheng, Yanfei Zhong, Junjue Wang, Ailong Ma

    Abstract: Geospatial object segmentation, as a particular semantic segmentation task, always faces with larger-scale variation, larger intra-class variance of background, and foreground-background imbalance in the high spatial resolution (HSR) remote sensing imagery. However, general semantic segmentation methods mainly focus on scale variation in the natural scene, with inadequate consideration of the othe… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2020

  16. FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

    Authors: Zhuo Zheng, Yanfei Zhong, Ailong Ma, Liangpei Zhang

    Abstract: Deep learning techniques have provided significant improvements in hyperspectral image (HSI) classification. The current deep learning based HSI classifiers follow a patch-based learning framework by dividing the image into overlapping patches. As such, these methods are local learning methods, which have a high computational cost. In this paper, a fast patch-free global learning (FPGA) framework… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 16 pages, 15 figures, IEEE Transactions on Geoscience and Remote Sensing, 2020

  17. arXiv:2011.03247  [pdf, other

    cs.CV eess.IV

    Hi-UCD: A Large-scale Dataset for Urban Semantic Change Detection in Remote Sensing Imagery

    Authors: Shiqi Tian, Ailong Ma, Zhuo Zheng, Yanfei Zhong

    Abstract: With the acceleration of the urban expansion, urban change detection (UCD), as a significant and effective approach, can provide the change information with respect to geospatial objects for dynamical urban analysis. However, existing datasets suffer from three bottlenecks: (1) lack of high spatial resolution images; (2) lack of semantic annotation; (3) lack of long-range multi-temporal images. In… ▽ More

    Submitted 27 December, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Presented at NeurIPS 2020 Workshop on Machine Learning for the Developing World

  18. arXiv:1801.10264  [pdf, other

    cs.IT cs.DS eess.SP math.NA math.OC

    Compressed Anomaly Detection with Multiple Mixed Observations

    Authors: Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

    Abstract: We consider a collection of independent random variables that are identically distributed, except for a small subset which follows a different, anomalous distribution. We study the problem of detecting which random variables in the collection are governed by the anomalous distribution. Recent work proposes to solve this problem by conducting hypothesis tests based on mixed observations (e.g. linea… ▽ More

    Submitted 19 June, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

    Comments: 27 pages, 9 figures. Incorporates reviewer feedback, additional experiments, and additional figures

  19. arXiv:1711.02743  [pdf, other

    eess.SP cs.DS math.NA

    Sparse Randomized Kaczmarz for Support Recovery of Jointly Sparse Corrupted Multiple Measurement Vectors

    Authors: Natalie Durgin, Rachel Grotheer, Chenxi Huang, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

    Abstract: While single measurement vector (SMV) models have been widely studied in signal processing, there is a surging interest in addressing the multiple measurement vectors (MMV) problem. In the MMV setting, more than one measurement vector is available and the multiple signals to be recovered share some commonalities such as a common support. Applications in which MMV is a naturally occurring phenomeno… ▽ More

    Submitted 14 June, 2018; v1 submitted 7 November, 2017; originally announced November 2017.

    Comments: 13 pages, 6 figures