Skip to main content

Showing 1–50 of 71 results for author: Song, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14968  [pdf, ps, other

    cs.RO cs.AI

    FEAST: A Flexible Mealtime-Assistance System Towards In-the-Wild Personalization

    Authors: Rajat Kumar Jenamani, Tom Silver, Ben Dodson, Shiqin Tong, Anthony Song, Yuting Yang, Ziang Liu, Benjamin Howe, Aimee Whitneck, Tapomayukh Bhattacharjee

    Abstract: Physical caregiving robots hold promise for improving the quality of life of millions worldwide who require assistance with feeding. However, in-home meal assistance remains challenging due to the diversity of activities (e.g., eating, drinking, mouth wiping), contexts (e.g., socializing, watching TV), food items, and user preferences that arise during deployment. In this work, we propose FEAST, a… ▽ More

    Submitted 27 June, 2025; v1 submitted 17 June, 2025; originally announced June 2025.

    Comments: RSS 2025 - Best Paper Award

  2. arXiv:2506.09022  [pdf, ps, other

    cs.CV

    Do Multiple Instance Learning Models Transfer?

    Authors: Daniel Shao, Richard J. Chen, Andrew H. Song, Joel Runevic, Ming Y. Lu, Tong Ding, Faisal Mahmood

    Abstract: Multiple Instance Learning (MIL) is a cornerstone approach in computational pathology (CPath) for generating clinically meaningful slide-level embeddings from gigapixel tissue images. However, MIL often struggles with small, weakly supervised clinical datasets. In contrast to fields such as NLP and conventional computer vision, where transfer learning is widely used to address data scarcity, the t… ▽ More

    Submitted 11 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: ICML 2025 (Spotlight). 20 pages, 8 figures

  3. arXiv:2506.03373  [pdf, ps, other

    cs.CV cs.AI

    A Foundation Model for Spatial Proteomics

    Authors: Muhammad Shaban, Yuzhou Chang, Huaying Qiu, Yao Yu Yeo, Andrew H. Song, Guillaume Jaume, Yuchen Wang, Luca L. Weishaupt, Tong Ding, Anurag Vaidya, Abdallah Lamane, Daniel Shao, Mohammed Zidane, Yunhao Bai, Paige McCallum, Shuli Luo, Wenrui Wu, Yang Wang, Precious Cramer, Chi Ngai Chan, Pierre Stephan, Johanna Schaffenrath, Jia Le Lee, Hendrik A. Michel, Caiwei Tian , et al. (35 additional authors not shown)

    Abstract: Foundation models have begun to transform image analysis by acting as pretrained generalist backbones that can be adapted to many tasks even when post-training data are limited, yet their impact on spatial proteomics, imaging that maps proteins at single-cell resolution, remains limited. Here, we introduce KRONOS, a foundation model built for spatial proteomics. KRONOS was trained in a self-superv… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  4. arXiv:2505.17108  [pdf, ps, other

    cs.NE cs.AI

    REMS: a unified solution representation, problem modeling and metaheuristic algorithm design for general combinatorial optimization problems

    Authors: Aijuan Song, Guohua Wu

    Abstract: Combinatorial optimization problems (COPs) with discrete variables and finite search space are critical across numerous fields, and solving them in metaheuristic algorithms is popular. However, addressing a specific COP typically requires developing a tailored and handcrafted algorithm. Even minor adjustments, such as constraint changes, may necessitate algorithm redevelopment. Therefore, establis… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: 15 pages, 11 figures, regular reseach paper

  5. arXiv:2505.10492  [pdf, ps, other

    eess.IV cs.CV physics.med-ph physics.optics

    Multi-contrast laser endoscopy for in vivo gastrointestinal imaging

    Authors: Taylor L. Bobrow, Mayank Golhar, Suchapa Arayakarnkul, Anthony A. Song, Saowanee Ngamruengphong, Nicholas J. Durr

    Abstract: White light endoscopy is the clinical gold standard for detecting diseases in the gastrointestinal tract. Most applications involve identifying visual abnormalities in tissue color, texture, and shape. Unfortunately, the contrast of these features is often subtle, causing many clinically relevant cases to go undetected. To overcome this challenge, we introduce Multi-contrast Laser Endoscopy (MLE):… ▽ More

    Submitted 23 June, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

  6. Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction

    Authors: Zihan Zhou, Changrui Dai, Aibo Song, Xiaolin Fang

    Abstract: Successful video analysis relies on accurate recognition of pixels across frames, and frame reconstruction methods based on video correspondence learning are popular due to their efficiency. Existing frame reconstruction methods, while efficient, neglect the value of direct involvement of multiple reference frames for reconstruction and decision-making aspects, especially in complex situations suc… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  7. arXiv:2503.22727  [pdf, other

    cs.CL cs.LG

    A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI

    Authors: Alejandro Lozano, Min Woo Sun, James Burgess, Jeffrey J. Nirschl, Christopher Polzak, Yuhui Zhang, Liangyu Chen, Jeffrey Gu, Ivan Lopez, Josiah Aklilu, Anita Rau, Austin Wolfgang Katzer, Collin Chiu, Orr Zohar, Xiaohan Wang, Alfred Seunghoon Song, Chiang Chia-Chun, Robert Tibshirani, Serena Yeung-Levy

    Abstract: Despite the excitement behind biomedical artificial intelligence (AI), access to high-quality, diverse, and large-scale data - the foundation for modern AI systems - is still a bottleneck to unlocking its full potential. To address this gap, we introduce Biomedica, an open-source dataset derived from the PubMed Central Open Access subset, containing over 6 million scientific articles and 24 millio… ▽ More

    Submitted 1 April, 2025; v1 submitted 26 March, 2025; originally announced March 2025.

  8. arXiv:2503.12026  [pdf, other

    cs.CV

    Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning

    Authors: Zihan Zhou, Changrui Dai, Aibo Song, Xiaolin Fang

    Abstract: Self-supervised video correspondence learning depends on the ability to accurately associate pixels between video frames that correspond to the same visual object. However, achieving reliable pixel matching without supervision remains a major challenge. To address this issue, recent research has focused on feature learning techniques that aim to encode unique pixel representations for matching. De… ▽ More

    Submitted 30 April, 2025; v1 submitted 15 March, 2025; originally announced March 2025.

  9. arXiv:2503.07125  [pdf, other

    cs.CV

    Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation

    Authors: Sihao Lin, Daqi Liu, Ruochong Fu, Dongrui Liu, Andy Song, Hongwei Xie, Zhihui Li, Bing Wang, Xiaojun Chang

    Abstract: Estimating the 3D world from 2D monocular images is a fundamental yet challenging task due to the labour-intensive nature of 3D annotations. To simplify label acquisition, this work proposes a novel approach that bridges 2D vision foundation models (VFMs) with 3D tasks by decoupling 3D supervision into an ensemble of image-level primitives, e.g., semantic and geometric components. As a key motivat… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: preprint

  10. arXiv:2502.17761  [pdf, other

    cs.CV stat.AP

    AI-driven 3D Spatial Transcriptomics

    Authors: Cristina Almagro-Pérez, Andrew H. Song, Luca Weishaupt, Ahrong Kim, Guillaume Jaume, Drew F. K. Williamson, Konstantin Hemker, Ming Y. Lu, Kritika Singh, Bowen Chen, Long Phi Le, Alexander S. Baras, Sizun Jiang, Ali Bashashati, Jonathan T. C. Liu, Faisal Mahmood

    Abstract: A comprehensive three-dimensional (3D) map of tissue architecture and gene expression is crucial for illuminating the complexity and heterogeneity of tissues across diverse biomedical applications. However, most spatial transcriptomics (ST) approaches remain limited to two-dimensional (2D) sections of tissue. Although current 3D ST methods hold promise, they typically require extensive tissue sect… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  11. MRUCT: Mixed Reality Assistance for Acupuncture Guided by Ultrasonic Computed Tomography

    Authors: Xinkai Wang, Yue Yang, Kehong Zhou, Xue Xie, Lifeng Zhu, Aiguo Song, Bruce Daniel

    Abstract: Chinese acupuncture practitioners primarily depend on muscle memory and tactile feedback to insert needles and accurately target acupuncture points, as the current workflow lacks imaging modalities and visual aids. Consequently, new practitioners often learn through trial and error, requiring years of experience to become proficient and earn the trust of patients. Medical students face similar cha… ▽ More

    Submitted 3 April, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

  12. arXiv:2502.04405  [pdf, other

    cs.LG cs.AI cs.CL

    FAS: Fast ANN-SNN Conversion for Spiking Large Language Models

    Authors: Long Chen, Xiaotian Song, Andy Song, BaDong Chen, Jiancheng Lv, Yanan Sun

    Abstract: Spiking Large Language Models have been shown as a good alternative to LLMs in various scenarios. Existing methods for creating Spiking LLMs, i.e., direct training and ANN-SNN conversion, often suffer from performance degradation and relatively high computational costs. To address these issues, we propose a novel Fast ANN-SNN conversion strategy (FAS) that transforms LLMs into spiking LLMs in two… ▽ More

    Submitted 14 May, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

  13. arXiv:2502.02970  [pdf, ps, other

    cs.LG

    Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models

    Authors: Muxing Li, Zesheng Ye, Yixuan Li, Andy Song, Guangquan Zhang, Feng Liu

    Abstract: To detect unauthorized data usage in training large-scale generative models (e.g., ChatGPT or Midjourney), membership inference attacks (MIA) have proven effective in distinguishing a single training instance (a member) from a single non-training instance (a non-member). This success is mainly credited to a memorization effect: models tend to perform better on a member than a non-member. However,… ▽ More

    Submitted 19 June, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

  14. arXiv:2501.18355  [pdf, other

    eess.AS cs.SD eess.SP eess.SY

    Multilayered Intelligent Reflecting Surface for Long-Range Underwater Acoustic Communication

    Authors: Yu Luo, Lina Pu, Aijun Song

    Abstract: This article introduces a multilayered acoustic reconfigurable intelligent surface (ML-ARIS) architecture designed for the next generation of underwater communications. ML-ARIS incorporates multiple layers of piezoelectric material in each acoustic reflector, with the load impedance of each layer independently adjustable via a control circuit. This design increases the flexibility in generating re… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: 12 pages, 16 figures

  15. arXiv:2501.16652  [pdf, other

    cs.CV cs.AI

    Molecular-driven Foundation Model for Oncologic Pathology

    Authors: Anurag Vaidya, Andrew Zhang, Guillaume Jaume, Andrew H. Song, Tong Ding, Sophia J. Wagner, Ming Y. Lu, Paul Doucet, Harry Robertson, Cristina Almagro-Perez, Richard J. Chen, Dina ElHarouni, Georges Ayoub, Connor Bossi, Keith L. Ligon, Georg Gerber, Long Phi Le, Faisal Mahmood

    Abstract: Foundation models are reshaping computational pathology by enabling transfer learning, where models pre-trained on vast datasets can be adapted for downstream diagnostic, prognostic, and therapeutic response tasks. Despite these advances, foundation models are still limited in their ability to encode the entire gigapixel whole-slide images without additional training and often lack complementary m… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

  16. arXiv:2501.09802  [pdf

    cs.CR

    W3ID: A Quantum Computing-Secure Digital Identity System Redefining Standards for Web3 and Digital Twins

    Authors: Joseph Yun, Eli Lifton, Eunseo Lee, Yohan Yun, Abigail Song, Joshua Lee, Cristian Jimenez-Bert, Benedict Song, Yejun Lee, Alex Seo, Sijung Yun

    Abstract: The rapid advancements in quantum computing present significant threats to existing encryption standards and internet security. Simultaneously, the advent of Web 3.0 marks a transformative era in internet history, emphasizing enhanced data security, decentralization, and user ownership. This white paper introduces the W3ID, an abbreviation of Web3 standard meeting universal digital ID, which is a… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

  17. arXiv:2501.07171  [pdf, other

    cs.CV cs.CL

    BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

    Authors: Alejandro Lozano, Min Woo Sun, James Burgess, Liangyu Chen, Jeffrey J Nirschl, Jeffrey Gu, Ivan Lopez, Josiah Aklilu, Austin Wolfgang Katzer, Collin Chiu, Anita Rau, Xiaohan Wang, Yuhui Zhang, Alfred Seunghoon Song, Robert Tibshirani, Serena Yeung-Levy

    Abstract: The development of vision-language models (VLMs) is driven by large-scale and diverse multimodal datasets. However, progress toward generalist biomedical VLMs is limited by the lack of annotated, publicly accessible datasets across biology and medicine. Existing efforts are restricted to narrow domains, missing the full diversity of biomedical knowledge encoded in scientific literature. To address… ▽ More

    Submitted 1 April, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

  18. arXiv:2412.12801  [pdf, other

    cs.CV cs.LG

    Multi-View Incremental Learning with Structured Hebbian Plasticity for Enhanced Fusion Efficiency

    Authors: Yuhong Chen, Ailin Song, Huifeng Yin, Shuai Zhong, Fuhai Chen, Qi Xu, Shiping Wang, Mingkun Xu

    Abstract: The rapid evolution of multimedia technology has revolutionized human perception, paving the way for multi-view learning. However, traditional multi-view learning approaches are tailored for scenarios with fixed data views, falling short of emulating the intricate cognitive procedures of the human brain processing signals sequentially. Our cerebral architecture seamlessly integrates sequential dat… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: 11 pages

  19. arXiv:2411.19666  [pdf, other

    eess.IV cs.AI cs.CV cs.LG stat.AP

    Multimodal Whole Slide Foundation Model for Pathology

    Authors: Tong Ding, Sophia J. Wagner, Andrew H. Song, Richard J. Chen, Ming Y. Lu, Andrew Zhang, Anurag J. Vaidya, Guillaume Jaume, Muhammad Shaban, Ahrong Kim, Drew F. K. Williamson, Bowen Chen, Cristina Almagro-Perez, Paul Doucet, Sharifa Sahai, Chengkuan Chen, Daisuke Komura, Akihiro Kawabe, Shumpei Ishikawa, Georg Gerber, Tingying Peng, Long Phi Le, Faisal Mahmood

    Abstract: The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests (ROIs) into versatile and transferable feature representations via self-supervised learning (SSL). However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Comments: The code is accessible at https://github.com/mahmoodlab/TITAN

  20. arXiv:2411.11192  [pdf

    cs.RO cs.MA eess.SY

    Robot Metabolism: Towards machines that can grow by consuming other machines

    Authors: Philippe Martin Wyder, Riyaan Bakhda, Meiqi Zhao, Quinn A. Booth, Matthew E. Modi, Andrew Song, Simon Kang, Jiahao Wu, Priya Patel, Robert T. Kasumi, David Yi, Nihar Niraj Garg, Pranav Jhunjhunwala, Siddharth Bhutoria, Evan H. Tong, Yuhang Hu, Judah Goldfeder, Omer Mustel, Donghan Kim, Hod Lipson

    Abstract: Biological lifeforms can heal, grow, adapt, and reproduce -- abilities essential for sustained survival and development. In contrast, robots today are primarily monolithic machines with limited ability to self-repair, physically develop, or incorporate material from their environments. A key challenge to such physical adaptation has been that while robot minds are rapidly evolving new behaviors th… ▽ More

    Submitted 17 November, 2024; originally announced November 2024.

    Comments: Manuscript combined with Supplementary Materials File for arXiv submission. Submitting to Journal and will update external DOI once available

    MSC Class: 70-01; 68-02 ACM Class: I.6; H.4; H.m; I.m; B.m

  21. arXiv:2411.08641  [pdf, other

    cs.HC cs.AI

    DipMe: Haptic Recognition of Granular Media for Tangible Interactive Applications

    Authors: Xinkai Wang, Shuo Zhang, Ziyi Zhao, Lifeng Zhu, Aiguo Song

    Abstract: While tangible user interface has shown its power in naturally interacting with rigid or soft objects, users cannot conveniently use different types of granular materials as the interaction media. We introduce DipMe as a smart device to recognize the types of granular media in real time, which can be used to connect the granular materials in the physical world with various virtual content. Other t… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

    Comments: 17 pages, 10 figures

  22. Image-Based Visual Servoing for Enhanced Cooperation of Dual-Arm Manipulation

    Authors: Zizhe Zhang, Yuan Yang, Wenqiang Zuo, Guangming Song, Aiguo Song, Yang Shi

    Abstract: The cooperation of a pair of robot manipulators is required to manipulate a target object without any fixtures. The conventional control methods coordinate the end-effector pose of each manipulator with that of the other using their kinematics and joint coordinate measurements. Yet, the manipulators' inaccurate kinematics and joint coordinate measurements can cause significant pose synchronization… ▽ More

    Submitted 15 February, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

    Comments: 8 pages, 7 figures. Corresponding author: Yuan Yang {[email protected]}. For associated videos, see {https://zizhe.io/ral-ibvs-enhanced/}. This work has been accepted to the IEEE Robotics and Automation Letters in Feb 2025

  23. arXiv:2408.02859  [pdf, other

    eess.IV cs.AI cs.CV

    Multistain Pretraining for Slide Representation Learning in Pathology

    Authors: Guillaume Jaume, Anurag Vaidya, Andrew Zhang, Andrew H. Song, Richard J. Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long Phi Le, Faisal Mahmood

    Abstract: Developing self-supervised learning (SSL) models that can learn universal and transferable representations of H&E gigapixel whole-slide images (WSIs) is becoming increasingly valuable in computational pathology. These models hold the potential to advance critical tasks such as few-shot classification, slide retrieval, and patient stratification. Existing approaches for slide representation learnin… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

    Comments: ECCV'24

  24. arXiv:2407.00224  [pdf, other

    cs.CV stat.AP

    Multimodal Prototyping for cancer survival prediction

    Authors: Andrew H. Song, Richard J. Chen, Guillaume Jaume, Anurag J. Vaidya, Alexander S. Baras, Faisal Mahmood

    Abstract: Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification. Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes. However, this proc… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  25. arXiv:2406.16192  [pdf, other

    cs.CV

    HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis

    Authors: Guillaume Jaume, Paul Doucet, Andrew H. Song, Ming Y. Lu, Cristina Almagro-Pérez, Sophia J. Wagner, Anurag J. Vaidya, Richard J. Chen, Drew F. K. Williamson, Ahrong Kim, Faisal Mahmood

    Abstract: Spatial transcriptomics enables interrogating the molecular composition of tissue with ever-increasing resolution and sensitivity. However, costs, rapidly evolving technology, and lack of standards have constrained computational methods in ST to narrow tasks and small cohorts. In addition, the underlying tissue morphology, as reflected by H&E-stained whole slide images (WSIs), encodes rich informa… ▽ More

    Submitted 2 November, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: NeurIPS'24 Spotlight

  26. arXiv:2406.07061  [pdf, other

    eess.IV cs.CV

    Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments

    Authors: Gan Gao, Andrew H. Song, Fiona Wang, David Brenes, Rui Wang, Sarah S. L. Chow, Kevin W. Bishop, Lawrence D. True, Faisal Mahmood, Jonathan T. C. Liu

    Abstract: Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibili… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: CVPR CVMI 2024

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 6955-6965

  27. arXiv:2405.14116  [pdf, other

    cs.RO cs.HC cs.LG

    Learning Multimodal Confidence for Intention Recognition in Human-Robot Interaction

    Authors: Xiyuan Zhao, Huijun Li, Tianyuan Miao, Xianyi Zhu, Zhikai Wei, Aiguo Song

    Abstract: The rapid development of collaborative robotics has provided a new possibility of helping the elderly who has difficulties in daily life, allowing robots to operate according to specific intentions. However, efficient human-robot cooperation requires natural, accurate and reliable intention recognition in shared environments. The current paramount challenge for this is reducing the uncertainty of… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  28. arXiv:2405.11643  [pdf, other

    cs.CV cs.LG stat.AP

    Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

    Authors: Andrew H. Song, Richard J. Chen, Tong Ding, Drew F. K. Williamson, Guillaume Jaume, Faisal Mahmood

    Abstract: Representation learning of pathology whole-slide images (WSIs) has been has primarily relied on weak supervision with Multiple Instance Learning (MIL). However, the slide representations resulting from this approach are highly tailored to specific clinical tasks, which limits their expressivity and generalization, particularly in scenarios with limited data. Instead, we hypothesize that morphologi… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR 2024

  29. arXiv:2405.11618  [pdf, other

    cs.CV cs.AI

    Transcriptomics-guided Slide Representation Learning in Computational Pathology

    Authors: Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F. K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood

    Abstract: Self-supervised learning (SSL) has been successful in building patch embeddings of small histology images (e.g., 224x224 pixels), but scaling these models to learn slide embeddings from the entirety of giga-pixel whole-slide images (WSIs) remains challenging. Here, we leverage complementary information from gene expression profiles to guide slide representation learning using multimodal pre-traini… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR'24, Oral

  30. arXiv:2404.05657  [pdf, other

    cs.CV

    MLP Can Be A Good Transformer Learner

    Authors: Sihao Lin, Pumeng Lyu, Dongrui Liu, Tao Tang, Xiaodan Liang, Andy Song, Xiaojun Chang

    Abstract: Self-attention mechanism is the key of the Transformer but often criticized for its computation demands. Previous token pruning works motivate their methods from the view of computation redundancy but still need to load the full network and require same memory costs. This paper introduces a novel strategy that simplifies vision transformers and reduces computational load through the selective remo… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: efficient transformer

  31. arXiv:2403.04161  [pdf, other

    cs.LG cs.CV cs.NE

    SWAP-NAS: Sample-Wise Activation Patterns for Ultra-fast NAS

    Authors: Yameng Peng, Andy Song, Haytham M. Fayek, Vic Ciesielski, Xiaojun Chang

    Abstract: Training-free metrics (a.k.a. zero-cost proxies) are widely used to avoid resource-intensive neural network training, especially in Neural Architecture Search (NAS). Recent studies show that existing training-free metrics have several limitations, such as limited correlation and poor generalisation across different search spaces and tasks. Hence, we propose Sample-Wise Activation Patterns and its… ▽ More

    Submitted 24 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: ICLR2024 Spotlight

  32. arXiv:2402.01988  [pdf, other

    cs.ET physics.optics

    Low-power scalable multilayer optoelectronic neural networks enabled with incoherent light

    Authors: Alexander Song, Sai Nikhilesh Murty Kottapalli, Rahul Goyal, Bernhard Schölkopf, Peer Fischer

    Abstract: Optical approaches have made great strides towards the goal of high-speed, energy-efficient computing necessary for modern deep learning and AI applications. Read-in and read-out of data, however, limit the overall performance of existing approaches. This study introduces a multilayer optoelectronic computing framework that alternates between optical and optoelectronic layers to implement matrix-v… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  33. arXiv:2401.06148  [pdf, other

    eess.IV cs.AI cs.CV q-bio.QM

    Artificial Intelligence for Digital and Computational Pathology

    Authors: Andrew H. Song, Guillaume Jaume, Drew F. K. Williamson, Ming Y. Lu, Anurag Vaidya, Tiffany R. Miller, Faisal Mahmood

    Abstract: Advances in digitizing tissue slides and the fast-paced progress in artificial intelligence, including deep learning, have boosted the field of computational pathology. This field holds tremendous potential to automate clinical diagnosis, predict patient prognosis and response to therapy, and discover new morphological biomarkers from tissue images. Some of these artificial intelligence-based syst… ▽ More

    Submitted 12 December, 2023; originally announced January 2024.

    Journal ref: Nature Reviews Bioengineering 2023

  34. arXiv:2401.01721  [pdf, other

    cs.IT eess.SP

    Limited Feedback on Measurements: Sharing a Codebook or a Generative Model?

    Authors: Nurettin Turan, Benedikt Fesl, Michael Joham, Zhengxiang Ma, Anthony C. K. Soong, Baoling Sheen, Weimin Xiao, Wolfgang Utschick

    Abstract: Discrete Fourier transform (DFT) codebook-based solutions are well-established for limited feedback schemes in frequency division duplex (FDD) systems. In recent years, data-aided solutions have been shown to achieve higher performance, enabled by the adaptivity of the feedback scheme to the propagation environment of the base station (BS) cell. In particular, a versatile limited feedback scheme u… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  35. arXiv:2310.14137  [pdf, other

    cs.CR

    Finding Vulnerabilities in Mobile Application APIs: A Modular Programmatic Approach

    Authors: Nate Haris, Kendree Chen, Ann Song, Benjamin Pou

    Abstract: Currently, Application Programming Interfaces (APIs) are becoming increasingly popular to facilitate data transfer in a variety of mobile applications. These APIs often process sensitive user information through their endpoints, which are potentially exploitable due to developer misimplementation. In this paper, a custom, modular endpoint vulnerability detection tool was created and implemented to… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    ACM Class: K.6.5

  36. arXiv:2308.15474  [pdf, other

    cs.CV cs.AI q-bio.TO

    A General-Purpose Self-Supervised Model for Computational Pathology

    Authors: Richard J. Chen, Tong Ding, Ming Y. Lu, Drew F. K. Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H. Song, Muhammad Shaban, Mane Williams, Anurag Vaidya, Sharifa Sahai, Lukas Oldenburg, Luca L. Weishaupt, Judy J. Wang, Walt Williams, Long Phi Le, Georg Gerber, Faisal Mahmood

    Abstract: Tissue phenotyping is a fundamental computational pathology (CPath) task in learning objective characterizations of histopathologic biomarkers in anatomic pathology. However, whole-slide imaging (WSI) poses a complex computer vision problem in which the large-scale image resolutions of WSIs and the enormous diversity of morphological phenotypes preclude large-scale data annotation. Current efforts… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  37. arXiv:2307.14907  [pdf, other

    eess.IV cs.CV q-bio.QM

    Weakly Supervised AI for Efficient Analysis of 3D Pathology Samples

    Authors: Andrew H. Song, Mane Williams, Drew F. K. Williamson, Guillaume Jaume, Andrew Zhang, Bowen Chen, Robert Serafin, Jonathan T. C. Liu, Alex Baras, Anil V. Parwani, Faisal Mahmood

    Abstract: Human tissue and its constituent cells form a microenvironment that is fundamentally three-dimensional (3D). However, the standard-of-care in pathologic diagnosis involves selecting a few two-dimensional (2D) sections for microscopic evaluation, risking sampling bias and misdiagnosis. Diverse methods for capturing 3D tissue morphologies have been developed, but they have yet had little translation… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  38. A Survey on Explainable Artificial Intelligence for Cybersecurity

    Authors: Gaith Rjoub, Jamal Bentahar, Omar Abdel Wahab, Rabeb Mizouni, Alyssa Song, Robin Cohen, Hadi Otrok, Azzam Mourad

    Abstract: The black-box nature of artificial intelligence (AI) models has been the source of many concerns in their use for critical applications. Explainable Artificial Intelligence (XAI) is a rapidly growing research field that aims to create machine learning models that can provide clear and interpretable explanations for their decisions and actions. In the field of network cybersecurity, XAI has the pot… ▽ More

    Submitted 11 June, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

  39. arXiv:2212.09479  [pdf

    cs.NE cs.AI

    Performance assessment and exhaustive listing of 500+ nature inspired metaheuristic algorithms

    Authors: Zhongqiang Ma, Guohua Wu, Ponnuthurai N. Suganthan, Aijuan Song, Qizhang Luo

    Abstract: Metaheuristics are popularly used in various fields, and they have attracted much attention in the scientific and industrial communities. In recent years, the number of new metaheuristic names has been continuously growing. Generally, the inventors attribute the novelties of these new algorithms to inspirations from either biology, human behaviors, physics, or other phenomena. In addition, these n… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Report number: 45 pages

  40. arXiv:2209.02946  [pdf, other

    stat.ML cs.LG

    On the Sparse DAG Structure Learning Based on Adaptive Lasso

    Authors: Danru Xu, Erdun Gao, Wei Huang, Menghan Wang, Andy Song, Mingming Gong

    Abstract: Learning the underlying Bayesian Networks (BNs), represented by directed acyclic graphs (DAGs), of the concerned events from purely-observational data is a crucial part of evidential reasoning. This task remains challenging due to the large and discrete search space. A recent flurry of developments followed NOTEARS[1] recast this combinatorial problem into a continuous optimization problem by leve… ▽ More

    Submitted 17 February, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: 11 pages, 8 figures

  41. arXiv:2206.08885  [pdf, other

    eess.IV cs.CV cs.LG stat.ME

    Incorporating intratumoral heterogeneity into weakly-supervised deep learning models via variance pooling

    Authors: Iain Carmichael, Andrew H. Song, Richard J. Chen, Drew F. K. Williamson, Tiffany Y. Chen, Faisal Mahmood

    Abstract: Supervised learning tasks such as cancer survival prediction from gigapixel whole slide images (WSIs) are a critical challenge in computational pathology that requires modeling complex features of the tumor microenvironment. These learning tasks are often solved with deep multi-instance learning (MIL) models that do not explicitly capture intratumoral heterogeneity. We develop a novel variance poo… ▽ More

    Submitted 19 November, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: MICCAI 2022

  42. PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search

    Authors: Yameng Peng, Andy Song, Vic Ciesielski, Haytham M. Fayek, Xiaojun Chang

    Abstract: Neural architecture search (NAS) aims to automate architecture engineering in neural networks. This often requires a high computational overhead to evaluate a number of candidate networks from the set of all possible networks in the search space during the search. Prediction of the networks' performance can alleviate this high computational overhead by mitigating the need for evaluating every cand… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Accepted by GECCO 2022

    ACM Class: I.2; I.4

  43. arXiv:2204.03112  [pdf

    cs.RO eess.SY

    An Instrumented Wheel-On-Limb System of Planetary Rovers for Wheel-Terrain Interactions: System Conception and Preliminary Design

    Authors: Lihang Feng, Xu Jiang, Aiguo Song

    Abstract: Understanding the wheel-terrain interaction is of great importance to improve the maneuverability and traversability of the rovers. A well-developed sensing device carried by the rover would greatly facilitate the complex risk-reducing operations on sandy terrains. In this paper, an instrumented wheel-on-limb (WOL) system of planetary rovers for wheel-terrain interaction characterization is presen… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 2nd International Conference on Robotics and Control Engineering, ACM RobCE 2022, March 25, 2022, Nanjing, China

  44. arXiv:2202.12808  [pdf, other

    eess.SP cs.LG stat.CO stat.ML

    High-Dimensional Sparse Bayesian Learning without Covariance Matrices

    Authors: Alexander Lin, Andrew H. Song, Berkin Bilgic, Demba Ba

    Abstract: Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem. However, the most popular inference algorithms for SBL become too expensive for high-dimensional settings, due to the need to store and compute a large covariance matrix. We introduce a new inference scheme that avoids explicit construction of the covariance matrix by solving multiple linear systems in p… ▽ More

    Submitted 25 February, 2022; originally announced February 2022.

    Comments: 5 pages

    Journal ref: IEEE ICASSP 2022

  45. arXiv:2110.04683  [pdf, other

    cs.LG eess.SP

    Mixture Model Auto-Encoders: Deep Clustering through Dictionary Learning

    Authors: Alexander Lin, Andrew H. Song, Demba Ba

    Abstract: State-of-the-art approaches for clustering high-dimensional data utilize deep auto-encoder architectures. Many of these networks require a large number of parameters and suffer from a lack of interpretability, due to the black-box nature of the auto-encoders. We introduce Mixture Model Auto-Encoders (MixMate), a novel architecture that clusters data by performing inference on a generative model. D… ▽ More

    Submitted 25 February, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: 5 pages, 3 figures

    Journal ref: IEEE ICASSP 2022

  46. arXiv:2109.06057  [pdf, other

    cs.CV

    Unsupervised Person Re-Identification: A Systematic Survey of Challenges and Solutions

    Authors: Xiangtan Lin, Pengzhen Ren, Chung-Hsing Yeh, Lina Yao, Andy Song, Xiaojun Chang

    Abstract: Person re-identification (Re-ID) has been a significant research topic in the past decade due to its real-world applications and research significance. While supervised person Re-ID methods achieve superior performance over unsupervised counterparts, they can not scale to large unlabelled datasets and new domains due to the prohibitive labelling cost. Therefore, unsupervised person Re-ID has drawn… ▽ More

    Submitted 1 October, 2021; v1 submitted 31 August, 2021; originally announced September 2021.

    Comments: 20 pages

  47. MoParkeR : Multi-objective Parking Recommendation

    Authors: Mohammad Saiedur Rahaman, Wei Shao, Flora D. Salim, Ayad Turky, Andy Song, Jeffrey Chan, Junliang Jiang, Doug Bradbrook

    Abstract: Existing parking recommendation solutions mainly focus on finding and suggesting parking spaces based on the unoccupied options only. However, there are other factors associated with parking spaces that can influence someone's choice of parking such as fare, parking rule, walking distance to destination, travel time, likelihood to be unoccupied at a given time. More importantly, these factors may… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: 6 pages, 5 figures

  48. arXiv:2105.10439  [pdf, other

    eess.SP cs.LG stat.ML

    Covariance-Free Sparse Bayesian Learning

    Authors: Alexander Lin, Andrew H. Song, Berkin Bilgic, Demba Ba

    Abstract: Sparse Bayesian learning (SBL) is a powerful framework for tackling the sparse coding problem while also providing uncertainty quantification. The most popular inference algorithms for SBL exhibit prohibitively large computational costs for high-dimensional problems due to the need to maintain a large covariance matrix. To resolve this issue, we introduce a new method for accelerating SBL inferenc… ▽ More

    Submitted 8 April, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

    Comments: 13 pages

  49. arXiv:2104.00530  [pdf, other

    cs.LG stat.AP stat.ML

    Gaussian Process Convolutional Dictionary Learning

    Authors: Andrew H. Song, Bahareh Tolooshams, Demba Ba

    Abstract: Convolutional dictionary learning (CDL), the problem of estimating shift-invariant templates from data, is typically conducted in the absence of a prior/structure on the templates. In data-scarce or low signal-to-noise ratio (SNR) regimes, learned templates overfit the data and lack smoothness, which can affect the predictive performance of downstream tasks. To address this limitation, we propose… ▽ More

    Submitted 24 November, 2021; v1 submitted 28 March, 2021; originally announced April 2021.

    Comments: IEEE Signal Processing Letters (2021)

  50. A Bayesian Approach for Inferring Sea Ice Loads

    Authors: Matthew Parno, Taylor Hodgdon, Brendan West, Devin O'Connor, Arnold Song

    Abstract: The Earth's climate is rapidly changing and some of the most drastic changes can be seen in the Arctic, where sea ice extent has diminished considerably in recent years. As the Arctic climate continues to change, gathering in situ sea ice measurements is increasingly important for understanding the complex evolution of the Arctic ice pack. To date, observations of ice stresses in the Arctic have b… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.