Skip to main content

Showing 1–50 of 124 results for author: Mirza, M

.
  1. arXiv:2505.22486  [pdf, other

    cs.LG cs.CV

    Understanding Adversarial Training with Energy-based Models

    Authors: Mujtaba Hussain Mirza, Maria Rosaria Briglia, Filippo Bartolucci, Senad Beadini, Giuseppe Lisanti, Iacopo Masi

    Abstract: We aim at using Energy-based Model (EBM) framework to better understand adversarial training (AT) in classifiers, and additionally to analyze the intrinsic generative capabilities of robust classifiers. By viewing standard classifiers through an energy lens, we begin by analyzing how the energies of adversarial examples, generated by various attacks, differ from those of the natural samples. The c… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Under review for TPAMI

  2. arXiv:2505.21742  [pdf, ps, other

    cs.CV cs.LG

    What is Adversarial Training for Diffusion Models?

    Authors: Briglia Maria Rosaria, Mujtaba Hussain Mirza, Giuseppe Lisanti, Iacopo Masi

    Abstract: We answer the question in the title, showing that adversarial training (AT) for diffusion models (DMs) fundamentally differs from classifiers: while AT in classifiers enforces output invariance, AT in DMs requires equivariance to keep the diffusion process aligned with the data distribution. AT is a way to enforce smoothness in the diffusion flow, improving robustness to outliers and corrupted dat… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 40 pages

  3. arXiv:2505.18115  [pdf, ps, other

    cs.CV

    Instructify: Demystifying Metadata to Visual Instruction Tuning Data Conversion

    Authors: Jacob Hansen, Wei Lin, Junmo Kang, Muhammad Jehanzeb Mirza, Hongyin Luo, Rogerio Feris, Alan Ritter, James Glass, Leonid Karlinsky

    Abstract: Visual Instruction Tuning (VisIT) data, commonly available as human-assistant conversations with images interleaved in the human turns, are currently the most widespread vehicle for aligning strong LLMs to understand visual inputs, converting them to strong LMMs. While many VisIT datasets are available, most are constructed using ad-hoc techniques developed independently by different groups. They… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  4. arXiv:2505.07793  [pdf, ps, other

    cs.LG cs.AI

    Overflow Prevention Enhances Long-Context Recurrent LLMs

    Authors: Assaf Ben-Kish, Itamar Zimerman, M. Jehanzeb Mirza, James Glass, Leonid Karlinsky, Raja Giryes

    Abstract: A recent trend in LLMs is developing recurrent sub-quadratic models that improve long-context processing efficiency. We investigate leading large long-context models, focusing on how their fixed-size recurrent memory affects their performance. Our experiments reveal that, even when these models are trained for extended contexts, their use of long contexts remains underutilized. Specifically, we de… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  5. A Modularized Design Approach for GelSight Family of Vision-based Tactile Sensors

    Authors: Arpit Agarwal, Mohammad Amin Mirzaee, Xiping Sun, Wenzhen Yuan

    Abstract: GelSight family of vision-based tactile sensors has proven to be effective for multiple robot perception and manipulation tasks. These sensors are based on an internal optical system and an embedded camera to capture the deformation of the soft sensor surface, inferring the high-resolution geometry of the objects in contact. However, customizing the sensors for different robot hands requires a ted… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

    Comments: The paper is accepted to International Journal of Robotics Research with DOI 10.1177/02783649251339680

  6. arXiv:2504.14231  [pdf, other

    cs.CV

    Exploring Modality Guidance to Enhance VFM-based Feature Fusion for UDA in 3D Semantic Segmentation

    Authors: Johannes Spoecklberger, Wei Lin, Pedro Hermosilla, Sivan Doveh, Horst Possegger, M. Jehanzeb Mirza

    Abstract: Vision Foundation Models (VFMs) have become a de facto choice for many downstream vision tasks, like image classification, image segmentation, and object localization. However, they can also provide significant utility for downstream 3D tasks that can leverage the cross-modal information (e.g., from paired image data). In our work, we further explore the utility of VFMs for adapting from a labeled… ▽ More

    Submitted 19 April, 2025; originally announced April 2025.

  7. arXiv:2504.00220  [pdf, other

    cs.LG cs.AI cs.CV

    Can Diffusion Models Disentangle? A Theoretical Perspective

    Authors: Liming Wang, Muhammad Jehanzeb Mirza, Yishu Gong, Yuan Gong, Jiaqi Zhang, Brian H. Tracey, Katerina Placek, Marco Vilela, James R. Glass

    Abstract: This paper presents a novel theoretical framework for understanding how diffusion models can learn disentangled representations. Within this framework, we establish identifiability conditions for general disentangled latent variable models, analyze training dynamics, and derive sample complexity bounds for disentangled latent subspace models. To validate our theory, we conduct disentanglement expe… ▽ More

    Submitted 31 March, 2025; originally announced April 2025.

  8. GelBelt: A Vision-based Tactile Sensor for Continuous Sensing of Large Surfaces

    Authors: Mohammad Amin Mirzaee, Hung-Jui Huang, Wenzhen Yuan

    Abstract: Scanning large-scale surfaces is widely demanded in surface reconstruction applications and detecting defects in industries' quality control and maintenance stages. Traditional vision-based tactile sensors have shown promising performance in high-resolution shape reconstruction while suffering limitations such as small sensing areas or susceptibility to damage when slid across surfaces, making the… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    Comments: Accepted to IEEE RA-L. 8 pages, 7 figures, webpage: https://aminmirz.github.io/GelBelt/

  9. arXiv:2411.13317  [pdf, other

    cs.CV

    Teaching VLMs to Localize Specific Objects from In-context Examples

    Authors: Sivan Doveh, Nimrod Shabtay, Wei Lin, Eli Schwartz, Hilde Kuehne, Raja Giryes, Rogerio Feris, Leonid Karlinsky, James Glass, Assaf Arbelle, Shimon Ullman, M. Jehanzeb Mirza

    Abstract: Vision-Language Models (VLMs) have shown remarkable capabilities across diverse visual tasks, including image recognition, video understanding, and Visual Question Answering (VQA) when explicitly trained for these tasks. Despite these advances, we find that present-day VLMs (including the proprietary GPT-4o) lack a fundamental cognitive ability: learning to localize specific objects in a scene by… ▽ More

    Submitted 12 March, 2025; v1 submitted 20 November, 2024; originally announced November 2024.

  10. arXiv:2410.10783  [pdf, other

    cs.CV

    LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

    Authors: Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh, Wei Lin, M. Jehanzeb Mirza, Leshem Chosen, Mikhail Yurochkin, Yuekai Sun, Assaf Arbelle, Leonid Karlinsky, Raja Giryes

    Abstract: The large-scale training of multi-modal models on data scraped from the web has shown outstanding utility in infusing these models with the required world knowledge to perform effectively on multiple downstream tasks. However, one downside of scraping data from the web can be the potential sacrifice of the benchmarks on which the abilities of these models are often evaluated. To safeguard against… ▽ More

    Submitted 22 April, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

  11. arXiv:2410.06154  [pdf, other

    cs.CV

    GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

    Authors: M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass

    Abstract: In this work, we propose GLOV, which enables Large Language Models (LLMs) to act as implicit optimizers for Vision-Language Models (VLMs) to enhance downstream vision tasks. GLOV prompts an LLM with the downstream task description, querying it for suitable VLM prompts (e.g., for zero-shot classification with CLIP). These prompts are ranked according to their fitness for the downstream vision task.… ▽ More

    Submitted 5 February, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: Code: https://github.com/jmiemirza/GLOV

  12. arXiv:2410.00700  [pdf, other

    cs.CV cs.AI

    Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

    Authors: Saurav Jha, Shiqi Yang, Masato Ishii, Mengjie Zhao, Christian Simon, Muhammad Jehanzeb Mirza, Dong Gong, Lina Yao, Shusuke Takahashi, Yuki Mitsufuji

    Abstract: Personalized text-to-image diffusion models have grown popular for their ability to efficiently acquire a new concept from user-defined text descriptions and a few images. However, in the real world, a user may wish to personalize a model on multiple concepts but one at a time, with no access to the data from previous concepts due to storage/privacy concerns. When faced with this continual learnin… ▽ More

    Submitted 9 February, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: Accepted to ICLR 2025

  13. arXiv:2407.18309  [pdf

    cs.RO eess.SY

    Adaptive Terminal Sliding Mode Control Using Deep Reinforcement Learning for Zero-Force Control of Exoskeleton Robot Systems

    Authors: Morteza Mirzaee, Reza Kazemi

    Abstract: This paper introduces a novel zero-force control method for upper-limb exoskeleton robots, which are used in a variety of applications including rehabilitation, assistance, and human physical capability enhancement. The proposed control method employs an Adaptive Integral Terminal Sliding Mode (AITSM) controller, combined with an exponential reaching law and Proximal Policy Optimization (PPO), a t… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  14. arXiv:2407.06315  [pdf, other

    cs.CV cs.LG

    Shedding More Light on Robust Classifiers under the lens of Energy-based Models

    Authors: Mujtaba Hussain Mirza, Maria Rosaria Briglia, Senad Beadini, Iacopo Masi

    Abstract: By reinterpreting a robust discriminative classifier as Energy-based Model (EBM), we offer a new take on the dynamics of adversarial training (AT). Our analysis of the energy landscape during AT reveals that untargeted attacks generate adversarial images much more in-distribution (lower energy) than the original data from the point of view of the model. Conversely, we observe the opposite for targ… ▽ More

    Submitted 10 September, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at European Conference on Computer Vision (ECCV) 2024

  15. arXiv:2407.00346  [pdf, other

    quant-ph

    Photon routing in disordered chiral waveguide QED ladders: Interplay between photonic localization and collective atomic effects

    Authors: Nishan Amgain, Imran M. Mirza

    Abstract: In recent years, photon routing has garnered considerable research activity due to its key applications in quantum networking and optical communications. This paper studies the single photon routing scheme in many-emitter disordered chiral waveguide quantum electrodynamics (wQED) ladders. The wQED ladder consists of two one-dimensional lossless waveguides simultaneously and chirally coupled with a… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 10 pages, 6 figures

  16. arXiv:2406.09240  [pdf, other

    cs.CV

    Comparison Visual Instruction Tuning

    Authors: Wei Lin, Muhammad Jehanzeb Mirza, Sivan Doveh, Rogerio Feris, Raja Giryes, Sepp Hochreiter, Leonid Karlinsky

    Abstract: Comparing two images in terms of Commonalities and Differences (CaD) is a fundamental human capability that forms the basis of advanced visual reasoning and interpretation. It is essential for the generation of detailed and contextually relevant descriptions, performing comparative analysis, novelty detection, and making informed decisions based on visual data. However, surprisingly, little attent… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Project page: https://wlin-at.github.io/cad_vi ; Huggingface dataset repo: https://huggingface.co/datasets/wlin21at/CaD-Inst

  17. arXiv:2406.08164  [pdf, other

    cs.CV

    ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

    Authors: Irene Huang, Wei Lin, M. Jehanzeb Mirza, Jacob A. Hansen, Sivan Doveh, Victor Ion Butoi, Roei Herzig, Assaf Arbelle, Hilde Kuehne, Trevor Darrell, Chuang Gan, Aude Oliva, Rogerio Feris, Leonid Karlinsky

    Abstract: Compositional Reasoning (CR) entails grasping the significance of attributes, relations, and word order. Recent Vision-Language Models (VLMs), comprising a visual encoder and a Large Language Model (LLM) decoder, have demonstrated remarkable proficiency in such reasoning tasks. This prompts a crucial question: have VLMs effectively tackled the CR challenge? We conjecture that existing CR benchmark… ▽ More

    Submitted 12 November, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: NeurIPS 2024 Camera Ready

  18. arXiv:2406.06638  [pdf, other

    hep-ph cs.LG

    Particle Multi-Axis Transformer for Jet Tagging

    Authors: Muhammad Usman, M Husnain Shahid, Maheen Ejaz, Ummay Hani, Nayab Fatima, Abdul Rehman Khan, Asifullah Khan, Nasir Majid Mirza

    Abstract: Jet tagging is an essential categorization problem in high energy physics. In recent times, Deep Learning has not only risen to the challenge of jet tagging but also significantly improved its performance. In this article, we proposed an idea of a new architecture, Particle Multi-Axis transformer (ParMAT) which is a modified version of Particle transformer (ParT). ParMAT contains local and global… ▽ More

    Submitted 16 July, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  19. arXiv:2404.10534  [pdf, other

    cs.CV cs.AI

    Into the Fog: Evaluating Robustness of Multiple Object Tracking

    Authors: Nadezda Kirillova, M. Jehanzeb Mirza, Horst Bischof, Horst Possegger

    Abstract: State-of-the-art Multiple Object Tracking (MOT) approaches have shown remarkable performance when trained and evaluated on current benchmarks. However, these benchmarks primarily consist of clear weather scenarios, overlooking adverse atmospheric conditions such as fog, haze, smoke and dust. As a result, the robustness of trackers against these challenging conditions remains underexplored. To addr… ▽ More

    Submitted 13 November, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Journal ref: BMVC 2024

  20. arXiv:2403.12736  [pdf, other

    cs.CV

    Towards Multimodal In-Context Learning for Vision & Language Models

    Authors: Sivan Doveh, Shaked Perek, M. Jehanzeb Mirza, Wei Lin, Amit Alfassy, Assaf Arbelle, Shimon Ullman, Leonid Karlinsky

    Abstract: State-of-the-art Vision-Language Models (VLMs) ground the vision and the language modality primarily via projecting the vision tokens from the encoder to language-like tokens, which are directly fed to the Large Language Model (LLM) decoder. While these models have shown unprecedented performance in many downstream zero-shot tasks (eg image captioning, question answers, etc), still little emphasis… ▽ More

    Submitted 17 July, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  21. arXiv:2403.11755  [pdf, other

    cs.CV cs.AI cs.LG

    Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Sivan Doveh, Jakub Micorek, Mateusz Kozinski, Hilde Kuehne, Horst Possegger

    Abstract: Prompt ensembling of Large Language Model (LLM) generated category-specific prompts has emerged as an effective method to enhance zero-shot recognition ability of Vision-Language Models (VLMs). To obtain these category-specific prompts, the present methods rely on hand-crafting the prompts to the LLMs for generating VLM prompts for the downstream tasks. However, this requires manually composing th… ▽ More

    Submitted 7 August, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: ECCV Camera Ready. Code & Data: https://jmiemirza.github.io/Meta-Prompting/

  22. arXiv:2403.11691  [pdf, other

    cs.CV

    TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models

    Authors: Lisa Weijler, Muhammad Jehanzeb Mirza, Leon Sick, Can Ekkazan, Pedro Hermosilla

    Abstract: Test-Time Training (TTT) proposes to adapt a pre-trained network to changing data distributions on-the-fly. In this work, we propose the first TTT method for 3D semantic segmentation, TTT-KD, which models Knowledge Distillation (KD) from foundation models (e.g. DINOv2) as a self-supervised objective for adaptation to distribution shifts at test-time. Given access to paired image-pointcloud (2D-3D)… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  23. arXiv:2403.09193  [pdf, other

    cs.CV cs.AI cs.LG q-bio.NC

    Can We Talk Models Into Seeing the World Differently?

    Authors: Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, M. Jehanzeb Mirza, Margret Keuper, Janis Keuper

    Abstract: Unlike traditional vision-only models, vision language models (VLMs) offer an intuitive way to access visual content through language prompting by combining a large language model (LLM) with a vision encoder. However, both the LLM and the vision encoder come with their own set of biases, cue preferences, and shortcuts, which have been rigorously studied in uni-modal models. A timely question is ho… ▽ More

    Submitted 5 March, 2025; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted at ICLR 2025

  24. Band Gap Engineering and Controlling Transport Properties of Single Photons in Periodic and Disordered Jaynes-Cummings Arrays

    Authors: Tiberius Berndsen, Nishan Amgain, Imran M. Mirza

    Abstract: We theoretically study the single photon transport properties in periodic and position-disordered Jaynes-Cummings (or JC) arrays of waveguide-coupled microtoroidal ring resonators, each interacting with a single two-level quantum emitter. Employing the real-space formalism of quantum optics, we focus on various parameter regimes of cavity quantum electrodynamics (cQED) to gain better control of si… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: 12 pages, 5 figures

    Journal ref: J. Opt. Soc. Amer. B; Vol. 41, Issue 8, pp. C9-C19 (2024)

  25. arXiv:2309.06809  [pdf, other

    cs.CV

    TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Horst Possegger, Rogerio Feris, Horst Bischof

    Abstract: Vision and Language Models (VLMs), such as CLIP, have enabled visual recognition of a potentially unlimited set of categories described by text prompts. However, for the best visual recognition performance, these models still require tuning to better fit the data distributions of the downstream tasks, in order to overcome the domain shift from the web-based pre-training data. Recently, it has been… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: Code is available at: https://github.com/jmiemirza/TAP

  26. Chirality-assisted enhancement of tripartite entanglement in waveguide QED

    Authors: Logan Patrick, Umar Arshad, Dingyu Guo, Imran M. Mirza

    Abstract: We study the generation and control of genuine tripartite entanglement among quantum emitters (QEs) that are side coupled to one-dimensional spin-momentum locked (or chiral) waveguides. By applying the machinery of Fock state master equations along with the recently proposed concurrence fill measure of tripartite entanglement [S. Xie and J. H. Eberly, Phys. Rev. Lett. 127, 040403 (2021)], we analy… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: 14 pages, 6 figures

    Journal ref: Sci Rep 14, 11175 (2024)

  27. arXiv:2308.01096  [pdf, other

    eess.IV

    Learning Fourier-Constrained Diffusion Bridges for MRI Reconstruction

    Authors: Muhammad U. Mirza, Onat Dalmaz, Hasan A. Bedel, Gokberk Elmas, Yilmaz Korkmaz, Alper Gungor, Salman UH Dar, Tolga Çukur

    Abstract: Deep generative models have gained recent traction in accelerated MRI reconstruction. Diffusion priors are particularly promising given their representational fidelity. Instead of the target transformation from undersampled to fully-sampled data required for MRI reconstruction, common diffusion priors are trained to learn a task-agnostic transformation from an asymptotic start-point of Gaussian no… ▽ More

    Submitted 16 December, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  28. Electromagnetically induced transparency in many-emitter waveguide quantum electrodynamics: linear versus nonlinear waveguide dispersions

    Authors: Tiberius Berndsen, Imran M. Mirza

    Abstract: We study single-photon induced electromagnetically induced transparency (EIT) in many-emitter waveguide quantum electrodynamics (wQED) with linear and nonlinear waveguide dispersion relations. In the single-emitter problem, in addition to the robustness of the EIT spectral features in the over-coupled regime of wQED, we find that the nonlinear dispersion results in the appearance of a side peak fo… ▽ More

    Submitted 11 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: 7 pages, 11 figures

    Journal ref: Phys. Rev. A 108, 063702 (2023)

  29. arXiv:2305.18953  [pdf, other

    cs.CV

    Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

    Authors: Stefan Leitner, M. Jehanzeb Mirza, Wei Lin, Jakub Micorek, Marc Masana, Mateusz Kozinski, Horst Possegger, Horst Bischof

    Abstract: In autonomous driving scenarios, current object detection models show strong performance when tested in clear weather. However, their performance deteriorates significantly when tested in degrading weather conditions. In addition, even when adapted to perform robustly in a sequence of different weather conditions, they are often unable to perform well in all of them and suffer from catastrophic fo… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Intelligent Vehicle Conference (oral presentation)

  30. arXiv:2305.18287  [pdf, other

    cs.CV cs.CL

    LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Mateusz Kozinski, Horst Possegger, Rogerio Feris, Horst Bischof

    Abstract: Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed categ… ▽ More

    Submitted 23 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 (Camera Ready) - Project Page: https://jmiemirza.github.io/LaFTer/

  31. Theoretical and Experimental Challenges in the Measurement of Neutrino Mass

    Authors: Jyotsna Singh, M. Ibrahim Mirza

    Abstract: Neutrino masses are yet unknown. We discuss the present state of effective electron anti-neutrino mass from $β$ decay experiments; effective Majorana neutrino mass from neutrinoless double-beta decay experiments; neutrino mass squared differences from neutrino oscillation: solar, atmospheric, reactor and accelerator based experiments; sum of neutrino masses from cosmological observations. Current… ▽ More

    Submitted 28 September, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: 14 pages, 6 figures

    Journal ref: Advances in High Energy Physics, vol. 2023, Article ID 8897375, 14 pages, 2023

  32. arXiv:2303.14131  [pdf, ps, other

    physics.gen-ph

    Phase Space Analysis of Fluorine-Oxygen-Nitrogen Network and Energy Generation in Flourine-Oxygen Reaction

    Authors: Babur M. Mirza

    Abstract: Reaction network of fluorine-18, oxygen-15 and nitrogen-15 is considered for its temperature dependent energy output. The main reactions for generation and annihilation of oxygen and fluorine are coupled in the reaction equations while nitrogen is produced as a decay product. We find that the governing set of equations for F18(p, alpha)O15 process in the phase diagram exhibit a predominance of the… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

  33. arXiv:2302.13627  [pdf, other

    quant-ph physics.optics

    Nonreciprocal slow or fast light in anti-$\mathcal{PT}$-symmetric optomechanics

    Authors: Meiyu Peng, Huilai Zhang, Qian Zhang, Tian-Xiang Lu, Imran M. Mirza, Hui Jing

    Abstract: Non-Hermitian systems with anti-parity-time ($\mathcal{APT}$) symmetry have revealed rich physics beyond conventional systems. Here, we study optomechanics in an $\mathcal{APT}$-symmetric spinning resonator and show that, by tuning the rotating speed to approach the exceptional point (EP) or the non-Hermitian spectral degeneracy, nonreciprocal light transmission with a high isolation ratio can be… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 9 pages, 4 figures. It has been accepted for publication as a Regular Article in Physical Review A

  34. arXiv:2212.09729  [pdf

    q-bio.NC

    Bistable perception, precision and neuromodulation

    Authors: Filip Novicky, Thomas Parr, Karl Friston, M. Berk Mirza, Noor Sajid

    Abstract: Bistable perception follows from observing a static, ambiguous, (visual) stimulus with two possible interpretations. Here, we present an active (Bayesian) inference account of bistable perception and posit that perceptual transitions between different interpretations (i.e., inferences) of the same stimulus ensue from specific eye movements that shift the focus to a different visual feature. Formal… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

  35. arXiv:2211.15393  [pdf, other

    cs.CV

    Video Test-Time Adaptation for Action Recognition

    Authors: Wei Lin, Muhammad Jehanzeb Mirza, Mateusz Kozinski, Horst Possegger, Hilde Kuehne, Horst Bischof

    Abstract: Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in test data. However, test-time adaptation of video action recognition models against common distribution shifts has so far not been demonstrated. We propose to address this problem with an approach tailored to spatio-temporal mode… ▽ More

    Submitted 20 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: Accepted at CVPR 2023

  36. arXiv:2211.12870  [pdf, other

    cs.CV

    ActMAD: Activation Matching to Align Distributions for Test-Time-Training

    Authors: Muhammad Jehanzeb Mirza, Pol Jané Soneira, Wei Lin, Mateusz Kozinski, Horst Possegger, Horst Bischof

    Abstract: Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time. We propose to perform this adaptation via Activation Matching (ActMAD): We analyze activations of the model and align activation statistics of the OOD test data to those of the training data. In contrast to existing methods, which model the… ▽ More

    Submitted 23 March, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: CVPR 2023 - Project Page: https://jmiemirza.github.io/ActMAD/

  37. arXiv:2211.11432  [pdf, other

    cs.CV

    MATE: Masked Autoencoders are Online 3D Test-Time Learners

    Authors: M. Jehanzeb Mirza, Inkyu Shin, Wei Lin, Andreas Schriebl, Kunyang Sun, Jaesung Choe, Horst Possegger, Mateusz Kozinski, In So Kweon, Kun-Jin Yoon, Horst Bischof

    Abstract: Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data. Like existing TTT methods from the 2D image domain, MATE also leverages test data for adaptation. Its test-time objective is that of a Masked Autoencoder: a large portion of each test point cloud is remove… ▽ More

    Submitted 20 March, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Code is available at this repository: https://github.com/jmiemirza/MATE

  38. arXiv:2211.05854  [pdf, other

    cs.LG cs.AI

    Test-time adversarial detection and robustness for localizing humans using ultra wide band channel impulse responses

    Authors: Abhiram Kolli, Muhammad Jehanzeb Mirza, Horst Possegger, Horst Bischof

    Abstract: Keyless entry systems in cars are adopting neural networks for localizing its operators. Using test-time adversarial defences equip such systems with the ability to defend against adversarial attacks without prior training on adversarial samples. We propose a test-time adversarial example detector which detects the input adversarial example through quantifying the localized intermediate responses… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, ICASSP Conference

  39. Engineering Optomechanically Induced Transparency by coupling a qubit to a spinning resonator

    Authors: Jessica Burns, Owen Root, Hui Jing, Imran M. Mirza

    Abstract: We theoretically study the spectral properties of a pump-probe driven hybrid spinning optomechanical ring resonator optically coupled with a two-level quantum emitter (QE or qubit). Recently we have shown [arXiv:1810.03709] that in the absence of the emitter the coupled cavity version of this setup is not only capable of nonreciprocal light propagation but can also exhibit slow & fast light propag… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 9 pages, 6 figures

    Journal ref: J. Opt. Soc. Am. Vol. 40, Issue 5, pp. 958-965 (2023)

  40. arXiv:2209.07598  [pdf, other

    physics.ins-det

    Mitigation Strategies for ${}^{42}$Ar/${}^{42}$K Background Reduction using Encapsulation with Ultra-Pure Plastic for the LEGEND Experiment

    Authors: M. Ibrahim Mirza

    Abstract: Neutrinoless double-beta (0$νββ$) decay is the most compelling approach to determine the Majorana nature of neutrino and measure effective Majorana neutrino mass. The LEGEND collaboration is aiming to look for 0$νββ$ decay of ${}^{76}$Ge with unprecedented sensitivity. If underground-sourced argon is not available, the cosmogenically-induced isotope ${}^{42}$Ar and its decay progeny ${}^{42}$K in… ▽ More

    Submitted 6 October, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: 4 pages, 3 figures, conference

    Journal ref: AIP Conf. Proc. 2908, 100006 (2023)

  41. arXiv:2206.15340  [pdf, ps, other

    cond-mat.stat-mech

    Work Extracting From Nonextensive Small System With Feedback and Second Law-Like Inequalities with Quantum Tsallis Entropy

    Authors: Saman Amiri, Mahdi Mirzaee, Mohammad Mazhari

    Abstract: Gibbs-Boltzmann entropy leads to systems that have a strong dependence on initial conditions. In reality, most materials behave quite independently of initial conditions. Nonextensive entropy or Tsallis entropy leads to nonextensive statistical mechanics. In this paper, we calculate the Tsallis form of Clausius inequality and then determine the upper bound for extracting work from the small system… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: 13 pages

  42. arXiv:2204.08817  [pdf, other

    cs.CV

    An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions

    Authors: M. Jehanzeb Mirza, Marc Masana, Horst Possegger, Horst Bischof

    Abstract: Although deep neural networks enable impressive visual perception performance for autonomous driving, their robustness to varying weather conditions still requires attention. When adapting these models for changed environments, such as different weather conditions, they are prone to forgetting previously learned information. This catastrophic forgetting is typically addressed via incremental learn… ▽ More

    Submitted 21 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR Workshops - Camera Ready Version

  43. Non-Markovianity in photosynthetic reaction centers: A noise-induced quantum coherence perspective

    Authors: Zibo Wang, Antonio V. Lim, Imran M. Mirza

    Abstract: The long-standing problem of nearly perfect photosynthetic yield in some types of bacteria and nearly all kinds of plants despite the interaction with a hot and noisy environment has witnessed quantum optical explanations in the last decade or so. Typically in these explanations, photosynthetic reaction centers are modeled as five-level quantum heat engines where the generation of Fano-type interf… ▽ More

    Submitted 3 May, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: 12 pages, 5 figures

    Journal ref: Optics Continuum Vol. 1, Issue 8, pp. 1848-1858 (2022)

  44. arXiv:2202.08417  [pdf, other

    cs.LG

    Retrieval-Augmented Reinforcement Learning

    Authors: Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent Sifre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell

    Abstract: Most deep reinforcement learning (RL) algorithms distill experience into parametric behavior policies or value functions via gradient updates. While effective, this approach has several disadvantages: (1) it is computationally expensive, (2) it can take many updates to integrate experiences into the parametric model, (3) experiences that are not fully integrated do not appropriately influence the… ▽ More

    Submitted 24 May, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  45. arXiv:2202.00053  [pdf

    physics.plasm-ph

    Linear and nonlinear analysis of Ion-Temperature-Gradient (ITG) Driven mode in the asymmetric Pair-Ion Magnetoplasma

    Authors: Javaria Razzaq, Zahida Ehsan, Arshad M. Mirza

    Abstract: We have investigated linear and nonlinear dynamics of ion-temperature-gradient driven drift mode for Maxwellian and non Maxwellian pair-ion plasma embedded in an inhomogeneous magnetic field having gradients in ion's temperature and number density. Linear dispersion relations are derived and analyzed analytically as well as numerically for different cases. It has been found that growth rate of ins… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: 18

    MSC Class: na

  46. On the dissipative dynamics of entangled states in coupled-cavity quantum electrodynamics arrays

    Authors: Imran M. Mirza, Adriana S. Cruz

    Abstract: We examine the dissipative dynamics of N00N states with an arbitrary photon number N in two architectures of fiber-coupled optical ring resonators (RRs) interacting with two-level quantum emitters. One architecture consists of a two-way cascaded array of emitter-cavity systems, while in the other architecture we consider two fiber-coupled RRs each coupled to multiple dipole-dipole interacting (DDI… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: \c{opyright} XXXX [2022] Optica Publishing Group. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modifications of the content of this paper are prohibited

    Journal ref: J. Opt. Soc. Am. B 39 (1), 177-187 (2022)

  47. arXiv:2112.00463  [pdf, other

    cs.CV

    The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization

    Authors: M. Jehanzeb Mirza, Jakub Micorek, Horst Possegger, Horst Bischof

    Abstract: Domain adaptation is crucial to adapt a learned model to new scenarios, such as domain shifts or changing data distributions. Current approaches usually require a large amount of labeled or unlabeled data from the shifted domain. This can be a hurdle in fields which require continuous dynamic adaptation or suffer from scarcity of data, e.g. autonomous driving in challenging weather conditions. To… ▽ More

    Submitted 4 April, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: Accepted to CVPR 2022 - Camera Ready Version - Code: https://github.com/jmiemirza/DUA

  48. arXiv:2110.03363  [pdf, other

    cs.RO cs.AI cs.LG

    Evaluating model-based planning and planner amortization for continuous control

    Authors: Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas Abdolmaleki, Nicolas Heess, Josh Merel, Martin Riedmiller

    Abstract: There is a widespread intuition that model-based control methods should be able to surpass the data efficiency of model-free approaches. In this paper we attempt to evaluate this intuition on various challenging locomotion tasks. We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning; the learned policy serves as a proposal for MPC.… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 9 pages main text, 30 pages with references and appendix including several ablations and additional experiments. Submitted to ICLR 2022

  49. Coherent Perfect Absorption in Tavis-Cummings Models

    Authors: Zibo Wang, Pawan Khatiwada, Dan Wang, Imran M. Mirza

    Abstract: We theoretically study the conditions under which two laser fields can undergo Coherent Perfect Absorption (CPA) when shined on a single-mode bi-directional optical cavity coupled with two two- level quantum emitters (natural atoms, artificial atoms, quantum dots, qubits, etc.). In addition to being indirectly coupled through the cavity-mediated field, in our Tavis-Cummings model the two quantum e… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: 14 pages, 7 figures

  50. arXiv:2107.11462  [pdf, other

    physics.ins-det nucl-ex

    LEGEND-1000 Preconceptual Design Report

    Authors: LEGEND Collaboration, N. Abgrall, I. Abt, M. Agostini, A. Alexander, C. Andreoiu, G. R. Araujo, F. T. Avignone III, W. Bae, A. Bakalyarov, M. Balata, M. Bantel, I. Barabanov, A. S. Barabash, P. S. Barbeau, C. J. Barton, P. J. Barton, L. Baudis, C. Bauer, E. Bernieri, L. Bezrukov, K. H. Bhimani, V. Biancacci, E. Blalock, A. Bolozdynya , et al. (239 additional authors not shown)

    Abstract: We propose the construction of LEGEND-1000, the ton-scale Large Enriched Germanium Experiment for Neutrinoless $ββ$ Decay. This international experiment is designed to answer one of the highest priority questions in fundamental physics. It consists of 1000 kg of Ge detectors enriched to more than 90% in the $^{76}$Ge isotope operated in a liquid argon active shield at a deep underground laboratory… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.