-
$\mathcal{I}$-Extremization for AdS$_4$ Black Holes: Master Volume, Free Energy, and Baryonic Charges
Authors:
Seyed Morteza Hosseini,
Alberto Zaffaroni
Abstract:
In a previous paper, we proposed an entropy function for AdS$_4$ BPS black holes in M-theory with general magnetic charges, resolving a long-standing puzzle about baryonic charges in three-dimensional holography and offering a prediction for the large-$N$ limit of several partition functions whose saddle points have yet to be found. The entropy function is constructed from the master volume of the…
▽ More
In a previous paper, we proposed an entropy function for AdS$_4$ BPS black holes in M-theory with general magnetic charges, resolving a long-standing puzzle about baryonic charges in three-dimensional holography and offering a prediction for the large-$N$ limit of several partition functions whose saddle points have yet to be found. The entropy function is constructed from the master volume of the internal manifold. In this paper, we prove that the entropy of a general class of black holes based on toric geometry can indeed be reformulated as an $\mathcal{I}$-extremization problem, and we provide a set of examples. As an aside, we also simplify existing proofs of the equivalence between $a$-, $c$-, and $F$-extremizations and their gravitational duals.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
Type I anomaly cancellation revisited
Authors:
Saghar S. Hosseini,
Yuji Tachikawa,
Hao Y. Zhang
Abstract:
We revisit the issue of how the perturbative and global fermion anomaly of Type I string theory in ten dimensions is cancelled by the Green-Schwarz mechanism using the RR fields. This will be done by realising the RR fields as boundary modes of an eleven-dimensional bulk theory described in terms of a quadratic refinement of the differential KO-theory pairing.
We also discuss in a more general s…
▽ More
We revisit the issue of how the perturbative and global fermion anomaly of Type I string theory in ten dimensions is cancelled by the Green-Schwarz mechanism using the RR fields. This will be done by realising the RR fields as boundary modes of an eleven-dimensional bulk theory described in terms of a quadratic refinement of the differential KO-theory pairing.
We also discuss in a more general setting the procedures which need to be followed when we try to cancel fermion anomalies in terms of $p$-form fields based on differential K-theory classes. This we illustrate by performing an analysis of the mod-2 anomaly cancellation in nine dimensions arising from the $S^1$ compactification of the Type I theory.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Kinetic framework with consistent hydrodynamics for shallow water equations
Authors:
S. A. Hosseini,
I. V. Karlin
Abstract:
We present a novel discrete velocity kinetic framework to consistently recover the viscous shallow water equations. The proposed model has the following fundamental advantages and novelties: (a) A novel interpretation and general framework to introduce forces, (b) the possibility to consistently split pressure contributions between equilibrium and a force-like contribution, (c) consistent recovery…
▽ More
We present a novel discrete velocity kinetic framework to consistently recover the viscous shallow water equations. The proposed model has the following fundamental advantages and novelties: (a) A novel interpretation and general framework to introduce forces, (b) the possibility to consistently split pressure contributions between equilibrium and a force-like contribution, (c) consistent recovery of the viscous shallow water equations with no errors in the dissipation rates, (d) independent control over bulk viscosity, and (e) consistent second-order implementation of forces. As shown through a variety of different test cases, these features make for an accurate and stable solution method for the shallow-water equations.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Investigating Zero-Shot Diagnostic Pathology in Vision-Language Models with Efficient Prompt Design
Authors:
Vasudev Sharma,
Ahmed Alagha,
Abdelhakim Khellaf,
Vincent Quoc-Huy Trinh,
Mahdi S. Hosseini
Abstract:
Vision-language models (VLMs) have gained significant attention in computational pathology due to their multimodal learning capabilities that enhance big-data analytics of giga-pixel whole slide image (WSI). However, their sensitivity to large-scale clinical data, task formulations, and prompt design remains an open question, particularly in terms of diagnostic accuracy. In this paper, we present…
▽ More
Vision-language models (VLMs) have gained significant attention in computational pathology due to their multimodal learning capabilities that enhance big-data analytics of giga-pixel whole slide image (WSI). However, their sensitivity to large-scale clinical data, task formulations, and prompt design remains an open question, particularly in terms of diagnostic accuracy. In this paper, we present a systematic investigation and analysis of three state of the art VLMs for histopathology, namely Quilt-Net, Quilt-LLAVA, and CONCH, on an in-house digestive pathology dataset comprising 3,507 WSIs, each in giga-pixel form, across distinct tissue types. Through a structured ablative study on cancer invasiveness and dysplasia status, we develop a comprehensive prompt engineering framework that systematically varies domain specificity, anatomical precision, instructional framing, and output constraints. Our findings demonstrate that prompt engineering significantly impacts model performance, with the CONCH model achieving the highest accuracy when provided with precise anatomical references. Additionally, we identify the critical importance of anatomical context in histopathological image analysis, as performance consistently degraded when reducing anatomical precision. We also show that model complexity alone does not guarantee superior performance, as effective domain alignment and domain-specific training are critical. These results establish foundational guidelines for prompt engineering in computational pathology and highlight the potential of VLMs to enhance diagnostic accuracy when properly instructed with domain-appropriate prompts.
△ Less
Submitted 30 April, 2025;
originally announced May 2025.
-
Segment Any Crack: Deep Semantic Segmentation Adaptation for Crack Detection
Authors:
Ghodsiyeh Rostami,
Po-Han Chen,
Mahdi S. Hosseini
Abstract:
Image-based crack detection algorithms are increasingly in demand in infrastructure monitoring, as early detection of cracks is of paramount importance for timely maintenance planning. While deep learning has significantly advanced crack detection algorithms, existing models often require extensive labeled datasets and high computational costs for fine-tuning, limiting their adaptability across di…
▽ More
Image-based crack detection algorithms are increasingly in demand in infrastructure monitoring, as early detection of cracks is of paramount importance for timely maintenance planning. While deep learning has significantly advanced crack detection algorithms, existing models often require extensive labeled datasets and high computational costs for fine-tuning, limiting their adaptability across diverse conditions. This study introduces an efficient selective fine-tuning strategy, focusing on tuning normalization components, to enhance the adaptability of segmentation models for crack detection. The proposed method is applied to the Segment Anything Model (SAM) and five well-established segmentation models. Experimental results demonstrate that selective fine-tuning of only normalization parameters outperforms full fine-tuning and other common fine-tuning techniques in both performance and computational efficiency, while improving generalization. The proposed approach yields a SAM-based model, Segment Any Crack (SAC), achieving a 61.22\% F1-score and 44.13\% IoU on the OmniCrack30k benchmark dataset, along with the highest performance across three zero-shot datasets and the lowest standard deviation. The results highlight the effectiveness of the adaptation approach in improving segmentation accuracy while significantly reducing computational overhead.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Probabilistic Assessment of West Nile Virus Spillover Risk Using a Compartmental Mechanistic Model
Authors:
Saman Hosseini,
Lee W. Cohnstaedt,
Matin Marjani,
Caterina Scoglio
Abstract:
This paper presents a novel probabilistic approach for assessing the risk of West Nile Disease (WND) spillover to the human population. The assessment has been conducted under two different scenarios: (1) assessment of the onset of spillover, and (2) assessment of the severity of the epidemic after the onset of the disease. A compartmental model of differential equations is developed to describe t…
▽ More
This paper presents a novel probabilistic approach for assessing the risk of West Nile Disease (WND) spillover to the human population. The assessment has been conducted under two different scenarios: (1) assessment of the onset of spillover, and (2) assessment of the severity of the epidemic after the onset of the disease. A compartmental model of differential equations is developed to describe the disease transmission mechanism, and a probability density function for pathogen spillover to humans is derived based on the model for the assessment of the risk of the spillover onset and the severity of the epidemic. The prediction strategy involves making a long-term forecast and then updating it with a short-term (lead time of two weeks or daily). The methodology is demonstrated using detailed outbreak data from high-case counties in California, including Orange County, Los Angeles County, and Kern County. The predicted results are compared with actual infection dates reported by the California Department of Public Health for 2022-2024 to assess prediction accuracy. The performance accuracy is evaluated using a logarithmic scoring system and compared with one of the most renowned predictive models to assess its effectiveness. In all prediction scenarios, the model demonstrated strong performance. Lastly, the method is applied to explore the impact of global warming on spillover risk, revealing an increasing trend in the number of high-risk days and a shift toward a greater proportion of these days over time for the onset of the disease.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1): From spectrograms to spectra: the SIR spectroscopic Processing Function
Authors:
Euclid Collaboration,
Y. Copin,
M. Fumana,
C. Mancini,
P. N. Appleton,
R. Chary,
S. Conseil,
A. L. Faisst,
S. Hemmati,
D. C. Masters,
C. Scarlata,
M. Scodeggio,
A. Alavi,
A. Carle,
P. Casenove,
T. Contini,
I. Das,
W. Gillard,
G. Herzog,
J. Jacobson,
V. Le Brun,
D. Maino,
G. Setnikar,
N. R. Stickley,
D. Tavagnacco
, et al. (326 additional authors not shown)
Abstract:
The Euclid space mission aims to investigate the nature of dark energy and dark matter by mapping the large-scale structure of the Universe. A key component of Euclid's observational strategy is slitless spectroscopy, conducted using the Near Infrared Spectrometer and Photometer (NISP). This technique enables the acquisition of large-scale spectroscopic data without the need for targeted apertures…
▽ More
The Euclid space mission aims to investigate the nature of dark energy and dark matter by mapping the large-scale structure of the Universe. A key component of Euclid's observational strategy is slitless spectroscopy, conducted using the Near Infrared Spectrometer and Photometer (NISP). This technique enables the acquisition of large-scale spectroscopic data without the need for targeted apertures, allowing precise redshift measurements for millions of galaxies. These data are essential for Euclid's core science objectives, including the study of cosmic acceleration and the evolution of galaxy clustering, as well as enabling many non-cosmological investigations. This study presents the SIR processing function (PF), which is responsible for processing slitless spectroscopic data. The objective is to generate science-grade fully-calibrated one-dimensional spectra, ensuring high-quality spectroscopic data. The processing function relies on a source catalogue generated from photometric data, effectively corrects detector effects, subtracts cross-contaminations, minimizes self-contamination, calibrates wavelength and flux, and produces reliable spectra for later scientific use. The first Quick Data Release (Q1) of Euclid's spectroscopic data provides approximately three million validated spectra for sources observed in the red-grism mode from a selected portion of the Euclid Wide Survey. We find that wavelength accuracy and measured resolving power are within requirements, thanks to the excellent optical quality of the instrument. The SIR PF represents a significant step in processing slitless spectroscopic data for the Euclid mission. As the survey progresses, continued refinements and additional features will enhance its capabilities, supporting high-precision cosmological and astrophysical measurements.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
T2I-FineEval: Fine-Grained Compositional Metric for Text-to-Image Evaluation
Authors:
Seyed Mohammad Hadi Hosseini,
Amir Mohammad Izadi,
Ali Abdollahi,
Armin Saghafian,
Mahdieh Soleymani Baghshah
Abstract:
Although recent text-to-image generative models have achieved impressive performance, they still often struggle with capturing the compositional complexities of prompts including attribute binding, and spatial relationships between different entities. This misalignment is not revealed by common evaluation metrics such as CLIPScore. Recent works have proposed evaluation metrics that utilize Visual…
▽ More
Although recent text-to-image generative models have achieved impressive performance, they still often struggle with capturing the compositional complexities of prompts including attribute binding, and spatial relationships between different entities. This misalignment is not revealed by common evaluation metrics such as CLIPScore. Recent works have proposed evaluation metrics that utilize Visual Question Answering (VQA) by decomposing prompts into questions about the generated image for more robust compositional evaluation. Although these methods align better with human evaluations, they still fail to fully cover the compositionality within the image. To address this, we propose a novel metric that breaks down images into components, and texts into fine-grained questions about the generated image for evaluation. Our method outperforms previous state-of-the-art metrics, demonstrating its effectiveness in evaluating text-to-image generative models. Code is available at https://github.com/hadi-hosseini/ T2I-FineEval.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Radar: Fast Long-Context Decoding for Any Transformer
Authors:
Yongchang Hao,
Mengyao Zhai,
Hossein Hajimirsadeghi,
Sepidehsadat Hosseini,
Frederick Tung
Abstract:
Transformer models have demonstrated exceptional performance across a wide range of applications. Though forming the foundation of Transformer models, the dot-product attention does not scale well to long-context data since its time requirement grows quadratically with context length. In this work, we propose Radar, a training-free approach that accelerates inference by dynamically searching for t…
▽ More
Transformer models have demonstrated exceptional performance across a wide range of applications. Though forming the foundation of Transformer models, the dot-product attention does not scale well to long-context data since its time requirement grows quadratically with context length. In this work, we propose Radar, a training-free approach that accelerates inference by dynamically searching for the most important context tokens. For any pre-trained Transformer, Radar can reduce the decoding time complexity without training or heuristically evicting tokens. Moreover, we provide theoretical justification for our approach, demonstrating that Radar can reliably identify the most important tokens with high probability. We conduct extensive comparisons with the previous methods on a wide range of tasks. The results demonstrate that Radar achieves the state-of-the-art performance across different architectures with reduced time complexity, offering a practical solution for efficient long-context processing of Transformers.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
$\mathcal{I}$-extremization with baryonic charges
Authors:
Seyed Morteza Hosseini,
Alberto Zaffaroni
Abstract:
We propose an entropy function for AdS$_4$ BPS black holes in M-theory with general magnetic charges, resolving in particular a long-standing puzzle about baryonic charges. The entropy function is constructed from a gravitational block defined solely in terms of topological data of the internal manifold. We show that the entropy of twisted black holes can always be reformulated as an…
▽ More
We propose an entropy function for AdS$_4$ BPS black holes in M-theory with general magnetic charges, resolving in particular a long-standing puzzle about baryonic charges. The entropy function is constructed from a gravitational block defined solely in terms of topological data of the internal manifold. We show that the entropy of twisted black holes can always be reformulated as an $\mathcal{I}$-extremization problem -- even in cases where existing large-$N$ field theory computations fail to provide an answer. Furthermore, we correctly reproduce the entropy for a class of known black holes with purely baryonic magnetic charges. Our results offer both a conjecture for the general gravitational block for AdS$_4$ black holes in M-theory and a prediction for the large-$N$ limit of several partition functions whose saddle points have yet to be found.
△ Less
Submitted 28 May, 2025; v1 submitted 11 March, 2025;
originally announced March 2025.
-
Linear stability of lattice Boltzmann models with non-ideal equation of state
Authors:
S. A. Hosseini,
I. V. Karlin
Abstract:
Detailed study of spectral properties and of linear stability is presented for a class of lattice Boltzmann models with a non-ideal equation of state. Examples include the van der Waals and the shallow water models. Both analytical and numerical approaches demonstrate that linear stability requires boundedness of propagation speeds of normal eigen-modes. The study provides a basis for the construc…
▽ More
Detailed study of spectral properties and of linear stability is presented for a class of lattice Boltzmann models with a non-ideal equation of state. Examples include the van der Waals and the shallow water models. Both analytical and numerical approaches demonstrate that linear stability requires boundedness of propagation speeds of normal eigen-modes. The study provides a basis for the construction of unconditionally stable lattice Boltzmann models.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Fine-Grained Alignment and Noise Refinement for Compositional Text-to-Image Generation
Authors:
Amir Mohammad Izadi,
Seyed Mohammad Hadi Hosseini,
Soroush Vafaie Tabar,
Ali Abdollahi,
Armin Saghafian,
Mahdieh Soleymani Baghshah
Abstract:
Text-to-image generative models have made significant advancements in recent years; however, accurately capturing intricate details in textual prompts, such as entity missing, attribute binding errors, and incorrect relationships remains a formidable challenge. In response, we present an innovative, training-free method that directly addresses these challenges by incorporating tailored objectives…
▽ More
Text-to-image generative models have made significant advancements in recent years; however, accurately capturing intricate details in textual prompts, such as entity missing, attribute binding errors, and incorrect relationships remains a formidable challenge. In response, we present an innovative, training-free method that directly addresses these challenges by incorporating tailored objectives to account for textual constraints. Unlike layout-based approaches that enforce rigid structures and limit diversity, our proposed approach offers a more flexible arrangement of the scene by imposing just the extracted constraints from the text, without any unnecessary additions. These constraints are formulated as losses-entity missing, entity mixing, attribute binding, and spatial relationships, integrated into a unified loss that is applied in the first generation stage. Furthermore, we introduce a feedback-driven system for fine-grained initial noise refinement. This system integrates a verifier that evaluates the generated image, identifies inconsistencies, and provides corrective feedback. Leveraging this feedback, our refinement method first targets the unmet constraints by refining the faulty attention maps caused by initial noise, through the optimization of selective losses associated with these constraints. Subsequently, our unified loss function is reapplied to proceed the second generation phase. Experimental results demonstrate that our method, relying solely on our proposed objective functions, significantly enhances compositionality, achieving a 24% improvement in human evaluation and a 25% gain in spatial relationships. Furthermore, our fine-grained noise refinement proves effective, boosting performance by up to 5%. Code is available at https://github.com/hadi-hosseini/noise-refinement.
△ Less
Submitted 9 March, 2025;
originally announced March 2025.
-
CER: Confidence Enhanced Reasoning in LLMs
Authors:
Ali Razghandi,
Seyed Mohammad Hadi Hosseini,
Mahdieh Soleymani Baghshah
Abstract:
Ensuring the reliability of Large Language Models (LLMs) in complex reasoning tasks remains a formidable challenge, particularly in scenarios that demand precise mathematical calculations and knowledge-intensive open-domain generation. In this work, we introduce an uncertainty-aware framework designed to enhance the accuracy of LLM responses by systematically incorporating model confidence at crit…
▽ More
Ensuring the reliability of Large Language Models (LLMs) in complex reasoning tasks remains a formidable challenge, particularly in scenarios that demand precise mathematical calculations and knowledge-intensive open-domain generation. In this work, we introduce an uncertainty-aware framework designed to enhance the accuracy of LLM responses by systematically incorporating model confidence at critical decision points. We propose an approach that encourages multi-step reasoning in LLMs and quantify the confidence of intermediate answers such as numerical results in mathematical reasoning and proper nouns in open-domain generation. Then, the overall confidence of each reasoning chain is evaluated based on confidence of these critical intermediate steps. Finally, we aggregate the answer of generated response paths in a way that reflects the reliability of each generated content (as opposed to self-consistency in which each generated chain contributes equally to majority voting). We conducted extensive experiments in five datasets, three mathematical datasets and two open-domain datasets, using four LLMs. The results consistently validate the effectiveness of our novel confidence aggregation method, leading to an accuracy improvement of up to 7.4% and 5.8% over baseline approaches in math and open-domain generation tasks, respectively. Code is publicly available at https://github.com/ Aquasar11/CER.
△ Less
Submitted 25 May, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
Frequency Domain Stability and Convergence Analysis for General Reset Control Systems Architecture
Authors:
S. Ali Hosseini,
S. Hassan HosseinNia
Abstract:
A key factor that generates significant interest in reset control systems, especially within industrial contexts, is their potential to be designed using a frequency-domain loop-shaping procedure. On the other hand, formulating and assessing stability analysis for these nonlinear elements often depends on access to parametric models and numerically solving linear matrix inequalities. These specifi…
▽ More
A key factor that generates significant interest in reset control systems, especially within industrial contexts, is their potential to be designed using a frequency-domain loop-shaping procedure. On the other hand, formulating and assessing stability analysis for these nonlinear elements often depends on access to parametric models and numerically solving linear matrix inequalities. These specific factors could present challenges to the successful implementation of reset control within industrial settings. Moreover, one of the most effective structures for implementing reset elements is to use them in parallel with a linear element. Therefore, this article presents the development of the frequency domain-based $H_β$ stability method from a series to a more general structure of reset control systems. Additionally, it investigates the behavior of different reset elements in terms of the feasibility of stability in the presence of time delay. To illustrate the research findings, two examples are provided, including one from an industrial application.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Ultrasound Image Generation using Latent Diffusion Models
Authors:
Benoit Freiche,
Anthony El-Khoury,
Ali Nasiri-Sarvi,
Mahdi S. Hosseini,
Damien Garcia,
Adrian Basarab,
Mathieu Boily,
Hassan Rivaz
Abstract:
Diffusion models for image generation have been a subject of increasing interest due to their ability to generate diverse, high-quality images. Image generation has immense potential in medical imaging because open-source medical images are difficult to obtain compared to natural images, especially for rare conditions. The generated images can be used later to train classification and segmentation…
▽ More
Diffusion models for image generation have been a subject of increasing interest due to their ability to generate diverse, high-quality images. Image generation has immense potential in medical imaging because open-source medical images are difficult to obtain compared to natural images, especially for rare conditions. The generated images can be used later to train classification and segmentation models. In this paper, we propose simulating realistic ultrasound (US) images by successive fine-tuning of large diffusion models on different publicly available databases. To do so, we fine-tuned Stable Diffusion, a state-of-the-art latent diffusion model, on BUSI (Breast US Images) an ultrasound breast image dataset. We successfully generated high-quality US images of the breast using simple prompts that specify the organ and pathology, which appeared realistic to three experienced US scientists and a US radiologist. Additionally, we provided user control by conditioning the model with segmentations through ControlNet. We will release the source code at http://code.sonography.ai/ to allow fast US image generation to the scientific community.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
A fully conservative discrete velocity Boltzmann solver with parallel adaptive mesh refinement for compressible flows
Authors:
Ruben M. Strässle,
S. A. Hosseini,
I. V. Karlin
Abstract:
This paper presents a parallel and fully conservative adaptive mesh refinement (AMR) implementation of a finite-volume-based kinetic solver for compressible flows. Time-dependent H-type refinement is combined with a two-population quasi-equilibrium Bhatnagar-Gross-Krook discrete velocity Boltzmann model. A validation has shown that conservation laws are strictly preserved through the application o…
▽ More
This paper presents a parallel and fully conservative adaptive mesh refinement (AMR) implementation of a finite-volume-based kinetic solver for compressible flows. Time-dependent H-type refinement is combined with a two-population quasi-equilibrium Bhatnagar-Gross-Krook discrete velocity Boltzmann model. A validation has shown that conservation laws are strictly preserved through the application of refluxing operations at coarse-fine interfaces. Moreover, the targeted macroscopic moments of Euler and Navier-Stokes-Fourier level flows were accurately recovered with correct and Galilean invariant dispersion rates for a temperature range over three orders of magnitude and dissipation rates of all eigen-modes up to Mach of order 1.8. Results for one- and two-dimensional benchmarks up to Mach numbers of 3.2 and temperature ratios of 7, such as the Sod and Lax shock tubes, the Shu-Osher and several Riemann problems, as well as viscous shock-vortex interactions, have demonstrated that the solver precisely captures reference solutions. Excellent performance in obtaining sensitive quantities was proven, for example in the test case involving nonlinear acoustics, whilst, for the same accuracy and fidelity of the solution, the AMR methodology significantly reduced computational cost and memory footprints. Over all demonstrated two-dimensional problems, up to a 4- to 9-fold reduction was achieved and an upper limit of the AMR overhead of 30% was found in a case with very cost-intensive parameter choice. The proposed solver marks an accurate, efficient and scalable framework for kinetic simulations of compressible flows with moderate supersonic speeds and discontinuities, offering a valuable tool for studying complex problems in fluid dynamics.
△ Less
Submitted 10 March, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
A generalizable 3D framework and model for self-supervised learning in medical imaging
Authors:
Tony Xu,
Sepehr Hosseini,
Chris Anderson,
Anthony Rinaldi,
Rahul G. Krishnan,
Anne L. Martel,
Maged Goubran
Abstract:
Current self-supervised learning methods for 3D medical imaging rely on simple pretext formulations and organ- or modality-specific datasets, limiting their generalizability and scalability. We present 3DINO, a cutting-edge SSL method adapted to 3D datasets, and use it to pretrain 3DINO-ViT: a general-purpose medical imaging model, on an exceptionally large, multimodal, and multi-organ dataset of…
▽ More
Current self-supervised learning methods for 3D medical imaging rely on simple pretext formulations and organ- or modality-specific datasets, limiting their generalizability and scalability. We present 3DINO, a cutting-edge SSL method adapted to 3D datasets, and use it to pretrain 3DINO-ViT: a general-purpose medical imaging model, on an exceptionally large, multimodal, and multi-organ dataset of ~100,000 3D medical imaging scans from over 10 organs. We validate 3DINO-ViT using extensive experiments on numerous medical imaging segmentation and classification tasks. Our results demonstrate that 3DINO-ViT generalizes across modalities and organs, including out-of-distribution tasks and datasets, outperforming state-of-the-art methods on the majority of evaluation metrics and labeled dataset sizes. Our 3DINO framework and 3DINO-ViT will be made available to enable research on 3D foundation models or further finetuning for a wide range of medical imaging applications.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Real-Time Bus Departure Prediction Using Neural Networks for Smart IoT Public Bus Transit
Authors:
Narges Rashvand,
Sanaz Sadat Hosseini,
Mona Azarbayjani,
Hamed Tabkhi
Abstract:
Bus transit plays a vital role in urban public transportation but often struggles to provide accurate and reliable departure times. This leads to delays, passenger dissatisfaction, and decreased ridership, particularly in transit-dependent areas. A major challenge lies in the discrepancy between actual and scheduled bus departure times, which disrupts timetables and impacts overall operational eff…
▽ More
Bus transit plays a vital role in urban public transportation but often struggles to provide accurate and reliable departure times. This leads to delays, passenger dissatisfaction, and decreased ridership, particularly in transit-dependent areas. A major challenge lies in the discrepancy between actual and scheduled bus departure times, which disrupts timetables and impacts overall operational efficiency. To address these challenges, this paper presents a neural network-based approach for real-time bus departure time prediction tailored for smart IoT public transit applications. We leverage AI-driven models to enhance the accuracy of bus schedules by preprocessing data, engineering relevant features, and implementing a fully connected neural network that utilizes historical departure data to predict departure times at subsequent stops. In our case study analyzing bus data from Boston, we observed an average deviation of nearly 4 minutes from scheduled times. However, our model, evaluated across 151 bus routes, demonstrates a significant improvement, predicting departure time deviations with an accuracy of under 80 seconds. This advancement not only improves the reliability of bus transit schedules but also plays a crucial role in enabling smart bus systems and IoT applications within public transit networks. By providing more accurate real-time predictions, our approach can facilitate the integration of IoT devices, such as smart bus stops and passenger information systems, that rely on precise data for optimal performance.
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Distributed Learning and Inference Systems: A Networking Perspective
Authors:
Hesham G. Moussa,
Arashmid Akhavain,
S. Maryam Hosseini,
Bill McCormick
Abstract:
Machine learning models have achieved, and in some cases surpassed, human-level performance in various tasks, mainly through centralized training of static models and the use of large models stored in centralized clouds for inference. However, this centralized approach has several drawbacks, including privacy concerns, high storage demands, a single point of failure, and significant computing requ…
▽ More
Machine learning models have achieved, and in some cases surpassed, human-level performance in various tasks, mainly through centralized training of static models and the use of large models stored in centralized clouds for inference. However, this centralized approach has several drawbacks, including privacy concerns, high storage demands, a single point of failure, and significant computing requirements. These challenges have driven interest in developing alternative decentralized and distributed methods for AI training and inference. Distribution introduces additional complexity, as it requires managing multiple moving parts. To address these complexities and fill a gap in the development of distributed AI systems, this work proposes a novel framework, Data and Dynamics-Aware Inference and Training Networks (DA-ITN). The different components of DA-ITN and their functions are explored, and the associated challenges and research areas are highlighted.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Dilated Balanced Cross Entropy Loss for Medical Image Segmentation
Authors:
Seyed Mohsen Hosseini,
Mahdieh Soleymani Baghshah
Abstract:
A novel method for tackling the problem of imbalanced data in medical image segmentation is proposed in this work. In balanced cross entropy (CE) loss, which is a type of weighted CE loss, the weight assigned to each class is the in-verse of the class frequency. These balancing weights are expected to equalize the effect of each class on the overall loss and prevent the model from being biased tow…
▽ More
A novel method for tackling the problem of imbalanced data in medical image segmentation is proposed in this work. In balanced cross entropy (CE) loss, which is a type of weighted CE loss, the weight assigned to each class is the in-verse of the class frequency. These balancing weights are expected to equalize the effect of each class on the overall loss and prevent the model from being biased towards the majority class. But, as it has been shown in previous studies, this method degrades the performance by a large margin. Therefore, balanced CE is not a popular loss in medical segmentation tasks, and usually a region-based loss, like the Dice loss, is used to address the class imbalance problem. In the pro-posed method, the weighting of cross entropy loss for each class is based on a dilated area of each class mask, and balancing weights are assigned to each class together with its surrounding pixels. The goal of this study is to show that the performance of balanced CE loss can be greatly improved my modifying its weighting strategy. Experiments on different datasets show that the proposed dilated balanced CE (DBCE) loss outperforms the balanced CE loss by a large margin and produces superior results compared to CE loss, and its performance is similar to the performance of the combination of Dice and CE loss. This means that a weighted cross entropy loss with the right weighing strategy can be as effective as a region-based loss in handling the problem of class imbalance in medical segmentation tasks.
△ Less
Submitted 8 December, 2024;
originally announced December 2024.
-
2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification
Authors:
Jingwei Zhang,
Anh Tien Nguyen,
Xi Han,
Vincent Quoc-Huy Trinh,
Hong Qin,
Dimitris Samaras,
Mahdi S. Hosseini
Abstract:
Efficiently modeling large 2D contexts is essential for various fields including Giga-Pixel Whole Slide Imaging (WSI) and remote sensing. Transformer-based models offer high parallelism but face challenges due to their quadratic complexity for handling long sequences. Recently, Mamba introduced a selective State Space Model (SSM) with linear complexity and high parallelism, enabling effective and…
▽ More
Efficiently modeling large 2D contexts is essential for various fields including Giga-Pixel Whole Slide Imaging (WSI) and remote sensing. Transformer-based models offer high parallelism but face challenges due to their quadratic complexity for handling long sequences. Recently, Mamba introduced a selective State Space Model (SSM) with linear complexity and high parallelism, enabling effective and efficient modeling of wide context in 1D sequences. However, extending Mamba to vision tasks, which inherently involve 2D structures, results in spatial discrepancies due to the limitations of 1D sequence processing. On the other hand, current 2D SSMs inherently model 2D structures but they suffer from prohibitively slow computation due to the lack of efficient parallel algorithms. In this work, we propose 2DMamba, a novel 2D selective SSM framework that incorporates the 2D spatial structure of images into Mamba, with a highly optimized hardware-aware operator, adopting both spatial continuity and computational efficiency. We validate the versatility of our approach on both WSIs and natural images. Extensive experiments on 10 public datasets for WSI classification and survival analysis show that 2DMamba improves up to 2.48% in AUC, 3.11% in F1 score, 2.47% in accuracy and 5.52% in C-index. Additionally, integrating our method with VMamba for natural imaging yields 0.5 to 0.7 improvements in mIoU on the ADE20k semantic segmentation dataset, and 0.2% accuracy improvement on ImageNet-1K classification dataset. Our code is available at https://github.com/AtlasAnalyticsLab/2DMamba.
△ Less
Submitted 15 March, 2025; v1 submitted 1 December, 2024;
originally announced December 2024.
-
Comparative Analysis of Diffusion Generative Models in Computational Pathology
Authors:
Denisha Thakkar,
Vincent Quoc-Huy Trinh,
Sonal Varma,
Samira Ebrahimi Kahou,
Hassan Rivaz,
Mahdi S. Hosseini
Abstract:
Diffusion Generative Models (DGM) have rapidly surfaced as emerging topics in the field of computer vision, garnering significant interest across a wide array of deep learning applications. Despite their high computational demand, these models are extensively utilized for their superior sample quality and robust mode coverage. While research in diffusion generative models is advancing, exploration…
▽ More
Diffusion Generative Models (DGM) have rapidly surfaced as emerging topics in the field of computer vision, garnering significant interest across a wide array of deep learning applications. Despite their high computational demand, these models are extensively utilized for their superior sample quality and robust mode coverage. While research in diffusion generative models is advancing, exploration within the domain of computational pathology and its large-scale datasets has been comparatively gradual. Bridging the gap between the high-quality generation capabilities of Diffusion Generative Models and the intricate nature of pathology data, this paper presents an in-depth comparative analysis of diffusion methods applied to a pathology dataset. Our analysis extends to datasets with varying Fields of View (FOV), revealing that DGMs are highly effective in producing high-quality synthetic data. An ablative study is also conducted, followed by a detailed discussion on the impact of various methods on the synthesized histopathology images. One striking observation from our experiments is how the adjustment of image size during data generation can simulate varying fields of view. These findings underscore the potential of DGMs to enhance the quality and diversity of synthetic pathology data, especially when used with real data, ultimately increasing accuracy of deep learning models in histopathology. Code is available from https://github.com/AtlasAnalyticsLab/Diffusion4Path
△ Less
Submitted 24 November, 2024;
originally announced November 2024.
-
Age of Information Minimization in UAV-Assisted Covert Communication: Trajectory and Beamforming Design
Authors:
Shima Salar Hosseini,
Paeiz Azmi,
Ali Nazari
Abstract:
Unmanned aerial vehicles (UAVs) have the potential for time-sensitive applications. Due to wireless channel variation, received data may have an expiration time, particularly in critical situations such as rescue operations, natural disasters, or the military. Age of Information (AoI) is a metric that measures the freshness of received packets to specify the validity period of information. In addi…
▽ More
Unmanned aerial vehicles (UAVs) have the potential for time-sensitive applications. Due to wireless channel variation, received data may have an expiration time, particularly in critical situations such as rescue operations, natural disasters, or the military. Age of Information (AoI) is a metric that measures the freshness of received packets to specify the validity period of information. In addition, it is necessary to guarantee the privacy of confidential information transmission through air-to-ground links against eavesdroppers. This paper investigates UAV-assisted covert communication to minimize AoI in the presence of an aerial eavesdropper for the first time. However, to ensure the eavesdropper's error detection rate, UAV-enabled beamforming employs the power-domain non-orthogonal multiple access (PD-NOMA) technique to cover the covert user by a public user. PD-NOMA technique significantly improves the user's AoI, too. The joint optimization problem contains non-convex constraints and coupled optimization variables, including UAV trajectory, beamforming design, and the user's AoI which is challenging to derive a direct solution. We have developed an efficient alternating optimization technique to address the formulated optimization problem. Numerical results demonstrate the impact of the main parameters on the performance of the proposed communication system.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
The Gravitational Wave Bias Parameter from Angular Power Spectra: Bridging Between Galaxies and Binary Black Holes
Authors:
Amir Dehghani,
J. Leo Kim,
Dorsa Sadat Hosseini,
Alex Krolewski,
Suvodip Mukherjee,
Ghazal Geshnizjani
Abstract:
This study presents the modeling of the gravitational wave (GW) bias parameter by bridging a connection between simulated GW sources and galaxies in low redshift galaxy surveys 2MPZ and WISExSCOS (WISC). We study this connection by creating a mock GW catalog, populating galaxy surveys with binary black holes (BBHs) for different scenarios of the GW host-galaxy probability as a function of the gala…
▽ More
This study presents the modeling of the gravitational wave (GW) bias parameter by bridging a connection between simulated GW sources and galaxies in low redshift galaxy surveys 2MPZ and WISExSCOS (WISC). We study this connection by creating a mock GW catalog, populating galaxy surveys with binary black holes (BBHs) for different scenarios of the GW host-galaxy probability as a function of the galaxy stellar mass. We probe the observable consequences of this connection by exploring the spatial clustering of the GW sources in terms of the GW bias parameter. We consider a phenomenological broken power law model for the host-galaxy probability function, with a potential turnover $M_{K}$ at high stellar mass ($10^{11}$ $M_{\odot}$ in the fiducial model) where the star formation efficiency begins to drop. We vary the parameters of the GW host-galaxy probability function and find that generically the GW bias increases as $M_{K}$ increases (and gets suppressed as $M_{K}$ decreases). The change in the GW bias parameter shows a maximum change of about $30\%$ for different scenarios explored in this work in comparison to the galaxy bias. Future measurements of the GW bias can help constrain $M_{K}$ and the slopes of the host-galaxy probability function and thus offer insights into the underlying astrophysical processes.
△ Less
Submitted 22 April, 2025; v1 submitted 18 November, 2024;
originally announced November 2024.
-
Discovery of a Dense Association of Stars in the Vicinity of the Supermassive Black Hole Sgr A*
Authors:
S. Elaheh Hosseini,
Andreas Eckart,
Michal Zajaček,
Silke Britzen,
Harshitha K. Bhat,
Vladimír Karas
Abstract:
We focus on a sample of 42 sources in the vicinity of the bow-shock source IRS 1W (N-sources), located at the distance of $6.05''$ north-east of the supermassive black hole (SMBH) Sagittarius A* (Sgr A*), within the radius of $1.35''$. We present the first proper motion measurements of N-sources and find that a larger subset of N-sources (28 sources) exhibit a north-westward flying angle. These so…
▽ More
We focus on a sample of 42 sources in the vicinity of the bow-shock source IRS 1W (N-sources), located at the distance of $6.05''$ north-east of the supermassive black hole (SMBH) Sagittarius A* (Sgr A*), within the radius of $1.35''$. We present the first proper motion measurements of N-sources and find that a larger subset of N-sources (28 sources) exhibit a north-westward flying angle. These sources can be bound by an intermediate mass black hole (IMBH) or the concentration that we observe is due to a disk-like distribution projection along the line of sight. We detect the N-sources in $H$, $K_s$, and $L$' bands. The north-westward flying sources could be a bound collection of stars. We discuss a tentative existence of an IMBH or an inclined disk distribution to explain a significant overdensity of stars. The first scenario of having an IMBH implies the lower limit of $\sim 10^4~M_\odot$ for the putative IMBH. Our measurements for the first time reveal that the dense association of stars containing IRS 1W is a co-moving group of massive, young stars. This stellar association might be the remnant core of a massive stellar cluster that is currently being tidally stripped as it inspirals towards Sgr A*. The second scenario suggests that the appearance of the N-sources might be influenced by the projection of a disk-like distribution of younger He-stars and/or dust-enshrouded stars.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Efficient Self-Supervised Barlow Twins from Limited Tissue Slide Cohorts for Colonic Pathology Diagnostics
Authors:
Cassandre Notton,
Vasudev Sharma,
Vincent Quoc-Huy Trinh,
Lina Chen,
Minqi Xu,
Sonal Varma,
Mahdi S. Hosseini
Abstract:
Colorectal cancer (CRC) is one of the few cancers that have an established dysplasia-carcinoma sequence that benefits from screening. Everyone over 50 years of age in Canada is eligible for CRC screening. About 20\% of those people will undergo a biopsy for a pre-neoplastic polyp and, in many cases, multiple polyps. As such, these polyp biopsies make up the bulk of a pathologist's workload. Develo…
▽ More
Colorectal cancer (CRC) is one of the few cancers that have an established dysplasia-carcinoma sequence that benefits from screening. Everyone over 50 years of age in Canada is eligible for CRC screening. About 20\% of those people will undergo a biopsy for a pre-neoplastic polyp and, in many cases, multiple polyps. As such, these polyp biopsies make up the bulk of a pathologist's workload. Developing an efficient computational model to help screen these polyp biopsies can improve the pathologist's workflow and help guide their attention to critical areas on the slide. DL models face significant challenges in computational pathology (CPath) because of the gigapixel image size of whole-slide images and the scarcity of detailed annotated datasets. It is, therefore, crucial to leverage self-supervised learning (SSL) methods to alleviate the burden and cost of data annotation. However, current research lacks methods to apply SSL frameworks to analyze pathology data effectively. This paper aims to propose an optimized Barlow Twins framework for colorectal polyps screening. We adapt its hyperparameters, augmentation strategy and encoder to the specificity of the pathology data to enhance performance. Additionally, we investigate the best Field of View (FoV) for colorectal polyps screening and propose a new benchmark dataset for CRC screening, made of four types of colorectal polyps and normal tissue, by performing downstream tasking on MHIST and NCT-CRC-7K datasets. Furthermore, we show that the SSL representations are more meaningful and qualitative than the supervised ones and that Barlow Twins benefits from the Swin Transformer when applied to pathology data. Codes are avaialble from https://github.com/AtlasAnalyticsLab/PathBT.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Transition time of a bouncing drop
Authors:
Yahua Liu,
Seyed Ali Hosseini,
Cong Liu,
Milo Feinberg,
Benedikt Dorschner,
Zuankai Wang,
Ilya Karlin
Abstract:
Contact time of bouncing drops is one of the most essential parameters to quantify the water-repellency of surfaces. Generally, the contact time on superhydrophobic surfaces is known to be Weber number-independent. Here, we probe an additional characteristic time, \emph{transition time} inherent in water drop impacting on superhydrophobic surfaces, marking a switch from a predominantly lateral to…
▽ More
Contact time of bouncing drops is one of the most essential parameters to quantify the water-repellency of surfaces. Generally, the contact time on superhydrophobic surfaces is known to be Weber number-independent. Here, we probe an additional characteristic time, \emph{transition time} inherent in water drop impacting on superhydrophobic surfaces, marking a switch from a predominantly lateral to an axial motion. Systematic experiments and numerical simulations show that the transition time is also Weber number-independent and accounts for half the contact time. Additionally we identify a Weber-independent partition of volume at the maximum spreading state between the rim and lamella and show that the latter contains 1/4 of the total volume of the drop.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Bouncing Scenario in the $f(T)$ Modified Gravity Model with Dynamical System Analysis
Authors:
S. Davood Sadatian,
S. Mohamad Reza Hosseini
Abstract:
In $f(T)$ gravity, the theory modifies the gravitational action by introducing a function of the torsion scalar $T$. This approach allows for a different treatment of gravity than general relativity, particularly in cosmological contexts. Dynamical system analysis is a powerful tool for exploring the stability and behavior of cosmological solutions within this framework. The dynamical system analy…
▽ More
In $f(T)$ gravity, the theory modifies the gravitational action by introducing a function of the torsion scalar $T$. This approach allows for a different treatment of gravity than general relativity, particularly in cosmological contexts. Dynamical system analysis is a powerful tool for exploring the stability and behavior of cosmological solutions within this framework. The dynamical system analysis involves examining the phase space of the cosmological equations derived from the $f(T)$ model. This analysis helps identify fixed points, stability, and the evolution of the universe's scale factor. Therefore, in the following, we first review the main equations of the $f(T)$ gravity model. Then we study the dynamic analysis of the gravity model and obtain stability points. Finally, we consider the bouncing scenario in this model.
△ Less
Submitted 5 October, 2024;
originally announced October 2024.
-
Extended Deep Submodular Functions
Authors:
Seyed Mohammad Hosseini,
Arash Jamshid,
Seyed Mahdi Noormousavi,
Mahdi Jafari Siavoshani,
Naeimeh Omidvar
Abstract:
We introduce a novel category of set functions called Extended Deep Submodular functions (EDSFs), which are neural network-representable. EDSFs serve as an extension of Deep Submodular Functions (DSFs), inheriting crucial properties from DSFs while addressing innate limitations. It is known that DSFs can represent a limiting subset of submodular functions. In contrast, through an analysis of polym…
▽ More
We introduce a novel category of set functions called Extended Deep Submodular functions (EDSFs), which are neural network-representable. EDSFs serve as an extension of Deep Submodular Functions (DSFs), inheriting crucial properties from DSFs while addressing innate limitations. It is known that DSFs can represent a limiting subset of submodular functions. In contrast, through an analysis of polymatroid properties, we establish that EDSFs possess the capability to represent all monotone submodular functions, a notable enhancement compared to DSFs. Furthermore, our findings demonstrate that EDSFs can represent any monotone set function, indicating the family of EDSFs is equivalent to the family of all monotone set functions. Additionally, we prove that EDSFs maintain the concavity inherent in DSFs when the components of the input vector are non-negative real numbers-an essential feature in certain combinatorial optimization problems. Through extensive experiments, we illustrate that EDSFs exhibit significantly lower empirical generalization error than DSFs in the learning of coverage functions. This suggests that EDSFs present a promising advancement in the representation and learning of set functions with improved generalization capabilities.
△ Less
Submitted 18 September, 2024;
originally announced September 2024.
-
Constructing an Interpretable Deep Denoiser by Unrolling Graph Laplacian Regularizer
Authors:
Seyed Alireza Hosseini,
Tam Thuc Do,
Gene Cheung,
Yuichi Tanaka
Abstract:
An image denoiser can be used for a wide range of restoration problems via the Plug-and-Play (PnP) architecture. In this paper, we propose a general framework to build an interpretable graph-based deep denoiser (GDD) by unrolling a solution to a maximum a posteriori (MAP) problem equipped with a graph Laplacian regularizer (GLR) as signal prior. Leveraging a recent theorem showing that any (pseudo…
▽ More
An image denoiser can be used for a wide range of restoration problems via the Plug-and-Play (PnP) architecture. In this paper, we propose a general framework to build an interpretable graph-based deep denoiser (GDD) by unrolling a solution to a maximum a posteriori (MAP) problem equipped with a graph Laplacian regularizer (GLR) as signal prior. Leveraging a recent theorem showing that any (pseudo-)linear denoiser $\boldsymbol Ψ$, under mild conditions, can be mapped to a solution of a MAP denoising problem regularized using GLR, we first initialize a graph Laplacian matrix $\mathbf L$ via truncated Taylor Series Expansion (TSE) of $\boldsymbol Ψ^{-1}$. Then, we compute the MAP linear system solution by unrolling iterations of the conjugate gradient (CG) algorithm into a sequence of neural layers as a feed-forward network -- one that is amenable to parameter tuning. The resulting GDD network is "graph-interpretable", low in parameter count, and easy to initialize thanks to $\mathbf L$ derived from a known well-performing denoiser $\boldsymbol Ψ$. Experimental results show that GDD achieves competitive image denoising performance compared to competitors, but employing far fewer parameters, and is more robust to covariate shift.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Separation of Body and Background in Radiological Images. A Practical Python Code
Authors:
Seyedeh Fahimeh Hosseini,
Faezeh Shalbafzadeh,
Behzad Amanpour-Gharaei
Abstract:
Radiological images, such as magnetic resonance imaging (MRI) and computed tomography (CT) images, typically consist of a body part and a dark background. For many analyses, it is necessary to separate the body part from the background. In this article, we present a Python code designed to separate body and background regions in 2D and 3D radiological images. We tested the algorithm on various MRI…
▽ More
Radiological images, such as magnetic resonance imaging (MRI) and computed tomography (CT) images, typically consist of a body part and a dark background. For many analyses, it is necessary to separate the body part from the background. In this article, we present a Python code designed to separate body and background regions in 2D and 3D radiological images. We tested the algorithm on various MRI and CT images of different body parts, including the brain, neck, and abdominal regions. Additionally, we introduced a method for intensity normalization and outlier restriction, adjusted for data conversion into 8-bit unsigned integer (UINT8) format, and examined its effects on body-background separation. Our Python code is available for use with proper citation.
△ Less
Submitted 9 September, 2024; v1 submitted 31 August, 2024;
originally announced September 2024.
-
Boosting Unconstrained Face Recognition with Targeted Style Adversary
Authors:
Mohammad Saeed Ebrahimi Saadabadi,
Sahar Rahimi Malakshan,
Seyed Rasoul Hosseini,
Nasser M. Nasrabadi
Abstract:
While deep face recognition models have demonstrated remarkable performance, they often struggle on the inputs from domains beyond their training data. Recent attempts aim to expand the training set by relying on computationally expensive and inherently challenging image-space augmentation of image generation modules. In an orthogonal direction, we present a simple yet effective method to expand t…
▽ More
While deep face recognition models have demonstrated remarkable performance, they often struggle on the inputs from domains beyond their training data. Recent attempts aim to expand the training set by relying on computationally expensive and inherently challenging image-space augmentation of image generation modules. In an orthogonal direction, we present a simple yet effective method to expand the training data by interpolating between instance-level feature statistics across labeled and unlabeled sets. Our method, dubbed Targeted Style Adversary (TSA), is motivated by two observations: (i) the input domain is reflected in feature statistics, and (ii) face recognition model performance is influenced by style information. Shifting towards an unlabeled style implicitly synthesizes challenging training instances. We devise a recognizability metric to constraint our framework to preserve the inherent identity-related information of labeled instances. The efficacy of our method is demonstrated through evaluations on unconstrained benchmarks, outperforming or being on par with its competitors while offering nearly a 70\% improvement in training speed and 40\% less memory consumption.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
The Llama 3 Herd of Models
Authors:
Aaron Grattafiori,
Abhimanyu Dubey,
Abhinav Jauhri,
Abhinav Pandey,
Abhishek Kadian,
Ahmad Al-Dahle,
Aiesha Letman,
Akhil Mathur,
Alan Schelten,
Alex Vaughan,
Amy Yang,
Angela Fan,
Anirudh Goyal,
Anthony Hartshorn,
Aobo Yang,
Archi Mitra,
Archie Sravankumar,
Artem Korenev,
Arthur Hinsvark,
Arun Rao,
Aston Zhang,
Aurelien Rodriguez,
Austen Gregerson,
Ava Spataru,
Baptiste Roziere
, et al. (536 additional authors not shown)
Abstract:
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical…
▽ More
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
△ Less
Submitted 23 November, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Vision Mamba for Classification of Breast Ultrasound Images
Authors:
Ali Nasiri-Sarvi,
Mahdi S. Hosseini,
Hassan Rivaz
Abstract:
Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of ex…
▽ More
Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba}
△ Less
Submitted 17 September, 2024; v1 submitted 3 July, 2024;
originally announced July 2024.
-
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors
Authors:
Tam Thuc Do,
Parham Eftekhar,
Seyed Alireza Hosseini,
Gene Cheung,
Philip Chou
Abstract:
We build interpretable and lightweight transformer-like neural networks by unrolling iterative optimization algorithms that minimize graph smoothness priors -- the quadratic graph Laplacian regularizer (GLR) and the $\ell_1$-norm graph total variation (GTV) -- subject to an interpolation constraint. The crucial insight is that a normalized signal-dependent graph learning module amounts to a varian…
▽ More
We build interpretable and lightweight transformer-like neural networks by unrolling iterative optimization algorithms that minimize graph smoothness priors -- the quadratic graph Laplacian regularizer (GLR) and the $\ell_1$-norm graph total variation (GTV) -- subject to an interpolation constraint. The crucial insight is that a normalized signal-dependent graph learning module amounts to a variant of the basic self-attention mechanism in conventional transformers. Unlike "black-box" transformers that require learning of large key, query and value matrices to compute scaled dot products as affinities and subsequent output embeddings, resulting in huge parameter sets, our unrolled networks employ shallow CNNs to learn low-dimensional features per node to establish pairwise Mahalanobis distances and construct sparse similarity graphs. At each layer, given a learned graph, the target interpolated signal is simply a low-pass filtered output derived from the minimization of an assumed graph smoothness prior, leading to a dramatic reduction in parameter count. Experiments for two image interpolation applications verify the restoration performance, parameter efficiency and robustness to covariate shift of our graph-based unrolled networks compared to conventional transformers.
△ Less
Submitted 5 November, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
AdaFisher: Adaptive Second Order Optimization via Fisher Information
Authors:
Damien Martins Gomes,
Yanlei Zhang,
Eugene Belilovsky,
Guy Wolf,
Mahdi S. Hosseini
Abstract:
First-order optimization methods are currently the mainstream in training deep neural networks (DNNs). Optimizers like Adam incorporate limited curvature information by employing the diagonal matrix preconditioning of the stochastic gradient during the training. Despite their widespread, second-order optimization algorithms exhibit superior convergence properties compared to their first-order coun…
▽ More
First-order optimization methods are currently the mainstream in training deep neural networks (DNNs). Optimizers like Adam incorporate limited curvature information by employing the diagonal matrix preconditioning of the stochastic gradient during the training. Despite their widespread, second-order optimization algorithms exhibit superior convergence properties compared to their first-order counterparts e.g. Adam and SGD. However, their practicality in training DNNs is still limited due to increased per-iteration computations compared to the first-order methods. We present \emph{AdaFisher}--an adaptive second-order optimizer that leverages a \emph{diagonal block-Kronecker} approximation of the Fisher information matrix for adaptive gradient preconditioning. AdaFisher aims to bridge the gap between enhanced \emph{convergence/generalization} capabilities and computational efficiency in second-order optimization framework for training DNNs. Despite the slow pace of second-order optimizers, we showcase that AdaFisher can be reliably adopted for image classification, language modeling and stands out for its stability and robustness in hyper-parameter tuning. We demonstrate that AdaFisher \textbf{outperforms the SOTA optimizers} in terms of both accuracy and convergence speed. Code is available from https://github.com/AtlasAnalyticsLab/AdaFisher.
△ Less
Submitted 10 March, 2025; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Deep Reinforcement Learning with Enhanced PPO for Safe Mobile Robot Navigation
Authors:
Hamid Taheri,
Seyed Rasoul Hosseini,
Mohammad Ali Nekoui
Abstract:
Collision-free motion is essential for mobile robots. Most approaches to collision-free and efficient navigation with wheeled robots require parameter tuning by experts to obtain good navigation behavior. This study investigates the application of deep reinforcement learning to train a mobile robot for autonomous navigation in a complex environment. The robot utilizes LiDAR sensor data and a deep…
▽ More
Collision-free motion is essential for mobile robots. Most approaches to collision-free and efficient navigation with wheeled robots require parameter tuning by experts to obtain good navigation behavior. This study investigates the application of deep reinforcement learning to train a mobile robot for autonomous navigation in a complex environment. The robot utilizes LiDAR sensor data and a deep neural network to generate control signals guiding it toward a specified target while avoiding obstacles. We employ two reinforcement learning algorithms in the Gazebo simulation environment: Deep Deterministic Policy Gradient and proximal policy optimization. The study introduces an enhanced neural network structure in the Proximal Policy Optimization algorithm to boost performance, accompanied by a well-designed reward function to improve algorithm efficacy. Experimental results conducted in both obstacle and obstacle-free environments underscore the effectiveness of the proposed approach. This research significantly contributes to the advancement of autonomous robotics in complex environments through the application of deep reinforcement learning.
△ Less
Submitted 6 August, 2024; v1 submitted 25 May, 2024;
originally announced May 2024.
-
Euclid. III. The NISP Instrument
Authors:
Euclid Collaboration,
K. Jahnke,
W. Gillard,
M. Schirmer,
A. Ealet,
T. Maciaszek,
E. Prieto,
R. Barbier,
C. Bonoli,
L. Corcione,
S. Dusini,
F. Grupp,
F. Hormuth,
S. Ligori,
L. Martin,
G. Morgante,
C. Padilla,
R. Toledo-Moreo,
M. Trifoglio,
L. Valenziano,
R. Bender,
F. J. Castander,
B. Garilli,
P. B. Lilje,
H. -W. Rix
, et al. (412 additional authors not shown)
Abstract:
The Near-Infrared Spectrometer and Photometer (NISP) on board the Euclid satellite provides multiband photometry and R>=450 slitless grism spectroscopy in the 950-2020nm wavelength range. In this reference article we illuminate the background of NISP's functional and calibration requirements, describe the instrument's integral components, and provide all its key properties. We also sketch the proc…
▽ More
The Near-Infrared Spectrometer and Photometer (NISP) on board the Euclid satellite provides multiband photometry and R>=450 slitless grism spectroscopy in the 950-2020nm wavelength range. In this reference article we illuminate the background of NISP's functional and calibration requirements, describe the instrument's integral components, and provide all its key properties. We also sketch the processes needed to understand how NISP operates and is calibrated, and its technical potentials and limitations. Links to articles providing more details and technical background are included. NISP's 16 HAWAII-2RG (H2RG) detectors with a plate scale of 0.3" pix^-1 deliver a field-of-view of 0.57deg^2. In photo mode, NISP reaches a limiting magnitude of ~24.5AB mag in three photometric exposures of about 100s exposure time, for point sources and with a signal-to-noise ratio (SNR) of 5. For spectroscopy, NISP's point-source sensitivity is a SNR = 3.5 detection of an emission line with flux ~2x10^-16erg/s/cm^2 integrated over two resolution elements of 13.4A, in 3x560s grism exposures at 1.6 mu (redshifted Ha). Our calibration includes on-ground and in-flight characterisation and monitoring of detector baseline, dark current, non-linearity, and sensitivity, to guarantee a relative photometric accuracy of better than 1.5%, and relative spectrophotometry to better than 0.7%. The wavelength calibration must be better than 5A. NISP is the state-of-the-art instrument in the NIR for all science beyond small areas available from HST and JWST - and an enormous advance due to its combination of field size and high throughput of telescope and instrument. During Euclid's 6-year survey covering 14000 deg^2 of extragalactic sky, NISP will be the backbone for determining distances of more than a billion galaxies. Its NIR data will become a rich reference imaging and spectroscopy data set for the coming decades.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Euclid. I. Overview of the Euclid mission
Authors:
Euclid Collaboration,
Y. Mellier,
Abdurro'uf,
J. A. Acevedo Barroso,
A. Achúcarro,
J. Adamek,
R. Adam,
G. E. Addison,
N. Aghanim,
M. Aguena,
V. Ajani,
Y. Akrami,
A. Al-Bahlawan,
A. Alavi,
I. S. Albuquerque,
G. Alestas,
G. Alguero,
A. Allaoui,
S. W. Allen,
V. Allevato,
A. V. Alonso-Tetilla,
B. Altieri,
A. Alvarez-Candal,
S. Alvi,
A. Amara
, et al. (1115 additional authors not shown)
Abstract:
The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14…
▽ More
The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance.
△ Less
Submitted 24 September, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
Probing double distribution function models in the lattice Boltzmann method for highly compressible flows
Authors:
S. A. Hosseini,
A. Bhadauria,
I. V. Karlin
Abstract:
The double distribution function approach is an efficient route towards extension of kinetic solvers to compressible flows. With a number of realizations available, an overview and comparative study in the context of high speed compressible flows is presented. We discuss the different variants of the energy partition, analyses of hydrodynamic limits and a numerical study of accuracy and performanc…
▽ More
The double distribution function approach is an efficient route towards extension of kinetic solvers to compressible flows. With a number of realizations available, an overview and comparative study in the context of high speed compressible flows is presented. We discuss the different variants of the energy partition, analyses of hydrodynamic limits and a numerical study of accuracy and performance with the particles on demand realization. Out of three considered energy partition strategies, it is shown that the non-translational energy split requires a higher-order quadrature for proper recovery of the Navier--Stokes--Fourier equations. The internal energy split on the other hand, while recovering the correct hydrodynamic limit with fourth-order quadrature, comes with a non-local --both in space and time-- source term which contributes to higher computational cost and memory overhead. Based on our analysis, the total energy split demonstrates the optimal overall performance.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Optimal Service Placement, Request Routing and CPU Sizing in Cooperative Mobile Edge Computing Networks for Delay-Sensitive Applications
Authors:
Naeimeh Omidvar,
Mahdieh Ahmadi,
Seyed Mohammad Hosseini
Abstract:
We study joint optimization of service placement, request routing, and CPU sizing in a cooperative MEC system. The problem is considered from the perspective of the service provider (SP), which delivers heterogeneous MEC-enabled delay-sensitive services, and needs to pay for the used resources to the mobile network operators and the cloud provider, while earning revenue from the served requests. W…
▽ More
We study joint optimization of service placement, request routing, and CPU sizing in a cooperative MEC system. The problem is considered from the perspective of the service provider (SP), which delivers heterogeneous MEC-enabled delay-sensitive services, and needs to pay for the used resources to the mobile network operators and the cloud provider, while earning revenue from the served requests. We formulate the problem of maximizing the SP's total profit subject to the computation, storage, and communication constraints of each edge node and end-to-end delay requirements of the services as a mixed-integer non-convex optimization problem, and prove it to be NP-hard.
To tackle the challenges in solving the problem, we first introduce a design trade-off parameter for different delay requirements of each service, which maintains flexibility in prioritizing them, and transform the original optimization problem by the new delay constraints. Then, by exploiting a hidden convexity, we reformulate the delay constraints into an equivalent form. Next, to handle the challenge of the complicating (integer) variables, using primal decomposition, we decompose the problem into an equivalent form of master and inner sub-problems over the mixed and real variables, respectively. We then employ a cutting-plane approach for building up adequate representations of the extremal value of the inner problem as a function of the complicating variables and the set of values of the complicating variables for which the inner problem is feasible. Finally, we propose a solution strategy based on generalized Benders decomposition and prove its convergence to the optimal solution within a limited number of iterations. Extensive simulation results demonstrate that the proposed scheme significantly outperforms the existing mechanisms in terms of the SP's profit, cache hit ratio, running time, and end-to-end delay.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Differentially Private Machine Learning-powered Combinatorial Auction Design
Authors:
Arash Jamshidi,
Seyed Mohammad Hosseini,
Seyed Mahdi Noormousavi,
Mahdi Jafari Siavoshani
Abstract:
We present a new approach to machine learning-powered combinatorial auctions, which is based on the principles of Differential Privacy. Our methodology guarantees that the auction mechanism is truthful, meaning that rational bidders have the incentive to reveal their true valuation functions. We achieve this by inducing truthfulness in the auction dynamics, ensuring that bidders consistently provi…
▽ More
We present a new approach to machine learning-powered combinatorial auctions, which is based on the principles of Differential Privacy. Our methodology guarantees that the auction mechanism is truthful, meaning that rational bidders have the incentive to reveal their true valuation functions. We achieve this by inducing truthfulness in the auction dynamics, ensuring that bidders consistently provide accurate information about their valuation functions.
Our method not only ensures truthfulness but also preserves the efficiency of the original auction. This means that if the initial auction outputs an allocation with high social welfare, our modified truthful version of the auction will also achieve high social welfare. We use techniques from Differential Privacy, such as the Exponential Mechanism, to achieve these results. Additionally, we examine the application of differential privacy in auctions across both asymptotic and non-asymptotic regimes.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Optical transition parameters of the silicon T centre
Authors:
Chloe Clear,
Sara Hosseini,
Amirhossein AlizadehKhaledi,
Nicholas Brunelle,
Austin Woolverton,
Joshua Kanaganayagam,
Moein Kazemi,
Camille Chartrand,
Mehdi Keshavarz,
Yihuang Xiong,
Louis Alaerts,
Oney O. Soykal,
Geoffroy Hautier,
Valentin Karassiouk,
Mike Thewalt,
Daniel Higginbottom,
Stephanie Simmons
Abstract:
The silicon T centre's narrow, telecommunications-band optical emission, long spin coherence, and direct photonic integration have spurred interest in this emitter as a spin-photon interface for distributed quantum computing and networking. However, key parameters of the T centre's spin-selective optical transitions remain undetermined or ambiguous in literature. In this paper we present a Hamilto…
▽ More
The silicon T centre's narrow, telecommunications-band optical emission, long spin coherence, and direct photonic integration have spurred interest in this emitter as a spin-photon interface for distributed quantum computing and networking. However, key parameters of the T centre's spin-selective optical transitions remain undetermined or ambiguous in literature. In this paper we present a Hamiltonian of the T centre TX state and determine key parameters of the optical transition from T$_0$ to TX$_0$ from a combined analysis of published results, density functional theory, and new spectroscopy. We resolve ambiguous values of the internal defect potential in the literature, and we present the first measurements of electrically tuned T centre emission. As a result, we provide a model of the T centre's optical and spin properties under strain, electric, and magnetic fields that can be utilized for realizing quantum technologies.
△ Less
Submitted 8 November, 2024; v1 submitted 11 May, 2024;
originally announced May 2024.
-
Super-suppression of long wavelength phonons in constricted nanoporous geometries
Authors:
Alex Greaney,
S. Aria Hosseini,
Laura de Sousa Oliveira,
Alathea Davies,
Neophytos Neophytou
Abstract:
In a typical semiconductor material, the majority of heat is carried by long wavelength, long mean-free-path phonons. Nanostructuring strategies to reduce thermal conductivity, a promising direction in the field of thermoelectrics, place scattering centers of size and spatial separation comparable to the mean-free-paths of the dominant phonons to selectively scatter them. The resultant thermal con…
▽ More
In a typical semiconductor material, the majority of heat is carried by long wavelength, long mean-free-path phonons. Nanostructuring strategies to reduce thermal conductivity, a promising direction in the field of thermoelectrics, place scattering centers of size and spatial separation comparable to the mean-free-paths of the dominant phonons to selectively scatter them. The resultant thermal conductivity is in most cases well predicted using Matthiessens rule. In general, however, long wavelength phonons are not as effectively scattered as the rest of the phonon spectrum. In this work, using large-scale Molecular Dynamics simulations, Non-Equilibrium Greens Function simulations, and Monte Carlo simulations, we show that specific nanoporous geometries, which create narrow constrictions in the passage of phonons, lead to anticorrelated heat currents in the phonon spectrum. This results in super-suppression of long-wavelength phonons due to heat trapping, and reductions in the thermal conductivity well below what is predicted by Matthiessens rule.
△ Less
Submitted 7 May, 2024;
originally announced May 2024.
-
The Third Monocular Depth Estimation Challenge
Authors:
Jaime Spencer,
Fabio Tosi,
Matteo Poggi,
Ripudaman Singh Arora,
Chris Russell,
Simon Hadfield,
Richard Bowden,
GuangYuan Zhou,
ZhengXin Li,
Qiang Rao,
YiPing Bao,
Xiao Liu,
Dohyeong Kim,
Jinseong Kim,
Myunghyun Kim,
Mykola Lavreniuk,
Rui Li,
Qing Mao,
Jiang Wu,
Yu Zhu,
Jinqiu Sun,
Yanning Zhang,
Suraj Patni,
Aradhye Agarwal,
Chetan Arora
, et al. (16 additional authors not shown)
Abstract:
This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su…
▽ More
This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%.
△ Less
Submitted 27 April, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Some aspects of symmetry descent
Authors:
Iñaki García Etxebarria,
Saghar S. Hosseini
Abstract:
In many cases the symmetry structure of quantum field theories can be neatly encoded into their associated symmetry topological field theory (SymTFT), a topological field theory in one dimension higher. For geometrically engineered QFTs in string theory this SymTFT has been argued to arise from the background geometry, essentially by integration of the topological sector of string theory on the ho…
▽ More
In many cases the symmetry structure of quantum field theories can be neatly encoded into their associated symmetry topological field theory (SymTFT), a topological field theory in one dimension higher. For geometrically engineered QFTs in string theory this SymTFT has been argued to arise from the background geometry, essentially by integration of the topological sector of string theory on the horizon of the geometry transverse to the QFT locus. In this paper we clarify some subtle aspects of this proposal. We take a higher dimensional approach, where the ten dimensional string theory fields to be integrated arise as edge modes of a topological field theory in eleven dimensions. The resulting construction provides a SymTFT generalisation of the descent procedure for anomalies.
△ Less
Submitted 11 December, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
Geomechanics Contribution to CO2 Storage Containment and Trapping Mechanisms in Tight Sandstone Complexes: A Case Study on Mae Moh Basin
Authors:
Romal Ramadhan,
Khomchan Promneewat,
Vorasate Thanasaksukthawee,
Teerapat Tosuai,
Masoud Babaei,
Seyyed A. Hosseini,
Avirut Puttiwongrak,
Cheowchan Leelasukseree,
Suparit Tangparitkul
Abstract:
Recognized as a not-an-option approach to mitigate the climate crisis, carbon dioxide capture and storage (CCS) has a potential as much as gigaton of CO2 to sequestrate permanently and securely. Recent attention has been paid to store highly concentrated point-source CO2 into saline formation, of which Thailand considers one onshore case in the north located in Lampang, the Mae Moh coal-fired powe…
▽ More
Recognized as a not-an-option approach to mitigate the climate crisis, carbon dioxide capture and storage (CCS) has a potential as much as gigaton of CO2 to sequestrate permanently and securely. Recent attention has been paid to store highly concentrated point-source CO2 into saline formation, of which Thailand considers one onshore case in the north located in Lampang, the Mae Moh coal-fired power plant matched with its own coal mine of Mae Moh Basin. The current study is thus aimed to examine the influence of reservoir geomechanics on CO2 storage containment and trapping mechanisms, with co-contributions from geochemistry and reservoir heterogeneity, using reservoir simulator, CMG-GEM. With the injection rate designed for 30-year injection, reservoir pressure build-ups were 77% of fracture pressure but increased to 80% when geomechanics excluded. Such pressure responses imply that storage security is associated with the geomechanics. Dominated by viscous force, CO2 plume migrated more laterally while geomechanics clearly contributed to lesser migration due to reservoir rock strength constraint. Reservoir geomechanics contributed to less plume traveling into more constrained spaces while leakage was secured, highlighting a significant and neglected influence of geomechanical factor. Spatiotemporal development of CO2 plume also confirms the geomechanics-dominant storage containment. Reservoir geomechanics as attributed to its respective reservoir fluid pressure controls development of trapping mechanisms, especially into residual and solubility traps. More secured storage containment after the injection was found with higher pressure, while less development into solubility trap was observed with lower pressure.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Vim4Path: Self-Supervised Vision Mamba for Histopathology Images
Authors:
Ali Nasiri-Sarvi,
Vincent Quoc-Huy Trinh,
Hassan Rivaz,
Mahdi S. Hosseini
Abstract:
Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge, leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The perfor…
▽ More
Representation learning from Gigapixel Whole Slide Images (WSI) poses a significant challenge in computational pathology due to the complicated nature of tissue structures and the scarcity of labeled data. Multi-instance learning methods have addressed this challenge, leveraging image patches to classify slides utilizing pretrained models using Self-Supervised Learning (SSL) approaches. The performance of both SSL and MIL methods relies on the architecture of the feature encoder. This paper proposes leveraging the Vision Mamba (Vim) architecture, inspired by state space models, within the DINO framework for representation learning in computational pathology. We evaluate the performance of Vim against Vision Transformers (ViT) on the Camelyon16 dataset for both patch-level and slide-level classification. Our findings highlight Vim's enhanced performance compared to ViT, particularly at smaller scales, where Vim achieves an 8.21 increase in ROC AUC for models of similar size. An explainability analysis further highlights Vim's capabilities, which reveals that Vim uniquely emulates the pathologist workflow-unlike ViT. This alignment with human expert analysis highlights Vim's potential in practical diagnostic settings and contributes significantly to developing effective representation-learning algorithms in computational pathology. We release the codes and pretrained weights at \url{https://github.com/AtlasAnalyticsLab/Vim4Path}.
△ Less
Submitted 25 May, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Microstates of accelerating and supersymmetric AdS$_4$ black holes from the spindle index
Authors:
Edoardo Colombo,
Seyed Morteza Hosseini,
Dario Martelli,
Antonio Pittelli,
Alberto Zaffaroni
Abstract:
We provide a first principles derivation of the microscopic entropy of a very general class of supersymmetric, rotating and accelerating black holes in AdS$_4$. This is achieved by analysing the large-$N$ limit of the spindle index and completes the construction of the first example of a holographic duality involving supersymmetric field theories defined on orbifolds with conical singularities.
We provide a first principles derivation of the microscopic entropy of a very general class of supersymmetric, rotating and accelerating black holes in AdS$_4$. This is achieved by analysing the large-$N$ limit of the spindle index and completes the construction of the first example of a holographic duality involving supersymmetric field theories defined on orbifolds with conical singularities.
△ Less
Submitted 21 July, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
COVID-19 Detection Based on Blood Test Parameters using Various Artificial Intelligence Methods
Authors:
Kavian Khanjani,
Seyed Rasoul Hosseini,
Hamid Taheri,
Shahrzad Shashaani,
Mohammad Teshnehlab
Abstract:
In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to a…
▽ More
In 2019, the world faced a new challenge: a COVID-19 disease caused by the novel coronavirus, SARS-CoV-2. The virus rapidly spread across the globe, leading to a high rate of mortality, which prompted health organizations to take measures to control its transmission. Early disease detection is crucial in the treatment process, and computer-based automatic detection systems have been developed to aid in this effort. These systems often rely on artificial intelligence (AI) approaches such as machine learning, neural networks, fuzzy systems, and deep learning to classify diseases. This study aimed to differentiate COVID-19 patients from others using self-categorizing classifiers and employing various AI methods. This study used two datasets: the blood test samples and radiography images. The best results for the blood test samples obtained from San Raphael Hospital, which include two classes of individuals, those with COVID-19 and those with non-COVID diseases, were achieved through the use of the Ensemble method (a combination of a neural network and two machines learning methods). The results showed that this approach for COVID-19 diagnosis is cost-effective and provides results in a shorter amount of time than other methods. The proposed model achieved an accuracy of 94.09% on the dataset used. Secondly, the radiographic images were divided into four classes: normal, viral pneumonia, ground glass opacity, and COVID-19 infection. These were used for segmentation and classification. The lung lobes were extracted from the images and then categorized into specific classes. We achieved an accuracy of 91.1% on the image dataset. Generally, this study highlights the potential of AI in detecting and managing COVID-19 and underscores the importance of continued research and development in this field.
△ Less
Submitted 6 August, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.