Search | arXiv e-print repository

Reliability of Capacitive Read in Arrays of Ferroelectric Capacitors

Authors: Luca Fehlings, Muhtasim Alam Chowdhury, Banafsheh Saber Latibari, Soheil Salehi, Erika Covi

Abstract: The non-destructive capacitance read-out of ferroelectric capacitors (FeCaps) based on doped HfO$_2$ metal-ferroelectric-metal (MFM) structures offers the potential for low-power and highly scalable crossbar arrays. This is due to a number of factors, including the selector-less design, the absence of sneak paths, the power-efficient charge-based read operation, and the reduced IR drop. Neverthele… ▽ More The non-destructive capacitance read-out of ferroelectric capacitors (FeCaps) based on doped HfO$_2$ metal-ferroelectric-metal (MFM) structures offers the potential for low-power and highly scalable crossbar arrays. This is due to a number of factors, including the selector-less design, the absence of sneak paths, the power-efficient charge-based read operation, and the reduced IR drop. Nevertheless, a reliable capacitive readout presents certain challenges, particularly in regard to device variability and the trade-off between read yield and read disturbances, which can ultimately result in bit-flips. This paper presents a digital read macro for HfO$_2$ FeCaps and provides design guidelines for capacitive readout of HfO$_2$ FeCaps, taking device-centric reliability and yield challenges into account. An experimentally calibrated physics-based compact model of HfO$_2$ FeCaps is employed to investigate the reliability of the read-out operation of the FeCap macro through Monte Carlo simulations. Based on this analysis, we identify limitations posed by the device variability and propose potential mitigation strategies through design-technology co-optimization (DTCO) of the FeCap device characteristics and the CMOS circuit design. Finally, we examine the potential applications of the FeCap macro in the context of secure hardware. We identify potential security threats and propose strategies to enhance the robustness of the system. △ Less

Submitted 11 June, 2025; originally announced June 2025.

Comments: 4 pages, 6 figures, submitted and presented at ISCAS 2025, London

arXiv:2505.22605 [pdf, ps, other]

Transformers for Secure Hardware Systems: Applications, Challenges, and Outlook

Authors: Banafsheh Saber Latibari, Najmeh Nazari, Avesta Sasan, Houman Homayoun, Pratik Satam, Soheil Salehi, Hossein Sayadi

Abstract: The rise of hardware-level security threats, such as side-channel attacks, hardware Trojans, and firmware vulnerabilities, demands advanced detection mechanisms that are more intelligent and adaptive. Traditional methods often fall short in addressing the complexity and evasiveness of modern attacks, driving increased interest in machine learning-based solutions. Among these, Transformer models, w… ▽ More The rise of hardware-level security threats, such as side-channel attacks, hardware Trojans, and firmware vulnerabilities, demands advanced detection mechanisms that are more intelligent and adaptive. Traditional methods often fall short in addressing the complexity and evasiveness of modern attacks, driving increased interest in machine learning-based solutions. Among these, Transformer models, widely recognized for their success in natural language processing and computer vision, have gained traction in the security domain due to their ability to model complex dependencies, offering enhanced capabilities in identifying vulnerabilities, detecting anomalies, and reinforcing system integrity. This survey provides a comprehensive review of recent advancements on the use of Transformers in hardware security, examining their application across key areas such as side-channel analysis, hardware Trojan detection, vulnerability classification, device fingerprinting, and firmware security. Furthermore, we discuss the practical challenges of applying Transformers to secure hardware systems, and highlight opportunities and future research directions that position them as a foundation for next-generation hardware-assisted security. These insights pave the way for deeper integration of AI-driven techniques into hardware security frameworks, enabling more resilient and intelligent defenses. △ Less

Submitted 28 May, 2025; originally announced May 2025.

arXiv:2504.09263 [pdf, ps, other]

Machine Learning-Based AP Selection in User-Centric Cell-free Multiple-Antenna Networks

Authors: S. Salehi, S. Mashdour, O. Tamyigit, S. Seyedmasoumian, M. Moradikia, R. C. de Lamare, A. Schmeink

Abstract: User-centric cell-free (UCCF) massive multiple-input multiple-output (MIMO) systems are considered a viable solution to realize the advantages offered by cell-free (CF) networks, including reduced interference and consistent quality of service while maintaining manageable complexity. In this paper, we propose novel learning-based access point (AP) selection schemes tailored for UCCF massive MIMO s… ▽ More User-centric cell-free (UCCF) massive multiple-input multiple-output (MIMO) systems are considered a viable solution to realize the advantages offered by cell-free (CF) networks, including reduced interference and consistent quality of service while maintaining manageable complexity. In this paper, we propose novel learning-based access point (AP) selection schemes tailored for UCCF massive MIMO systems. The learning model exploits the dataset generated from two distinct AP selection schemes, based on large-scale fading (LSF) coefficients and the sum-rate coefficients, respectively. The proposed learning-based AP selection schemes could be implemented centralized or distributed, with the aim of performing AP selection efficiently. We evaluate our model's performance against CF and two heuristic clustering schemes for UCCF networks. The results demonstrate that the learning-based approach achieves a comparable sum-rate performance to that of competing techniques for UCCF networks, while significantly reducing computational complexity. △ Less

Submitted 12 April, 2025; originally announced April 2025.

Comments: 7 pages, 5 figures

arXiv:2504.08854 [pdf, other]

Hardware Design and Security Needs Attention: From Survey to Path Forward

Authors: Sujan Ghimire, Muhtasim Alam Chowdhury, Banafsheh Saber Latibari, Muntasir Mamun, Jaeden Wolf Carpenter, Benjamin Tan, Hammond Pearce, Pratik Satam, Soheil Salehi

Abstract: Recent advances in attention-based artificial intelligence (AI) models have unlocked vast potential to automate digital hardware design while enhancing and strengthening security measures against various threats. This rapidly emerging field leverages Large Language Models (LLMs) to generate HDL code, identify vulnerabilities, and sometimes mitigate them. The state of the art in this design automat… ▽ More Recent advances in attention-based artificial intelligence (AI) models have unlocked vast potential to automate digital hardware design while enhancing and strengthening security measures against various threats. This rapidly emerging field leverages Large Language Models (LLMs) to generate HDL code, identify vulnerabilities, and sometimes mitigate them. The state of the art in this design automation space utilizes optimized LLMs with HDL datasets, creating automated systems for register-transfer level (RTL) generation, verification, and debugging, and establishing LLM-driven design environments for streamlined logic designs. Additionally, attention-based models like graph attention have shown promise in chip design applications, including floorplanning. This survey investigates the integration of these models into hardware-related domains, emphasizing logic design and hardware security, with or without the use of IP libraries. This study explores the commercial and academic landscape, highlighting technical hurdles and future prospects for automating hardware design and security. Moreover, it provides new insights into the study of LLM-driven design systems, advances in hardware security mechanisms, and the impact of influential works on industry practices. Through the examination of 30 representative approaches and illustrative case studies, this paper underscores the transformative potential of attention-based models in revolutionizing hardware design while addressing the challenges that lie ahead in this interdisciplinary domain. △ Less

Submitted 10 April, 2025; originally announced April 2025.

arXiv:2504.07431 [pdf, other]

LLM-Enabled Data Transmission in End-to-End Semantic Communication

Authors: Shavbo Salehi, Melike Erol-Kantarci, Dusit Niyato

Abstract: Emerging services such as augmented reality (AR) and virtual reality (VR) have increased the volume of data transmitted in wireless communication systems, revealing the limitations of traditional Shannon theory. To address these limitations, semantic communication has been proposed as a solution that prioritizes the meaning of messages over the exact transmission of bits. This paper explores seman… ▽ More Emerging services such as augmented reality (AR) and virtual reality (VR) have increased the volume of data transmitted in wireless communication systems, revealing the limitations of traditional Shannon theory. To address these limitations, semantic communication has been proposed as a solution that prioritizes the meaning of messages over the exact transmission of bits. This paper explores semantic communication for text data transmission in end-to-end (E2E) systems through a novel approach called KG-LLM semantic communication, which integrates knowledge graph (KG) extraction and large language model (LLM) coding. In this method, the transmitter first utilizes a KG to extract key entities and relationships from sentences. The extracted information is then encoded using an LLM to obtain the semantic meaning. On the receiver side, messages are decoded using another LLM, while a bidirectional encoder representations from transformers (i.e., BERT) model further refines the reconstructed sentences for improved semantic similarity. The KG-LLM semantic communication method reduces the transmitted text data volume by 30% through KG-based compression and achieves 84\% semantic similarity between the original and received messages. This demonstrates the KG-LLM methods efficiency and robustness in semantic communication systems, outperforming the deep learning-based semantic communication model (DeepSC), which achieves only 63%. △ Less

Submitted 11 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

arXiv:2502.14080 [pdf, other]

Personalized Education with Generative AI and Digital Twins: VR, RAG, and Zero-Shot Sentiment Analysis for Industry 4.0 Workforce Development

Authors: Yu-Zheng Lin, Karan Petal, Ahmed H Alhamadah, Sujan Ghimire, Matthew William Redondo, David Rafael Vidal Corona, Jesus Pacheco, Soheil Salehi, Pratik Satam

Abstract: The Fourth Industrial Revolution (4IR) technologies, such as cloud computing, machine learning, and AI, have improved productivity but introduced challenges in workforce training and reskilling. This is critical given existing workforce shortages, especially in marginalized communities like Underrepresented Minorities (URM), who often lack access to quality education. Addressing these challenges,… ▽ More The Fourth Industrial Revolution (4IR) technologies, such as cloud computing, machine learning, and AI, have improved productivity but introduced challenges in workforce training and reskilling. This is critical given existing workforce shortages, especially in marginalized communities like Underrepresented Minorities (URM), who often lack access to quality education. Addressing these challenges, this research presents gAI-PT4I4, a Generative AI-based Personalized Tutor for Industrial 4.0, designed to personalize 4IR experiential learning. gAI-PT4I4 employs sentiment analysis to assess student comprehension, leveraging generative AI and finite automaton to tailor learning experiences. The framework integrates low-fidelity Digital Twins for VR-based training, featuring an Interactive Tutor - a generative AI assistant providing real-time guidance via audio and text. It uses zero-shot sentiment analysis with LLMs and prompt engineering, achieving 86\% accuracy in classifying student-teacher interactions as positive or negative. Additionally, retrieval-augmented generation (RAG) enables personalized learning content grounded in domain-specific knowledge. To adapt training dynamically, finite automaton structures exercises into states of increasing difficulty, requiring 80\% task-performance accuracy for progression. Experimental evaluation with 22 volunteers showed improved accuracy exceeding 80\%, reducing training time. Finally, this paper introduces a Multi-Fidelity Digital Twin model, aligning Digital Twin complexity with Bloom's Taxonomy and Kirkpatrick's model, providing a scalable educational framework. △ Less

Submitted 19 February, 2025; originally announced February 2025.

arXiv:2502.09758 [pdf, other]

doi 10.1007/978-3-031-92366-1_3

Fast Inexact Bilevel Optimization for Analytical Deep Image Priors

Authors: Mohammad Sadegh Salehi, Tatiana A. Bubba, Yury Korolev

Abstract: The analytical deep image prior (ADP) introduced by Dittmer et al. (2020) establishes a link between deep image priors and classical regularization theory via bilevel optimization. While this is an elegant construction, it involves expensive computations if the lower-level problem is to be solved accurately. To overcome this issue, we propose to use adaptive inexact bilevel optimization to solve A… ▽ More The analytical deep image prior (ADP) introduced by Dittmer et al. (2020) establishes a link between deep image priors and classical regularization theory via bilevel optimization. While this is an elegant construction, it involves expensive computations if the lower-level problem is to be solved accurately. To overcome this issue, we propose to use adaptive inexact bilevel optimization to solve ADP problems. We discuss an extension of a recent inexact bilevel method called the method of adaptive inexact descent of Salehi et al.(2024) to an infinite-dimensional setting required by the ADP framework. In our numerical experiments we demonstrate that the computational speed-up achieved by adaptive inexact bilevel optimization allows one to use ADP on larger-scale problems than in the previous literature, e.g. in deblurring of 2D color images. △ Less

Submitted 11 March, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

Comments: 12 pages, 7 figures. Accepted to the 10th International Conference on Scale Space and Variational Methods in Computer Vision (SSVM 2025)

arXiv:2502.05884 [pdf, other]

Study of Robust Multiuser Scheduling and Power Allocation in Cell-Free MIMO Networks

Authors: S. Mashdour, A. R. Flores, S. Salehi, R. C. de Lamare, Anke Schmeink

Abstract: This paper introduces a robust resource allocation framework for the downlink of cell-free massive multi-input multi-output (CF-mMIMO) networks to address the effects caused by imperfect channel state information (CSI). In particular, the proposed robust resource allocation framework includes a robust user scheduling algorithm to optimize the network's sum-rate and a robust power allocation techni… ▽ More This paper introduces a robust resource allocation framework for the downlink of cell-free massive multi-input multi-output (CF-mMIMO) networks to address the effects caused by imperfect channel state information (CSI). In particular, the proposed robust resource allocation framework includes a robust user scheduling algorithm to optimize the network's sum-rate and a robust power allocation technique aimed at minimizing the mean square error (MSE) for a network with a linear precoder. Unlike non-robust resource allocation techniques, the proposed robust strategies effectively counteract the effects of imperfect CSI, enhancing network efficiency and reliability. Simulation results show a significant improvement in network performance obtained by the proposed approaches, highlighting the impact of robust resource allocation in wireless networks. △ Less

Submitted 9 February, 2025; originally announced February 2025.

Comments: 2 figures, 7 pages

arXiv:2501.15734 [pdf, other]

Prioritized Value-Decomposition Network for Explainable AI-Enabled Network Slicing

Authors: Shavbo Salehi, Pedro Enrique Iturria-Rivera, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

Abstract: Network slicing aims to enhance flexibility and efficiency in next-generation wireless networks by allocating the right resources to meet the diverse requirements of various applications. Managing these slices with machine learning (ML) algorithms has emerged as a promising approach however explainability has been a challenge. To this end, several Explainable Artificial Intelligence (XAI) framewor… ▽ More Network slicing aims to enhance flexibility and efficiency in next-generation wireless networks by allocating the right resources to meet the diverse requirements of various applications. Managing these slices with machine learning (ML) algorithms has emerged as a promising approach however explainability has been a challenge. To this end, several Explainable Artificial Intelligence (XAI) frameworks have been proposed to address the opacity in decision-making in many ML methods. In this paper, we propose a Prioritized Value-Decomposition Network (PVDN) as an XAI-driven approach for resource allocation in a multi-agent network slicing system. The PVDN method decomposes the global value function into individual contributions and prioritizes slice outputs, providing an explanation of how resource allocation decisions impact system performance. By incorporating XAI, PVDN offers valuable insights into the decision-making process, enabling network operators to better understand, trust, and optimize slice management strategies. Through simulations, we demonstrate the effectiveness of the PVDN approach with improving the throughput by 67% and 16%, while reducing latency by 35% and 22%, compared to independent and VDN-based resource allocation methods. △ Less

Submitted 26 January, 2025; originally announced January 2025.

arXiv:2501.00051 [pdf, other]

DDD-GenDT: Dynamic Data-driven Generative Digital Twin Framework

Authors: Yu-Zheng Lin, Qinxuan Shi, Zhanglong Yang, Banafsheh Saber Latibari, Sicong Shao, Soheil Salehi, Pratik Satam

Abstract: Digital twin (DT) technology has emerged as a transformative approach to simulate, predict, and optimize the behavior of physical systems, with applications that span manufacturing, healthcare, climate science, and more. However, the development of DT models often faces challenges such as high data requirements, integration complexity, and limited adaptability to dynamic changes in physical system… ▽ More Digital twin (DT) technology has emerged as a transformative approach to simulate, predict, and optimize the behavior of physical systems, with applications that span manufacturing, healthcare, climate science, and more. However, the development of DT models often faces challenges such as high data requirements, integration complexity, and limited adaptability to dynamic changes in physical systems. This paper presents a new method inspired by dynamic data-driven applications systems (DDDAS), called the dynamic data-driven generative of digital twins framework (DDD-GenDT), which combines the physical system with LLM, allowing LLM to act as DT to interact with the physical system operating status and generate the corresponding physical behaviors. We apply DDD-GenDT to the computer numerical control (CNC) machining process, and we use the spindle current measurement data in the NASA milling wear data set as an example to enable LLMs to forecast the physical behavior from historical data and interact with current observations. Experimental results show that in the zero-shot prediction setting, the LLM-based DT can adapt to the change in the system, and the average RMSE of the GPT-4 prediction is 0.479A, which is 4.79% of the maximum spindle motor current measurement of 10A, with little training data and instructions required. Furthermore, we analyze the performance of DDD-GenDT in this specific application and their potential to construct digital twins. We also discuss the limitations and challenges that may arise in practical implementations. △ Less

Submitted 27 December, 2024; originally announced January 2025.

arXiv:2412.21091 [pdf]

Comparative Analysis of 2D and 3D ResNet Architectures for IDH and MGMT Mutation Detection in Glioma Patients

Authors: Danial Elyassirad, Benyamin Gheiji, Mahsa Vatanparast, Amir Mahmoud Ahmadzadeh, Neda Kamandi, Amirmohammad Soleimanian, Sara Salehi, Shahriar Faghani

Abstract: Gliomas are the most common cause of mortality among primary brain tumors. Molecular markers, including Isocitrate Dehydrogenase (IDH) and O[6]-methylguanine-DNA methyltransferase (MGMT) influence treatment responses and prognosis. Deep learning (DL) models may provide a non-invasive method for predicting the status of these molecular markers. To achieve non-invasive determination of gene mutation… ▽ More Gliomas are the most common cause of mortality among primary brain tumors. Molecular markers, including Isocitrate Dehydrogenase (IDH) and O[6]-methylguanine-DNA methyltransferase (MGMT) influence treatment responses and prognosis. Deep learning (DL) models may provide a non-invasive method for predicting the status of these molecular markers. To achieve non-invasive determination of gene mutations in glioma patients, we compare 2D and 3D ResNet models to predict IDH and MGMT status, using T1, post-contrast T1, and FLAIR MRI sequences. USCF glioma dataset was used, which contains 495 patients with known IDH and 410 patients with known MGMT status. The dataset was divided into training (60%), tuning (20%), and test (20%) subsets at the patient level. The 2D models take axial, coronal, and sagittal tumor slices as three separate models. To ensemble the 2D predictions the three different views were combined using logistic regression. Various ResNet architectures (ResNet10, 18, 34, 50, 101, 152) were trained. For the 3D approach, we incorporated the entire brain tumor volume in the ResNet10, 18, and 34 models. After optimizing each model, the models with the lowest tuning loss were selected for further evaluation on the separate test sets. The best-performing models in IDH prediction were the 2D ResNet50, achieving a test area under the receiver operating characteristic curve (AUROC) of 0.9096, and the 3D ResNet34, which reached a test AUROC of 0.8999. For MGMT status prediction, the 2D ResNet152 achieved a test AUROC of 0.6168; however, all 3D models yielded AUROCs less than 0.5. Overall, the study indicated that both 2D and 3D models showed high predictive value for IDH prediction, with slightly better performance in 2D models. △ Less

Submitted 30 December, 2024; originally announced December 2024.

Comments: 11 PAGES, 2 Figures, 3 Tables

arXiv:2412.15348 [pdf, other]

Autonomous Vehicle Security: A Deep Dive into Threat Modeling

Authors: Amal Yousseef, Shalaka Satam, Banafsheh Saber Latibari, Jesus Pacheco, Soheil Salehi, Salim Hariri, Partik Satam

Abstract: Autonomous vehicles (AVs) are poised to revolutionize modern transportation, offering enhanced safety, efficiency, and convenience. However, the increasing complexity and connectivity of AV systems introduce significant cybersecurity challenges. This paper provides a comprehensive survey of AV security with a focus on threat modeling frameworks, including STRIDE, DREAD, and MITRE ATT\&CK, to syste… ▽ More Autonomous vehicles (AVs) are poised to revolutionize modern transportation, offering enhanced safety, efficiency, and convenience. However, the increasing complexity and connectivity of AV systems introduce significant cybersecurity challenges. This paper provides a comprehensive survey of AV security with a focus on threat modeling frameworks, including STRIDE, DREAD, and MITRE ATT\&CK, to systematically identify and mitigate potential risks. The survey examines key components of AV architectures, such as sensors, communication modules, and electronic control units (ECUs), and explores common attack vectors like wireless communication exploits, sensor spoofing, and firmware vulnerabilities. Through case studies of real-world incidents, such as the Jeep Cherokee and Tesla Model S exploits, the paper highlights the critical need for robust security measures. Emerging technologies, including blockchain for secure Vehicle-to-Everything (V2X) communication, AI-driven threat detection, and secure Over-The-Air (OTA) updates, are discussed as potential solutions to mitigate evolving threats. The paper also addresses legal and ethical considerations, emphasizing data privacy, user safety, and regulatory compliance. By combining threat modeling frameworks, multi-layered security strategies, and proactive defenses, this survey offers insights and recommendations for enhancing the cybersecurity of autonomous vehicles. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.12049 [pdf, other]

doi 10.1007/978-3-031-92366-1_27

Bilevel Learning with Inexact Stochastic Gradients

Authors: Mohammad Sadegh Salehi, Subhadip Mukherjee, Lindon Roberts, Matthias J. Ehrhardt

Abstract: Bilevel learning has gained prominence in machine learning, inverse problems, and imaging applications, including hyperparameter optimization, learning data-adaptive regularizers, and optimizing forward operators. The large-scale nature of these problems has led to the development of inexact and computationally efficient methods. Existing adaptive methods predominantly rely on deterministic formul… ▽ More Bilevel learning has gained prominence in machine learning, inverse problems, and imaging applications, including hyperparameter optimization, learning data-adaptive regularizers, and optimizing forward operators. The large-scale nature of these problems has led to the development of inexact and computationally efficient methods. Existing adaptive methods predominantly rely on deterministic formulations, while stochastic approaches often adopt a doubly-stochastic framework with impractical variance assumptions, enforces a fixed number of lower-level iterations, and requires extensive tuning. In this work, we focus on bilevel learning with strongly convex lower-level problems and a nonconvex sum-of-functions in the upper-level. Stochasticity arises from data sampling in the upper-level which leads to inexact stochastic hypergradients. We establish their connection to state-of-the-art stochastic optimization theory for nonconvex objectives. Furthermore, we prove the convergence of inexact stochastic bilevel optimization under mild assumptions. Our empirical results highlight significant speed-ups and improved generalization in imaging tasks such as image denoising and deblurring in comparison with adaptive deterministic bilevel methods. △ Less

Submitted 11 March, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

Comments: Accepted to the 10th International Conference on Scale Space and Variational Methods in Computer Vision (SSVM 2025)

arXiv:2412.10683 [pdf, other]

Adaptive Nonparametric Perturbations of Parametric Bayesian Models

Authors: Bohan Wu, Eli N. Weinstein, Sohrab Salehi, Yixin Wang, David M. Blei

Abstract: Parametric Bayesian modeling offers a powerful and flexible toolbox for scientific data analysis. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we study nonparametrically perturbed parametric (NPP) Bayesian models, in which a parametric Bayesian model is relaxed via a distortion of its likelihood. We analyze the properties of NPP mod… ▽ More Parametric Bayesian modeling offers a powerful and flexible toolbox for scientific data analysis. Yet the model, however detailed, may still be wrong, and this can make inferences untrustworthy. In this paper we study nonparametrically perturbed parametric (NPP) Bayesian models, in which a parametric Bayesian model is relaxed via a distortion of its likelihood. We analyze the properties of NPP models when the target of inference is the true data distribution or some functional of it, such as in causal inference. We show that NPP models can offer the robustness of nonparametric models while retaining the data efficiency of parametric models, achieving fast convergence when the parametric model is close to true. To efficiently analyze data with an NPP model, we develop a generalized Bayes procedure to approximate its posterior. We demonstrate our method by estimating causal effects of gene expression from single cell RNA sequencing data. NPP modeling offers an efficient approach to robust Bayesian inference and can be used to robustify any parametric Bayesian model. △ Less

Submitted 17 December, 2024; v1 submitted 14 December, 2024; originally announced December 2024.

arXiv:2412.06436 [pdf, ps, other]

An Adaptively Inexact Method for Bilevel Learning Using Primal-Dual Style Differentiation

Authors: Lea Bogensperger, Matthias J. Ehrhardt, Thomas Pock, Mohammad Sadegh Salehi, Hok Shing Wong

Abstract: We consider a bilevel learning framework for learning linear operators. In this framework, the learnable parameters are optimized via a loss function that also depends on the minimizer of a convex optimization problem (denoted lower-level problem). We utilize an iterative algorithm called `piggyback' to compute the gradient of the loss and minimizer of the lower-level problem. Given that the lower… ▽ More We consider a bilevel learning framework for learning linear operators. In this framework, the learnable parameters are optimized via a loss function that also depends on the minimizer of a convex optimization problem (denoted lower-level problem). We utilize an iterative algorithm called `piggyback' to compute the gradient of the loss and minimizer of the lower-level problem. Given that the lower-level problem is solved numerically, the loss function and thus its gradient can only be computed inexactly. To estimate the accuracy of the computed hypergradient, we derive an a-posteriori error bound, which provides guides for setting the tolerance for the lower-level problem, as well as the piggyback algorithm. To efficiently solve the upper-level optimization, we also propose an adaptive method for choosing a suitable step-size. To illustrate the proposed method, we consider a few learned regularizer problems, such as training an input-convex neural network. △ Less

Submitted 7 June, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

arXiv:2412.02653 [pdf, other]

Scaffold or Crutch? Examining College Students' Use and Views of Generative AI Tools for STEM Education

Authors: Karen D. Wang, Zhangyang Wu, L'Nard Tufts II, Carl Wieman, Shima Salehi, Nick Haber

Abstract: Developing problem-solving competency is central to Science, Technology, Engineering, and Mathematics (STEM) education, yet translating this priority into effective approaches to problem-solving instruction and assessment remain a significant challenge. The recent proliferation of generative artificial intelligence (genAI) tools like ChatGPT in higher education introduces new considerations about… ▽ More Developing problem-solving competency is central to Science, Technology, Engineering, and Mathematics (STEM) education, yet translating this priority into effective approaches to problem-solving instruction and assessment remain a significant challenge. The recent proliferation of generative artificial intelligence (genAI) tools like ChatGPT in higher education introduces new considerations about how these tools can help or hinder students' development of STEM problem-solving competency. Our research examines these considerations by studying how and why college students use genAI tools in their STEM coursework, focusing on their problem-solving support. We surveyed 40 STEM college students from diverse U.S. institutions and 28 STEM faculty to understand instructor perspectives on effective genAI tool use and guidance in STEM courses. Our findings reveal high adoption rates and diverse applications of genAI tools among STEM students. The most common use cases include finding explanations, exploring related topics, summarizing readings, and helping with problem-set questions. The primary motivation for using genAI tools was to save time. Moreover, over half of student participants reported simply inputting problems for AI to generate solutions, potentially bypassing their own problem-solving processes. These findings indicate that despite high adoption rates, students' current approaches to utilizing genAI tools often fall short in enhancing their own STEM problem-solving competencies. The study also explored students' and STEM instructors' perceptions of the benefits and risks associated with using genAI tools in STEM education. Our findings provide insights into how to guide students on appropriate genAI use in STEM courses and how to design genAI-based tools to foster students' problem-solving competency. △ Less

Submitted 3 December, 2024; originally announced December 2024.

arXiv:2411.14433 [pdf, other]

Transforming Engineering Education Using Generative AI and Digital Twin Technologies

Authors: Yu-Zheng Lin, Ahmed Hussain J Alhamadah, Matthew William Redondo, Karan Himanshu Patel, Sujan Ghimire, Banafsheh Saber Latibari, Soheil Salehi, Pratik Satam

Abstract: Digital twin technology, traditionally used in industry, is increasingly recognized for its potential to enhance educational experiences. This study investigates the application of industrial digital twins (DTs) in education, focusing on how DT models of varying fidelity can support different stages of Bloom's taxonomy in the cognitive domain. We align Bloom's six cognitive stages with educational… ▽ More Digital twin technology, traditionally used in industry, is increasingly recognized for its potential to enhance educational experiences. This study investigates the application of industrial digital twins (DTs) in education, focusing on how DT models of varying fidelity can support different stages of Bloom's taxonomy in the cognitive domain. We align Bloom's six cognitive stages with educational levels: undergraduate studies for "Remember" and "Understand," master's level for "Apply" and "Analyze," and doctoral level for "Evaluate" and "Create." Low-fidelity DTs aid essential knowledge acquisition and skill training, providing a low-risk environment for grasping fundamental concepts. Medium-fidelity DTs offer more detailed and dynamic simulations, enhancing application skills and problem-solving. High-fidelity DTs support advanced learners by replicating physical phenomena, allowing for innovative design and complex experiments. Within this framework, large language models (LLMs) serve as mentors, assessing progress, filling knowledge gaps, and assisting with DT interactions, parameter setting, and debugging. We evaluate the educational impact using the Kirkpatrick Model, examining how each DT model's fidelity influences learning outcomes. This framework helps educators make informed decisions on integrating DTs and LLMs to meet specific learning objectives. △ Less

Submitted 2 November, 2024; originally announced November 2024.

Comments: 8 pages, 7 figures

arXiv:2410.05153 [pdf, other]

doi 10.1109/TMLCN.2024.3470760

Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing

Authors: Shavbo Salehi, Hao Zhou, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

Abstract: Network slicing is a pivotal paradigm in wireless networks enabling customized services to users and applications. Yet, intelligent jamming attacks threaten the performance of network slicing. In this paper, we focus on the security aspect of network slicing over a deep transfer reinforcement learning (DTRL) enabled scenario. We first demonstrate how a deep reinforcement learning (DRL)-enabled jam… ▽ More Network slicing is a pivotal paradigm in wireless networks enabling customized services to users and applications. Yet, intelligent jamming attacks threaten the performance of network slicing. In this paper, we focus on the security aspect of network slicing over a deep transfer reinforcement learning (DTRL) enabled scenario. We first demonstrate how a deep reinforcement learning (DRL)-enabled jamming attack exposes potential risks. In particular, the attacker can intelligently jam resource blocks (RBs) reserved for slices by monitoring transmission signals and perturbing the assigned resources. Then, we propose a DRL-driven mitigation model to mitigate the intelligent attacker. Specifically, the defense mechanism generates interference on unallocated RBs where another antenna is used for transmitting powerful signals. This causes the jammer to consider these RBs as allocated RBs and generate interference for those instead of the allocated RBs. The analysis revealed that the intelligent DRL-enabled jamming attack caused a significant 50% degradation in network throughput and 60% increase in latency in comparison with the no-attack scenario. However, with the implemented mitigation measures, we observed 80% improvement in network throughput and 70% reduction in latency in comparison to the under-attack scenario. △ Less

Submitted 7 October, 2024; originally announced October 2024.

arXiv:2408.10376 [pdf, other]

Self-Play Ensemble Q-learning enabled Resource Allocation for Network Slicing

Authors: Shavbo Salehi, Pedro Enrique Iturria-Rivera, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci

Abstract: In 5G networks, network slicing has emerged as a pivotal paradigm to address diverse user demands and service requirements. To meet the requirements, reinforcement learning (RL) algorithms have been utilized widely, but this method has the problem of overestimation and exploration-exploitation trade-offs. To tackle these problems, this paper explores the application of self-play ensemble Q-learnin… ▽ More In 5G networks, network slicing has emerged as a pivotal paradigm to address diverse user demands and service requirements. To meet the requirements, reinforcement learning (RL) algorithms have been utilized widely, but this method has the problem of overestimation and exploration-exploitation trade-offs. To tackle these problems, this paper explores the application of self-play ensemble Q-learning, an extended version of the RL-based technique. Self-play ensemble Q-learning utilizes multiple Q-tables with various exploration-exploitation rates leading to different observations for choosing the most suitable action for each state. Moreover, through self-play, each model endeavors to enhance its performance compared to its previous iterations, boosting system efficiency, and decreasing the effect of overestimation. For performance evaluation, we consider three RL-based algorithms; self-play ensemble Q-learning, double Q-learning, and Q-learning, and compare their performance under different network traffic. Through simulations, we demonstrate the effectiveness of self-play ensemble Q-learning in meeting the diverse demands within 21.92% in latency, 24.22% in throughput, and 23.63\% in packet drop rate in comparison with the baseline methods. Furthermore, we evaluate the robustness of self-play ensemble Q-learning and double Q-learning in situations where one of the Q-tables is affected by a malicious user. Our results depicted that the self-play ensemble Q-learning method is more robust against adversarial users and prevents a noticeable drop in system performance, mitigating the impact of users manipulating policies. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2407.18951 [pdf, other]

doi 10.1109/AICCSA63423.2024.10912549

Photogrammetry for Digital Twinning Industry 4.0 (I4) Systems

Authors: Ahmed Alhamadah, Muntasir Mamun, Henry Harms, Mathew Redondo, Yu-Zheng Lin, Jesus Pacheco, Soheil Salehi, Pratik Satam

Abstract: The onset of Industry 4.0 is rapidly transforming the manufacturing world through the integration of cloud computing, machine learning (ML), artificial intelligence (AI), and universal network connectivity, resulting in performance optimization and increase productivity. Digital Twins (DT) are one such transformational technology that leverages software systems to replicate physical process behavi… ▽ More The onset of Industry 4.0 is rapidly transforming the manufacturing world through the integration of cloud computing, machine learning (ML), artificial intelligence (AI), and universal network connectivity, resulting in performance optimization and increase productivity. Digital Twins (DT) are one such transformational technology that leverages software systems to replicate physical process behavior, representing the physical process in a digital environment. This paper aims to explore the use of photogrammetry (which is the process of reconstructing physical objects into virtual 3D models using photographs) and 3D Scanning techniques to create accurate visual representation of the 'Physical Process', to interact with the ML/AI based behavior models. To achieve this, we have used a readily available consumer device, the iPhone 15 Pro, which features stereo vision capabilities, to capture the depth of an Industry 4.0 system. By processing these images using 3D scanning tools, we created a raw 3D model for 3D modeling and rendering software for the creation of a DT model. The paper highlights the reliability of this method by measuring the error rate in between the ground truth (measurements done manually using a tape measure) and the final 3D model created using this method. The overall mean error is 4.97\% and the overall standard deviation error is 5.54\% between the ground truth measurements and their photogrammetry counterparts. The results from this work indicate that photogrammetry using consumer-grade devices can be an efficient and cost-efficient approach to creating DTs for smart manufacturing, while the approaches flexibility allows for iterative improvements of the models over time. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.17365 [pdf, other]

ViPer: Visual Personalization of Generative Models via Individual Preference Learning

Authors: Sogand Salehi, Mahdi Shafiei, Teresa Yeo, Roman Bachmann, Amir Zamir

Abstract: Different users find different images generated for the same prompt desirable. This gives rise to personalized image generation which involves creating images aligned with an individual's visual preference. Current generative models are, however, unpersonalized, as they are tuned to produce outputs that appeal to a broad audience. Using them to generate images aligned with individual users relies… ▽ More Different users find different images generated for the same prompt desirable. This gives rise to personalized image generation which involves creating images aligned with an individual's visual preference. Current generative models are, however, unpersonalized, as they are tuned to produce outputs that appeal to a broad audience. Using them to generate images aligned with individual users relies on iterative manual prompt engineering by the user which is inefficient and undesirable. We propose to personalize the image generation process by first capturing the generic preferences of the user in a one-time process by inviting them to comment on a small selection of images, explaining why they like or dislike each. Based on these comments, we infer a user's structured liked and disliked visual attributes, i.e., their visual preference, using a large language model. These attributes are used to guide a text-to-image model toward producing images that are tuned towards the individual user's visual preference. Through a series of user studies and large language model guided evaluations, we demonstrate that the proposed method results in generations that are well aligned with individual users' visual preferences. △ Less

Submitted 24 July, 2024; originally announced July 2024.

Comments: Project page at https://viper.epfl.ch/

arXiv:2405.12197 [pdf]

Automated Hardware Logic Obfuscation Framework Using GPT

Authors: Banafsheh Saber Latibari, Sujan Ghimire, Muhtasim Alam Chowdhury, Najmeh Nazari, Kevin Immanuel Gubbi, Houman Homayoun, Avesta Sasan, Soheil Salehi

Abstract: Obfuscation stands as a promising solution for safeguarding hardware intellectual property (IP) against a spectrum of threats including reverse engineering, IP piracy, and tampering. In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. The proposed framework accepts hardware design netlists and key… ▽ More Obfuscation stands as a promising solution for safeguarding hardware intellectual property (IP) against a spectrum of threats including reverse engineering, IP piracy, and tampering. In this paper, we introduce Obfus-chat, a novel framework leveraging Generative Pre-trained Transformer (GPT) models to automate the obfuscation process. The proposed framework accepts hardware design netlists and key sizes as inputs, and autonomously generates obfuscated code tailored to enhance security. To evaluate the effectiveness of our approach, we employ the Trust-Hub Obfuscation Benchmark for comparative analysis. We employed SAT attacks to assess the security of the design, along with functional verification procedures to ensure that the obfuscated design remains consistent with the original. Our results demonstrate the efficacy and efficiency of the proposed framework in fortifying hardware IP against potential threats, thus providing a valuable contribution to the field of hardware security. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.08359 [pdf, other]

GPS-IDS: An Anomaly-based GPS Spoofing Attack Detection Framework for Autonomous Vehicles

Authors: Murad Mehrab Abrar, Amal Youssef, Raian Islam, Shalaka Satam, Banafsheh Saber Latibari, Salim Hariri, Sicong Shao, Soheil Salehi, Pratik Satam

Abstract: Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AV… ▽ More Autonomous Vehicles (AVs) heavily rely on sensors and communication networks like Global Positioning System (GPS) to navigate autonomously. Prior research has indicated that networks like GPS are vulnerable to cyber-attacks such as spoofing and jamming, thus posing serious risks like navigation errors and system failures. These threats are expected to intensify with the widespread deployment of AVs, making it crucial to detect and mitigate such attacks. This paper proposes GPS Intrusion Detection System, or GPS-IDS, an Anomaly-based intrusion detection framework to detect GPS spoofing attacks on AVs. The framework uses a novel physics-based vehicle behavior model where a GPS navigation model is integrated into the conventional dynamic bicycle model for accurate AV behavior representation. Temporal features derived from this behavior model are analyzed using machine learning to detect normal and abnormal navigation behaviors. The performance of the GPS-IDS framework is evaluated on the AV-GPS-Dataset -- a GPS security dataset for AVs comprising real-world data collected using an AV testbed, and simulated data representing urban traffic environments. To the best of our knowledge, this dataset is the first of its kind and has been publicly released for the global research community to address such security challenges. △ Less

Submitted 17 December, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: Article under review at IEEE Transactions on Dependable and Secure Computing. For associated AV-GPS-Dataset, see https://github.com/mehrab-abrar/AV-GPS-Dataset

arXiv:2404.18879 [pdf, other]

SCN as a Local Probe of Protein Structural Dynamics

Authors: Sena Aydin, Seyedeh Maryam Salehi, Kai Töpfer, Markus Meuwly

Abstract: The dynamics of lysozyme is probed by attaching -SCN to all alanine-residues. The 1-dimensional infrared spectra exhibit frequency shifts in the position of the maximum absorption by 4 cm$^{-1}$ which is consistent with experiments in different solvents and indicates moderately strong interactions of the vibrational probe with its environment. Isotopic substitution $^{12}$C $\rightarrow ^{13}$C le… ▽ More The dynamics of lysozyme is probed by attaching -SCN to all alanine-residues. The 1-dimensional infrared spectra exhibit frequency shifts in the position of the maximum absorption by 4 cm$^{-1}$ which is consistent with experiments in different solvents and indicates moderately strong interactions of the vibrational probe with its environment. Isotopic substitution $^{12}$C $\rightarrow ^{13}$C leads to a red-shift by $-47$ cm$^{-1}$ which is consistent with experiments with results on CN-substituted copper complexes in solution. The low-frequency, far-infrared part of the protein spectra contain label-specific information in the difference spectra when compared with the wild type protein. Depending on the positioning of the labels, local structural changes are observed. For example, introducing the -SCN label at Ala129 leads to breaking of the $α-$helical structure with concomitant change in the far-infrared spectrum. Finally, changes in the local hydration of SCN-labelled Alanine residues as a function of time can be related to angular reorientation of the label. It is concluded that -SCN is potentially useful for probing protein dynamics, both in the high-frequency (CN-stretch) and far-infrared part of the spectrum. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2403.05671 [pdf]

Investigating Changes of Water Quality in Reservoirs based on Flood and Inflow Fluctuations

Authors: Shabnam Salehi, Mojtaba Ardestani

Abstract: Water temperature and dissolved oxygen are essential indicators of water quality and ecosystem sustainability. Lately, heavy rainfalls are happening frequently and forcefully affecting the thermal structure and mixing layers in depth by sharply increasing the volume of inflow entitled flash flood. It can occur by sudden intense precipitation and develop within minutes or hours. Because of heavy de… ▽ More Water temperature and dissolved oxygen are essential indicators of water quality and ecosystem sustainability. Lately, heavy rainfalls are happening frequently and forcefully affecting the thermal structure and mixing layers in depth by sharply increasing the volume of inflow entitled flash flood. It can occur by sudden intense precipitation and develop within minutes or hours. Because of heavy debris load and speedy water, this phenomenon has remarkable effects on water quality. A higher flow during floods may worsens water quality at lakes and reservoirs that are thermally stratified (with separate density layers) and decrease dissolved oxygen content. However, it is unclear how well these parameters represent the response of lakes to changes in volume discharge. To address this question, researchers simulate the thermal structure in two stratified reservoirs, considering the Rajae reservoir as a representative reservoir in the north of Iran and Minab reservoir in the south. In this study, the model realistically represented variations of dissolved oxygen and temperature of dams Lake response to flash floods. The model performance was evaluated using observed data from stations on the dams lake. In this case, the inflow charge considered in a 10-day flash flood from April 6th to April 16th during the yearly normal flow. The complete mixture in a part of the thermal structure has been proved in Rajaee reservoir. The nonpermanent impact of the massive inflow of storm runoff caused an increase in oxygen-consuming, leading to a severe decrease in dissolved oxygen on epilimnion and metalimnion. The situation in Minab reservoir was relatively different from Rajae reservoir. The inflow changes not only cause mixture but also help expanding stratification. △ Less

Submitted 19 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: 18 pages, 14 figuers

arXiv:2402.14824 [pdf]

Hard Rock Drilling for Super-hot Enhanced Geothermal System Development: Literature Review and Techno-Economic Analysis

Authors: Orkhan Khankishiyev, Saeed Salehi

Abstract: The increasing global demand for electricity and the imperative of achieving sustainable and net-zero energy solutions have underscored the importance of exploring alternative sources. Enhanced Geothermal Systems (EGS) have emerged as a promising avenue for renewable and sustainable energy production. However, the development of EGS faces a significant challenge in drilling through hard rock forma… ▽ More The increasing global demand for electricity and the imperative of achieving sustainable and net-zero energy solutions have underscored the importance of exploring alternative sources. Enhanced Geothermal Systems (EGS) have emerged as a promising avenue for renewable and sustainable energy production. However, the development of EGS faces a significant challenge in drilling through hard rock formations at high temperatures, necessitating specialized drilling equipment and techniques. This study aims to investigate the current state-of-the-art technology for drilling in hard rock formations under elevated temperatures, specifically in the context of super-hot EGS development. It involves a comprehensive review of previous projects and a meticulous analysis of existing drilling technologies and techniques. Furthermore, a techno-economic evaluation will be conducted to assess the feasibility of super-hot EGS development in hard igneous formations, considering key factors such as drilling performance, operational challenges, and material costs. The outcomes of this study will enhance the understanding of the technical challenges associated with super-hot EGS development and facilitate the design of efficient and cost-effective drilling technologies for the geothermal energy industry. By improving the drilling process in EGS development, the full potential of geothermal energy can be harnessed as a viable and sustainable energy source to meet the growing global demand for electricity. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Journal ref: Geothermal Resources Council Transactions, Vol 47, 2023, Davis, California

arXiv:2402.14823 [pdf]

Geothermal Energy in Sedimentary Basins: Assessing Techno-economic Viability for Sustainable Development

Authors: Orkhan Khankishiyev, Saeed Salehi, Runar Nygaard, Danny Rehg

Abstract: Drilling deep geothermal wells has proven to be a challenging endeavor, primarily due to issues such as loss circulation events, material limitations under high temperatures, and the production of corrosive fluids. Furthermore, the substantial upfront costs, coupled with geological and technical obstacles associated with drilling super-hot EGS wells in igneous rocks, hinder the widespread implemen… ▽ More Drilling deep geothermal wells has proven to be a challenging endeavor, primarily due to issues such as loss circulation events, material limitations under high temperatures, and the production of corrosive fluids. Furthermore, the substantial upfront costs, coupled with geological and technical obstacles associated with drilling super-hot EGS wells in igneous rocks, hinder the widespread implementation of geothermal systems. Alternatively, geothermal energy development in sedimentary basins presents an opportunity for clean energy production with relatively lower investment costs compared to the development of super-hot EGS in igneous rocks. Sedimentary basins exhibit attractive temperatures for geothermal applications, and their wide distribution enhances the potential for nationwide deployment. Decades of drilling and development experience in oil and gas wells have yielded a wealth of data, knowledge, and expertise. Leveraging this experience and data for geothermal drilling can significantly reduce costs associated with subsurface data gathering, well drilling, and completion. This paper explores the economic viability of geothermal energy production systems in sedimentary basins. The study encompasses an analysis of time-to-hit-temperature (THT) and cost-to-hit-temperature (CHT) parameters, as well as Favorability maps across the United States. These maps are based on factors such as well depth, total drilling time, well cost, and subsurface temperature data. By integrating sedimentary basin maps and underground temperature maps, the THT and CHT maps can facilitate the strategic placement of EGS wells and other geothermal system applications in the most favorable locations across the United States. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Report number: ISSN: 0193-5933; ISBN: 934412-29-4

Journal ref: Geothermal Resources Council Transactions, 2023

arXiv:2312.13530 [pdf, other]

doi 10.1145/3737459

HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion

Authors: Yu-Zheng Lin, Muntasir Mamun, Muhtasim Alam Chowdhury, Shuyu Cai, Mingyu Zhu, Banafsheh Saber Latibari, Kevin Immanuel Gubbi, Najmeh Nazari Bavarsad, Arjun Caputo, Avesta Sasan, Houman Homayoun, Setareh Rafatirad, Pratik Satam, Soheil Salehi

Abstract: The escalating complexity of modern computing frameworks has resulted in a surge in the cybersecurity vulnerabilities reported to the National Vulnerability Database (NVD) by practitioners. Despite the fact that the stature of NVD is one of the most significant databases for the latest insights into vulnerabilities, extracting meaningful trends from such a large amount of unstructured data is stil… ▽ More The escalating complexity of modern computing frameworks has resulted in a surge in the cybersecurity vulnerabilities reported to the National Vulnerability Database (NVD) by practitioners. Despite the fact that the stature of NVD is one of the most significant databases for the latest insights into vulnerabilities, extracting meaningful trends from such a large amount of unstructured data is still challenging without the application of suitable technological methodologies. Previous efforts have mostly concentrated on software vulnerabilities; however, a holistic strategy incorporates approaches for mitigating vulnerabilities, score prediction, and a knowledge-generating system that may extract relevant insights from the Common Weakness Enumeration (CWE) and Common Vulnerability Exchange (CVE) databases is notably absent. As the number of hardware attacks on Internet of Things (IoT) devices continues to rapidly increase, we present the Hardware Vulnerability to Weakness Mapping (HW-V2W-Map) Framework, which is a Machine Learning (ML) framework focusing on hardware vulnerabilities and IoT security. The architecture that we have proposed incorporates an Ontology-driven Storytelling framework, which automates the process of updating the ontology in order to recognize patterns and evolution of vulnerabilities over time and provides approaches for mitigating the vulnerabilities. The repercussions of vulnerabilities can be mitigated as a result of this, and conversely, future exposures can be predicted and prevented. Furthermore, our proposed framework utilized Generative Pre-trained Transformer (GPT) Large Language Models (LLMs) to provide mitigation suggestions. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 22 pages, 10 pages appendix, 10 figures, Submitted to ACM TODAES

arXiv:2311.06568 [pdf, ps, other]

On a fallacy concerning I-am-unprovable sentences: what to take home from Goedel's introduction

Authors: Kaave Lajevardi, Saeed Salehi

Abstract: We demonstrate that, in itself and in the absence of extra premises, the following argument scheme is fallacious: The sentence A says about itself that it has a certain property F, and A does in fact have the property F; therefore A is true. We then examine an argument of this form in the informal introduction of Goedel's classic (1931) and examine some auxiliary premises which might have been at… ▽ More We demonstrate that, in itself and in the absence of extra premises, the following argument scheme is fallacious: The sentence A says about itself that it has a certain property F, and A does in fact have the property F; therefore A is true. We then examine an argument of this form in the informal introduction of Goedel's classic (1931) and examine some auxiliary premises which might have been at work in that context. Philosophically significant as it may be, that particular informal argument plays no role in Goedel's technical results. Going deeper into the issue and investigating truth conditions of Goedelian sentences (i.e., those sentences which are provably equivalent to their own unprovability) will provide us with insights regarding the philosophical debate on the truth of Goedelian sentences of systems--a debate which is at least as old as Dummett (1963). △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: 14 pages

MSC Class: 03F40

arXiv:2310.14807 [pdf, ps, other]

On Chaitin's Heuristic Principle and Halting Probability

Authors: Saeed Salehi

Abstract: It would be a heavenly reward if there were a method of weighing theories and sentences in such a way that a theory could never prove a heavier sentence (Chaitin's Heuristic Principle). Alas, no satisfactory measure has been found so far, and this dream seemed too good to ever come true. In the first part of this paper, we attempt to revive Chaitin's lost paradise of heuristic principle as much as… ▽ More It would be a heavenly reward if there were a method of weighing theories and sentences in such a way that a theory could never prove a heavier sentence (Chaitin's Heuristic Principle). Alas, no satisfactory measure has been found so far, and this dream seemed too good to ever come true. In the first part of this paper, we attempt to revive Chaitin's lost paradise of heuristic principle as much as logic allows. In the second part, which is a joint work with M. Jalilvand and B. Nikzad, we study Chaitin's well-known constant Omega, and show that this number is not a probability of halting the randomly chosen input-free programs under any infinite discrete measure. We suggest some methods for defining the halting probabilities by various measures. △ Less

Submitted 15 June, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: 22 pages (two parts)

MSC Class: 03F40; 68Q30; 60A10; 28A05; 68Q04; 03D10

arXiv:2310.13422 [pdf]

Soundness does not come for free (if at all)

Authors: Kaave Lajevardi, Saeed Salehi

Abstract: We respond to some of the points made by Bennet and Blanck (2022) concerning a previous publication of ours (2021). We respond to some of the points made by Bennet and Blanck (2022) concerning a previous publication of ours (2021). △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 7 pages

arXiv:2310.09362 [pdf, other]

From Words and Exercises to Wellness: Farsi Chatbot for Self-Attachment Technique

Authors: Sina Elahimanesh, Shayan Salehi, Sara Zahedi Movahed, Lisa Alazraki, Ruoyu Hu, Abbas Edalat

Abstract: In the wake of the post-pandemic era, marked by social isolation and surging rates of depression and anxiety, conversational agents based on digital psychotherapy can play an influential role compared to traditional therapy sessions. In this work, we develop a voice-capable chatbot in Farsi to guide users through Self-Attachment (SAT), a novel, self-administered, holistic psychological technique b… ▽ More In the wake of the post-pandemic era, marked by social isolation and surging rates of depression and anxiety, conversational agents based on digital psychotherapy can play an influential role compared to traditional therapy sessions. In this work, we develop a voice-capable chatbot in Farsi to guide users through Self-Attachment (SAT), a novel, self-administered, holistic psychological technique based on attachment theory. Our chatbot uses a dynamic array of rule-based and classification-based modules to comprehend user input throughout the conversation and navigates a dialogue flowchart accordingly, recommending appropriate SAT exercises that depend on the user's emotional and mental state. In particular, we collect a dataset of over 6,000 utterances and develop a novel sentiment-analysis module that classifies user sentiment into 12 classes, with accuracy above 92%. To keep the conversation novel and engaging, the chatbot's responses are retrieved from a large dataset of utterances created with the aid of Farsi GPT-2 and a reinforcement learning approach, thus requiring minimal human annotation. Our chatbot also offers a question-answering module, called SAT Teacher, to answer users' questions about the principles of Self-Attachment. Finally, we design a cross-platform application as the bot's user interface. We evaluate our platform in a ten-day human study with N=52 volunteers from the non-clinical population, who have had over 2,000 dialogues in total with the chatbot. The results indicate that the platform was engaging to most users (75%), 72% felt better after the interactions, and 74% were satisfied with the SAT Teacher's performance. △ Less

Submitted 25 March, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.08773 [pdf, other]

Examining the Potential and Pitfalls of ChatGPT in Science and Engineering Problem-Solving

Authors: Karen D. Wang, Eric Burkholder, Carl Wieman, Shima Salehi, Nick Haber

Abstract: The study explores the capabilities of OpenAI's ChatGPT in solving different types of physics problems. ChatGPT (with GPT-4) was queried to solve a total of 40 problems from a college-level engineering physics course. These problems ranged from well-specified problems, where all data required for solving the problem was provided, to under-specified, real-world problems where not all necessary data… ▽ More The study explores the capabilities of OpenAI's ChatGPT in solving different types of physics problems. ChatGPT (with GPT-4) was queried to solve a total of 40 problems from a college-level engineering physics course. These problems ranged from well-specified problems, where all data required for solving the problem was provided, to under-specified, real-world problems where not all necessary data were given. Our findings show that ChatGPT could successfully solve 62.5% of the well-specified problems, but its accuracy drops to 8.3% for under-specified problems. Analysis of the model's incorrect solutions revealed three distinct failure modes: 1) failure to construct accurate models of the physical world, 2) failure to make reasonable assumptions about missing data, and 3) calculation errors. The study offers implications for how to leverage LLM-augmented instructional materials to enhance STEM education. The insights also contribute to the broader discourse on AI's strengths and limitations, serving both educators aiming to leverage the technology and researchers investigating human-AI collaboration frameworks for problem-solving and decision-making. △ Less

Submitted 27 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: 12 pages, 2 figures

arXiv:2310.03971 [pdf, other]

Quantized Transformer Language Model Implementations on Edge Devices

Authors: Mohammad Wali Ur Rahman, Murad Mehrab Abrar, Hunter Gibbons Copening, Salim Hariri, Sicong Shao, Pratik Satam, Soheil Salehi

Abstract: Large-scale transformer-based models like the Bidirectional Encoder Representations from Transformers (BERT) are widely used for Natural Language Processing (NLP) applications, wherein these models are initially pre-trained with a large corpus with millions of parameters and then fine-tuned for a downstream NLP task. One of the major limitations of these large-scale models is that they cannot be d… ▽ More Large-scale transformer-based models like the Bidirectional Encoder Representations from Transformers (BERT) are widely used for Natural Language Processing (NLP) applications, wherein these models are initially pre-trained with a large corpus with millions of parameters and then fine-tuned for a downstream NLP task. One of the major limitations of these large-scale models is that they cannot be deployed on resource-constrained devices due to their large model size and increased inference latency. In order to overcome these limitations, such large-scale models can be converted to an optimized FlatBuffer format, tailored for deployment on resource-constrained edge devices. Herein, we evaluate the performance of such FlatBuffer transformed MobileBERT models on three different edge devices, fine-tuned for Reputation analysis of English language tweets in the RepLab 2013 dataset. In addition, this study encompassed an evaluation of the deployed models, wherein their latency, performance, and resource efficiency were meticulously assessed. Our experiment results show that, compared to the original BERT large model, the converted and quantized MobileBERT models have 160$\times$ smaller footprints for a 4.1% drop in accuracy while analyzing at least one tweet per second on edge devices. Furthermore, our study highlights the privacy-preserving aspect of TinyML systems as all data is processed locally within a serverless environment. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: Accepted for publication on 22nd International Conference of Machine Learning and Applications, ICMLA 2023

arXiv:2308.10098 [pdf, other]

An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning

Authors: Mohammad Sadegh Salehi, Subhadip Mukherjee, Lindon Roberts, Matthias J. Ehrhardt

Abstract: Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function… ▽ More Various tasks in data science are modeled utilizing the variational regularization approach, where manually selecting regularization parameters presents a challenge. The difficulty gets exacerbated when employing regularizers involving a large number of hyperparameters. To overcome this challenge, bilevel learning can be employed to learn such parameters from data. However, neither exact function values nor exact gradients with respect to the hyperparameters are attainable, necessitating methods that only rely on inexact evaluation of such quantities. State-of-the-art inexact gradient-based methods a priori select a sequence of the required accuracies and cannot identify an appropriate step size since the Lipschitz constant of the hypergradient is unknown. In this work, we propose an algorithm with backtracking line search that only relies on inexact function evaluations and hypergradients and show convergence to a stationary point. Furthermore, the proposed algorithm determines the required accuracy dynamically rather than manually selected before running it. Our numerical experiments demonstrate the efficiency and feasibility of our approach for hyperparameter estimation on a range of relevant problems in imaging and data science such as total variation and field of experts denoising and multinomial logistic regression. Particularly, the results show that the algorithm is robust to its own hyperparameters such as the initial accuracies and step size. △ Less

Submitted 8 April, 2025; v1 submitted 19 August, 2023; originally announced August 2023.

arXiv:2306.15041 [pdf]

A Comparison of Neuroelectrophysiology Databases

Authors: Priyanka Subash, Alex Gray, Misque Boswell, Samantha L. Cohen, Rachael Garner, Sana Salehi, Calvary Fisher, Samuel Hobel, Satrajit Ghosh, Yaroslav Halchenko, Benjamin Dichter, Russell A. Poldrack, Chris Markiewicz, Dora Hermes, Arnaud Delorme, Scott Makeig, Brendan Behan, Alana Sparks, Stephen R Arnott, Zhengjia Wang, John Magnotti, Michael S. Beauchamp, Nader Pouratian, Arthur W. Toga, Dominique Duncan

Abstract: As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DAN… ▽ More As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DANDI), OpenNeuro, and Brain-CODE. The aim of this review is to describe archives that provide researchers with tools to store, share, and reanalyze both human and non-human neurophysiology data based on criteria that are of interest to the neuroscientific community. The Brain Imaging Data Structure (BIDS) and Neurodata Without Borders (NWB) are utilized by these archives to make data more accessible to researchers by implementing a common standard. As the necessity for integrating large-scale analysis into data repository platforms continues to grow within the neuroscientific community, this article will highlight the various analytical and customizable tools developed within the chosen archives that may advance the field of neuroinformatics. △ Less

Submitted 30 August, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: 22 pages, 6 figures, 5 tables

arXiv:2304.07908 [pdf, other]

Traffic Characteristics of Extended Reality

Authors: Abdullah Alnajim, Seyedmohammad Salehi, Chien-Chung Shen, Malcolm Smith

Abstract: This tutorial paper analyzes the traffic characteristics of immersive experiences with extended reality (XR) technologies, including Augmented reality (AR), virtual reality (VR), and mixed reality (MR). The current trend in XR applications is to offload the computation and rendering to an external server and use wireless communications between the XR head-mounted display (HMD) and the access point… ▽ More This tutorial paper analyzes the traffic characteristics of immersive experiences with extended reality (XR) technologies, including Augmented reality (AR), virtual reality (VR), and mixed reality (MR). The current trend in XR applications is to offload the computation and rendering to an external server and use wireless communications between the XR head-mounted display (HMD) and the access points. This paradigm becomes essential owing to (1) its high flexibility (in terms of user mobility) compared to remote rendering through a wired connection, and (2) the high computing power available on the server compared to local rendering (on HMD). The requirements to facilitate a pleasant XR experience are analyzed in three aspects: capacity (throughput), latency, and reliability. For capacity, two VR experiences are analyzed: a human eye-like experience and an experience with the Oculus Quest 2 HMD. For latency, the key components of the motion-to-photon (MTP) delay are discussed. For reliability, the maximum packet loss rate (or the minimum packet delivery rate) is studied for different XR scenarios. Specifically, the paper reviews optimization techniques that were proposed to reduce the latency, conserve the bandwidth, extend the scalability, and/or increase the reliability to satisfy the stringent requirements of the emerging XR applications. △ Less

Submitted 16 April, 2023; originally announced April 2023.

Comments: 23 pages, 17 figures, tutorial paper

arXiv:2302.09244 [pdf, other]

Dual-Domain Self-Supervised Learning for Accelerated Non-Cartesian MRI Reconstruction

Authors: Bo Zhou, Jo Schlemper, Neel Dey, Seyed Sadegh Mohseni Salehi, Kevin Sheth, Chi Liu, James S. Duncan, Michal Sofka

Abstract: While enabling accelerated acquisition and improved reconstruction accuracy, current deep MRI reconstruction networks are typically supervised, require fully sampled data, and are limited to Cartesian sampling patterns. These factors limit their practical adoption as fully-sampled MRI is prohibitively time-consuming to acquire clinically. Further, non-Cartesian sampling patterns are particularly d… ▽ More While enabling accelerated acquisition and improved reconstruction accuracy, current deep MRI reconstruction networks are typically supervised, require fully sampled data, and are limited to Cartesian sampling patterns. These factors limit their practical adoption as fully-sampled MRI is prohibitively time-consuming to acquire clinically. Further, non-Cartesian sampling patterns are particularly desirable as they are more amenable to acceleration and show improved motion robustness. To this end, we present a fully self-supervised approach for accelerated non-Cartesian MRI reconstruction which leverages self-supervision in both k-space and image domains. In training, the undersampled data are split into disjoint k-space domain partitions. For the k-space self-supervision, we train a network to reconstruct the input undersampled data from both the disjoint partitions and from itself. For the image-level self-supervision, we enforce appearance consistency obtained from the original undersampled data and the two partitions. Experimental results on our simulated multi-coil non-Cartesian MRI dataset demonstrate that DDSS can generate high-quality reconstruction that approaches the accuracy of the fully supervised reconstruction, outperforming previous baseline methods. Finally, DDSS is shown to scale to highly challenging real-world clinical MRI reconstruction acquired on a portable low-field (0.064 T) MRI scanner with no data available for supervised training while demonstrating improved image quality as compared to traditional reconstruction, as determined by a radiologist study. △ Less

Submitted 18 February, 2023; originally announced February 2023.

Comments: 14 pages, 10 figures, published at Medical Image Analysis (MedIA)

arXiv:2302.07746 [pdf, other]

AGNI: In-Situ, Iso-Latency Stochastic-to-Binary Number Conversion for In-DRAM Deep Learning

Authors: Supreeth Mysore Shivanandamurthy, Sairam Sri Vatsavai, Ishan Thakkar, Sayed Ahmad Salehi

Abstract: Recent years have seen a rapid increase in research activity in the field of DRAM-based Processing-In-Memory (PIM) accelerators, where the analog computing capability of DRAM is employed by minimally changing the inherent structure of DRAM peripherals to accelerate various data-centric applications. Several DRAM-based PIM accelerators for Convolutional Neural Networks (CNNs) have also been reporte… ▽ More Recent years have seen a rapid increase in research activity in the field of DRAM-based Processing-In-Memory (PIM) accelerators, where the analog computing capability of DRAM is employed by minimally changing the inherent structure of DRAM peripherals to accelerate various data-centric applications. Several DRAM-based PIM accelerators for Convolutional Neural Networks (CNNs) have also been reported. Among these, the accelerators leveraging in-DRAM stochastic arithmetic have shown manifold improvements in processing latency and throughput, due to the ability of stochastic arithmetic to convert multiplications into simple bit-wise logical AND operations. However,the use of in-DRAM stochastic arithmetic for CNN acceleration requires frequent stochastic to binary number conversions. For that, prior works employ full adder-based or serial counter based in-DRAM circuits. These circuits consume large area and incur long latency. Their in-DRAM implementations also require heavy modifications in DRAM peripherals, which significantly diminishes the benefits of using stochastic arithmetic in these accelerators. To address these shortcomings, this paper presents a new substrate for in-DRAM stochastic-to-binary number conversion called AGNI. AGNI makes minor modifications in DRAM peripherals using pass transistors, capacitors, encoders, and charge pumps, and re-purposes the sense amplifiers as voltage comparators, to enable in-situ binary conversion of input statistic operands of different sizes with iso latency. △ Less

Submitted 11 February, 2023; originally announced February 2023.

Comments: (Preprint) To Appear at ISQED 2023

arXiv:2212.04102 [pdf, other]

Water Dynamics around T0 vs. R4 of Hemoglobin from Local Hydrophobicity Analysis

Authors: Seyedeh Maryam Salehi, Marco Pezzella, Adam Willard, Markus Meuwly, Martin Karplus

Abstract: The local hydration around tetrameric Hb in its T$_0$ and R$_4$ conformational substates is analyzed based on molecular dynamics simulations. Analysis of the local hydrophobicity (LH) for all residues at the $α_1 β_2$ and $α_2 β_1$ interfaces, responsible for the quaternary T$\rightarrow$R transition, which is encoded in the MWC model, as well as comparison with earlier computations of the solvent… ▽ More The local hydration around tetrameric Hb in its T$_0$ and R$_4$ conformational substates is analyzed based on molecular dynamics simulations. Analysis of the local hydrophobicity (LH) for all residues at the $α_1 β_2$ and $α_2 β_1$ interfaces, responsible for the quaternary T$\rightarrow$R transition, which is encoded in the MWC model, as well as comparison with earlier computations of the solvent accessible surface area (SASA), makes clear that the two quantities measure different aspects of hydration. Local hydrophobicity quantifies the presence and structure of water molecules at the interface whereas ``buried surface'' reports on the available space for solvent. For simulations with Hb frozen in its T$_0$ and R$_4$ states the correlation coefficient between LH and buried surface is 0.36 and 0.44, respectively, but it increases considerably if the 95 \% confidence interval is used. The LH with Hb frozen and flexible changes little for most residues at the interfaces but is significantly altered for a few select ones, which are Thr41$α$, Tyr42$α$, Tyr140$α$, Trp37$β$, Glu101$β$ (for T$_0$) and Thr38$α$, Tyr42$α$, Tyr140$α$ (for R$_4$). The number of water molecules at the interface is found to increase by $\sim 25$ \% for T$_0$$\rightarrow$R$_4$ which is consistent with earlier measurements. Since hydration is found to be essential to protein function, it is clear that hydration also plays an essential role in allostery. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2210.04502 [pdf, ps, other]

A Reunion of Godel, Tarski, Carnap, and Rosser

Authors: Saeed Salehi

Abstract: We unify Godel's First Incompleteness Theorem (1931), Tarski's Undefinability Theorem (1933), Godel-Carnap's Diagonal Lemma (1934), and Rosser's (strengthening of Godel's first) Incompleteness Theorem (1936), whose proofs resemble much and use almost the same technique. We unify Godel's First Incompleteness Theorem (1931), Tarski's Undefinability Theorem (1933), Godel-Carnap's Diagonal Lemma (1934), and Rosser's (strengthening of Godel's first) Incompleteness Theorem (1936), whose proofs resemble much and use almost the same technique. △ Less

Submitted 15 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: 7 pages

MSC Class: 03F40

arXiv:2209.07122 [pdf, ps, other]

On Godel's "Much Weaker" Assumption

Authors: Saeed Salehi

Abstract: Godelian sentences of a sufficiently strong and recursively enumerable theory, constructed in Godel's 1931 groundbreaking paper on the incompleteness theorems, are unprovable if the theory is consistent; however, they could be refutable. These sentences are independent when the theory is so-called omega-consistent; a notion introduced by Godel, which is stronger than (simple) consistency, but ``mu… ▽ More Godelian sentences of a sufficiently strong and recursively enumerable theory, constructed in Godel's 1931 groundbreaking paper on the incompleteness theorems, are unprovable if the theory is consistent; however, they could be refutable. These sentences are independent when the theory is so-called omega-consistent; a notion introduced by Godel, which is stronger than (simple) consistency, but ``much weaker'' than soundness. Godel goes to great lengths to show in detail that omega-consistency is stronger than consistency, but never shows, or seems to forget to say, why it is much weaker than soundness. In this paper, we study this proof-theoretic notion and compare some of its properties with those of consistency and (variants of) soundness. △ Less

Submitted 20 September, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

Comments: 7 pages

MSC Class: 03F40; 03F30

arXiv:2207.13884 [pdf, other]

An Optimal Multi-UAV Deployment Model for UAV-assisted Smart Farming

Authors: Shavbo Salehi, Jahan Hassan, Ayub Bokani

Abstract: Next-generation wireless networks will deploy UAVs dynamically as aerial base stations (UAV-BSs) to boost the wireless network coverage in the out of reach areas. To provide an efficient service in stochastic environments, the optimal number of UAV-BSs, their locations, and trajectories must be specified appropriately for different scenarios. Such deployment requires an intelligent decision-making… ▽ More Next-generation wireless networks will deploy UAVs dynamically as aerial base stations (UAV-BSs) to boost the wireless network coverage in the out of reach areas. To provide an efficient service in stochastic environments, the optimal number of UAV-BSs, their locations, and trajectories must be specified appropriately for different scenarios. Such deployment requires an intelligent decision-making mechanism that can deal with various variables at different times. This paper proposes a multi UAV-BS deployment model for smart farming, formulated as a Multi-Criteria Decision Making (MCDM) method to find the optimal number of UAV-BSs to monitor animals' behavior. This model considers the effect of UAV-BSs' signal interference and path loss changes caused by users' mobility to maximize the system's efficiency. To avoid collision among UAV-BSs, we split the considered area into several clusters, each covered by a UAV-BS. Our simulation results suggest up to 11x higher deployment efficiency than the benchmark clustering algorithm. △ Less

Submitted 28 July, 2022; originally announced July 2022.

arXiv:2207.08529 [pdf, other]

doi 10.1021/jacs.2c04169

Structure, Organization and Heterogeneity of Water-Containing Deep Eutectic Solvents

Authors: Kai Töpfer, Andrea Pasti, Anuradha Das, Seyedeh Maryam Salehi, Luis Itza Vazquez-Salazar, David Rohrbach, Thomas Feurer, Peter Hamm, Markus Meuwly

Abstract: The spectroscopy and structural dynamics of a deep eutectic mixture (KSCN/acetamide) with varying water content is investigated from 2D IR (with the C-N stretch vibration of the SCN$^-$ anions as the reporter) and THz spectroscopy. Molecular dynamics simulations correctly describe the non-trivial dependence of both spectroscopic signatures depending on water content. For the 2D IR spectra, the MD… ▽ More The spectroscopy and structural dynamics of a deep eutectic mixture (KSCN/acetamide) with varying water content is investigated from 2D IR (with the C-N stretch vibration of the SCN$^-$ anions as the reporter) and THz spectroscopy. Molecular dynamics simulations correctly describe the non-trivial dependence of both spectroscopic signatures depending on water content. For the 2D IR spectra, the MD simulations relate the steep increase in the cross relaxation rate at high water content to parallel alignment of packed SCN$^-$ anions. Conversely, the non-linear increase of the THz absorption with increasing water content is mainly attributed to the formation of larger water clusters. The results demonstrate that a combination of structure sensitive spectroscopies and molecular dynamics simulations provides molecular-level insights into emergence of heterogeneity of such mixtures by modulating their composition. △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2206.13434 [pdf, other]

ContraReg: Contrastive Learning of Multi-modality Unsupervised Deformable Image Registration

Authors: Neel Dey, Jo Schlemper, Seyed Sadegh Mohseni Salehi, Bo Zhou, Guido Gerig, Michal Sofka

Abstract: Establishing voxelwise semantic correspondence across distinct imaging modalities is a foundational yet formidable computer vision task. Current multi-modality registration techniques maximize hand-crafted inter-domain similarity functions, are limited in modeling nonlinear intensity-relationships and deformations, and may require significant re-engineering or underperform on new tasks, datasets,… ▽ More Establishing voxelwise semantic correspondence across distinct imaging modalities is a foundational yet formidable computer vision task. Current multi-modality registration techniques maximize hand-crafted inter-domain similarity functions, are limited in modeling nonlinear intensity-relationships and deformations, and may require significant re-engineering or underperform on new tasks, datasets, and domain pairs. This work presents ContraReg, an unsupervised contrastive representation learning approach to multi-modality deformable registration. By projecting learned multi-scale local patch features onto a jointly learned inter-domain embedding space, ContraReg obtains representations useful for non-rigid multi-modality alignment. Experimentally, ContraReg achieves accurate and robust results with smooth and invertible deformations across a series of baselines and ablations on a neonatal T1-T2 brain MRI registration task with all methods validated over a wide range of deformation regularization strengths. △ Less

Submitted 27 June, 2022; originally announced June 2022.

Comments: Accepted by MICCAI 2022. 13 pages, 6 figures, and 1 table

arXiv:2206.10463 [pdf, other]

Hydration Dynamics and IR Spectroscopy of 4-Fluorophenol

Authors: Seyedeh Maryam Salehi, Silvan Käser, Kai Töpfer, Polydefkis Diamantis, Rolf Pfister, Peter Hamm, Ursula Röthlisberger, Markus Meuwly

Abstract: Halogenated groups are relevant in pharmaceutical applications and potentially useful spectroscopic probes for infrared spectroscopy. In this work, the structural dynamics and infrared spectroscopy of $para$-fluorophenol (F-PhOH) and phenol (PhOH) is investigated in the gas phase and in water using a combination of experiment and molecular dynamics (MD) simulations. The gas phase and solvent dynam… ▽ More Halogenated groups are relevant in pharmaceutical applications and potentially useful spectroscopic probes for infrared spectroscopy. In this work, the structural dynamics and infrared spectroscopy of $para$-fluorophenol (F-PhOH) and phenol (PhOH) is investigated in the gas phase and in water using a combination of experiment and molecular dynamics (MD) simulations. The gas phase and solvent dynamics around F-PhOH and PhOH is characterized from atomistic simulations using empirical energy functions with point charges or multipoles for the electrostatics, Machine-Learning (ML) based parametrization and with full $\textit{ab initio}$ (QM) and mixed Quantum Mechanical/Molecular Mechanics (QM/MM) simulations with a particular focus on the CF- and OH-stretch region. The CF-stretch band is heavily mixed with other modes whereas the OH-stretch in solution displays a characteristic high-frequency peak around 3600 cm$^{-1}$ most likely associated with the -OH group of PhOH and F-PhOH together with a characteristic progression below 3000 cm$^{-1}$ due to coupling with water modes which is also reproduced by several of the simulations. Solvent and radial distribution functions indicate that the CF-site is largely hydrophobic except for simulations using point charges which renders them unsuited for correctly describing hydration and dynamics around fluorinated sites. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: Main Manuscript: 41 pages and 6 figures, SI: 10 pages and 10 figures

arXiv:2201.10776 [pdf, other]

DSFormer: A Dual-domain Self-supervised Transformer for Accelerated Multi-contrast MRI Reconstruction

Authors: Bo Zhou, Neel Dey, Jo Schlemper, Seyed Sadegh Mohseni Salehi, Chi Liu, James S. Duncan, Michal Sofka

Abstract: Multi-contrast MRI (MC-MRI) captures multiple complementary imaging modalities to aid in radiological decision-making. Given the need for lowering the time cost of multiple acquisitions, current deep accelerated MRI reconstruction networks focus on exploiting the redundancy between multiple contrasts. However, existing works are largely supervised with paired data and/or prohibitively expensive fu… ▽ More Multi-contrast MRI (MC-MRI) captures multiple complementary imaging modalities to aid in radiological decision-making. Given the need for lowering the time cost of multiple acquisitions, current deep accelerated MRI reconstruction networks focus on exploiting the redundancy between multiple contrasts. However, existing works are largely supervised with paired data and/or prohibitively expensive fully-sampled MRI sequences. Further, reconstruction networks typically rely on convolutional architectures which are limited in their capacity to model long-range interactions and may lead to suboptimal recovery of fine anatomical detail. To these ends, we present a dual-domain self-supervised transformer (DSFormer) for accelerated MC-MRI reconstruction. DSFormer develops a deep conditional cascade transformer (DCCT) consisting of several cascaded Swin transformer reconstruction networks (SwinRN) trained under two deep conditioning strategies to enable MC-MRI information sharing. We further present a dual-domain (image and k-space) self-supervised learning strategy for DCCT to alleviate the costs of acquiring fully sampled training data. DSFormer generates high-fidelity reconstructions which experimentally outperform current fully-supervised baselines. Moreover, we find that DSFormer achieves nearly the same performance when trained either with full supervision or with our proposed dual-domain self-supervision. △ Less

Submitted 16 August, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

Comments: Accepted at WACV 2023

arXiv:2112.00606 [pdf, other]

Cross-Correlated Motions in Azidolysozyme

Authors: Seyedeh Maryam Salehi, Markus Meuwly

Abstract: The changes in the local and global dynamics of azide-labelled Lysozyme compared with that of the wild type protein are quantitatively assessed for all alanine residues along the polypeptide chain. Although attaching -N$_3$ to alanine residues has been considered to be a minimally invasive change in the protein it is found that depending on the location of the Alanine residue the local and global… ▽ More The changes in the local and global dynamics of azide-labelled Lysozyme compared with that of the wild type protein are quantitatively assessed for all alanine residues along the polypeptide chain. Although attaching -N$_3$ to alanine residues has been considered to be a minimally invasive change in the protein it is found that depending on the location of the Alanine residue the local and global changes in the dynamics differ. For Ala92 the change in the cross correlated motions are minimal whereas attaching -N$_3$ to Ala90 leads to pronounced differences in the local and global correlations as quantified by cross correlation coefficients of the C$_α$ atoms. It is also demonstrated that the spectral region of the asymmetric azide stretch distinguishes between alanine attachment sites whereas changes in the low frequency, far-infrared region are less characteristic. △ Less

Submitted 1 December, 2021; originally announced December 2021.

arXiv:2111.12504 [pdf]

An equitable and effective approach to introductory mechanics

Authors: Eric Burkholder, Shima Salehi, Sarah Sackeyfio, Nicel Mohamed-Hinds, Carl Wieman

Abstract: Introductory mechanics ("physics 1") is a critical gateway course for students desiring to pursue a STEM career. A major challenge with this course is that there is a large spread in the students' incoming physics preparation, and this level of preparation is strongly predictive of a students' performance. The level of incoming preparation is also largely determined by a student's educational priv… ▽ More Introductory mechanics ("physics 1") is a critical gateway course for students desiring to pursue a STEM career. A major challenge with this course is that there is a large spread in the students' incoming physics preparation, and this level of preparation is strongly predictive of a students' performance. The level of incoming preparation is also largely determined by a student's educational privilege, and so this course can amplify inequities in K-12 education and provide a barrier to a STEM career for students from marginalized groups. Here, we present a novel introductory course design to address such equity challenges in physics 1. We designed the course based on the concept of deliberate practice to give students targeted, scaffolded, and repeated opportunities to engage in research-identified practices and decisions required for effective problem-solving. We used real-world problems, as they carry less resemblance to physics high school problems, and so even the students with the best high school physics instruction have little experience or skill in solving them. The students learned the physics content knowledge they needed in future courses, particularly in engineering, and their problem-solving skills improved substantially. Furthermore, the success in the course was not correlated with incoming physics preparation, in stark contrast to the outcomes from traditional physics 1 courses. These findings suggest that we made physics 1 more equitable by employing a deliberate practice approach in the context of real-world problem-solving. △ Less

Submitted 24 November, 2021; originally announced November 2021.

arXiv:2109.09356 [pdf, other]

doi 10.1063/5.0077361

Site-Selective Dynamics of Ligand-Free and Ligand-Bound Azidolysozyme

Authors: Seyedeh Maryam Salehi, Markus Meuwly

Abstract: Azido-modified alanine residues (AlaN$_3$) are environment-sensitive, minimally invasive infrared probes for the site-specific investigation of protein structure and dynamics. Here, the capability of the label is investigated to query whether or not a ligand is bound to the active site of Lysozyme and how the spectroscopy and dynamics change upon ligand binding. The results demonstrate specific di… ▽ More Azido-modified alanine residues (AlaN$_3$) are environment-sensitive, minimally invasive infrared probes for the site-specific investigation of protein structure and dynamics. Here, the capability of the label is investigated to query whether or not a ligand is bound to the active site of Lysozyme and how the spectroscopy and dynamics change upon ligand binding. The results demonstrate specific differences for center frequencies of the asymmetric azide stretch vibration, the long time decay and the static offset of the frequency fluctuation correlation function - all of which are experimental observables - between the ligand-free and the ligand-bound, N$_3$-labelled protein. Changes in dynamics can also be mapped onto changes in the local and through-space coupling between residues by virtue of dynamical cross-correlation maps. This makes the azide label a versatile and structurally sensitive probe to report on the dynamics of proteins in a variety of environments and for a range of different applications. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: Main manuscript: 21 pages and 6 figures, SI: 13 pages and 17 figures

Showing 1–50 of 104 results for author: Salehi, S