-
Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification
Authors:
Daqian Shi,
Xiaolei Diao,
Xu Chen,
Cédric M. John
Abstract:
Deep Neural Networks (DNNs) have significantly advanced the field of computer vision. To improve DNN training process, knowledge distillation methods demonstrate their effectiveness in accelerating network training by introducing a fixed learning direction from the teacher network to student networks. In this context, several distillation-based optimization strategies are proposed, e.g., deep mutu…
▽ More
Deep Neural Networks (DNNs) have significantly advanced the field of computer vision. To improve DNN training process, knowledge distillation methods demonstrate their effectiveness in accelerating network training by introducing a fixed learning direction from the teacher network to student networks. In this context, several distillation-based optimization strategies are proposed, e.g., deep mutual learning and self-distillation, as an attempt to achieve generic training performance enhancement through the cooperative training of multiple networks. However, such strategies achieve limited improvements due to the poor understanding of the impact of learning directions among networks across different iterations. In this paper, we propose a novel competitive distillation strategy that allows each network in a group to potentially act as a teacher based on its performance, enhancing the overall learning performance. Competitive distillation organizes a group of networks to perform a shared task and engage in competition, where competitive optimization is proposed to improve the parameter updating process. We further introduce stochastic perturbation in competitive distillation, aiming to motivate networks to induce mutations to achieve better visual representations and global optimum. The experimental results show that competitive distillation achieves promising performance in diverse tasks and datasets.
△ Less
Submitted 29 June, 2025;
originally announced June 2025.
-
Ancient Script Image Recognition and Processing: A Review
Authors:
Xiaolei Diao,
Rite Bo,
Yanling Xiao,
Lida Shi,
Zhihan Zhou,
Hao Xu,
Chuntao Li,
Xiongfeng Tang,
Massimo Poesio,
Cédric M. John,
Daqian Shi
Abstract:
Ancient scripts, e.g., Egyptian hieroglyphs, Oracle Bone Inscriptions, and Ancient Greek inscriptions, serve as vital carriers of human civilization, embedding invaluable historical and cultural information. Automating ancient script image recognition has gained importance, enabling large-scale interpretation and advancing research in archaeology and digital humanities. With the rise of deep learn…
▽ More
Ancient scripts, e.g., Egyptian hieroglyphs, Oracle Bone Inscriptions, and Ancient Greek inscriptions, serve as vital carriers of human civilization, embedding invaluable historical and cultural information. Automating ancient script image recognition has gained importance, enabling large-scale interpretation and advancing research in archaeology and digital humanities. With the rise of deep learning, this field has progressed rapidly, with numerous script-specific datasets and models proposed. While these scripts vary widely, spanning phonographic systems with limited glyphs to logographic systems with thousands of complex symbols, they share common challenges and methodological overlaps. Moreover, ancient scripts face unique challenges, including imbalanced data distribution and image degradation, which have driven the development of various dedicated methods. This survey provides a comprehensive review of ancient script image recognition methods. We begin by categorizing existing studies based on script types and analyzing respective recognition methods, highlighting both their differences and shared strategies. We then focus on challenges unique to ancient scripts, systematically examining their impact and reviewing recent solutions, including few-shot learning and noise-robust techniques. Finally, we summarize current limitations and outline promising future directions. Our goal is to offer a structured, forward-looking perspective to support ongoing advancements in the recognition, interpretation, and decipherment of ancient scripts.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
Ethical Challenges of Using Artificial Intelligence in Judiciary
Authors:
Angel Mary John,
Aiswarya M. U.,
Jerrin Thomas Panachakel
Abstract:
Artificial intelligence (AI) has emerged as a ubiquitous concept in numerous domains, including the legal system. AI has the potential to revolutionize the functioning of the judiciary and the dispensation of justice. Incorporating AI into the legal system offers the prospect of enhancing decision-making for judges, lawyers, and legal professionals, while concurrently providing the public with mor…
▽ More
Artificial intelligence (AI) has emerged as a ubiquitous concept in numerous domains, including the legal system. AI has the potential to revolutionize the functioning of the judiciary and the dispensation of justice. Incorporating AI into the legal system offers the prospect of enhancing decision-making for judges, lawyers, and legal professionals, while concurrently providing the public with more streamlined, efficient, and cost-effective services. The integration of AI into the legal landscape offers manifold benefits, encompassing tasks such as document review, legal research, contract analysis, case prediction, and decision-making. By automating laborious and error-prone procedures, AI has the capacity to alleviate the burden associated with these arduous tasks. Consequently, courts around the world have begun embracing AI technology as a means to enhance the administration of justice. However, alongside its potential advantages, the use of AI in the judiciary poses a range of ethical challenges. These ethical quandaries must be duly addressed to ensure the responsible and equitable deployment of AI systems. This article delineates the principal ethical challenges entailed in employing AI within the judiciary and provides recommendations to effectively address these issues.
△ Less
Submitted 27 April, 2025;
originally announced April 2025.
-
Navigating AI Policy Landscapes: Insights into Human Rights Considerations Across IEEE Regions
Authors:
Angel Mary John,
Jerrin Thomas Panachakel,
Anusha S. P
Abstract:
This paper explores the integration of human rights considerations into AI regulatory frameworks across different IEEE regions - specifically the United States (Region 1-6), Europe (Region 8), China (part of Region 10), and Singapore (part of Region 10). While all acknowledge the transformative potential of AI and the necessity of ethical guidelines, their regulatory approaches significantly diffe…
▽ More
This paper explores the integration of human rights considerations into AI regulatory frameworks across different IEEE regions - specifically the United States (Region 1-6), Europe (Region 8), China (part of Region 10), and Singapore (part of Region 10). While all acknowledge the transformative potential of AI and the necessity of ethical guidelines, their regulatory approaches significantly differ. Europe exhibits a rigorous framework with stringent protections for individual rights, while the U.S. promotes innovation with less restrictive regulations. China emphasizes state control and societal order in its AI strategies. In contrast, Singapore's advisory framework encourages self-regulation and aligns closely with international norms. This comparative analysis underlines the need for ongoing global dialogue to harmonize AI regulations that safeguard human rights while promoting technological advancement, reflecting the diverse perspectives and priorities of each region.
△ Less
Submitted 27 April, 2025;
originally announced April 2025.
-
Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project
Authors:
Carolin Penke,
Chelsea Maria John,
Jan Ebert,
Stefan Kesselheim,
Andreas Herten
Abstract:
The training of large language models (LLMs) requires substantial computational resources, complex software stacks, and carefully designed workflows to achieve scalability and efficiency. This report presents best practices and insights gained from the OpenGPT-X project, a German initiative focused on developing open, multilingual LLMs optimized for European languages. We detail the use of high-pe…
▽ More
The training of large language models (LLMs) requires substantial computational resources, complex software stacks, and carefully designed workflows to achieve scalability and efficiency. This report presents best practices and insights gained from the OpenGPT-X project, a German initiative focused on developing open, multilingual LLMs optimized for European languages. We detail the use of high-performance computing (HPC) systems, primarily JUWELS Booster at JSC, for training Teuken-7B, a 7-billion-parameter transformer model. The report covers system architecture, training infrastructure, software choices, profiling and benchmarking tools, as well as engineering and operational challenges.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
AI-Enabled Operations at Fermi Complex: Multivariate Time Series Prediction for Outage Prediction and Diagnosis
Authors:
Milan Jain,
Burcu O. Mutlu,
Caleb Stam,
Jan Strube,
Brian A. Schupbach,
Jason M. St. John,
William A. Pellico
Abstract:
The Main Control Room of the Fermilab accelerator complex continuously gathers extensive time-series data from thousands of sensors monitoring the beam. However, unplanned events such as trips or voltage fluctuations often result in beam outages, causing operational downtime. This downtime not only consumes operator effort in diagnosing and addressing the issue but also leads to unnecessary energy…
▽ More
The Main Control Room of the Fermilab accelerator complex continuously gathers extensive time-series data from thousands of sensors monitoring the beam. However, unplanned events such as trips or voltage fluctuations often result in beam outages, causing operational downtime. This downtime not only consumes operator effort in diagnosing and addressing the issue but also leads to unnecessary energy consumption by idle machines awaiting beam restoration. The current threshold-based alarm system is reactive and faces challenges including frequent false alarms and inconsistent outage-cause labeling. To address these limitations, we propose an AI-enabled framework that leverages predictive analytics and automated labeling. Using data from $2,703$ Linac devices and $80$ operator-labeled outages, we evaluate state-of-the-art deep learning architectures, including recurrent, attention-based, and linear models, for beam outage prediction. Additionally, we assess a Random Forest-based labeling system for providing consistent, confidence-scored outage annotations. Our findings highlight the strengths and weaknesses of these architectures for beam outage prediction and identify critical gaps that must be addressed to fully harness AI for transitioning downtime handling from reactive to predictive, ultimately reducing downtime and improving decision-making in accelerator management.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
Performance and Power: Systematic Evaluation of AI Workloads on Accelerators with CARAML
Authors:
Chelsea Maria John,
Stepan Nassyr,
Carolin Penke,
Andreas Herten
Abstract:
The rapid advancement of machine learning (ML) technologies has driven the development of specialized hardware accelerators designed to facilitate more efficient model training. This paper introduces the CARAML benchmark suite, which is employed to assess performance and energy consumption during the training of transformer-based large language models and computer vision models on a range of hardw…
▽ More
The rapid advancement of machine learning (ML) technologies has driven the development of specialized hardware accelerators designed to facilitate more efficient model training. This paper introduces the CARAML benchmark suite, which is employed to assess performance and energy consumption during the training of transformer-based large language models and computer vision models on a range of hardware accelerators, including systems from NVIDIA, AMD, and Graphcore. CARAML provides a compact, automated, extensible, and reproducible framework for assessing the performance and energy of ML workloads across various novel hardware architectures. The design and implementation of CARAML, along with a custom power measurement tool called jpwr, are discussed in detail.
△ Less
Submitted 29 October, 2024; v1 submitted 19 September, 2024;
originally announced September 2024.
-
Application-Driven Exascale: The JUPITER Benchmark Suite
Authors:
Andreas Herten,
Sebastian Achilles,
Damian Alvarez,
Jayesh Badwaik,
Eric Behle,
Mathis Bode,
Thomas Breuer,
Daniel Caviedes-Voullième,
Mehdi Cherti,
Adel Dabah,
Salem El Sayed,
Wolfgang Frings,
Ana Gonzalez-Nicolas,
Eric B. Gregory,
Kaveh Haghighi Mood,
Thorsten Hater,
Jenia Jitsev,
Chelsea Maria John,
Jan H. Meinke,
Catrin I. Meyer,
Pavel Mezentsev,
Jan-Oliver Mirus,
Stepan Nassyr,
Carolin Penke,
Manoel Römmer
, et al. (6 additional authors not shown)
Abstract:
Benchmarks are essential in the design of modern HPC installations, as they define key aspects of system components. Beyond synthetic workloads, it is crucial to include real applications that represent user requirements into benchmark suites, to guarantee high usability and widespread adoption of a new system. Given the significant investments in leadership-class supercomputers of the exascale er…
▽ More
Benchmarks are essential in the design of modern HPC installations, as they define key aspects of system components. Beyond synthetic workloads, it is crucial to include real applications that represent user requirements into benchmark suites, to guarantee high usability and widespread adoption of a new system. Given the significant investments in leadership-class supercomputers of the exascale era, this is even more important and necessitates alignment with a vision of Open Science and reproducibility. In this work, we present the JUPITER Benchmark Suite, which incorporates 16 applications from various domains. It was designed for and used in the procurement of JUPITER, the first European exascale supercomputer. We identify requirements and challenges and outline the project and software infrastructure setup. We provide descriptions and scalability studies of selected applications and a set of key takeaways. The JUPITER Benchmark Suite is released as open source software with this work at https://github.com/FZJ-JSC/jubench.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Noise2Noise Denoising of CRISM Hyperspectral Data
Authors:
Robert Platt,
Rossella Arcucci,
Cédric M. John
Abstract:
Hyperspectral data acquired by the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) have allowed for unparalleled mapping of the surface mineralogy of Mars. Due to sensor degradation over time, a significant portion of the recently acquired data is considered unusable. Here a new data-driven model architecture, Noise2Noise4Mars (N2N4M), is introduced to remove noise from CRISM images.…
▽ More
Hyperspectral data acquired by the Compact Reconnaissance Imaging Spectrometer for Mars (CRISM) have allowed for unparalleled mapping of the surface mineralogy of Mars. Due to sensor degradation over time, a significant portion of the recently acquired data is considered unusable. Here a new data-driven model architecture, Noise2Noise4Mars (N2N4M), is introduced to remove noise from CRISM images. Our model is self-supervised and does not require zero-noise target data, making it well suited for use in Planetary Science applications where high quality labelled data is scarce. We demonstrate its strong performance on synthetic-noise data and CRISM images, and its impact on downstream classification performance, outperforming benchmark methods on most metrics. This allows for detailed analysis for critical sites of interest on the Martian surface, including proposed lander sites.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
An Exploration of Clustering Algorithms for Customer Segmentation in the UK Retail Market
Authors:
Jeen Mary John,
Olamilekan Shobayo,
Bayode Ogunleye
Abstract:
Recently, peoples awareness of online purchases has significantly risen. This has given rise to online retail platforms and the need for a better understanding of customer purchasing behaviour. Retail companies are pressed with the need to deal with a high volume of customer purchases, which requires sophisticated approaches to perform more accurate and efficient customer segmentation. Customer se…
▽ More
Recently, peoples awareness of online purchases has significantly risen. This has given rise to online retail platforms and the need for a better understanding of customer purchasing behaviour. Retail companies are pressed with the need to deal with a high volume of customer purchases, which requires sophisticated approaches to perform more accurate and efficient customer segmentation. Customer segmentation is a marketing analytical tool that aids customer-centric service and thus enhances profitability. In this paper, we aim to develop a customer segmentation model to improve decision-making processes in the retail market industry. To achieve this, we employed a UK-based online retail dataset obtained from the UCI machine learning repository. The retail dataset consists of 541,909 customer records and eight features. Our study adopted the RFM (recency, frequency, and monetary) framework to quantify customer values. Thereafter, we compared several state-of-the-art (SOTA) clustering algorithms, namely, K-means clustering, the Gaussian mixture model (GMM), density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering, and balanced iterative reducing and clustering using hierarchies (BIRCH). The results showed the GMM outperformed other approaches, with a Silhouette Score of 0.80.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e
Authors:
Chenwei Xu,
Jerry Yao-Chieh Hu,
Aakaash Narayanan,
Mattson Thieme,
Vladimir Nagaslaev,
Mark Austin,
Jeremy Arnold,
Jose Berlioz,
Pierrick Hanlet,
Aisha Ibrahim,
Dennis Nicklaus,
Jovan Mitrevski,
Jason Michael St. John,
Gauri Pradhan,
Andrea Saewert,
Kiyomi Seiya,
Brian Schupbach,
Randy Thurman-Keup,
Nhan Tran,
Rui Shi,
Seda Ogrenci,
Alexis Maya-Isabelle Shuping,
Kyle Hazelwood,
Han Liu
Abstract:
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an aut…
▽ More
We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an automated controller capable of providing real-time feedback and calibration of the Spill Regulation System (SRS) parameters on a millisecond timescale. We treat the Mu2e accelerator system as a Markov Decision Process suitable for Reinforcement Learning (RL), utilizing PPO to reduce bias and enhance training stability. A key innovation in our approach is the integration of a neuralized Proportional-Integral-Derivative (PID) controller into the policy function, resulting in a significant improvement in the Spill Duty Factor (SDF) by 13.6%, surpassing the performance of the current PID controller baseline by an additional 1.6%. This paper presents the preliminary offline results based on a differentiable simulator of the Mu2e accelerator. It paves the groundwork for real-time implementations and applications, representing a crucial step towards automated proton beam intensity control for the Mu2e experiment.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Local monotone operator learning using non-monotone operators: MnM-MOL
Authors:
Maneesh John,
Jyothi Rikhab Chand,
Mathews Jacob
Abstract:
The recovery of magnetic resonance (MR) images from undersampled measurements is a key problem that has seen extensive research in recent years. Unrolled approaches, which rely on end-to-end training of convolutional neural network (CNN) blocks within iterative reconstruction algorithms, offer state-of-the-art performance. These algorithms require a large amount of memory during training, making t…
▽ More
The recovery of magnetic resonance (MR) images from undersampled measurements is a key problem that has seen extensive research in recent years. Unrolled approaches, which rely on end-to-end training of convolutional neural network (CNN) blocks within iterative reconstruction algorithms, offer state-of-the-art performance. These algorithms require a large amount of memory during training, making them difficult to employ in high-dimensional applications. Deep equilibrium (DEQ) models and the recent monotone operator learning (MOL) approach were introduced to eliminate the need for unrolling, thus reducing the memory demand during training. Both approaches require a Lipschitz constraint on the network to ensure that the forward and backpropagation iterations converge. Unfortunately, the constraint often results in reduced performance compared to unrolled methods. The main focus of this work is to relax the constraint on the CNN block in two different ways. Inspired by convex-non-convex regularization strategies, we now impose the monotone constraint on the sum of the gradient of the data term and the CNN block, rather than constrain the CNN itself to be a monotone operator. This approach enables the CNN to learn possibly non-monotone score functions, which can translate to improved performance. In addition, we only restrict the operator to be monotone in a local neighborhood around the image manifold. Our theoretical results show that the proposed algorithm is guaranteed to converge to the fixed point and that the solution is robust to input perturbations, provided that it is initialized close to the true solution. Our empirical results show that the relaxed constraints translate to improved performance and that the approach enjoys robustness to input perturbations similar to MOL.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
A Simple Illustration of Interleaved Learning using Kalman Filter for Linear Least Squares
Authors:
Majnu John,
Yihren Wu
Abstract:
Interleaved learning in machine learning algorithms is a biologically inspired training method with promising results. In this short note, we illustrate the interleaving mechanism via a simple statistical and optimization framework based on Kalman Filter for Linear Least Squares.
Interleaved learning in machine learning algorithms is a biologically inspired training method with promising results. In this short note, we illustrate the interleaving mechanism via a simple statistical and optimization framework based on Kalman Filter for Linear Least Squares.
△ Less
Submitted 21 September, 2023;
originally announced October 2023.
-
e-G2C: A 0.14-to-8.31 $μ$J/Inference NN-based Processor with Continuous On-chip Adaptation for Anomaly Detection and ECG Conversion from EGM
Authors:
Yang Zhao,
Yongan Zhang,
Yonggan Fu,
Xu Ouyang,
Cheng Wan,
Shang Wu,
Anton Banta,
Mathews M. John,
Allison Post,
Mehdi Razavi,
Joseph Cavallaro,
Behnaam Aazhang,
Yingyan Lin
Abstract:
This work presents the first silicon-validated dedicated EGM-to-ECG (G2C) processor, dubbed e-G2C, featuring continuous lightweight anomaly detection, event-driven coarse/precise conversion, and on-chip adaptation. e-G2C utilizes neural network (NN) based G2C conversion and integrates 1) an architecture supporting anomaly detection and coarse/precise conversion via time multiplexing to balance the…
▽ More
This work presents the first silicon-validated dedicated EGM-to-ECG (G2C) processor, dubbed e-G2C, featuring continuous lightweight anomaly detection, event-driven coarse/precise conversion, and on-chip adaptation. e-G2C utilizes neural network (NN) based G2C conversion and integrates 1) an architecture supporting anomaly detection and coarse/precise conversion via time multiplexing to balance the effectiveness and power, 2) an algorithm-hardware co-designed vector-wise sparsity resulting in a 1.6-1.7$\times$ speedup, 3) hybrid dataflows for enhancing near 100% utilization for normal/depth-wise(DW)/point-wise(PW) convolutions (Convs), and 4) an on-chip detection threshold adaptation engine for continuous effectiveness. The achieved 0.14-8.31 $μ$J/inference energy efficiency outperforms prior arts under similar complexity, promising real-time detection/conversion and possibly life-critical interventions
△ Less
Submitted 23 July, 2022;
originally announced September 2022.
-
On Connections between Opacity and Security in Linear Systems
Authors:
Varkey M. John,
Vaibhav Katewa
Abstract:
Opacity and attack detectability are important properties for any system as they allow the states to remain private and malicious attacks to be detected, respectively. In this paper, we show that a fundamental trade-off exists between these properties for a linear dynamical system, in the sense that if an opaque system is subjected to attacks, all attacks cannot be detected. We first characterize…
▽ More
Opacity and attack detectability are important properties for any system as they allow the states to remain private and malicious attacks to be detected, respectively. In this paper, we show that a fundamental trade-off exists between these properties for a linear dynamical system, in the sense that if an opaque system is subjected to attacks, all attacks cannot be detected. We first characterize the opacity conditions for the system in terms of its weakly unobservable subspace (WUS) and show that the number of opaque states is proportional to the size of the WUS. Further, we establish conditions under which increasing the opaque sets also increases the set of undetectable attacks. This highlights a fundamental trade-off between security and privacy. We demonstrate application of our results on a remotely controlled automotive system.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Deep Image Prior using Stein's Unbiased Risk Estimator: SURE-DIP
Authors:
Maneesh John,
Hemant Kumar Aggarwal,
Qing Zou,
Mathews Jacob
Abstract:
Deep learning algorithms that rely on extensive training data are revolutionizing image recovery from ill-posed measurements. Training data is scarce in many imaging applications, including ultra-high-resolution imaging. The deep image prior (DIP) algorithm was introduced for single-shot image recovery, completely eliminating the need for training data. A challenge with this scheme is the need for…
▽ More
Deep learning algorithms that rely on extensive training data are revolutionizing image recovery from ill-posed measurements. Training data is scarce in many imaging applications, including ultra-high-resolution imaging. The deep image prior (DIP) algorithm was introduced for single-shot image recovery, completely eliminating the need for training data. A challenge with this scheme is the need for early stopping to minimize the overfitting of the CNN parameters to the noise in the measurements. We introduce a generalized Stein's unbiased risk estimate (GSURE) loss metric to minimize the overfitting. Our experiments show that the SURE-DIP approach minimizes the overfitting issues, thus offering significantly improved performance over classical DIP schemes. We also use the SURE-DIP approach with model-based unrolling architectures, which offers improved performance over direct inversion schemes.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac Electrograms
Authors:
Yongan Zhang,
Anton Banta,
Yonggan Fu,
Mathews M. John,
Allison Post,
Mehdi Razavi,
Joseph Cavallaro,
Behnaam Aazhang,
Yingyan Lin
Abstract:
There exists a gap in terms of the signals provided by pacemakers (i.e., intracardiac electrogram (EGM)) and the signals doctors use (i.e., 12-lead electrocardiogram (ECG)) to diagnose abnormal rhythms. Therefore, the former, even if remotely transmitted, are not sufficient for doctors to provide a precise diagnosis, let alone make a timely intervention. To close this gap and make a heuristic step…
▽ More
There exists a gap in terms of the signals provided by pacemakers (i.e., intracardiac electrogram (EGM)) and the signals doctors use (i.e., 12-lead electrocardiogram (ECG)) to diagnose abnormal rhythms. Therefore, the former, even if remotely transmitted, are not sufficient for doctors to provide a precise diagnosis, let alone make a timely intervention. To close this gap and make a heuristic step towards real-time critical intervention in instant response to irregular and infrequent ventricular rhythms, we propose a new framework dubbed RT-RCG to automatically search for (1) efficient Deep Neural Network (DNN) structures and then (2)corresponding accelerators, to enable Real-Time and high-quality Reconstruction of ECG signals from EGM signals. Specifically, RT-RCG proposes a new DNN search space tailored for ECG reconstruction from EGM signals, and incorporates a differentiable acceleration search (DAS) engine to efficiently navigate over the large and discrete accelerator design space to generate optimized accelerators. Extensive experiments and ablation studies under various settings consistently validate the effectiveness of our RT-RCG. To the best of our knowledge, RT-RCG is the first to leverage neural architecture search (NAS) to simultaneously tackle both reconstruction efficacy and efficiency.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
ENSURE: A General Approach for Unsupervised Training of Deep Image Reconstruction Algorithms
Authors:
Hemant Kumar Aggarwal,
Aniket Pramanik,
Maneesh John,
Mathews Jacob
Abstract:
Image reconstruction using deep learning algorithms offers improved reconstruction quality and lower reconstruction time than classical compressed sensing and model-based algorithms. Unfortunately, clean and fully sampled ground-truth data to train the deep networks is often unavailable in several applications, restricting the applicability of the above methods. We introduce a novel metric termed…
▽ More
Image reconstruction using deep learning algorithms offers improved reconstruction quality and lower reconstruction time than classical compressed sensing and model-based algorithms. Unfortunately, clean and fully sampled ground-truth data to train the deep networks is often unavailable in several applications, restricting the applicability of the above methods. We introduce a novel metric termed the ENsemble Stein's Unbiased Risk Estimate (ENSURE) framework, which can be used to train deep image reconstruction algorithms without fully sampled and noise-free images. The proposed framework is the generalization of the classical SURE and GSURE formulation to the setting where the images are sampled by different measurement operators, chosen randomly from a set. We evaluate the expectation of the GSURE loss functions over the sampling patterns to obtain the ENSURE loss function. We show that this loss is an unbiased estimate for the true mean-square error, which offers a better alternative to GSURE, which only offers an unbiased estimate for the projected error. Our experiments show that the networks trained with this loss function can offer reconstructions comparable to the supervised setting. While we demonstrate this framework in the context of MR image recovery, the ENSURE framework is generally applicable to arbitrary inverse problems.
△ Less
Submitted 2 December, 2022; v1 submitted 20 October, 2020;
originally announced October 2020.
-
Regularized deep learning with nonconvex penalties
Authors:
Sujit Vettam,
Majnu John
Abstract:
Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based DNN regularization methods, convex penalties are typically considered because of their optimization guarantees. Recent theoretical work have shown that nonconvex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimizatio…
▽ More
Regularization methods are often employed in deep learning neural networks (DNNs) to prevent overfitting. For penalty based DNN regularization methods, convex penalties are typically considered because of their optimization guarantees. Recent theoretical work have shown that nonconvex penalties that satisfy certain regularity conditions are also guaranteed to perform well with standard optimization algorithms. In this paper, we examine new and currently existing nonconvex penalties for DNN regularization. We provide theoretical justifications for the new penalties and also assess the performance of all penalties with DNN analyses of seven datasets.
△ Less
Submitted 19 November, 2020; v1 submitted 11 September, 2019;
originally announced September 2019.
-
The efficacy of various machine learning models for multi-class classification of RNA-seq expression data
Authors:
Sterling Ramroach,
Melford John,
Ajay Joshi
Abstract:
Late diagnosis and high costs are key factors that negatively impact the care of cancer patients worldwide. Although the availability of biological markers for the diagnosis of cancer type is increasing, costs and reliability of tests currently present a barrier to the adoption of their routine use. There is a pressing need for accurate methods that enable early diagnosis and cover a broad range o…
▽ More
Late diagnosis and high costs are key factors that negatively impact the care of cancer patients worldwide. Although the availability of biological markers for the diagnosis of cancer type is increasing, costs and reliability of tests currently present a barrier to the adoption of their routine use. There is a pressing need for accurate methods that enable early diagnosis and cover a broad range of cancers. The use of machine learning and RNA-seq expression analysis has shown promise in the classification of cancer type. However, research is inconclusive about which type of machine learning models are optimal. The suitability of five algorithms were assessed for the classification of 17 different cancer types. Each algorithm was fine-tuned and trained on the full array of 18,015 genes per sample, for 4,221 samples (75 % of the dataset). They were then tested with 1,408 samples (25 % of the dataset) for which cancer types were withheld to determine the accuracy of prediction. The results show that ensemble algorithms achieve 100% accuracy in the classification of 14 out of 17 types of cancer. The clustering and classification models, while faster than the ensembles, performed poorly due to the high level of noise in the dataset. When the features were reduced to a list of 20 genes, the ensemble algorithms maintained an accuracy above 95% as opposed to the clustering and classification models.
△ Less
Submitted 19 August, 2019;
originally announced August 2019.
-
Connecting the Dots: Privacy Leakage via Write-Access Patterns to the Main Memory
Authors:
Tara Merin John,
Syed Kamran Haider,
Hamza Omar,
Marten van Dijk
Abstract:
Data-dependent access patterns of an application to an untrusted storage system are notorious for leaking sensitive information about the user's data. Previous research has shown how an adversary capable of monitoring both read and write requests issued to the memory can correlate them with the application to learn its sensitive data. However, information leakage through only the write access patt…
▽ More
Data-dependent access patterns of an application to an untrusted storage system are notorious for leaking sensitive information about the user's data. Previous research has shown how an adversary capable of monitoring both read and write requests issued to the memory can correlate them with the application to learn its sensitive data. However, information leakage through only the write access patterns is less obvious and not well studied in the current literature. In this work, we demonstrate an actual attack on power-side-channel resistant Montgomery's ladder based modular exponentiation algorithm commonly used in public key cryptography. We infer the complete 512-bit secret exponent in $\sim3.5$ minutes by virtue of just the write access patterns of the algorithm to the main memory. In order to learn the victim algorithm's write access patterns under realistic settings, we exploit a compromised DMA device to take frequent snapshots of the application's address space, and then run a simple differential analysis on these snapshots to find the write access sequence. The attack has been shown on an Intel Core(TM) i7-4790 3.60GHz processor based system. We further discuss a possible attack on McEliece public-key cryptosystem that also exploits the write-access patterns to learn the secret key.
△ Less
Submitted 17 June, 2017; v1 submitted 13 February, 2017;
originally announced February 2017.
-
Bibliometrics and Information Retrieval: Creating Knowledge through Research Synergies
Authors:
Judit Bar-Ilan,
Rob Koopman,
Shenghui Wang,
Andrea Scharnhorst,
Marcus John,
Philipp Mayr,
Dietmar Wolfram
Abstract:
This panel brings together experts in bibliometrics and information retrieval to discuss how each of these two important areas of information science can help to inform the research of the other. There is a growing body of literature that capitalizes on the synergies created by combining methodological approaches of each to solve research problems and practical issues related to how information is…
▽ More
This panel brings together experts in bibliometrics and information retrieval to discuss how each of these two important areas of information science can help to inform the research of the other. There is a growing body of literature that capitalizes on the synergies created by combining methodological approaches of each to solve research problems and practical issues related to how information is created, stored, organized, retrieved and used. The session will begin with an overview of the common threads that exist between IR and metrics, followed by a summary of findings from the BIR workshops and examples of research projects that combine aspects of each area to benefit IR or metrics research areas, including search results ranking, semantic indexing and visualization. The panel will conclude with an engaging discussion with the audience to identify future areas of research and collaboration.
△ Less
Submitted 29 August, 2016;
originally announced August 2016.
-
ZigBee Based Wireless Data Acquisition Using LabVIEW for Implementing Smart Driving Skill Evaluation System
Authors:
Mohit John,
Arun JosephPalai
Abstract:
The Smart Driving Skill Evaluation (SDSE) System presented in this paper expedite the testing of candidates aspiring for a driving license in a more efficient and transparent manner, as compared to the present manual testing procedure existing in most parts of Asia and Pacific region. The manual test procedure is also subjected to multiple limitations like time consuming, costly and heavily contro…
▽ More
The Smart Driving Skill Evaluation (SDSE) System presented in this paper expedite the testing of candidates aspiring for a driving license in a more efficient and transparent manner, as compared to the present manual testing procedure existing in most parts of Asia and Pacific region. The manual test procedure is also subjected to multiple limitations like time consuming, costly and heavily controlled by the experience of examiner in conducting the test. This technological solution is developed by customizing 8051 controller based embedded system and LabVIEW based virtual instrument. The controller module senses the motion of the test vehicle on the test track referred to as zero rpm measurement and the LabVIEW based virtual instrument provides a Graphical User Interface for remote end monitoring of the sensors embedded on the test track. The proposed technological solution for the automation of existing manual test process enables the elimination of human intervention and improves the driving test accuracy while going paperless with Driving Skill Evaluation System. As a contribution to the society this technological solution can reduce the number of road accidents because most accidents results from lack of planning, anticipation and control which are highly dependent on driving skill.
△ Less
Submitted 16 August, 2013;
originally announced August 2013.