-
Primer C-VAE: An interpretable deep learning primer design method to detect emerging virus variants
Authors:
Hanyu Wang,
Emmanuel K. Tsinda,
Anthony J. Dunn,
Francis Chikweto,
Alain B. Zemkoho
Abstract:
Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with larg…
▽ More
Motivation: PCR is more economical and quicker than Next Generation Sequencing for detecting target organisms, with primer design being a critical step. In epidemiology with rapidly mutating viruses, designing effective primers is challenging. Traditional methods require substantial manual intervention and struggle to ensure effective primer design across different strains. For organisms with large, similar genomes like Escherichia coli and Shigella flexneri, differentiating between species is also difficult but crucial.
Results: We developed Primer C-VAE, a model based on a Variational Auto-Encoder framework with Convolutional Neural Networks to identify variants and generate specific primers. Using SARS-CoV-2, our model classified variants (alpha, beta, gamma, delta, omicron) with 98% accuracy and generated variant-specific primers. These primers appeared with >95% frequency in target variants and <5% in others, showing good performance in in-silico PCR tests. For Alpha, Delta, and Omicron, our primer pairs produced fragments <200 bp, suitable for qPCR detection. The model also generated effective primers for organisms with longer gene sequences like E. coli and S. flexneri.
Conclusion: Primer C-VAE is an interpretable deep learning approach for developing specific primer pairs for target organisms. This flexible, semi-automated and reliable tool works regardless of sequence completeness and length, allowing for qPCR applications and can be applied to organisms with large and highly similar genomes.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Deep learning forward and reverse primer design to detect SARS-CoV-2 emerging variants
Authors:
Hanyu Wang,
Emmanuel K. Tsinda,
Anthony J. Dunn,
Francis Chikweto,
Nusreen Ahmed,
Emanuela Pelosi,
Alain B. Zemkoho
Abstract:
Surges that have been observed at different periods in the number of COVID-19 cases are associated with the emergence of multiple SARS-CoV-2 (Severe Acute Respiratory Virus) variants. The design of methods to support laboratory detection are crucial in the monitoring of these variants. Hence, in this paper, we develop a semi-automated method to design both forward and reverse primer sets to detect…
▽ More
Surges that have been observed at different periods in the number of COVID-19 cases are associated with the emergence of multiple SARS-CoV-2 (Severe Acute Respiratory Virus) variants. The design of methods to support laboratory detection are crucial in the monitoring of these variants. Hence, in this paper, we develop a semi-automated method to design both forward and reverse primer sets to detect SARS-CoV-2 variants. To proceed, we train deep Convolution Neural Networks (CNNs) to classify labelled SARS-CoV-2 variants and identify partial genomic features needed for the forward and reverse Polymerase Chain Reaction (PCR) primer design. Our proposed approach supplements existing ones while promoting the emerging concept of neural network assisted primer design for PCR. Our CNN model was trained using a database of SARS-CoV-2 full-length genomes from GISAID and tested on a separate dataset from NCBI, with 98\% accuracy for the classification of variants. This result is based on the development of three different methods of feature extraction, and the selected primer sequences for each SARS-CoV-2 variant detection (except Omicron) were present in more than 95 \% of sequences in an independent set of 5000 same variant sequences, and below 5 \% in other independent datasets with 5000 sequences of each variant. In total, we obtain 22 forward and reverse primer pairs with flexible length sizes (18-25 base pairs) with an expected amplicon length ranging between 42 and 3322 nucleotides. Besides the feature appearance, in-silico primer checks confirmed that the identified primer pairs are suitable for accurate SARS-CoV-2 variant detection by means of PCR tests.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
Deep learning methods for screening patients' S-ICD implantation eligibility
Authors:
Anthony J. Dunn,
Mohamed H. ElRefai,
Paul R. Roberts,
Stefano Coniglio,
Benedict M. Wiles,
Alain B. Zemkoho
Abstract:
Subcutaneous Implantable Cardioverter-Defibrillators (S-ICDs) are used for prevention of sudden cardiac death triggered by ventricular arrhythmias. T Wave Over Sensing (TWOS) is an inherent risk with S-ICDs which can lead to inappropriate shocks. A major predictor of TWOS is a high T:R ratio (the ratio between the amplitudes of the T and R waves). Currently patients' Electrocardiograms (ECGs) are…
▽ More
Subcutaneous Implantable Cardioverter-Defibrillators (S-ICDs) are used for prevention of sudden cardiac death triggered by ventricular arrhythmias. T Wave Over Sensing (TWOS) is an inherent risk with S-ICDs which can lead to inappropriate shocks. A major predictor of TWOS is a high T:R ratio (the ratio between the amplitudes of the T and R waves). Currently patients' Electrocardiograms (ECGs) are screened over 10 seconds to measure the T:R ratio, determining the patients' eligibility for S-ICD implantation. Due to temporal variations in the T:R ratio, 10 seconds is not long enough to reliably determine the normal values of a patient's T:R ratio. In this paper, we develop a convolutional neural network (CNN) based model utilising phase space reconstruction matrices to predict T:R ratios from 10-second ECG segments without explicitly locating the R or T waves, thus avoiding the issue of TWOS. This tool can be used to automatically screen patients over a much longer period and provide an in-depth description of the behaviour of the T:R ratio over that period. The tool can also enable much more reliable and descriptive screenings to better assess patients' eligibility for S-ICD implantation.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Last Round Convergence and No-Instant Regret in Repeated Games with Asymmetric Information
Authors:
Le Cong Dinh,
Long Tran-Thanh,
Tri-Dung Nguyen,
Alain B. Zemkoho
Abstract:
This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseu…
▽ More
This paper considers repeated games in which one player has more information about the game than the other players. In particular, we investigate repeated two-player zero-sum games where only the column player knows the payoff matrix A of the game. Suppose that while repeatedly playing this game, the row player chooses her strategy at each round by using a no-regret algorithm to minimize her (pseudo) regret. We develop a no-instant-regret algorithm for the column player to exhibit last round convergence to a minimax equilibrium. We show that our algorithm is efficient against a large set of popular no-regret algorithms of the row player, including the multiplicative weight update algorithm, the online mirror descent method/follow-the-regularized-leader, the linear multiplicative weight update algorithm, and the optimistic multiplicative weight update.
△ Less
Submitted 15 February, 2023; v1 submitted 25 March, 2020;
originally announced March 2020.
-
Infrequent adverse event prediction in low carbon energy production using machine learning
Authors:
Stefano Coniglio,
Anthony J. Dunn,
Alain B. Zemkoho
Abstract:
We address the problem of predicting the occurrence of infrequent adverse events in the context of predictive maintenance. We cast the corresponding machine learning task as an imbalanced classification problem and propose a framework for solving it that is capable of leveraging different classifiers in order to predict the occurrence of an adverse event before it takes place. In particular, we fo…
▽ More
We address the problem of predicting the occurrence of infrequent adverse events in the context of predictive maintenance. We cast the corresponding machine learning task as an imbalanced classification problem and propose a framework for solving it that is capable of leveraging different classifiers in order to predict the occurrence of an adverse event before it takes place. In particular, we focus on two applications arising in low-carbon energy production: foam formation in anaerobic digestion and condenser tube leakage in the steam turbines of a nuclear power station. The results of an extensive set of omputational experiments show the effectiveness of the techniques that we propose.
△ Less
Submitted 27 January, 2021; v1 submitted 19 January, 2020;
originally announced January 2020.
-
Robust toll pricing: A novel approach
Authors:
Trivikram Dokka,
Alain B. Zemkoho,
Sonali Sen Gupta,
Fabrice T. Nobibon
Abstract:
We study a robust toll pricing problem where toll setters and users have different level of information when taking their decisions. Toll setters do not have full information on the costs of the network and rely on historical information when determining toll rates, whereas users decide on the path to use from origin to destination knowing toll rates and having, in addition, more accurate traffic…
▽ More
We study a robust toll pricing problem where toll setters and users have different level of information when taking their decisions. Toll setters do not have full information on the costs of the network and rely on historical information when determining toll rates, whereas users decide on the path to use from origin to destination knowing toll rates and having, in addition, more accurate traffic data. Toll setters often also face constraints on price experimentation which means less opportunity for price revision. Motivated by this we propose a novel robust pricing methodology for fixing prices where we take non-adversarial view of nature different from the existing robust approaches. We show that our non-adversarial robustness results in less conservative pricing decisions compared to traditional adversarial nature setting. We start by first considering a single origin-destination parallel network in this new robust setting and formulate the robust toll pricing problem as a distributionally robust optimization problem, for which we develop an exact algorithm based on a mixed-integer programming formulation and a heuristic based on two-point support distribution. We further extend our formulations to more general networks and show how our algorithms can be adapted for the general networks. Finally, we illustrate the usefulness of our approach by means of numerical experiments both on randomly generated networks and on the data recorded on the road network of the city of Chicago.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.