-
Wafer-Level Prototyping Tools for CMOS Bioelectronic Sensors
Authors:
Advait Madhavan,
Ruohong Shi,
Alokik Kanwal,
Glenn Holland,
Jacob M. Majikes,
Paul N. Patrone,
Anthony J. Kearsley,
Arvind Balijepalli
Abstract:
Integrating biology with complementary metal-oxide-semiconductor (CMOS) sensors can enable highly parallel measurements with minimal parasitic effects, significantly enhancing sensitivity. However, realizing this potential often requires overcoming substantial barriers related to design, fabrication, and heterogeneous integration. In this context, we present a comprehensive suite of tools and meth…
▽ More
Integrating biology with complementary metal-oxide-semiconductor (CMOS) sensors can enable highly parallel measurements with minimal parasitic effects, significantly enhancing sensitivity. However, realizing this potential often requires overcoming substantial barriers related to design, fabrication, and heterogeneous integration. In this context, we present a comprehensive suite of tools and methods designed for wafer-scale biosensor prototyping that is sensitive, highly parallelizable, and manufacturable. A central component of our approach is a new initiative that allows for open-source multi-project wafers (MPW), giving all participants access to the designs submitted by others. We demonstrate that this strategy not only promotes design reuse but also facilitates advanced back-end-of-line (BEOL) fabrication techniques, improving the manufacturability and process yield of CMOS biosensors. Developing CMOS-based biosensors also involves the challenge of heterogeneous integration, which includes external electrical, mechanical, and fluid layers. We demonstrate simple modular designs that enable such integration for sample delivery and signal readout. Finally, we showcase the effectiveness of our approach in measuring the hybridization of DNA molecules by focusing on data acquisition and machine learning (ML) methods that leverage the parallelism of the sensors to enable robust classification of desirable analyte interactions.
△ Less
Submitted 28 September, 2025;
originally announced September 2025.
-
Probabilistic Modeling of Antibody Kinetics Post Infection and Vaccination: A Markov Chain Approach
Authors:
Rayanne A. Luke,
Prajakta Bedekar,
Lyndsey M. Muehling,
Glenda Canderan,
Yesun Lee,
Wesley A. Cheng,
Judith A. Woodfolk,
Jeffrey M. Wilson,
Pia S. Pannaraj,
Anthony J. Kearsley
Abstract:
Understanding the dynamics of antibody levels is crucial for characterizing the time-dependent response to immune events: either infections or vaccinations. The sequence and timing of these events significantly influence antibody level changes. Despite extensive interest in the topic in the recent years and many experimental studies, the effect of immune event sequences on antibody levels is not w…
▽ More
Understanding the dynamics of antibody levels is crucial for characterizing the time-dependent response to immune events: either infections or vaccinations. The sequence and timing of these events significantly influence antibody level changes. Despite extensive interest in the topic in the recent years and many experimental studies, the effect of immune event sequences on antibody levels is not well understood. Moreover, disease or vaccination prevalence in the population are time-dependent. This, alongside the complexities of personal antibody kinetics, makes it difficult to analyze a sample immune measurement from a population. As a solution, we design a rigorous mathematical characterization in terms of a time-inhomogeneous Markov chain model for event-to-event transitions coupled with a probabilistic framework for the post-event antibody kinetics of multiple immune events. We demonstrate that this is an ideal model for immune event sequences, referred to as personal trajectories. This novel modeling framework surpasses the susceptible-infected-recovered (SIR) characterizations by rigorously tracking the probability distribution of population antibody response across time. To illustrate our ideas, we apply our mathematical framework to longitudinal severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) data from individuals with multiple documented infection and vaccination events. Our work is an important step towards a comprehensive understanding of antibody kinetics that could lead to an effective way to analyze the protective power of natural immunity or vaccination, predict missed immune events at an individual level, and inform booster timing recommendations.
△ Less
Submitted 4 August, 2025; v1 submitted 14 July, 2025;
originally announced July 2025.
-
Uncertainty Quantification of Antibody Measurements: Physical Principles and Implications for Standardization
Authors:
Paul N. Patrone,
Lili Wang,
Sheng Lin-Gibson,
Anthony J. Kearsley
Abstract:
Harmonizing serology measurements is critical for identifying reference materials that permit standardization and comparison of results across different diagnostic platforms. However, the theoretical foundations of such tasks have yet to be fully explored in the context of antibody thermodynamics and uncertainty quantification (UQ). This has restricted the usefulness of standards currently deploye…
▽ More
Harmonizing serology measurements is critical for identifying reference materials that permit standardization and comparison of results across different diagnostic platforms. However, the theoretical foundations of such tasks have yet to be fully explored in the context of antibody thermodynamics and uncertainty quantification (UQ). This has restricted the usefulness of standards currently deployed and limited the scope of materials considered as viable reference material. To address these problems, we develop rigorous theories of antibody normalization and harmonization, as well as formulate a probabilistic framework for defining correlates of protection. We begin by proposing a mathematical definition of harmonization equipped with structure needed to quantify uncertainty associated with the choice of standard, assay, etc. We then show how a thermodynamic description of serology measurements (i) relates this structure to the Gibbs free-energy of antibody binding, and thereby (ii) induces a regression analysis that directly harmonizes measurements. We supplement this with a novel, optimization-based normalization (not harmonization!) method that checks for consistency between reference and sample dilution curves. Last, we relate these analyses to uncertainty propagation techniques to estimate correlates of protection. A key result of these analyses is that under physically reasonable conditions, the choice of reference material does not increase uncertainty associated with harmonization or correlates of protection. We provide examples and validate main ideas in the context of an interlab study that lays the foundation for using monoclonal antibodies as a reference for SARS-CoV-2 serology measurements.
△ Less
Submitted 30 August, 2024;
originally announced September 2024.
-
Prevalence estimation methods for time-dependent antibody kinetics of infected and vaccinated individuals: a graph-theoretic approach
Authors:
Prajakta Bedekar,
Rayanne A. Luke,
Anthony J. Kearsley
Abstract:
Immune events such as infection, vaccination, and a combination of the two result in distinct time-dependent antibody responses in affected individuals. These responses and event prevalences combine non-trivially to govern antibody levels sampled from a population. Time-dependence and disease prevalence pose considerable modeling challenges that need to be addressed to provide a rigorous mathemati…
▽ More
Immune events such as infection, vaccination, and a combination of the two result in distinct time-dependent antibody responses in affected individuals. These responses and event prevalences combine non-trivially to govern antibody levels sampled from a population. Time-dependence and disease prevalence pose considerable modeling challenges that need to be addressed to provide a rigorous mathematical underpinning of the underlying biology. We propose a time-inhomogeneous Markov chain model for event-to-event transitions coupled with a probabilistic framework for anti-body kinetics and demonstrate its use in a setting in which individuals can be infected or vaccinated but not both. We prove the equivalency of this approach to the framework developed in our previous work. Synthetic data are used to demonstrate the modeling process and conduct prevalence estimation via transition probability matrices. This approach is ideal to model sequences of infections and vaccinations, or personal trajectories in a population, making it an important first step towards a mathematical characterization of reinfection, vaccination boosting, and cross-events of infection after vaccination or vice versa.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Optimal classification and generalized prevalence estimates for diagnostic settings with more than two classes
Authors:
Rayanne A. Luke,
Anthony J. Kearsley,
Paul N. Patrone
Abstract:
An accurate multiclass classification strategy is crucial to interpreting antibody tests. However, traditional methods based on confidence intervals or receiver operating characteristics lack clear extensions to settings with more than two classes. We address this problem by developing a multiclass classification based on probabilistic modeling and optimal decision theory that minimizes the convex…
▽ More
An accurate multiclass classification strategy is crucial to interpreting antibody tests. However, traditional methods based on confidence intervals or receiver operating characteristics lack clear extensions to settings with more than two classes. We address this problem by developing a multiclass classification based on probabilistic modeling and optimal decision theory that minimizes the convex combination of false classification rates. The classification process is challenging when the relative fraction of the population in each class, or generalized prevalence, is unknown. Thus, we also develop a method for estimating the generalized prevalence of test data that is independent of classification. We validate our approach on serological data with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) naïve, previously infected, and vaccinated classes. Synthetic data are used to demonstrate that (i) prevalence estimates are unbiased and converge to true values and (ii) our procedure applies to arbitrary measurement dimensions. In contrast to the binary problem, the multiclass setting offers wide-reaching utility as the most general framework and provides new insight into prevalence estimation best practices.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Modeling in higher dimensions to improve diagnostic testing accuracy: theory and examples for multiplex saliva-based SARS-CoV-2 antibody assays
Authors:
Rayanne A. Luke,
Anthony J. Kearsley,
Nora Pisanic,
Yukari C. Manabe,
David L. Thomas,
Christopher D. Heaney,
Paul N. Patrone
Abstract:
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has emphasized the importance and challenges of correctly interpreting antibody test results. Identification of positive and negative samples requires a classification strategy with low error rates, which is hard to achieve when the corresponding measurement values overlap. Additional uncertainty arises when classification s…
▽ More
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has emphasized the importance and challenges of correctly interpreting antibody test results. Identification of positive and negative samples requires a classification strategy with low error rates, which is hard to achieve when the corresponding measurement values overlap. Additional uncertainty arises when classification schemes fail to account for complicated structure in data. We address these problems through a mathematical framework that combines high dimensional data modeling and optimal decision theory. Specifically, we show that appropriately increasing the dimension of data better separates positive and negative populations and reveals nuanced structure that can be described in terms of mathematical models. We combine these models with optimal decision theory to yield a classification scheme that better separates positive and negative samples relative to traditional methods such as confidence intervals (CIs) and receiver operating characteristics. We validate the usefulness of this approach in the context of a multiplex salivary SARS-CoV-2 immunoglobulin G assay dataset. This example illustrates how our analysis: (i) improves the assay accuracy (e.g. lowers classification errors by up to 42 % compared to CI methods); (ii) reduces the number of indeterminate samples when an inconclusive class is permissible (e.g. by 40 % compared to the original analysis of the example multiplex dataset); and (iii) decreases the number of antigens needed to classify samples. Our work showcases the power of mathematical modeling in diagnostic classification and highlights a method that can be adopted broadly in public health and clinical settings.
△ Less
Submitted 9 November, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Reproducibility in Cytometry: Signals Analysis and its Connection to Uncertainty Quantification
Authors:
Paul N. Patrone,
Matthew DiSalvo,
Anthony J. Kearsley,
Geoffrey B. McFadden,
Gregory A. Cooksey
Abstract:
Signals analysis for cytometry remains a challenging task that has a significant impact on uncertainty. Conventional cytometers assume that individual measurements are well characterized by simple properties such as the signal area, width, and height. However, these approaches have difficulty distinguishing inherent biological variability from instrument artifacts and operating conditions. As a re…
▽ More
Signals analysis for cytometry remains a challenging task that has a significant impact on uncertainty. Conventional cytometers assume that individual measurements are well characterized by simple properties such as the signal area, width, and height. However, these approaches have difficulty distinguishing inherent biological variability from instrument artifacts and operating conditions. As a result, it is challenging to quantify uncertainty in the properties of individual cells and perform tasks such as doublet deconvolution. We address these problems via signals analysis techniques that use scale transformations to: (I) separate variation in biomarker expression from effects due to flow conditions and particle size; (II) quantify reproducibility associated with a given laser interrogation region; (III) estimate uncertainty in measurement values on a per-event basis; and (IV) extract the singlets that make up a multiplet. The key idea behind this approach is to model how variable operating conditions deform the signal shape and then use constrained optimization to "undo" these deformations for measured signals; residuals to this process characterize reproducibility. Using a recently developed microfluidic cytometer, we demonstrate that these techniques can account for instrument and measurand induced variability with a residual uncertainty of less than 2.5% in the signal shape and less than 1% in integrated area.
△ Less
Submitted 4 June, 2022;
originally announced June 2022.
-
Optimal Decision Theory for Diagnostic Testing: Minimizing Indeterminate Classes with Applications to Saliva-Based SARS-CoV-2 Antibody Assays
Authors:
Paul N. Patrone,
Prajakta Bedekar,
Nora Pisanic,
Yukari C. Manabe,
David L. Thomas,
Christopher D. Heaney,
Anthony J. Kearsley
Abstract:
In diagnostic testing, establishing an indeterminate class is an effective way to identify samples that cannot be accurately classified. However, such approaches also make testing less efficient and must be balanced against overall assay performance. We address this problem by reformulating data classification in terms of a constrained optimization problem that (i) minimizes the probability of lab…
▽ More
In diagnostic testing, establishing an indeterminate class is an effective way to identify samples that cannot be accurately classified. However, such approaches also make testing less efficient and must be balanced against overall assay performance. We address this problem by reformulating data classification in terms of a constrained optimization problem that (i) minimizes the probability of labeling samples as indeterminate while (ii) ensuring that the remaining ones are classified with an average target accuracy X. We show that the solution to this problem is expressed in terms of a bathtub principle that holds out those samples with the lowest local accuracy up to an X-dependent threshold. To illustrate the usefulness of this analysis, we apply it to a multiplex, saliva-based SARS-CoV-2 antibody assay and demonstrate up to a 30 % reduction in the number of indeterminate samples relative to more traditional approaches.
△ Less
Submitted 31 January, 2022;
originally announced February 2022.
-
Predicting Kovats Retention Indices Using Graph Neural Networks
Authors:
Chen Qu,
Barry I. Schneider,
Anthony J. Kearsley,
Walid Keyrouz,
Thomas C. Allison
Abstract:
The \kovats retention index is a dimensionless quantity that characterizes the rate at which a compound is processed through a gas chromatography column. This quantity is independent of many experimental variables and, as such, is considered a near-universal descriptor of retention time on a chromatography column. The \kovats retention indices of a large number of molecules have been determined ex…
▽ More
The \kovats retention index is a dimensionless quantity that characterizes the rate at which a compound is processed through a gas chromatography column. This quantity is independent of many experimental variables and, as such, is considered a near-universal descriptor of retention time on a chromatography column. The \kovats retention indices of a large number of molecules have been determined experimentally. The "NIST 20: GC Method\slash Retention Index Library" database has collected and, more importantly, curated retention indices of a subset of these compounds resulting in a highly valued reference database. The experimental data in the library form an ideal data set for training machine learning models for the prediction of retention indices of unknown compounds. In this article, we describe the training of a graph neural network model to predict the \kovats retention index for compounds in the NIST library and compare this approach with previous work \cite{2019Matyushin}. We predict the \kovats retention index with a mean unsigned error of 28 index units as compared to 44, the putative best result using a convolutional neural network \cite{2019Matyushin}. The NIST library also incorporates an estimation scheme based on a group contribution approach that achieves a mean unsigned error of 114 compared to the experimental data. Our method uses the same input data source as the group contribution approach, making its application straightforward and convenient to apply to existing libraries. Our results convincingly demonstrate the predictive powers of systematic, data-driven approaches leveraging deep learning methodologies applied to chemical data and for the data in the NIST 20 library outperform previous models.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Improving Baseline Subtraction for Increased Sensitivity of Quantitative PCR Measurements
Authors:
Paul N. Patrone,
Anthony J. Kearsley,
Erica L. Romsos,
Peter M. Vallone
Abstract:
Motivated by the current COVID-19 health-crisis, we examine the task of baseline subtraction for quantitative polymerase chain-reaction (qPCR) measurements. In particular, we present an algorithm that leverages information obtained from non-template and/or DNA extraction-control experiments to remove systematic bias from amplification curves. We recast this problem in terms of mathematical optimiz…
▽ More
Motivated by the current COVID-19 health-crisis, we examine the task of baseline subtraction for quantitative polymerase chain-reaction (qPCR) measurements. In particular, we present an algorithm that leverages information obtained from non-template and/or DNA extraction-control experiments to remove systematic bias from amplification curves. We recast this problem in terms of mathematical optimization, i.e. by finding the amount of control signal that, when subtracted from an amplification curve, minimizes background noise. We demonstrate that this approach can yield a decade improvement in sensitivity relative to standard approaches, especially for data exhibiting late-cycle amplification. Critically, this increased sensitivity and accuracy promises more effective screening of viral DNA and a reduction in the rate of false-negatives in diagnostic settings.
△ Less
Submitted 11 April, 2020;
originally announced April 2020.
-
The Role of Data Analysis in Uncertainty Quantification: Case Studies for Materials Modeling
Authors:
Paul N. Patrone,
Anthony J. Kearsley,
Andrew M. Dienstfrey
Abstract:
In computational materials science, mechanical properties are typically extracted from simulations by means of analysis routines that seek to mimic their experimental counterparts. However, simulated data often exhibit uncertainties that can propagate into final predictions in unexpected ways. Thus, modelers require data analysis tools that (i) address the problems posed by simulated data, and (ii…
▽ More
In computational materials science, mechanical properties are typically extracted from simulations by means of analysis routines that seek to mimic their experimental counterparts. However, simulated data often exhibit uncertainties that can propagate into final predictions in unexpected ways. Thus, modelers require data analysis tools that (i) address the problems posed by simulated data, and (ii) facilitate uncertainty quantification. In this manuscript, we discuss three case studies in materials modeling where careful data analysis can be leveraged to address specific instances of these issues. As a unifying theme, we highlight the idea that attention to physical and mathematical constraints surrounding the generation of computational data can significantly enhance its analysis.
△ Less
Submitted 5 December, 2017;
originally announced December 2017.
-
Diffusion-limited Reactions in Nanoscale Electronics
Authors:
Ryan M. Evans,
Arvind Balijepalli,
Anthony J. Kearsley
Abstract:
A partial differential equation (PDE) was developed to describe time-dependent ligand-receptor interactions for applications in biosensing using field effect transistors (FET). The model describes biochemical interactions at the sensor surface (or biochemical gate) located at the bottom of a solution-well, which result in a time-dependent change in the FET conductance. It was shown that one can ex…
▽ More
A partial differential equation (PDE) was developed to describe time-dependent ligand-receptor interactions for applications in biosensing using field effect transistors (FET). The model describes biochemical interactions at the sensor surface (or biochemical gate) located at the bottom of a solution-well, which result in a time-dependent change in the FET conductance. It was shown that one can exploit the disparate length scales of the solution-well and biochemical gate to reduce the coupled PDE model to a single nonlinear integrodifferential equation (IDE) that describes the concentration of reacting species. Although this equation has a convolution integral with a singular kernel, a numerical approximation was constructed by applying the method of lines. The need for specialized quadrature techniques was obviated and numerical evidence strongly suggests that this method achieves first-order accuracy. Results reveal a depletion region on the biochemical gate, which non-uniformly alters the surface potential of the semiconductor.
△ Less
Submitted 19 October, 2017;
originally announced October 2017.