-
Contemporary AI foundation models increase biological weapons risk
Authors:
Roger Brent,
T. Greg McKelvey Jr
Abstract:
The rapid advancement of artificial intelligence has raised concerns about its potential to facilitate biological weapons development. We argue existing safety assessments of contemporary foundation AI models underestimate this risk, largely due to flawed assumptions and inadequate evaluation methods. First, assessments mistakenly assume biological weapons development requires tacit knowledge, or…
▽ More
The rapid advancement of artificial intelligence has raised concerns about its potential to facilitate biological weapons development. We argue existing safety assessments of contemporary foundation AI models underestimate this risk, largely due to flawed assumptions and inadequate evaluation methods. First, assessments mistakenly assume biological weapons development requires tacit knowledge, or skills gained through hands-on experience that cannot be easily verbalized. Second, they rely on imperfect benchmarks that overlook how AI can uplift both nonexperts and already-skilled individuals. To challenge the tacit knowledge assumption, we examine cases where individuals without formal expertise, including a 2011 Norwegian ultranationalist who synthesized explosives, successfully carried out complex technical tasks. We also review efforts to document pathogen construction processes, highlighting how such tasks can be conveyed in text. We identify "elements of success" for biological weapons development that large language models can describe in words, including steps such as acquiring materials and performing technical procedures. Applying this framework, we find that advanced AI models Llama 3.1 405B, ChatGPT-4o, and Claude 3.5 Sonnet can accurately guide users through the recovery of live poliovirus from commercially obtained synthetic DNA, challenging recent claims that current models pose minimal biosecurity risk. We advocate for improved benchmarks, while acknowledging the window for meaningful implementation may have already closed.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Analysis of Interpolating Regression Models and the Double Descent Phenomenon
Authors:
Tomas McKelvey
Abstract:
A regression model with more parameters than data points in the training data is overparametrized and has the capability to interpolate the training data. Based on the classical bias-variance tradeoff expressions, it is commonly assumed that models which interpolate noisy training data are poor to generalize. In some cases, this is not true. The best models obtained are overparametrized and the te…
▽ More
A regression model with more parameters than data points in the training data is overparametrized and has the capability to interpolate the training data. Based on the classical bias-variance tradeoff expressions, it is commonly assumed that models which interpolate noisy training data are poor to generalize. In some cases, this is not true. The best models obtained are overparametrized and the testing error exhibits the double descent behavior as the model order increases. In this contribution, we provide some analysis to explain the double descent phenomenon, first reported in the machine learning literature. We focus on interpolating models derived from the minimum norm solution to the classical least-squares problem and also briefly discuss model fitting using ridge regression. We derive a result based on the behavior of the smallest singular value of the regression matrix that explains the peak location and the double descent shape of the testing error as a function of model order.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Analysis of the first Genetic Engineering Attribution Challenge
Authors:
Oliver M. Crook,
Kelsey Lane Warmbrod,
Greg Lipstein,
Christine Chung,
Christopher W. Bakerlee,
T. Greg McKelvey Jr.,
Shelly R. Holland,
Jacob L. Swett,
Kevin M. Esvelt,
Ethan C. Alley,
William J. Bradshaw
Abstract:
The ability to identify the designer of engineered biological sequences -- termed genetic engineering attribution (GEA) -- would help ensure due credit for biotechnological innovation, while holding designers accountable to the communities they affect. Here, we present the results of the first Genetic Engineering Attribution Challenge, a public data-science competition to advance GEA. Top-scoring…
▽ More
The ability to identify the designer of engineered biological sequences -- termed genetic engineering attribution (GEA) -- would help ensure due credit for biotechnological innovation, while holding designers accountable to the communities they affect. Here, we present the results of the first Genetic Engineering Attribution Challenge, a public data-science competition to advance GEA. Top-scoring teams dramatically outperformed previous models at identifying the true lab-of-origin of engineered sequences, including an increase in top-1 and top-10 accuracy of 10 percentage points. A simple ensemble of prizewinning models further increased performance. New metrics, designed to assess a model's ability to confidently exclude candidate labs, also showed major improvements, especially for the ensemble. Most winning teams adopted CNN-based machine-learning approaches; however, one team achieved very high accuracy with an extremely fast neural-network-free approach. Future work, including future competitions, should further explore a wide diversity of approaches for bringing GEA technology into practical use.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes
Authors:
James Mullenbach,
Yada Pruksachatkun,
Sean Adler,
Jennifer Seale,
Jordan Swartz,
T. Greg McKelvey,
Hui Dai,
Yi Yang,
David Sontag
Abstract:
Continuity of care is crucial to ensuring positive health outcomes for patients discharged from an inpatient hospital setting, and improved information sharing can help. To share information, caregivers write discharge notes containing action items to share with patients and their future caregivers, but these action items are easily lost due to the lengthiness of the documents. In this work, we de…
▽ More
Continuity of care is crucial to ensuring positive health outcomes for patients discharged from an inpatient hospital setting, and improved information sharing can help. To share information, caregivers write discharge notes containing action items to share with patients and their future caregivers, but these action items are easily lost due to the lengthiness of the documents. In this work, we describe our creation of a dataset of clinical action items annotated over MIMIC-III, the largest publicly available dataset of real clinical notes. This dataset, which we call CLIP, is annotated by physicians and covers 718 documents representing 100K sentences. We describe the task of extracting the action items from these documents as multi-aspect extractive summarization, with each aspect representing a type of action to be taken. We evaluate several machine learning models on this task, and show that the best models exploit in-domain language model pre-training on 59K unannotated documents, and incorporate context from neighboring sentences. We also propose an approach to pre-training data selection that allows us to explore the trade-off between size and domain-specificity of pre-training datasets for this task.
△ Less
Submitted 4 June, 2021;
originally announced June 2021.
-
Knowledge Base Completion for Constructing Problem-Oriented Medical Records
Authors:
James Mullenbach,
Jordan Swartz,
T. Greg McKelvey,
Hui Dai,
David Sontag
Abstract:
Both electronic health records and personal health records are typically organized by data type, with medical problems, medications, procedures, and laboratory results chronologically sorted in separate areas of the chart. As a result, it can be difficult to find all of the relevant information for answering a clinical question about a given medical problem. A promising alternative is to instead o…
▽ More
Both electronic health records and personal health records are typically organized by data type, with medical problems, medications, procedures, and laboratory results chronologically sorted in separate areas of the chart. As a result, it can be difficult to find all of the relevant information for answering a clinical question about a given medical problem. A promising alternative is to instead organize by problems, with related medications, procedures, and other pertinent information all grouped together. A recent effort by Buchanan (2017) manually defined, through expert consensus, 11 medical problems and the relevant labs and medications for each. We show how to use machine learning on electronic health records to instead automatically construct these problem-based groupings of relevant medications, procedures, and laboratory tests. We formulate the learning task as one of knowledge base completion, and annotate a dataset that expands the set of problems from 11 to 32. We develop a model architecture that exploits both pre-trained concept embeddings and usage data relating the concepts contained in a longitudinal dataset from a large health system. We evaluate our algorithms' ability to suggest relevant medications, procedures, and lab tests, and find that the approach provides feasible suggestions even for problems that are hidden during training. The dataset, along with code to reproduce our results, is available at https://github.com/asappresearch/kbc-pomr.
△ Less
Submitted 7 August, 2020; v1 submitted 27 April, 2020;
originally announced April 2020.
-
Building Efficient CNNs Using Depthwise Convolutional Eigen-Filters (DeCEF)
Authors:
Yinan Yu,
Samuel Scheidegger,
Tomas McKelvey
Abstract:
Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To reduce the complexity of a network, compression techniques can be applied. These methods typically rely on the analysis of trained deep learning model…
▽ More
Deep Convolutional Neural Networks (CNNs) have been widely used in various domains due to their impressive capabilities. These models are typically composed of a large number of 2D convolutional (Conv2D) layers with numerous trainable parameters. To reduce the complexity of a network, compression techniques can be applied. These methods typically rely on the analysis of trained deep learning models. However, in some applications, due to reasons such as particular data or system specifications and licensing restrictions, a pre-trained network may not be available. This would require the user to train a CNN from scratch. In this paper, we aim to find an alternative parameterization to Conv2D filters without relying on a pre-trained convolutional network. During the analysis, we observe that the effective rank of the vectorized Conv2D filters decreases with respect to the increasing depth in the network, which then leads to the implementation of the Depthwise Convolutional Eigen-Filter (DeCEF) layer. Essentially, a DeCEF layer is a low rank version of the Conv2D layer with significantly fewer trainable parameters and floating point operations (FLOPs). The way we define the effective rank is different from the previous work and it is easy to implement in any deep learning frameworks. To evaluate the effectiveness of DeCEF, experiments are conducted on the benchmark datasets CIFAR-10 and ImageNet using various network architectures. The results have shown a similar or higher accuracy and robustness using about 2/3 of the original parameters and reducing the number of FLOPs to 2/3 of the base network, which is then compared to the state-of-the-art techniques.
△ Less
Submitted 31 January, 2022; v1 submitted 21 October, 2019;
originally announced October 2019.
-
Learning Hierarchical Feature Space Using CLAss-specific Subspace Multiple Kernel -- Metric Learning for Classification
Authors:
Yinan Yu,
Tomas McKelvey
Abstract:
Metric learning for classification has been intensively studied over the last decade. The idea is to learn a metric space induced from a normed vector space on which data from different classes are well separated. Different measures of the separation thus lead to various designs of the objective function in the metric learning model. One classical metric is the Mahalanobis distance, where a linear…
▽ More
Metric learning for classification has been intensively studied over the last decade. The idea is to learn a metric space induced from a normed vector space on which data from different classes are well separated. Different measures of the separation thus lead to various designs of the objective function in the metric learning model. One classical metric is the Mahalanobis distance, where a linear transformation matrix is designed and applied on the original dataset to obtain a new subspace equipped with the Euclidean norm. The kernelized version has also been developed, followed by Multiple-Kernel learning models. In this paper, we consider metric learning to be the identification of the best kernel function with respect to a high class separability in the corresponding metric space. The contribution is twofold: 1) No pairwise computations are required as in most metric learning techniques; 2) Better flexibility and lower computational complexity is achieved using the CLAss-Specific (Multiple) Kernel - Metric Learning (CLAS(M)K-ML). The proposed techniques can be considered as a preprocessing step to any kernel method or kernel approximation technique. An extension to a hierarchical learning structure is also proposed to further improve the classification performance, where on each layer, the CLASMK is computed based on a selected "marginal" subset and feature vectors are constructed by concatenating the features from all previous layers.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Transmission Strategies for Remote Estimation with an Energy Harvesting Sensor
Authors:
Ayca Ozcelikkale,
Tomas McKelvey,
Mats Viberg
Abstract:
We consider the remote estimation of a time-correlated signal using an energy harvesting (EH) sensor. The sensor observes the unknown signal and communicates its observations to a remote fusion center using an amplify-and-forward strategy. We consider the design of optimal power allocation strategies in order to minimize the mean-square error at the fusion center. Contrary to the traditional appro…
▽ More
We consider the remote estimation of a time-correlated signal using an energy harvesting (EH) sensor. The sensor observes the unknown signal and communicates its observations to a remote fusion center using an amplify-and-forward strategy. We consider the design of optimal power allocation strategies in order to minimize the mean-square error at the fusion center. Contrary to the traditional approaches, the degree of correlation between the signal values constitutes an important aspect of our formulation. We provide the optimal power allocation strategies for a number of illustrative scenarios. We show that the most majorized power allocation strategy, i.e. the power allocation as balanced as possible, is optimal for the cases of circularly wide-sense stationary (c.w.s.s.) signals with a static correlation coefficient, and sampled low-pass c.w.s.s. signals for a static channel. We show that the optimal strategy can be characterized as a water-filling type solution for sampled low-pass c.w.s.s. signals for a fading channel. Motivated by the high-complexity of the numerical solution of the optimization problem, we propose low-complexity policies for the general scenario. Numerical evaluations illustrate the close performance of these low-complexity policies to that of the optimal policies, and demonstrate the effect of the EH constraints and the degree of freedom of the signal.
△ Less
Submitted 9 October, 2016;
originally announced October 2016.
-
Performance Bounds for Remote Estimation under Energy Harvesting Constraints
Authors:
Ayca Ozcelikkale,
Tomas McKelvey,
Mats Viberg
Abstract:
Remote estimation with an energy harvesting sensor with a limited data and energy buffer is considered. The sensor node observes an unknown Gaussian field and communicates its observations to a remote fusion center using the energy it harvested. The fusion center employs minimum mean-square error (MMSE) estimation to reconstruct the unknown field. The distortion minimization problem under the onli…
▽ More
Remote estimation with an energy harvesting sensor with a limited data and energy buffer is considered. The sensor node observes an unknown Gaussian field and communicates its observations to a remote fusion center using the energy it harvested. The fusion center employs minimum mean-square error (MMSE) estimation to reconstruct the unknown field. The distortion minimization problem under the online scheme, where the sensor has access to only the statistical information for the future energy packets is considered. We provide performance bounds on the achievable distortion under a slotted block transmission scheme, where at each transmission time slot, the data and the energy buffer are completely emptied. Our bounds provide insights to the trade-offs between the buffer sizes, the statistical properties of the energy harvesting process and the achievable distortion. In particular, these trade-offs illustrate the insensitivity of the performance to the buffer sizes for signals with low degree of freedom and suggest performance improvements with increasing buffer size for signals with relatively higher degree of freedom. Depending only on the mean, variance and finite support of the energy arrival process, these results provide practical insights for the battery and buffer sizes for deployment in future energy harvesting wireless sensing systems.
△ Less
Submitted 8 October, 2016;
originally announced October 2016.