-
DNA Origami Nanostructures Observed in Transmission Electron Microscopy Images can be Characterized through Convolutional Neural Networks
Authors:
Xingfei Wei,
Qiankun Mo,
Chi Chen,
Mark Bathe,
Rigoberto Hernandez
Abstract:
Artificial intelligence (AI) models remain an emerging strategy to accelerate materials design and development. We demonstrate that convolutional neural network (CNN) models can characterize DNA origami nanostructures employed in programmable self-assembling, which is important in many applications such as in biomedicine. Specifically, we benchmark the performance of 9 CNN models -- viz. AlexNet,…
▽ More
Artificial intelligence (AI) models remain an emerging strategy to accelerate materials design and development. We demonstrate that convolutional neural network (CNN) models can characterize DNA origami nanostructures employed in programmable self-assembling, which is important in many applications such as in biomedicine. Specifically, we benchmark the performance of 9 CNN models -- viz. AlexNet, GoogLeNet, VGG16, VGG19, ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152 -- to characterize the ligation number of DNA origami nanostructures in transmission electron microscopy (TEM) images. We first pre-train CNN models using a large image dataset of 720 images from our coarse-grained (CG) molecular dynamics (MD) simulations. Then, we fine-tune the pre-trained CNN models, using a small experimental TEM dataset with 146 TEM images. All CNN models were found to have similar computational time requirements, while their model sizes and performances are different. We use 20 test MD images to demonstrate that among all of the pre-trained CNN models ResNet50 and VGG16 have the highest and second highest accuracies. Among the fine-tuned models, VGG16 was found to have the highest agreement on the test TEM images. Thus, we conclude that fine-tuned VGG16 models can quickly characterize the ligation number of nanostructures in large TEM images.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
Isometric Hamming embeddings of weighted graphs
Authors:
Joseph Berleant,
Kristin Sheridan,
Anne Condon,
Virginia Vassilevska Williams,
Mark Bathe
Abstract:
A mapping $α: V(G) \to V(H)$ from the vertex set of one graph $G$ to another graph $H$ is an isometric embedding if the shortest path distance between any two vertices in $G$ equals the distance between their images in $H$. Here, we consider isometric embeddings of a weighted graph $G$ into unweighted Hamming graphs, called Hamming embeddings, when $G$ satisfies the property that every edge is a s…
▽ More
A mapping $α: V(G) \to V(H)$ from the vertex set of one graph $G$ to another graph $H$ is an isometric embedding if the shortest path distance between any two vertices in $G$ equals the distance between their images in $H$. Here, we consider isometric embeddings of a weighted graph $G$ into unweighted Hamming graphs, called Hamming embeddings, when $G$ satisfies the property that every edge is a shortest path between its endpoints. Using a Cartesian product decomposition of $G$ called its pseudofactorization, we show that every Hamming embedding of $G$ may be partitioned into Hamming embeddings for each irreducible pseudofactor graph of $G$, which we call its canonical partition. This implies that $G$ permits a Hamming embedding if and only if each of its irreducible pseudofactors is Hamming embeddable. This result extends prior work on unweighted graphs that showed that an unweighted graph permits a Hamming embedding if and only if each irreducible pseudofactor is a complete graph. When a graph $G$ has nontrivial pseudofactors, determining whether $G$ has a Hamming embedding can be simplified to checking embeddability of two or more smaller graphs.
△ Less
Submitted 20 December, 2021; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Factorization and pseudofactorization of weighted graphs
Authors:
Kristin Sheridan,
Joseph Berleant,
Mark Bathe,
Anne Condon,
Virginia Vassilevska Williams
Abstract:
For unweighted graphs, finding isometric embeddings is closely related to decompositions of $G$ into Cartesian products of smaller graphs. When $G$ is isomorphic to a Cartesian graph product, we call the factors of this product a factorization of $G$. When $G$ is isomorphic to an isometric subgraph of a Cartesian graph product, we call those factors a pseudofactorization of $G$. Prior work has sho…
▽ More
For unweighted graphs, finding isometric embeddings is closely related to decompositions of $G$ into Cartesian products of smaller graphs. When $G$ is isomorphic to a Cartesian graph product, we call the factors of this product a factorization of $G$. When $G$ is isomorphic to an isometric subgraph of a Cartesian graph product, we call those factors a pseudofactorization of $G$. Prior work has shown that an unweighted graph's pseudofactorization can be used to generate a canonical isometric embedding into a product of the smallest possible pseudofactors. However, for arbitrary weighted graphs, which represent a richer variety of metric spaces, methods for finding isometric embeddings or determining their existence remain elusive, and indeed pseudofactorization and factorization have not previously been extended to this context. In this work, we address the problem of finding the factorization and pseudofactorization of a weighted graph $G$, where $G$ satisfies the property that every edge constitutes a shortest path between its endpoints. We term such graphs minimal graphs, noting that every graph can be made minimal by removing edges not affecting its path metric. We generalize pseudofactorization and factorization to minimal graphs and develop new proof techniques that extend the previously proposed algorithms due to Graham and Winkler [Graham and Winkler, '85] and Feder [Feder, '92] for pseudofactorization and factorization of unweighted graphs. We show that any $m$-edge, $n$-vertex graph with positive integer edge weights can be factored in $O(m^2)$ time, plus the time to find all pairs shortest paths (APSP) distances in a weighted graph, resulting in an overall running time of $O(m^2+n^2\log\log n)$ time. We also show that a pseudofactorization for such a graph can be computed in $O(mn)$ time, plus the time to solve APSP, resulting in an $O(mn+n^2\log\log n)$ running time.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.