Search | arXiv e-print repository

Ghostbuster: a phase retrieval diffraction tomography algorithm for cryo-EM

Authors: Joel Yeo, Benedikt J. Daurer, Dari Kimanius, Deepan Balakrishnan, Tristan Bepler, Yong Zi Tan, N. Duane Loh

Abstract: Ewald sphere curvature correction, which extends beyond the projection approximation, stretches the shallow depth of field in cryo-EM reconstructions of thick particles. Here we show that even for previously assumed thin particles, reconstruction artifacts which we refer to as ghosts can appear. By retrieving the lost phases of the electron exitwaves and accounting for the first Born approximation… ▽ More Ewald sphere curvature correction, which extends beyond the projection approximation, stretches the shallow depth of field in cryo-EM reconstructions of thick particles. Here we show that even for previously assumed thin particles, reconstruction artifacts which we refer to as ghosts can appear. By retrieving the lost phases of the electron exitwaves and accounting for the first Born approximation scattering within the particle, we show that these ghosts can be effectively eliminated. Our simulations demonstrate how such ghostbusting can improve reconstructions as compared to existing state-of-the-art software. Like ptychographic cryo-EM, our Ghostbuster algorithm uses phase retrieval to improve reconstructions, but unlike the former, we do not need to modify the existing data acquisition pipelines. △ Less

Submitted 3 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: 20 pages, 11 figures. Submitted to IUCrJ

arXiv:2305.16628 [pdf]

Nanoscale cuticle mass density variations influenced by pigmentation in butterfly wing scales

Authors: Deepan Balakrishnan, Anupama Prakash, Benedikt J. Daurer, Cédric Finet, Ying Chen Lim, Zhou Shen, Pierre Thibault, Antónia Monteiro, N. Duane Loh

Abstract: How pigment distribution influences the cuticle density within a microscopic butterfly wing scale, and how both impact final reflected color remains unknown. We used ptychographic X-ray computed tomography to quantitatively determine, at nanoscale resolutions, the three-dimensional mass density of scales with pigmentation differences. By comparing cuticle densities between pairs of scales with pig… ▽ More How pigment distribution influences the cuticle density within a microscopic butterfly wing scale, and how both impact final reflected color remains unknown. We used ptychographic X-ray computed tomography to quantitatively determine, at nanoscale resolutions, the three-dimensional mass density of scales with pigmentation differences. By comparing cuticle densities between pairs of scales with pigmentation differences, we determine that the density of the lower lamina is inversely correlated with pigmentation. In the upper lamina structure, low pigment levels also correlate with sheet-like chitin structures as opposed to rod-like structures. Within each scale, we determine that the lower lamina in all scales has the highest density and distinct layers within the lower lamina help explain reflected color. We hypothesize that pigments, in addition to absorbing specific wavelengths, can affect cuticle polymerization, density, and refractive index, thereby impacting reflected wavelengths that produce colors. △ Less

Submitted 27 December, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2209.07930 [pdf]

Single-shot, coherent, pop-out 3D metrology

Authors: Deepan Balakrishnan, See Wee Chee, Zhaslan Baraissov, Michel Bosman, Utkur Mirsaidov, N. Duane Loh

Abstract: Three-dimensional (3D) imaging of thin, extended specimens at nanometer resolution is critical for applications in biology, materials science, advanced synthesis, and manufacturing. One route to 3D imaging is tomography, which requires a tilt series of a local region. Here we describe a coherent imaging alternative that recovers the 3D volume of a thin, homogeneously amorphous specimen with only a… ▽ More Three-dimensional (3D) imaging of thin, extended specimens at nanometer resolution is critical for applications in biology, materials science, advanced synthesis, and manufacturing. One route to 3D imaging is tomography, which requires a tilt series of a local region. Here we describe a coherent imaging alternative that recovers the 3D volume of a thin, homogeneously amorphous specimen with only a single, energy-filtered, bright-field image. We demonstrated this technique with a transmission electron microscope to fill a glaring gap for rapid, accessible, non-destructive 3D nanometrology. This technique is applicable, in general, to any coherent bright field imaging with electrons, photons, or any other wavelike particles. △ Less

Submitted 20 October, 2023; v1 submitted 7 September, 2022; originally announced September 2022.

Comments: 28 pages, 12 figures

arXiv:2208.13054 [pdf, other]

CrackSeg9k: A Collection and Benchmark for Crack Segmentation Datasets and Frameworks

Authors: Shreyas Kulkarni, Shreyas Singh, Dhananjay Balakrishnan, Siddharth Sharma, Saipraneeth Devunuri, Sai Chowdeswara Rao Korlapati

Abstract: The detection of cracks is a crucial task in monitoring structural health and ensuring structural safety. The manual process of crack detection is time-consuming and subjective to the inspectors. Several researchers have tried tackling this problem using traditional Image Processing or learning-based techniques. However, their scope of work is limited to detecting cracks on a single type of surfac… ▽ More The detection of cracks is a crucial task in monitoring structural health and ensuring structural safety. The manual process of crack detection is time-consuming and subjective to the inspectors. Several researchers have tried tackling this problem using traditional Image Processing or learning-based techniques. However, their scope of work is limited to detecting cracks on a single type of surface (walls, pavements, glass, etc.). The metrics used to evaluate these methods are also varied across the literature, making it challenging to compare techniques. This paper addresses these problems by combining previously available datasets and unifying the annotations by tackling the inherent problems within each dataset, such as noise and distortions. We also present a pipeline that combines Image Processing and Deep Learning models. Finally, we benchmark the results of proposed models on these metrics on our new dataset and compare them with state-of-the-art models in the literature. △ Less

Submitted 27 August, 2022; originally announced August 2022.

arXiv:2106.09775 [pdf, other]

An Information Retrieval Approach to Building Datasets for Hate Speech Detection

Authors: Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, Matthew Lease

Abstract: Building a benchmark dataset for hate speech detection presents various challenges. Firstly, because hate speech is relatively rare, random sampling of tweets to annotate is very inefficient in finding hate speech. To address this, prior datasets often include only tweets matching known "hate words". However, restricting data to a pre-defined vocabulary may exclude portions of the real-world pheno… ▽ More Building a benchmark dataset for hate speech detection presents various challenges. Firstly, because hate speech is relatively rare, random sampling of tweets to annotate is very inefficient in finding hate speech. To address this, prior datasets often include only tweets matching known "hate words". However, restricting data to a pre-defined vocabulary may exclude portions of the real-world phenomenon we seek to model. A second challenge is that definitions of hate speech tend to be highly varying and subjective. Annotators having diverse prior notions of hate speech may not only disagree with one another but also struggle to conform to specified labeling guidelines. Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR). This connection suggests that well-established methodologies for creating IR test collections can be usefully applied to create better benchmark datasets for hate speech. To intelligently and efficiently select which tweets to annotate, we apply standard IR techniques of {\em pooling} and {\em active learning}. To improve both consistency and value of annotations, we apply {\em task decomposition} and {\em annotator rationale} techniques. We share a new benchmark dataset for hate speech detection on Twitter that provides broader coverage of hate than prior datasets. We also show a dramatic drop in accuracy of existing detection models when tested on these broader forms of hate. Annotator rationales we collect not only justify labeling decisions but also enable future work opportunities for dual-supervision and/or explanation generation in modeling. Further details of our approach can be found in the supplementary materials. △ Less

Submitted 9 November, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

Comments: Accepted as a full paper at 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks. (https://openreview.net/group?id=NeurIPS.cc/2021/Track/Datasets_and_Benchmarks/Round2)

arXiv:2104.01241 [pdf, other]

TreeToaster: Towards an IVM-Optimized Compiler

Authors: Darshana Balakrishnan, Carl Nuessle, Oliver Kennedy, Lukasz Ziarek

Abstract: A compiler's optimizer operates over abstract syntax trees (ASTs), continuously applying rewrite rules to replace subtrees of the AST with more efficient ones. Especially on large source repositories, even simply finding opportunities for a rewrite can be expensive, as optimizer traverses the AST naively. In this paper, we leverage the need to repeatedly find rewrites, and explore options for maki… ▽ More A compiler's optimizer operates over abstract syntax trees (ASTs), continuously applying rewrite rules to replace subtrees of the AST with more efficient ones. Especially on large source repositories, even simply finding opportunities for a rewrite can be expensive, as optimizer traverses the AST naively. In this paper, we leverage the need to repeatedly find rewrites, and explore options for making the search faster through indexing and incremental view maintenance (IVM). Concretely, we consider bolt-on approaches that make use of embedded IVM systems like DBToaster, as well as two new approaches: Label-indexing and TreeToaster, an AST-specialized form of IVM. We integrate these approaches into an existing just-in-time data structure compiler and show experimentally that TreeToaster can significantly improve performance with minimal memory overheads. △ Less

Submitted 2 April, 2021; originally announced April 2021.

Comments: 23 pages, 17 figures

arXiv:1911.06013 [pdf]

ReCoDe: A Data Reduction and Compression Description for High Throughput Time-Resolved Electron Microscopy

Authors: Abhik Datta, Kian Fong Ng, Deepan Balakrishnan, Melissa Ding, Yvonne Ban, See Wee Chee, Jian Shi, N. Duane Loh

Abstract: Fast, direct electron detectors have significantly improved the spatio-temporal resolution of electron microscopy movies. Preserving both spatial and temporal resolution in extended observations, however, requires storing prohibitively large amounts of data. Here, we describe an efficient and flexible data reduction and compression scheme (ReCoDe) that retains both spatial and temporal resolution… ▽ More Fast, direct electron detectors have significantly improved the spatio-temporal resolution of electron microscopy movies. Preserving both spatial and temporal resolution in extended observations, however, requires storing prohibitively large amounts of data. Here, we describe an efficient and flexible data reduction and compression scheme (ReCoDe) that retains both spatial and temporal resolution by preserving individual electron events. Running ReCoDe on a workstation we demonstrate on-the-fly reduction and compression of raw data streaming off a detector at 3 GB/s, for hours of uninterrupted data collection. The output was 100-fold smaller than the raw data and saved directly onto network-attached storage drives over a 10 GbE connection. We discuss calibration techniques that support electron detection and counting (e.g. estimate electron backscattering rates, false positive rates, and data compressibility), and novel data analysis methods enabled by ReCoDe (e.g. recalibration of data post acquisition, and accurate estimation of coincidence loss). △ Less

Submitted 27 September, 2020; v1 submitted 14 November, 2019; originally announced November 2019.

Comments: 53 pages, 20 figures

arXiv:1908.02465 [pdf]

Miniaturised control of acidity in multiplexed microreactors

Authors: Divya Balakrishnan, Wouter Olthuis, César Pascual-García

Abstract: The control of acidity influences the structural assembly of biopolymers that are essential for a wide range of applications. Its miniaturization can increase the speed and the possibilities of combinatorial throughput for their manipulation, similarly to the way that the miniaturization of transistors allows the high throughput of logical operations in microelectronics. Here we present a device c… ▽ More The control of acidity influences the structural assembly of biopolymers that are essential for a wide range of applications. Its miniaturization can increase the speed and the possibilities of combinatorial throughput for their manipulation, similarly to the way that the miniaturization of transistors allows the high throughput of logical operations in microelectronics. Here we present a device containing multiplexed micro-reactors, each one enabling independent electrochemical control of the acidity in ~ 2.5 nL volumes, with a large acidity range in aqueous solutions from pH 3 to 7 and an accuracy of at least 0.4 pH units. The attained pH within each microreactor (with footprints of ~ 0.3 mm2 for each spot) was kept constant for long retention times (~10 minutes) and over repeated cycles >100. The acidity is driven by redox proton exchange reactions, which can be driven at different rates that influence the efficiency of the device in order to achieve more charge exchange (larger acidity range) or better reversibility. By the performance in the acidity control the miniaturisation and the possibility to multiplex paves the way for the control of combinatorial chemistry through pH and acidity controlled reactions. △ Less

Submitted 7 August, 2019; originally announced August 2019.

arXiv:1901.07627 [pdf, other]

Just-in-Time Index Compilation

Authors: Darshana Balakrishnan, Lukasz Ziarek, Oliver Kennedy

Abstract: Creating or modifying a primary index is a time-consuming process, as the index typically needs to be rebuilt from scratch. In this paper, we explore a more graceful "just-in-time" approach to index reorganization, where small changes are dynamically applied in the background. To enable this type of reorganization, we formalize a composable organizational grammar, expressive enough to capture inst… ▽ More Creating or modifying a primary index is a time-consuming process, as the index typically needs to be rebuilt from scratch. In this paper, we explore a more graceful "just-in-time" approach to index reorganization, where small changes are dynamically applied in the background. To enable this type of reorganization, we formalize a composable organizational grammar, expressive enough to capture instances of not only existing index structures, but arbitrary hybrids as well. We introduce an algebra of rewrite rules for such structures, and a framework for defining and optimizing policies for just-in-time rewriting. Our experimental analysis shows that the resulting index structure is flexible enough to adapt to a variety of performance goals, while also remaining competitive with existing structures like the C++ standard template library map. △ Less

Submitted 22 January, 2019; originally announced January 2019.

Comments: Work Supported by NSF Award #IIS-1617586

arXiv:1011.0628 [pdf]

doi 10.5121/ijaia.2010.1409

Significance of Classification Techniques in Prediction of Learning Disabilities

Authors: Julie M. David And Kannan Balakrishnan

Abstract: The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and cl… ▽ More The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified. △ Less

Submitted 2 November, 2010; originally announced November 2010.

Comments: 10 pages, 3 tables and 2 figures

Journal ref: International Journal of Artificial Intelligence&Applications, Vol 1, No.4, Oct. 2010, pp 111-120

Showing 1–10 of 10 results for author: Balakrishnan, D