-
Learning Molecular Representation in a Cell
Authors:
Gang Liu,
Srijit Seal,
John Arevalo,
Zhenwen Liang,
Anne E. Carpenter,
Meng Jiang,
Shantanu Singh
Abstract:
Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignme…
▽ More
Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignment (InfoAlign) approach to learn molecular representations through the information bottleneck method in cells. We integrate molecules and cellular response data as nodes into a context graph, connecting them with weighted edges based on chemical, biological, and computational criteria. For each molecule in a training batch, InfoAlign optimizes the encoder's latent representation with a minimality objective to discard redundant structural information. A sufficiency objective decodes the representation to align with different feature spaces from the molecule's neighborhood in the context graph. We demonstrate that the proposed sufficiency objective for alignment is tighter than existing encoder-based contrastive methods. Empirically, we validate representations from InfoAlign in two downstream applications: molecular property prediction against up to 27 baseline methods across four datasets, plus zero-shot molecule-morphology matching.
△ Less
Submitted 2 October, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Cell Painting Gallery: an open resource for image-based profiling
Authors:
Erin Weisbart,
Ankur Kumar,
John Arevalo,
Anne E. Carpenter,
Beth A. Cimini,
Shantanu Singh
Abstract:
Image-based or morphological profiling is a rapidly expanding field wherein cells are "profiled" by extracting hundreds to thousands of unbiased, quantitative features from images of cells that have been perturbed by genetic or chemical perturbations. The Cell Painting assay is the most popular imaged-based profiling assay wherein six small-molecule dyes label eight cellular compartments and thous…
▽ More
Image-based or morphological profiling is a rapidly expanding field wherein cells are "profiled" by extracting hundreds to thousands of unbiased, quantitative features from images of cells that have been perturbed by genetic or chemical perturbations. The Cell Painting assay is the most popular imaged-based profiling assay wherein six small-molecule dyes label eight cellular compartments and thousands of measurements are made, describing quantitative traits such as size, shape, intensity, and texture within the nucleus, cytoplasm, and whole cell (Cimini et al., 2023). We have created the Cell Painting Gallery, a publicly available collection of Cell Painting datasets, with granular dataset descriptions and access instructions. It is hosted by AWS on the Registry of Open Data (RODA). As of January 2024, the Cell Painting Gallery holds 656 terabytes (TB) of image and associated numerical data. It includes the largest publicly available Cell Painting dataset, in terms of perturbations tested (Joint Undertaking for Morphological Profiling or JUMP (Chandrasekaran et al., 2023)), along with many other canonical datasets using Cell Painting, close derivatives of Cell Painting (such as LipocyteProfiler (Laber et al., 2023) and Pooled Cell Painting (Ramezani et al., 2023)).
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Reproducible image-based profiling with Pycytominer
Authors:
Erik Serrano,
Srinivas Niranj Chandrasekaran,
Dave Bunten,
Kenneth I. Brewer,
Jenna Tomkinson,
Roshan Kern,
Michael Bornholdt,
Stephen Fleming,
Ruifan Pei,
John Arevalo,
Hillary Tsang,
Vincent Rubinetti,
Callum Tromans-Coia,
Tim Becker,
Erin Weisbart,
Charlotte Bunne,
Alexandr A. Kalinin,
Rebecca Senft,
Stephen J. Taylor,
Nasim Jamali,
Adeniyi Adeboye,
Hamdah Shafqat Abbasi,
Allen Goodman,
Juan C. Caicedo,
Anne E. Carpenter
, et al. (3 additional authors not shown)
Abstract:
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics st…
▽ More
Advances in high-throughput microscopy have enabled the rapid acquisition of large numbers of high-content microscopy images. Whether by deep learning or classical algorithms, image analysis pipelines then produce single-cell features. To process these single-cells for downstream applications, we present Pycytominer, a user-friendly, open-source python package that implements the bioinformatics steps, known as image-based profiling. We demonstrate Pycytominers usefulness in a machine learning project to predict nuisance compounds that cause undesirable cell injuries.
△ Less
Submitted 2 July, 2024; v1 submitted 22 November, 2023;
originally announced November 2023.
-
Local migration quantification method for scratch assays
Authors:
Ana Victoria Ponce Bobadilla,
Jazmine Arévalo,
Eduard Sarró,
Helen Byrne,
Philip K Maini,
Thomas Carraro,
Simone Balocco,
Anna Meseguer,
Tomás Alarcón
Abstract:
Motivation: The scratch assay is a standard experimental protocol used to characterize cell migration. It can be used to identify genes that regulate migration and evaluate the efficacy of potential drugs that inhibit cancer invasion. In these experiments, a scratch is made on a cell monolayer and recolonisation of the scratched region is imaged to quantify cell migration rates. A drawback of this…
▽ More
Motivation: The scratch assay is a standard experimental protocol used to characterize cell migration. It can be used to identify genes that regulate migration and evaluate the efficacy of potential drugs that inhibit cancer invasion. In these experiments, a scratch is made on a cell monolayer and recolonisation of the scratched region is imaged to quantify cell migration rates. A drawback of this methodology is the lack of its reproducibility resulting in irregular cell-free areas with crooked leading edges. Existing quantification methods deal poorly with such resulting irregularities present in the data. Results: We introduce a new quantification method that can analyse low quality experimental data. By considering in-silico and in-vitro data, we show that the method provides a more accurate statistical classification of the migration rates than two established quantification methods. The application of this method will enable the quantification of migration rates of scratch assay data previously unsuitable for analysis. Availability and Implementation: The source code and the implementation of the algorithm as a GUI along with an example dataset and user instructions, are available in https://bitbucket.org/anavictoria-ponce/local_migration_quantification_scratch_assays/src/master/. The datasets are available in https://ganymed.math.uni-heidelberg.de/~victoria/publications.shtml.
△ Less
Submitted 24 June, 2018;
originally announced June 2018.