-
MAMMAL -- Molecular Aligned Multi-Modal Architecture and Language
Authors:
Yoel Shoshan,
Moshiko Raboh,
Michal Ozery-Flato,
Vadim Ratner,
Alex Golts,
Jeffrey K. Weber,
Ella Barkan,
Simona Rabinovici-Cohen,
Sagi Polaczek,
Ido Amos,
Ben Shapira,
Liam Hazan,
Matan Ninio,
Sivan Ravid,
Michael M. Danziger,
Yosi Shamay,
Sharon Kurant,
Joseph A. Morrone,
Parthasarathy Suryanarayanan,
Michal Rosen-Zvi,
Efrat Hexter
Abstract:
Large language models applied to vast biological datasets have the potential to transform biology by uncovering disease mechanisms and accelerating drug development. However, current models are often siloed, trained separately on small-molecules, proteins, or transcriptomic data, limiting their ability to capture complex, multi-modal interactions. Effective drug discovery requires computational to…
▽ More
Large language models applied to vast biological datasets have the potential to transform biology by uncovering disease mechanisms and accelerating drug development. However, current models are often siloed, trained separately on small-molecules, proteins, or transcriptomic data, limiting their ability to capture complex, multi-modal interactions. Effective drug discovery requires computational tools that integrate multiple biological entities while supporting prediction and generation, a challenge existing models struggle to address. For this purpose, we present MAMMAL - Molecular Aligned Multi-Modal Architecture and Language - a versatile method applied to create a multi-task foundation model that learns from large-scale biological datasets across diverse modalities, including proteins, small-molecules, and omics. MAMMAL's structured prompt syntax supports classification, regression, and generation tasks while handling token and scalar inputs and outputs. Evaluated on eleven diverse downstream tasks, it reaches a new state of the art (SOTA) in nine tasks and is comparable to SOTA in two tasks, all within a unified architecture, unlike prior task-specific models. Additionally, we explored Alphafold 3 binding prediction capabilities on antibody-antigen and nanobody-antigen complexes showing significantly better classification performance of MAMMAL in 3 out of 4 targets. The model code and pretrained weights are publicly available at https://github.com/BiomedSciAI/biomed-multi-alignment and https://huggingface.co/ibm/biomed.omics.bl.sm.ma-ted-458m
△ Less
Submitted 6 May, 2025; v1 submitted 28 October, 2024;
originally announced October 2024.
-
A large dataset curation and benchmark for drug target interaction
Authors:
Alex Golts,
Vadim Ratner,
Yoel Shoshan,
Moshe Raboh,
Sagi Polaczek,
Michal Ozery-Flato,
Daniel Shats,
Liam Hazan,
Sivan Ravid,
Efrat Hexter
Abstract:
Bioactivity data plays a key role in drug discovery and repurposing. The resource-demanding nature of \textit{in vitro} and \textit{in vivo} experiments, as well as the recent advances in data-driven computational biochemistry research, highlight the importance of \textit{in silico} drug target interaction (DTI) prediction approaches. While numerous large public bioactivity data sources exist, res…
▽ More
Bioactivity data plays a key role in drug discovery and repurposing. The resource-demanding nature of \textit{in vitro} and \textit{in vivo} experiments, as well as the recent advances in data-driven computational biochemistry research, highlight the importance of \textit{in silico} drug target interaction (DTI) prediction approaches. While numerous large public bioactivity data sources exist, research in the field could benefit from better standardization of existing data resources. At present, different research works that share similar goals are often difficult to compare properly because of different choices of data sources and train/validation/test split strategies. Additionally, many works are based on small data subsets, leading to results and insights of possible limited validity. In this paper we propose a way to standardize and represent efficiently a very large dataset curated from multiple public sources, split the data into train, validation and test sets based on different meaningful strategies, and provide a concrete evaluation protocol to accomplish a benchmark. We analyze the proposed data curation, prove its usefulness and validate the proposed benchmark through experimental studies based on an existing neural network model.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Regularized adversarial examples for model interpretability
Authors:
Yoel Shoshan,
Vadim Ratner
Abstract:
As machine learning algorithms continue to improve, there is an increasing need for explaining why a model produces a certain prediction for a certain input. In recent years, several methods for model interpretability have been developed, aiming to provide explanation of which subset regions of the model input is the main reason for the model prediction. In parallel, a significant research communi…
▽ More
As machine learning algorithms continue to improve, there is an increasing need for explaining why a model produces a certain prediction for a certain input. In recent years, several methods for model interpretability have been developed, aiming to provide explanation of which subset regions of the model input is the main reason for the model prediction. In parallel, a significant research community effort is occurring in recent years for developing adversarial example generation methods for fooling models, while not altering the true label of the input,as it would have been classified by a human annotator. In this paper, we bridge the gap between adversarial example generation and model interpretability, and introduce a modification to the adversarial example generation process which encourages better interpretability. We analyze the proposed method on a public medical imaging dataset, both quantitatively and qualitatively, and show that it significantly outperforms the leading known alternative method. Our suggested method is simple to implement, and can be easily plugged into most common adversarial example generation frameworks. Additionally, we propose an explanation quality metric - $APE$ - "Adversarial Perturbative Explanation", which measures how well an explanation describes model decisions.
△ Less
Submitted 21 November, 2018; v1 submitted 18 November, 2018;
originally announced November 2018.
-
Learning multiple non-mutually-exclusive tasks for improved classification of inherently ordered labels
Authors:
Vadim Ratner,
Yoel Shoshan,
Tal Kachman
Abstract:
Medical image classification involves thresholding of labels that represent malignancy risk levels. Usually, a task defines a single threshold, and when developing computer-aided diagnosis tools, a single network is trained per such threshold, e.g. as screening out healthy (very low risk) patients to leave possibly sick ones for further analysis (low threshold), or trying to find malignant cases a…
▽ More
Medical image classification involves thresholding of labels that represent malignancy risk levels. Usually, a task defines a single threshold, and when developing computer-aided diagnosis tools, a single network is trained per such threshold, e.g. as screening out healthy (very low risk) patients to leave possibly sick ones for further analysis (low threshold), or trying to find malignant cases among those marked as non-risk by the radiologist ("second reading", high threshold). We propose a way to rephrase the classification problem in a manner that yields several problems (corresponding to different thresholds) to be solved simultaneously. This allows the use of Multiple Task Learning (MTL) methods, significantly improving the performance of the original classifier, by facilitating effective extraction of information from existing data.
△ Less
Submitted 21 November, 2018; v1 submitted 30 May, 2018;
originally announced May 2018.
-
AdapterNet - learning input transformation for domain adaptation
Authors:
Alon Hazan,
Yoel Shoshan,
Daniel Khapun,
Roy Aladjem,
Vadim Ratner
Abstract:
Deep neural networks have demonstrated impressive performance in various machine learning tasks. However, they are notoriously sensitive to changes in data distribution. Often, even a slight change in the distribution can lead to drastic performance reduction. Artificially augmenting the data may help to some extent, but in most cases, fails to achieve model invariance to the data distribution. So…
▽ More
Deep neural networks have demonstrated impressive performance in various machine learning tasks. However, they are notoriously sensitive to changes in data distribution. Often, even a slight change in the distribution can lead to drastic performance reduction. Artificially augmenting the data may help to some extent, but in most cases, fails to achieve model invariance to the data distribution. Some examples where this sub-class of domain adaptation can be valuable are various imaging modalities such as thermal imaging, X-ray, ultrasound, and MRI, where changes in acquisition parameters or acquisition device manufacturer will result in a different representation of the same input. Our work shows that standard fine-tuning fails to adapt the model in certain important cases. We propose a novel method of adapting to a new data source, and demonstrate near perfect adaptation on a customized ImageNet benchmark. Moreover, our method does not require any samples from the original data set, it is completely explainable and can be tailored to the task.
△ Less
Submitted 15 November, 2018; v1 submitted 29 May, 2018;
originally announced May 2018.