Skip to main content

Showing 1–11 of 11 results for author: Mollaysa, A

.
  1. arXiv:2506.21028  [pdf, ps, other

    cs.LG

    TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence

    Authors: Feng Jiang, Mangal Prakash, Hehuan Ma, Jianyuan Deng, Yuzhi Guo, Amina Mollaysa, Tommaso Mansi, Rui Liao, Junzhou Huang

    Abstract: Molecular property prediction aims to learn representations that map chemical structures to functional properties. While multimodal learning has emerged as a powerful paradigm to learn molecular representations, prior works have largely overlooked textual and taxonomic information of molecules for representation learning. We introduce TRIDENT, a novel framework that integrates molecular SMILES, te… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  2. arXiv:2506.08936  [pdf, ps, other

    cs.LG

    BioLangFusion: Multimodal Fusion of DNA, mRNA, and Protein Language Models

    Authors: Amina Mollaysa, Artem Moskale, Pushpak Pati, Tommaso Mansi, Mangal Prakash, Rui Liao

    Abstract: We present BioLangFusion, a simple approach for integrating pre-trained DNA, mRNA, and protein language models into unified molecular representations. Motivated by the central dogma of molecular biology (information flow from gene to transcript to protein), we align per-modality embeddings at the biologically meaningful codon level (three nucleotides encoding one amino acid) to ensure direct cross… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Proceedings of ICML 2025 Workshop on Multi-modal Foundation Proceedings of ICML 2025 Workshop on Multi-modal Foundation Proceedings of ICML 2025 Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences

  3. arXiv:2409.00046  [pdf, other

    q-bio.BM cs.LG

    Rethinking Molecular Design: Integrating Latent Variable and Auto-Regressive Models for Goal Directed Generation

    Authors: Heath Arthur-Loui, Amina Mollaysa, Michael Krauthammer

    Abstract: De novo molecule design has become a highly active research area, advanced significantly through the use of state-of-the-art generative models. Despite these advances, several fundamental questions remain unanswered as the field increasingly focuses on more complex generative models and sophisticated molecular representations as an answer to the challenges of drug design. In this paper, we return… ▽ More

    Submitted 6 September, 2024; v1 submitted 19 August, 2024; originally announced September 2024.

    Journal ref: Proceedings of the ICML 2024 Workshop on Accessible and Effi- cient Foundation Models for Biological Discovery

  4. arXiv:2311.07744  [pdf, other

    cs.LG cs.CY

    Two-Stage Aggregation with Dynamic Local Attention for Irregular Time Series

    Authors: Xingyu Chen, Xiaochen Zheng, Amina Mollaysa, Manuel Schürch, Ahmed Allam, Michael Krauthammer

    Abstract: Irregular multivariate time series data is characterized by varying time intervals between consecutive observations of measured variables/signals (i.e., features) and varying sampling rates (i.e., recordings/measurement) across these features. Modeling time series while taking into account these irregularities is still a challenging task for machine learning methods. Here, we introduce TADA, a Two… ▽ More

    Submitted 25 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: A short version of this paper has been accepted for presentation at the Findings of Machine Learning for Health (ML4H) 2023 conference

  5. arXiv:2311.07636  [pdf, other

    q-bio.GN cs.LG

    Attention-based Multi-task Learning for Base Editor Outcome Prediction

    Authors: Amina Mollaysa, Ahmed Allam, Michael Krauthammer

    Abstract: Human genetic diseases often arise from point mutations, emphasizing the critical need for precise genome editing techniques. Among these, base editing stands out as it allows targeted alterations at the single nucleotide level. However, its clinical application is hindered by low editing efficiency and unintended mutations, necessitating extensive trial-and-error experimentation in the laboratory… ▽ More

    Submitted 15 November, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 15 pages. arXiv admin note: substantial text overlap with arXiv:2310.02919

  6. arXiv:2310.02919  [pdf, other

    cs.LG

    Attention-based Multi-task Learning for Base Editor Outcome Prediction

    Authors: Amina Mollaysa, Ahmed Allam, Michael Krauthammer

    Abstract: Human genetic diseases often arise from point mutations, emphasizing the critical need for precise genome editing techniques. Among these, base editing stands out as it allows targeted alterations at the single nucleotide level. However, its clinical application is hindered by low editing efficiency and unintended mutations, necessitating extensive trial-and-error experimentation in the laboratory… ▽ More

    Submitted 10 November, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  7. arXiv:2309.16521  [pdf, other

    stat.ML cs.LG

    Generating Personalized Insulin Treatments Strategies with Deep Conditional Generative Time Series Models

    Authors: Manuel Schürch, Xiang Li, Ahmed Allam, Giulia Rathmes, Amina Mollaysa, Claudia Cavelti-Weder, Michael Krauthammer

    Abstract: We propose a novel framework that combines deep generative time series models with decision theory for generating personalized treatment strategies. It leverages historical patient trajectory data to jointly learn the generation of realistic personalized treatment and future outcome trajectories through deep generative time series models. In particular, our framework enables the generation of nove… ▽ More

    Submitted 13 November, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 17 pages

    Journal ref: Machine Learning for Health (ML4H) 2023

  8. arXiv:2303.18205  [pdf, other

    cs.LG

    Simple Contrastive Representation Learning for Time Series Forecasting

    Authors: Xiaochen Zheng, Xingyu Chen, Manuel Schürch, Amina Mollaysa, Ahmed Allam, Michael Krauthammer

    Abstract: Contrastive learning methods have shown an impressive ability to learn meaningful representations for image or time series classification. However, these methods are less effective for time series forecasting, as optimization of instance discrimination is not directly applicable to predicting the future state from the historical context. To address these limitations, we propose SimTS, a simple rep… ▽ More

    Submitted 11 November, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: Extended version. A shortened version was accepted by the 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), see https://ieeexplore.ieee.org/document/10446875

  9. arXiv:2210.00802  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    DDoS: A Graph Neural Network based Drug Synergy Prediction Algorithm

    Authors: Kyriakos Schwarz, Alicia Pliego-Mendieta, Amina Mollaysa, Lara Planas-Paz, Chantal Pauli, Ahmed Allam, Michael Krauthammer

    Abstract: Drug synergy arises when the combined impact of two drugs exceeds the sum of their individual effects. While single-drug effects on cell lines are well-documented, the scarcity of data on drug synergy, considering the vast array of potential drug combinations, prompts a growing interest in computational approaches for predicting synergies in untested drug pairs. We introduce a Graph Neural Network… ▽ More

    Submitted 26 April, 2024; v1 submitted 3 October, 2022; originally announced October 2022.

  10. arXiv:2010.02311  [pdf, other

    cs.LG stat.ML

    Goal-directed Generation of Discrete Structures with Conditional Generative Models

    Authors: Amina Mollaysa, Brooks Paige, Alexandros Kalousis

    Abstract: Despite recent advances, goal-directed generation of structured discrete data remains challenging. For problems such as program synthesis (generating source code) and materials design (generating molecules), finding examples which satisfy desired constraints or exhibit desired properties is difficult. In practice, expensive heuristic search or reinforcement learning algorithms are often employed.… ▽ More

    Submitted 23 October, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

  11. arXiv:1703.02570  [pdf, other

    cs.LG stat.ML

    Regularising Non-linear Models Using Feature Side-information

    Authors: Amina Mollaysa, Pablo Strasser, Alexandros Kalousis

    Abstract: Very often features come with their own vectorial descriptions which provide detailed information about their properties. We refer to these vectorial descriptions as feature side-information. In the standard learning scenario, input is represented as a vector of features and the feature side-information is most often ignored or used only for feature selection prior to model fitting. We believe tha… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    Comments: 11 page with appendix