Skip to main content

Showing 1–12 of 12 results for author: Tyers, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.19865  [pdf, ps, other

    q-bio.BM cs.AI cs.LG

    Scalable and Cost-Efficient de Novo Template-Based Molecular Generation

    Authors: Piotr Gaiński, Oussama Boussif, Andrei Rekesh, Dmytro Shevchuk, Ali Parviz, Mike Tyers, Robert A. Batey, Michał Koziarski

    Abstract: Template-based molecular generation offers a promising avenue for drug design by ensuring generated compounds are synthetically accessible through predefined reaction templates and building blocks. In this work, we tackle three core challenges in template-based GFlowNets: (1) minimizing synthesis cost, (2) scaling to large building block libraries, and (3) effectively utilizing small fragment sets… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  2. arXiv:2406.08506  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    RGFN: Synthesizable Molecular Generation Using GFlowNets

    Authors: Michał Koziarski, Andrei Rekesh, Dmytro Shevchuk, Almer van der Sloot, Piotr Gaiński, Yoshua Bengio, Cheng-Hao Liu, Mike Tyers, Robert A. Batey

    Abstract: Generative models hold great promise for small molecule discovery, significantly increasing the size of search space compared to traditional in silico screening libraries. However, most existing machine learning methods for small molecule generation suffer from poor synthesizability of candidate compounds, making experimental validation difficult. In this paper we propose Reaction-GFlowNet (RGFN),… ▽ More

    Submitted 6 November, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  3. arXiv:2405.01616  [pdf, other

    q-bio.BM cs.AI cs.LG

    Generative Active Learning for the Search of Small-molecule Protein Binders

    Authors: Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra , et al. (9 additional authors not shown)

    Abstract: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecu… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  4. arXiv:2310.06764  [pdf, other

    cs.CL

    OmniLingo: Listening- and speaking-based language learning

    Authors: Francis M. Tyers, Nicholas Howell

    Abstract: In this demo paper we present OmniLingo, an architecture for distributing data for listening- and speaking-based language learning applications and a demonstration client built using the architecture. The architecture is based on the Interplanetary Filesystem (IPFS) and puts at the forefront user sovereignty over data.

    Submitted 10 October, 2023; originally announced October 2023.

  5. arXiv:2209.13518  [pdf

    q-bio.BM cs.AI cs.LG

    Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

    Authors: Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio

    Abstract: As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year. Antimicrobial Peptides (AMPs) are one of the candidates to tackle this problem because of their potential diversity, and ability to favorably modulate the host immune response. However, large-scale screenin… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: Under Review at Sciences Advances

  6. arXiv:2209.09742  [pdf, other

    cs.CL

    Yet Another Format of Universal Dependencies for Korean

    Authors: Yige Chen, Eunkyul Leah Jo, Yundong Yao, KyungTae Lim, Miikka Silfverberg, Francis M. Tyers, Jungyeul Park

    Abstract: In this study, we propose a morpheme-based scheme for Korean dependency parsing and adopt the proposed scheme to Universal Dependencies. We present the linguistic rationale that illustrates the motivation and the necessity of adopting the morpheme-based format, and develop scripts that convert between the original format used by Universal Dependencies and the proposed morpheme-based format automat… ▽ More

    Submitted 20 September, 2022; originally announced September 2022.

    Comments: COLING2022, Poster

  7. arXiv:2205.03608  [pdf, other

    cs.CL

    UniMorph 4.0: Universal Morphology

    Authors: Khuyagbaatar Batsuren, Omer Goldman, Salam Khalifa, Nizar Habash, Witold Kieraś, Gábor Bella, Brian Leonard, Garrett Nicolai, Kyle Gorman, Yustinus Ghanggo Ate, Maria Ryskina, Sabrina J. Mielke, Elena Budianskaya, Charbel El-Khaissi, Tiago Pimentel, Michael Gasser, William Lane, Mohit Raj, Matt Coler, Jaime Rafael Montoya Samame, Delio Siticonatzi Camaiteri, Benoît Sagot, Esaú Zumaeta Rojas, Didier López Francis, Arturo Oncevay , et al. (71 additional authors not shown)

    Abstract: The Universal Morphology (UniMorph) project is a collaborative effort providing broad-coverage instantiated normalized morphological inflection tables for hundreds of diverse world languages. The project comprises two major thrusts: a language-independent feature schema for rich morphological annotation and a type-level resource of annotated data in diverse languages realizing that schema. This pa… ▽ More

    Submitted 19 June, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: LREC 2022; The first two authors made equal contributions

  8. arXiv:2202.04202  [pdf, other

    q-bio.QM cs.LG

    RECOVER: sequential model optimization platform for combination drug repurposing identifies novel synergistic compounds in vitro

    Authors: Paul Bertin, Jarrid Rector-Brooks, Deepak Sharma, Thomas Gaudelet, Andrew Anighoro, Torsten Gross, Francisco Martinez-Pena, Eileen L. Tang, Suraj M S, Cristian Regep, Jeremy Hayter, Maksym Korablyov, Nicholas Valiante, Almer van der Sloot, Mike Tyers, Charles Roberts, Michael M. Bronstein, Luke L. Lairson, Jake P. Taylor-King, Yoshua Bengio

    Abstract: For large libraries of small molecules, exhaustive combinatorial chemical screens become infeasible to perform when considering a range of disease models, assay conditions, and dose ranges. Deep learning models have achieved state of the art results in silico for the prediction of synergy scores. However, databases of drug combinations are biased towards synergistic agents and these results do not… ▽ More

    Submitted 2 March, 2023; v1 submitted 6 February, 2022; originally announced February 2022.

  9. arXiv:2105.04674  [pdf

    cs.CL cs.LG cs.SD eess.AS

    What shall we do with an hour of data? Speech recognition for the un- and under-served languages of Common Voice

    Authors: Francis M. Tyers, Josh Meyer

    Abstract: This technical report describes the methods and results of a three-week sprint to produce deployable speech recognition models for 31 under-served languages of the Common Voice project. We outline the preprocessing steps, hyperparameter selection, and resulting accuracy on official testing sets. In addition to this we evaluate the models on multiple tasks: closed-vocabulary speech recognition, pre… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  10. arXiv:2102.03662  [pdf, other

    cs.CL cs.SD eess.AS

    A bandit approach to curriculum generation for automatic speech recognition

    Authors: Anastasia Kuznetsova, Anurag Kumar, Francis M. Tyers

    Abstract: The Automated Speech Recognition (ASR) task has been a challenging domain especially for low data scenarios with few audio examples. This is the main problem in training ASR systems on the data from low-resource or marginalized languages. In this paper we present an approach to mitigate the lack of training data by employing Automated Curriculum Learning in combination with an adversarial bandit a… ▽ More

    Submitted 6 February, 2021; originally announced February 2021.

  11. arXiv:1912.06670  [pdf, other

    cs.CL cs.LG

    Common Voice: A Massively-Multilingual Speech Corpus

    Authors: Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, Gregor Weber

    Abstract: The Common Voice corpus is a massively-multilingual collection of transcribed speech intended for speech technology research and development. Common Voice is designed for Automatic Speech Recognition purposes but can be useful in other domains (e.g. language identification). To achieve scale and sustainability, the Common Voice project employs crowdsourcing for both data collection and data valida… ▽ More

    Submitted 5 March, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: Accepted to LREC 2020

  12. arXiv:1809.04022  [pdf, ps, other

    cs.CL

    Can LSTM Learn to Capture Agreement? The Case of Basque

    Authors: Shauli Ravfogel, Francis M. Tyers, Yoav Goldberg

    Abstract: Sequential neural networks models are powerful tools in a variety of Natural Language Processing (NLP) tasks. The sequential nature of these models raises the questions: to what extent can these models implicitly learn hierarchical structures typical to human language, and what kind of grammatical phenomena can they acquire? We focus on the task of agreement prediction in Basque, as a case study… ▽ More

    Submitted 26 November, 2018; v1 submitted 11 September, 2018; originally announced September 2018.

    Comments: Accepted to "Analyzing and interpreting neural networks for NLP" workshop at EMNLP 2018