Skip to main content

Showing 1–19 of 19 results for author: Ambite, J L

.
  1. arXiv:2406.17235  [pdf, other

    cs.CV cs.AI cs.DC

    Task-Agnostic Federated Learning

    Authors: Zhengtao Yao, Hong Nguyen, Ajitesh Srivastava, Jose Luis Ambite

    Abstract: In the realm of medical imaging, leveraging large-scale datasets from various institutions is crucial for developing precise deep learning models, yet privacy concerns frequently impede data sharing. federated learning (FL) emerges as a prominent solution for preserving privacy while facilitating collaborative learning. However, its application in real-world scenarios faces several obstacles, such… ▽ More

    Submitted 6 January, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2205.08576 by other authors

  2. arXiv:2311.00334  [pdf, other

    cs.LG cs.AI cs.DC

    MetisFL: An Embarrassingly Parallelized Controller for Scalable & Efficient Federated Learning Workflows

    Authors: Dimitris Stripelis, Chrysovalantis Anastasiou, Patrick Toral, Armaghan Asghar, Jose Luis Ambite

    Abstract: A Federated Learning (FL) system typically consists of two core processing entities: the federation controller and the learners. The controller is responsible for managing the execution of FL workflows across learners and the learners for training and evaluating federated models over their private datasets. While executing an FL workflow, the FL system has no control over the computational resourc… ▽ More

    Submitted 13 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 15 pages, 11 figures, Accepted at DistributedML '23

  3. arXiv:2305.08985  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Learning over Harmonized Data Silos

    Authors: Dimitris Stripelis, Jose Luis Ambite

    Abstract: Federated Learning is a distributed machine learning approach that enables geographically distributed data silos to collaboratively learn a joint machine learning model without sharing data. Most of the existing work operates on unstructured data, such as images or text, or on structured data assumed to be consistent across the different sites. However, sites often have different schemata, data fo… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: Presented at the 7th International Workshop on Health Intelligence 2023 (W3PHIAI-23), 6 pages, 4 figures

    MSC Class: 68T07; 68M14; ACM Class: I.2; H.4

  4. arXiv:2208.11669  [pdf, other

    cs.LG cs.CR eess.IV q-bio.QM

    Towards Sparsified Federated Neuroimaging Models via Weight Pruning

    Authors: Dimitris Stripelis, Umang Gupta, Nikhil Dhinagar, Greg Ver Steeg, Paul Thompson, José Luis Ambite

    Abstract: Federated training of large deep neural networks can often be restrictive due to the increasing costs of communicating the updates with increasing model sizes. Various model pruning techniques have been designed in centralized settings to reduce inference times. Combining centralized pruning techniques with federated training seems intuitive for reducing communication costs -- by pruning the model… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Accepted to 3rd MICCAI Workshop on Distributed, Collaborative and Federated Learning (DeCaF, 2022)

  5. arXiv:2205.05249  [pdf, other

    cs.LG cs.CR cs.CV cs.DC

    Secure & Private Federated Neuroimaging

    Authors: Dimitris Stripelis, Umang Gupta, Hamza Saleem, Nikhil Dhinagar, Tanmay Ghai, Rafael Chrysovalantis Anastasiou, Armaghan Asghar, Greg Ver Steeg, Srivatsan Ravi, Muhammad Naveed, Paul M. Thompson, Jose Luis Ambite

    Abstract: The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use Federated Learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its… ▽ More

    Submitted 28 August, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: 18 pages, 13 figures, 2 tables

    ACM Class: I.2; I.5.1; J.3

  6. arXiv:2205.01184  [pdf, other

    cs.LG cs.CR

    Performance Weighting for Robust Federated Learning Against Corrupted Sources

    Authors: Dimitris Stripelis, Marcin Abram, Jose Luis Ambite

    Abstract: Federated Learning has emerged as a dominant computational paradigm for distributed machine learning. Its unique data privacy properties allow us to collaboratively train models while offering participating clients certain privacy-preserving guarantees. However, in real-world applications, a federated environment may consist of a mixture of benevolent and malicious clients, with the latter aiming… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: 27 pages, 40 figures

  7. arXiv:2204.12430  [pdf, other

    cs.LG

    Federated Progressive Sparsification (Purge, Merge, Tune)+

    Authors: Dimitris Stripelis, Umang Gupta, Greg Ver Steeg, Jose Luis Ambite

    Abstract: To improve federated training of neural networks, we develop FedSparsify, a sparsification strategy based on progressive weight magnitude pruning. Our method has several benefits. First, since the size of the network becomes increasingly smaller, computation and communication costs during training are reduced. Second, the models are incrementally constrained to a smaller set of parameters, which f… ▽ More

    Submitted 15 May, 2023; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: Accepted at the Workshop on Federated Learning: Recent Advances and New Challenges, in Conjunction with NeurIPS 2022 (FL-NeurIPS'22) 23 pages, 12 figures, 1 algorithm, 2 Tables

    MSC Class: 68T07 ACM Class: I.2.m

  8. arXiv:2203.15101  [pdf, other

    cs.CL cs.AI

    Federated Named Entity Recognition

    Authors: Joel Mathew, Dimitris Stripelis, José Luis Ambite

    Abstract: We present an analysis of the performance of Federated Learning in a paradigmatic natural-language processing task: Named-Entity Recognition (NER). For our evaluation, we use the language-independent CoNLL-2003 dataset as our benchmark dataset and a Bi-LSTM-CRF model as our benchmark NER model. We show that federated training reaches almost the same performance as the centralized model, though wit… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

  9. arXiv:2112.05313  [pdf, other

    cs.LG

    Building Autocorrelation-Aware Representations for Fine-Scale Spatiotemporal Prediction

    Authors: Yijun Lin, Yao-Yi Chiang, Meredith Franklin, Sandrah P. Eckel, José Luis Ambite

    Abstract: Many scientific prediction problems have spatiotemporal data- and modeling-related challenges in handling complex variations in space and time using only sparse and unevenly distributed observations. This paper presents a novel deep learning architecture, Deep learning predictions for LocATion-dependent Time-sEries data (DeepLATTE), that explicitly incorporates theories of spatial statistics into… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: Published in ICDM2020

  10. arXiv:2108.03437  [pdf, other

    cs.CR cs.LG

    Secure Neuroimaging Analysis using Federated Learning with Homomorphic Encryption

    Authors: Dimitris Stripelis, Hamza Saleem, Tanmay Ghai, Nikhil Dhinagar, Umang Gupta, Chrysovalantis Anastasiou, Greg Ver Steeg, Srivatsan Ravi, Muhammad Naveed, Paul M. Thompson, Jose Luis Ambite

    Abstract: Federated learning (FL) enables distributed computation of machine learning models over various disparate, remote data sources, without requiring to transfer any individual data to a centralized location. This results in an improved generalizability of models and efficient scaling of computation as more sources and larger datasets are added to the federation. Nevertheless, recent membership attack… ▽ More

    Submitted 9 November, 2021; v1 submitted 7 August, 2021; originally announced August 2021.

    Comments: 9 pages, 3 figures, 1 algorithm

  11. arXiv:2105.02866  [pdf, other

    q-bio.QM cs.CR cs.LG eess.IV

    Membership Inference Attacks on Deep Regression Models for Neuroimaging

    Authors: Umang Gupta, Dimitris Stripelis, Pradeep K. Lam, Paul M. Thompson, José Luis Ambite, Greg Ver Steeg

    Abstract: Ensuring the privacy of research participants is vital, even more so in healthcare environments. Deep learning approaches to neuroimaging require large datasets, and this often necessitates sharing data between multiple sites, which is antithetical to the privacy objectives. Federated learning is a commonly proposed solution to this problem. It circumvents the need for data sharing by sharing para… ▽ More

    Submitted 3 June, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: To appear at Medical Imaging with Deep Learning 2021 (MIDL 2021)

  12. arXiv:2102.08440  [pdf, other

    cs.LG cs.DC

    Scaling Neuroscience Research using Federated Learning

    Authors: Dimitris Stripelis, Jose Luis Ambite, Pradeep Lam, Paul Thompson

    Abstract: The amount of biomedical data continues to grow rapidly. However, the ability to analyze these data is limited due to privacy and regulatory concerns. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning is a promising approach to learn a joint model over data silos. This architecture does not share any s… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: To appear at IEEE International Symposium on Biomedical Imaging 2021 (ISBI 2021)

    MSC Class: 68T07 ACM Class: I.5.4

  13. Semi-Synchronous Federated Learning for Energy-Efficient Training and Accelerated Convergence in Cross-Silo Settings

    Authors: Dimitris Stripelis, Jose Luis Ambite

    Abstract: There are situations where data relevant to machine learning problems are distributed across multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning (FL) is a promising approach to learn a joint model over a… ▽ More

    Submitted 25 June, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: 30 pages, 12 figures

    MSC Class: 68T07; 68T09; 68M14; 68W15 ACM Class: I.2.6; I.5.1; K.6.4

  14. arXiv:2008.11281  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Accelerating Federated Learning in Heterogeneous Data and Computational Environments

    Authors: Dimitris Stripelis, Jose Luis Ambite

    Abstract: There are situations where data relevant to a machine learning problem are distributed among multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. For example, data present in users' cellphones, manufacturing data of companies in a given industrial sector, or medical records located at different hospitals. Moreover, participating sites often have dif… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

    MSC Class: 68T07; 68T09; 68M14; 68W15 ACM Class: I.2.6; I.5.1; K.6.4

  15. arXiv:1906.00282  [pdf, other

    cs.LG cs.CL stat.ML

    Biomedical Named Entity Recognition via Reference-Set Augmented Bootstrapping

    Authors: Joel Mathew, Shobeir Fakhraei, José Luis Ambite

    Abstract: We present a weakly-supervised data augmentation approach to improve Named Entity Recognition (NER) in a challenging domain: extracting biomedical entities (e.g., proteins) from the scientific literature. First, we train a neural NER (NNER) model over a small seed of fully-labeled examples. Second, we use a reference set of entity names (e.g., proteins in UniProt) to identify entity mentions with… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

    Comments: 5 pages, 1 Figure, 2 Table, ICML 2019 Workshop on Computational Biology

  16. arXiv:1811.07514  [pdf, other

    cs.IR cs.CL cs.DB cs.LG cs.NE

    NSEEN: Neural Semantic Embedding for Entity Normalization

    Authors: Shobeir Fakhraei, Joel Mathew, Jose Luis Ambite

    Abstract: Much of human knowledge is encoded in text, available in scientific publications, books, and the web. Given the rapid growth of these resources, we need automated methods to extract such knowledge into machine-processable structures, such as knowledge graphs. An important task in this process is entity normalization, which consists of mapping noisy entity mentions in text to canonical entities in… ▽ More

    Submitted 29 June, 2019; v1 submitted 19 November, 2018; originally announced November 2018.

    Comments: Accepted for publication at ECML-PKDD 2019

  17. Learning the Semantics of Structured Data Sources

    Authors: Mohsen Taheriyan, Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite

    Abstract: Information sources such as relational databases, spreadsheets, XML, JSON, and Web APIs contain a tremendous amount of structured data that can be leveraged to build and augment knowledge graphs. However, they rarely provide a semantic model to describe their contents. Semantic models of data sources represent the implicit meaning of the data by specifying the concepts and the relationships within… ▽ More

    Submitted 15 January, 2016; originally announced January 2016.

    Comments: Web Semantics: Science, Services and Agents on the World Wide Web, 2016

  18. Planning by Rewriting

    Authors: J. L. Ambite, C. A. Knoblock

    Abstract: Domain-independent planning is a hard combinatorial problem. Taking into account plan quality makes the task even more difficult. This article introduces Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning. PbR exploits declarative plan-rewriting rules and efficient local search techniques to transform an easy-to-generate, but possibly… ▽ More

    Submitted 1 June, 2011; originally announced June 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 15, pages 207-261, 2001

  19. arXiv:0909.1769  [pdf

    cs.DB cs.AI

    Interactive Data Integration through Smart Copy & Paste

    Authors: Zachary Ives, Craig Knoblock, Steve Minton, Marie Jacob, Partha Talukdar, Rattapoom Tuchinda, Jose Luis Ambite, Maria Muslea, Cenk Gazen

    Abstract: In many scenarios, such as emergency response or ad hoc collaboration, it is critical to reduce the overhead in integrating data. Ideally, one could perform the entire process interactively under one unified interface: defining extractors and wrappers for sources, creating a mediated schema, and adding schema mappings ? while seeing how these impact the integrated view of the data, and refining… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: CIDR 2009