Skip to main content

Showing 1–9 of 9 results for author: Abelló, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.18746  [pdf, other

    cs.LG

    A data-science pipeline to enable the Interpretability of Many-Objective Feature Selection

    Authors: Uchechukwu F. Njoku, Alberto Abelló, Besim Bilalli, Gianluca Bontempi

    Abstract: Many-Objective Feature Selection (MOFS) approaches use four or more objectives to determine the relevance of a subset of features in a supervised learning task. As a consequence, MOFS typically returns a large set of non-dominated solutions, which have to be assessed by the data scientist in order to proceed with the final choice. Given the multi-variate nature of the assessment, which may include… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 8 pages, 5 figures, 6 tables

  2. Federated Learning Enables Big Data for Rare Cancer Boundary Detection

    Authors: Sarthak Pati, Ujjwal Baid, Brandon Edwards, Micah Sheller, Shih-Han Wang, G Anthony Reina, Patrick Foley, Alexey Gruzdev, Deepthi Karkada, Christos Davatzikos, Chiharu Sako, Satyam Ghodasara, Michel Bilello, Suyash Mohan, Philipp Vollmuth, Gianluca Brugnara, Chandrakanth J Preetha, Felix Sahm, Klaus Maier-Hein, Maximilian Zenk, Martin Bendszus, Wolfgang Wick, Evan Calabrese, Jeffrey Rudie, Javier Villanueva-Meyer , et al. (254 additional authors not shown)

    Abstract: Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc… ▽ More

    Submitted 25 April, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: federated learning, deep learning, convolutional neural network, segmentation, brain tumor, glioma, glioblastoma, FeTS, BraTS

  3. Eris: Measuring discord among multidimensional data sources

    Authors: Alberto Abello, James Cheney

    Abstract: Data integration is a classical problem in databases, typically decomposed into schema matching, entity matching and data fusion. To solve the latter, it is mostly assumed that ground truth can be determined. However, in general, the data gathering processes in the different sources are imperfect and cannot provide an accurate merging of values. Thus, in the absence of ways to determine ground tru… ▽ More

    Submitted 17 August, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: 33 pages, 15 figures

    MSC Class: 68P15 ACM Class: H.2.4

  4. arXiv:2103.10811  [pdf, other

    cs.SE

    Improving Web API Usage Logging

    Authors: Rediana Koçi, Xavier Franch, Petar Jovanovic, Alberto Abelló

    Abstract: A Web API (WAPI) is a type of API whose interaction with its consumers is done through the Internet. While being accessed through the Internet can be challenging, mostly when WAPIs evolve, it gives providers the possibility to monitor their usage, and understand and analyze consumers' behavior. Currently, WAPI usage is mostly logged for traffic monitoring and troubleshooting. Even though they cont… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

  5. arXiv:2102.13125  [pdf, other

    cs.DC

    MEDAL: An AI-driven Data Fabric Concept for Elastic Cloud-to-Edge Intelligence

    Authors: Vasileios Theodorou, Ilias Gerostathopoulos, Iyad Alshabani, Alberto Abello, David Breitgand

    Abstract: Current Cloud solutions for Edge Computing are inefficient for data-centric applications, as they focus on the IaaS/PaaS level and they miss the data modeling and operations perspective. Consequently, Edge Computing opportunities are lost due to cumbersome and data assets-agnostic processes for end-to-end deployment over the Cloud-to-Edge continuum. In this paper, we introduce MEDAL, an intelligen… ▽ More

    Submitted 25 February, 2021; originally announced February 2021.

  6. arXiv:1806.03901  [pdf, other

    cs.DC

    A Cost-based Storage Format Selector for Materialization in Big Data Frameworks

    Authors: Rana Faisal Munir, Alberto Abelló, Oscar Romero, Maik Thiele, Wolfgang Lehner

    Abstract: Modern big data frameworks (such as Hadoop and Spark) allow multiple users to do large-scale analysis simultaneously. Typically, users deploy Data-Intensive Workflows (DIWs) for their analytical tasks. These DIWs of different users share many common parts (i.e, 50-80%), which can be materialized to reuse them in future executions. The materialization improves the overall processing time of DIWs an… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  7. arXiv:1805.02301  [pdf, other

    cs.CE

    Day-ahead Trading of Aggregated Energy Flexibility - Full Version

    Authors: Emmanouil Valsomatzis, Torben Bach Pedersen, Alberto Abello

    Abstract: Flexibility of small loads, in particular from Electric Vehicles (EVs), has recently attracted a lot of interest due to their possibility of participating in the energy market and the new commercial potentials. Different from existing work, the aggregation techniques proposed in this paper produce flexible aggregated loads from EVs taking into account technical market requirements. They can be fur… ▽ More

    Submitted 24 May, 2018; v1 submitted 6 May, 2018; originally announced May 2018.

    Comments: 9 pages, 7 figures, note paper of the full version to appear at ACM e-Energy 2018

  8. arXiv:1803.01024  [pdf, other

    cs.LG cs.DB

    PRESISTANT: Learning based assistant for data pre-processing

    Authors: Besim Bilalli, Alberto Abelló, Tomàs Aluja-Banet, Robert Wrembel

    Abstract: Data pre-processing is one of the most time consuming and relevant steps in a data analysis process (e.g., classification task). A given data pre-processing operator (e.g., transformation) can have positive, negative or zero impact on the final result of the analysis. Expert users have the required knowledge to find the right pre-processing operators. However, when it comes to non-experts, they ar… ▽ More

    Submitted 2 March, 2018; originally announced March 2018.

  9. An Integration-Oriented Ontology to Govern Evolution in Big Data Ecosystems

    Authors: Sergi Nadal, Oscar Romero, Alberto Abelló, Panos Vassiliadis, Stijn Vansummeren

    Abstract: Big Data architectures allow to flexibly store and process heterogeneous data, from multiple sources, in their original format. The structure of those data, commonly supplied by means of REST APIs, is continuously evolving. Thus data analysts need to adapt their analytical processes after each API release. This gets more challenging when performing an integrated or historical analysis. To cope wit… ▽ More

    Submitted 16 January, 2018; originally announced January 2018.

    Comments: Preprint submitted to Information Systems. 35 pages