Skip to main content

Showing 1–11 of 11 results for author: Amorim, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.00379  [pdf, other

    cs.LG stat.ML

    Improving clustering quality evaluation in noisy Gaussian mixtures

    Authors: Renato Cordeiro de Amorim, Vladimir Makarenkov

    Abstract: Clustering is a well-established technique in machine learning and data analysis, widely used across various domains. Cluster validity indices, such as the Average Silhouette Width, Calinski-Harabasz, and Davies-Bouldin indices, play a crucial role in assessing clustering quality when external ground truth labels are unavailable. However, these measures can be affected by the feature relevance iss… ▽ More

    Submitted 27 March, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

  2. arXiv:2411.19733  [pdf

    cs.CL

    A Deep Learning Approach to Language-independent Gender Prediction on Twitter

    Authors: Reyhaneh Hashempour, Barbara Plank, Aline Villavicencio, Renato Cordeiro de Amorim

    Abstract: This work presents a set of experiments conducted to predict the gender of Twitter users based on language-independent features extracted from the text of the users' tweets. The experiments were performed on a version of TwiSty dataset including tweets written by the users of six different languages: Portuguese, French, Dutch, English, German, and Italian. Logistic regression (LR), and feed-forwar… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Journal ref: Proceedings of the 2019 Workshop on Widening NLP, pp. 92-94, Florence, Italy

  3. arXiv:2409.18865  [pdf, other

    stat.ML cs.AI cs.CV cs.LG cs.SI

    Positional Encoder Graph Quantile Neural Networks for Geographic Data

    Authors: William E. R. de Amorim, Scott A. Sisson, T. Rodrigues, David J. Nott, Guilherme S. Rodrigues

    Abstract: Positional Encoder Graph Neural Networks (PE-GNNs) are among the most effective models for learning from continuous spatial data. However, their predictive distributions are often poorly calibrated, limiting their utility in applications that require reliable uncertainty quantification. We propose the Positional Encoder Graph Quantile Neural Network (PE-GQNN), a novel framework that combines PE-GN… ▽ More

    Submitted 15 May, 2025; v1 submitted 27 September, 2024; originally announced September 2024.

    Comments: 12 main text pages, 4 figures

  4. arXiv:2308.12152  [pdf, other

    cs.GR cs.CG

    Geo-Sketcher: Rapid 3D Geological Modeling using Geological and Topographic Map Sketches

    Authors: Ronan Amorim, Emilio Vital Brazil, Faramarz Samavati, Mario Costa Sousa

    Abstract: The construction of 3D geological models is an essential task in oil/gas exploration, development and production. However, it is a cumbersome, time-consuming and error-prone task mainly because of the model's geometric and topological complexity. The models construction is usually separated into interpretation and 3D modeling, performed by different highly specialized individuals, which leads to i… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: 21 pages, 30 Figures

  5. Power Allocation for Uplink Communications of Massive Cellular-Connected UAVs

    Authors: Xuesong Cai, István Z. Kovács, Jeroen Wigard, Rafhael Amorim, Fredrik Tufvesson, Preben E. Mogensen

    Abstract: Cellular-connected unmanned aerial vehicle (UAV) has attracted a surge of research interest in both academia and industry. To support aerial user equipment (UEs) in the existing cellular networks, one promising approach is to assign a portion of the system bandwidth exclusively to the UAV-UEs. This is especially favorable for use cases where a large number of UAV-UEs are exploited, e.g., for packa… ▽ More

    Submitted 26 February, 2023; v1 submitted 25 July, 2021; originally announced July 2021.

    Comments: The final version can be found in IEEE Transactions on Vehicular Technology

  6. Improving cluster recovery with feature rescaling factors

    Authors: Renato Cordeiro de Amorim, Vladimir Makarenkov

    Abstract: The data preprocessing stage is crucial in clustering. Features may describe entities using different scales. To rectify this, one usually applies feature normalisation aiming at rescaling features so that none of them overpowers the others in the objective function of the selected clustering algorithm. In this paper, we argue that the rescaling procedure should not treat all features identically.… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  7. arXiv:2008.01175  [pdf, other

    cs.CR cs.LG stat.ML

    Identifying meaningful clusters in malware data

    Authors: Renato Cordeiro de Amorim, Carlos David Lopez Ruiz

    Abstract: Finding meaningful clusters in drive-by-download malware data is a particularly difficult task. Malware data tends to contain overlapping clusters with wide variations of cardinality. This happens because there can be considerable similarity between malware samples (some are even said to belong to the same family), and these tend to appear in bursts. Clustering algorithms are usually applied to no… ▽ More

    Submitted 31 July, 2020; originally announced August 2020.

  8. arXiv:1811.07615  [pdf, other

    cs.LG stat.ML

    An efficient density-based clustering algorithm using reverse nearest neighbour

    Authors: Stiphen Chowdhury, Renato Cordeiro de Amorim

    Abstract: Density-based clustering is the task of discovering high-density regions of entities (clusters) that are separated from each other by contiguous regions of low-density. DBSCAN is, arguably, the most popular density-based clustering algorithm. However, its cluster recovery capabilities depend on the combination of the two parameters. In this paper we present a new density-based clustering algorithm… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

    Comments: Accepted in: Computing Conference 2019 in London, UK. http://saiconference.com/Computing

  9. A-Ward_p\b{eta}: Effective hierarchical clustering using the Minkowski metric and a fast k -means initialisation

    Authors: Renato Cordeiro de Amorim, Vladimir Makarenkov, Boris Mirkin

    Abstract: In this paper we make two novel contributions to hierarchical clustering. First, we introduce an anomalous pattern initialisation method for hierarchical clustering algorithms, called A-Ward, capable of substantially reducing the time they take to converge. This method generates an initial partition with a sufficiently large number of clusters. This allows the cluster merging process to start from… ▽ More

    Submitted 3 November, 2016; originally announced November 2016.

    Journal ref: Information Sciences, 370, 343-354 (2016)

  10. Recovering the number of clusters in data sets with noise features using feature rescaling factors

    Authors: Renato Cordeiro de Amorim, Christian Hennig

    Abstract: In this paper we introduce three methods for re-scaling data sets aiming at improving the likelihood of clustering validity indexes to return the true number of spherical Gaussian clusters with additional noise features. Our method obtains feature re-scaling factors taking into account the structure of a given data set and the intuitive idea that different features may have different degrees of re… ▽ More

    Submitted 22 February, 2016; originally announced February 2016.

    Journal ref: Information Sciences 324 (2015), 126-145

  11. arXiv:1601.03483  [pdf, ps, other

    cs.LG

    A survey on feature weighting based K-Means algorithms

    Authors: Renato Cordeiro de Amorim

    Abstract: In a real-world data set there is always the possibility, rather high in our opinion, that different features may have different degrees of relevance. Most machine learning algorithms deal with this fact by either selecting or deselecting features in the data preprocessing phase. However, we maintain that even among relevant features there may be different degrees of relevance, and this should be… ▽ More

    Submitted 22 September, 2015; originally announced January 2016.

    Comments: Journal of Classification (to appear)