Skip to main content

Showing 1–33 of 33 results for author: Araujo, L

Searching in archive cs. Search in all archives.
.
  1. Charting 5G Energy Efficiency: Flexible Energy Modeling for Sustainable Networks

    Authors: Anderson L de Araujo, Luc Deneire, Guillaume Urvoy-Keller, André L F de Almeida

    Abstract: Despite the rapid advancements in 5G technology, accurately assessing the energy consumption of its Radio Access Networks (RANs) remains a challenge due to the diverse range of applicable technologies and implementation solutions. Designing a versatile power model for estimating the 5G RANspecific power consumption requires extensive data collection and experimental studies to capture the diverse… ▽ More

    Submitted 12 March, 2025; originally announced March 2025.

    Journal ref: 20th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob 2024), Oct 2024, Paris, France. pp.721-726

  2. arXiv:2502.04478  [pdf

    cs.CV cs.LG

    OneTrack-M: A multitask approach to transformer-based MOT models

    Authors: Luiz C. S. de Araujo, Carlos M. S. Figueiredo

    Abstract: Multi-Object Tracking (MOT) is a critical problem in computer vision, essential for understanding how objects move and interact in videos. This field faces significant challenges such as occlusions and complex environmental dynamics, impacting model accuracy and efficiency. While traditional approaches have relied on Convolutional Neural Networks (CNNs), introducing transformers has brought substa… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    Comments: 13 pages, 11 figures

    ACM Class: I.4.8

  3. arXiv:2501.08464  [pdf, other

    cs.LG eess.SP

    Time series forecasting for multidimensional telemetry data using GAN and BiLSTM in a Digital Twin

    Authors: Joao Carmo de Almeida Neto, Claudio Miceli de Farias, Leandro Santiago de Araujo, Leopoldo Andre Dutra Lusquino Filho

    Abstract: The research related to digital twins has been increasing in recent years. Besides the mirroring of the physical word into the digital, there is the need of providing services related to the data collected and transferred to the virtual world. One of these services is the forecasting of physical part future behavior, that could lead to applications, like preventing harmful events or designing impr… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  4. arXiv:2501.03991  [pdf, other

    cs.CL

    Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles

    Authors: Yuxi Xia, Pedro Henrique Luz de Araujo, Klim Zaporojets, Benjamin Roth

    Abstract: Calibration, the alignment between model confidence and prediction accuracy, is critical for the reliable deployment of large language models (LLMs). Existing works neglect to measure the generalization of their methods to other prompt styles and different sizes of LLMs. To address this, we define a controlled experimental setting covering 12 LLMs and four prompt styles. We additionally investigat… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

    Comments: 24 pages, 11 figures, 8 tables

  5. arXiv:2410.19549  [pdf, other

    cs.SE cs.CL

    Mirror Matrix on the Wall: coding and vector notation as tools for introspection

    Authors: Leonardo Araújo

    Abstract: The vector notation adopted by GNU Octave plays a significant role as a tool for introspection, aligning itself with the vision of Kenneth E. Iverson. He believed that, just like mathematics, a programming language should be an effective thinking tool for representing and reasoning about problems we wish to address. This work aims to explore the use of vector notation in GNU Octave through the ana… ▽ More

    Submitted 30 October, 2024; v1 submitted 25 October, 2024; originally announced October 2024.

    Comments: 22 pages, 1 figure (3 subfigures)

  6. arXiv:2407.07159  [pdf, other

    cs.CY cs.SI

    Finding Fake News Websites in the Wild

    Authors: Leandro Araujo, Joao M. M. Couto, Luiz Felipe Nery, Isadora C. Rodrigues, Jussara M. Almeida, Julio C. S. Reis, Fabricio Benevenuto

    Abstract: The battle against the spread of misinformation on the Internet is a daunting task faced by modern society. Fake news content is primarily distributed through digital platforms, with websites dedicated to producing and disseminating such content playing a pivotal role in this complex ecosystem. Therefore, these websites are of great interest to misinformation researchers. However, obtaining a comp… ▽ More

    Submitted 15 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: This is a preprint version of a submitted manuscript on the Brazilian Symposium on Multimedia and the Web (WebMedia)

  7. arXiv:2407.02099  [pdf, other

    cs.CL

    Helpful assistant or fruitful facilitator? Investigating how personas affect language model behavior

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: One way to personalize and steer generations from large language models (LLM) is to assign a persona: a role that describes how the user expects the LLM to behave (e.g., a helpful assistant, a teacher, a woman). This paper investigates how personas affect diverse aspects of model behavior. We assign to seven LLMs 162 personas from 12 categories spanning variables like gender, sexual orientation, a… ▽ More

    Submitted 21 May, 2025; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: 20 pages, 12 figures. Accepted at PLOS One

  8. arXiv:2406.18589  [pdf, other

    cs.CV cs.LG

    Text-Guided Alternative Image Clustering

    Authors: Andreas Stephan, Lukas Miklautz, Collin Leiber, Pedro Henrique Luz de Araujo, Dominik Répás, Claudia Plant, Benjamin Roth

    Abstract: Traditional image clustering techniques only find a single grouping within visual data. In particular, they do not provide a possibility to explicitly define multiple types of clustering. This work explores the potential of large vision-language models to facilitate alternative image clustering. We propose Text-Guided Alternative Image Consensus Clustering (TGAICC), a novel approach that leverages… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2405.03004  [pdf, other

    cs.CL cs.LG

    Exploring prompts to elicit memorization in masked language model-based named entity recognition

    Authors: Yuxi Xia, Anastasiia Sedova, Pedro Henrique Luz de Araujo, Vasiliki Kougia, Lisa Nußbaumer, Benjamin Roth

    Abstract: Training data memorization in language models impacts model capability (generalization) and safety (privacy risk). This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity recognition models. Specifically, we employ a diverse set of 400 automatically generated prompts, and a pairwise dataset where each pair consists of one person's… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  10. arXiv:2404.04809  [pdf, other

    cs.CL

    Low-Resource Machine Translation through Retrieval-Augmented LLM Prompting: A Study on the Mambai Language

    Authors: Raphaël Merx, Aso Mahmudi, Katrina Langford, Leo Alberto de Araujo, Ekaterina Vylomova

    Abstract: This study explores the use of large language models (LLMs) for translating English into Mambai, a low-resource Austronesian language spoken in Timor-Leste, with approximately 200,000 native speakers. Leveraging a novel corpus derived from a Mambai language manual and additional sentences translated by a native speaker, we examine the efficacy of few-shot LLM prompting for machine translation (MT)… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Report number: https://aclanthology.org/2024.eurali-1.1/

  11. Specification Overfitting in Artificial Intelligence

    Authors: Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab

    Abstract: Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology's potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, i… ▽ More

    Submitted 2 January, 2025; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: 41 pages, 2 figures. This version of the article has been accepted for publication, after peer review but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s10462-024-11040-6

    Journal ref: Artificial Intelligence Review 58, 35 (2025)

  12. arXiv:2311.08481  [pdf, other

    cs.CL

    Functionality learning through specification instructions

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: Test suites assess natural language processing models' performance on specific functionalities: cases of interest involving model robustness, fairness, or particular linguistic capabilities. This paper introduces specification instructions: text descriptions specifying fine-grained task-specific behaviors. For each functionality in a suite, we generate an instruction that describes it. We combine… ▽ More

    Submitted 9 October, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 36 pages, 8 figures. Accepted at EMNLP 2024 Findings

    Journal ref: In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 10955-10990, Miami, Florida, USA. Association for Computational Linguistics

  13. arXiv:2311.05452  [pdf, other

    eess.IV cs.CV cs.LG

    Transformer-based Model for Oral Epithelial Dysplasia Segmentation

    Authors: Adam J Shephard, Hanya Mahmood, Shan E Ahmed Raza, Anna Luiza Damaceno Araujo, Alan Roger Santos-Silva, Marcio Ajudarte Lopes, Pablo Agustin Vargas, Kris McCombe, Stephanie Craig, Jacqueline James, Jill Brooks, Paul Nankivell, Hisham Mehanna, Syed Ali Khurram, Nasir M Rajpoot

    Abstract: Oral epithelial dysplasia (OED) is a premalignant histopathological diagnosis given to lesions of the oral cavity. OED grading is subject to large inter/intra-rater variability, resulting in the under/over-treatment of patients. We developed a new Transformer-based pipeline to improve detection and segmentation of OED in haematoxylin and eosin (H&E) stained whole slide images (WSIs). Our model was… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, 4 tables

  14. Cross-functional Analysis of Generalisation in Behavioural Learning

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: In behavioural testing, system functionalities underrepresented in the standard evaluation setting (with a held-out test set) are validated through controlled input-output pairs. Optimising performance on the behavioural tests during training (behavioural learning) would improve coverage of phenomena not sufficiently represented in the i.i.d. data and could lead to seemingly more robust models. Ho… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 16 pages, 1 figure. To be published in the Transactions of the Association for Computational Linguistics (TACL). This preprint is a pre-MIT Press publication version

    Journal ref: Transactions of the Association for Computational Linguistics 11, 2023, 1066-1081

  15. arXiv:2304.10618  [pdf, other

    cs.AR eess.SP

    ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

    Authors: Zachary Susskind, Aman Arora, Igor D. S. Miranda, Alan T. L. Bacellar, Luis A. Q. Villon, Rafael F. Katopodis, Leandro S. de Araujo, Diego L. C. Dutra, Priscila M. V. Lima, Felipe M. G. Franca, Mauricio Breternitz Jr., Lizy K. John

    Abstract: The deployment of AI models on low-power, real-time edge devices requires accelerators for which energy, latency, and area are all first-order concerns. There are many approaches to enabling deep neural networks (DNNs) in this domain, including pruning, quantization, compression, and binary neural networks (BNNs), but with the emergence of the "extreme edge", there is now a demand for even more ef… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 14 pages, 14 figures Portions of this article draw heavily from arXiv:2203.01479, most notably sections 5E and 5F.2

  16. Sequence-aware multimodal page classification of Brazilian legal documents

    Authors: Pedro H. Luz de Araujo, Ana Paula G. S. de Almeida, Fabricio A. Braz, Nilton C. da Silva, Flavio de Barros Vidal, Teofilo E. de Campos

    Abstract: The Brazilian Supreme Court receives tens of thousands of cases each semester. Court employees spend thousands of hours to execute the initial analysis and classification of those cases -- which takes effort away from posterior, more complex stages of the case management workflow. In this paper, we explore multimodal classification of documents from Brazil's Supreme Court. We train and evaluate ou… ▽ More

    Submitted 15 July, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

    Comments: 11 pages, 6 figures. This preprint, which was originally written on 8 April 2021, has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in the International Journal on Document Analysis and Recognition, and is available online at https://doi.org/10.1007/s10032-022-00406-7 and https://rdcu.be/cRvvV

    Journal ref: International Journal on Document Analysis and Recognition.2022

  17. Checking HateCheck: a cross-functional analysis of behaviour-aware learning for hate speech detection

    Authors: Pedro Henrique Luz de Araujo, Benjamin Roth

    Abstract: Behavioural testing -- verifying system capabilities by validating human-designed input-output pairs -- is an alternative evaluation method of natural language processing systems proposed to address the shortcomings of the standard approach: computing metrics on held-out data. While behavioural tests capture human prior knowledge and insights, there has been little exploration on how to leverage t… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: 9 pages, 5 figures. Accepted at the First Workshop on Efficient Benchmarking in NLP (NLP Power!)

    Journal ref: In Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP, 2022, pages 75-83, Dublin, Ireland. Association for Computational Linguistics

  18. arXiv:2203.01479  [pdf, other

    cs.AR cs.LG

    Weightless Neural Networks for Efficient Edge Inference

    Authors: Zachary Susskind, Aman Arora, Igor Dantas Dos Santos Miranda, Luis Armando Quintanilla Villon, Rafael Fontella Katopodis, Leandro Santiago de Araujo, Diego Leonel Cadette Dutra, Priscila Machado Vieira Lima, Felipe Maia Galvao Franca, Mauricio Breternitz Jr., Lizy K. John

    Abstract: Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existi… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  19. arXiv:2201.08041  [pdf, other

    cs.NI

    Multi-SIM support in 5G Evolution: Challenges and Opportunities

    Authors: O. Vikhrova, S. Pizzi, A. Terzani, L. Araujo, A. Orsino, G. Araniti

    Abstract: Devices with multiple Subscriber Identification Modules (SIM)s are expected to prevail over the conventional devices with only one SIM. Despite the growing demand for such devices, only proprietary solutions are available so far. To fill this gap, the Third Generation Partnership Project (3GPP) is aiming at the development of unified cross-platform solutions for multi-SIM device coordination. This… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

    Comments: This paper has been accepted for publication in IEEE Communications Standards Magazine

  20. arXiv:2108.05136  [pdf, other

    cs.AI cs.GT

    Snakes AI Competition 2020 and 2021 Report

    Authors: Joseph Alexander Brown, Luiz Jonata Pires de Araujo, Alexandr Grichshenko

    Abstract: The Snakes AI Competition was held by the Innopolis University and was part of the IEEE Conference on Games2020 and 2021 editions. It aimed to create a sandbox for learning and implementing artificial intelligence algorithms in agents in a ludic manner. Competitors of several countries participated in both editions of the competition, which was streamed to create asynergy between organizers and th… ▽ More

    Submitted 11 August, 2021; originally announced August 2021.

  21. Using Tabu Search Algorithm for Map Generation in the Terra Mystica Tabletop Game

    Authors: Alexandr Grichshenko, Luiz Jonata Pires de Araujo, Susanna Gimaeva, Joseph Alexander Brown

    Abstract: Tabu Search (TS) metaheuristic improves simple local search algorithms (e.g. steepest ascend hill-climbing) by enabling the algorithm to escape local optima points. It has shown to be useful for addressing several combinatorial optimization problems. This paper investigates the performance of TS and considers the effects of the size of the Tabu list and the size of the neighbourhood for a procedur… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    Journal ref: ISMSI '20: Proceedings of the 2020 4th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence

  22. Machine Learning and value generation in Software Development: a survey

    Authors: Barakat. J. Akinsanya, Luiz J. P. Araújo, Mariia Charikova, Susanna Gimaeva, Alexandr Grichshenko, Adil Khan, Manuel Mazzara, Ozioma Okonicha N, Daniil Shilintsev

    Abstract: Machine Learning (ML) has become a ubiquitous tool for predicting and classifying data and has found application in several problem domains, including Software Development (SD). This paper reviews the literature between 2000 and 2019 on the use the learning models that have been employed for programming effort estimation, predicting risks and identifying and detecting defects. This work is meant t… ▽ More

    Submitted 23 January, 2020; originally announced January 2020.

    Comments: To be published in the proceeding of International Conference on Software Testing, Machine Learning and Complex Process Analysis (TMPA-2019)

  23. arXiv:1911.03746  [pdf, other

    cs.CY cs.NI

    A Machine to Machine framework for the charging of Electric Autonomous Vehicles

    Authors: Ziyad Elbanna, Ilya Afanasyev, Luiz J. P. Araujo, Rasheed Hussain, Mansur Khazeev, Joseph Lamptey, Manuel Mazzara, Swati Megha, Diksha Moolchandani, Dragos Strugar

    Abstract: Electric Autonomous Vehicles (EAVs) have gained increasing attention of industry, governments and scientific communities concerned about issues related to classic transportation including accidents and casualties, gas emissions and air pollution, intensive traffic and city viability. One of the aspects, however, that prevent a broader adoption of this technology is the need for human interference… ▽ More

    Submitted 9 November, 2019; originally announced November 2019.

  24. arXiv:1909.12682  [pdf, other

    cs.SE

    Anomaly Detection in DevOps Toolchain

    Authors: Antonio Capizzi, Salvatore Distefano, Manuel Mazzara, Luiz J. P. Araùjo, Muhammad Ahmad, Evgeny Bobrov

    Abstract: The tools employed in the DevOps Toolchain generates a large quantity of data that is typically ignored or inspected only in particular occasions, at most. However, the analysis of such data could enable the extraction of useful information about the status and evolution of the project. For example, metrics like the "lines of code added since the last release" or "failures detected in the staging… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

  25. arXiv:1811.00607  [pdf, ps, other

    cs.DC

    Exploring the Equivalence between Dynamic Dataflow Model and Gamma - General Abstract Model for Multiset mAnipulation

    Authors: Rui R. Mello Junior, Leandro S. Araujo, Tiago A. O. Alves, Leandro A. J. Marzulo, Gabriel A. L. Paillard, Felipe M. G. França

    Abstract: With the increase of the search for computational models where the expression of parallelism occurs naturally, some paradigms arise as options for the next generation of computers. In this context, dynamic Dataflow and Gamma - General Abstract Model for Multiset mAnipulation) - emerge as interesting computational models choices. In the dynamic Dataflow model, operations are performed as soon as th… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

    Comments: Study submitted to the IPDPS 2019 - IEEE International Parallel and Distributed Processing Symposium

  26. arXiv:1807.03688  [pdf, other

    cs.SI

    Inside the Right-Leaning Echo Chambers: Characterizing Gab, an Unmoderated Social System

    Authors: Lucas Lima, Julio C. S. Reis, Philipe Melo, Fabricio Murai, Leandro Araújo, Pantelis Vikatos, Fabrício Benevenuto

    Abstract: The moderation of content in many social media systems, such as Twitter and Facebook, motivated the emergence of a new social network system that promotes free speech, named Gab. Soon after that, Gab has been removed from Google Play Store for violating the company's hate speech policy and it has been rejected by Apple for similar reasons. In this paper we characterize Gab, aiming at understanding… ▽ More

    Submitted 10 July, 2018; originally announced July 2018.

    Comments: This is a preprint of a paper that will appear on ASONAM'18

  27. arXiv:1803.03571  [pdf, other

    cs.SI cs.CY cs.IR q-bio.QM stat.ML

    City-wide Analysis of Electronic Health Records Reveals Gender and Age Biases in the Administration of Known Drug-Drug Interactions

    Authors: Rion Brattig Correia, Luciana P. de Araújo, Mauro M. Mattos, Luis M. Rocha

    Abstract: The occurrence of drug-drug-interactions (DDI) from multiple drug dispensations is a serious problem, both for individuals and health-care systems, since patients with complications due to DDI are likely to reenter the system at a costlier level. We present a large-scale longitudinal study (18 months) of the DDI phenomenon at the primary- and secondary-care level using electronic health records (E… ▽ More

    Submitted 2 January, 2020; v1 submitted 9 March, 2018; originally announced March 2018.

    MSC Class: J.3; G.3 ACM Class: J.3; G.3

    Journal ref: npj Digit. Med. 2, 74 (2019)

  28. arXiv:1204.6089  [pdf, ps, other

    cs.MA cs.CE cs.CY cs.SE

    Multi-model-based Access Control in Construction Projects

    Authors: Frank Hilbert, Raimar J. Scherer, Larissa Araujo

    Abstract: During the execution of large scale construction projects performed by Virtual Organizations (VO), relatively complex technical models have to be exchanged between the VO members. For linking the trade and transfer of these models, a so-called multi-model container format was developed. Considering the different skills and tasks of the involved partners, it is not necessary for them to know all th… ▽ More

    Submitted 26 April, 2012; originally announced April 2012.

    Comments: In Proceedings FAVO 2011, arXiv:1204.5796

    ACM Class: H.5.3

    Journal ref: EPTCS 83, 2012, pp. 1-9

  29. arXiv:0806.2843  [pdf, ps, other

    cs.NE cs.DC

    MultiKulti Algorithm: Migrating the Most Different Genotypes in an Island Model

    Authors: Lourdes Araujo, Juan J. Merelo Guervos, Carlos Cotta, Francisco Fernandez de Vega

    Abstract: Migration policies in distributed evolutionary algorithms has not been an active research area until recently. However, in the same way as operators have an impact on performance, the choice of migrants is due to have an impact too. In this paper we propose a new policy (named multikulti) for choosing the individuals that are going to be sent to other nodes, based on multiculturality: the indivi… ▽ More

    Submitted 18 June, 2008; v1 submitted 17 June, 2008; originally announced June 2008.

    Comments: First description of the multikulti distributed evolutionary computation migration policy

  30. arXiv:0804.2057  [pdf, ps, other

    cs.IR

    Comparing and Combining Methods for Automatic Query Expansion

    Authors: José R. Pérez-Agüera, Lourdes Araujo

    Abstract: Query expansion is a well known method to improve the performance of information retrieval systems. In this work we have tested different approaches to extract the candidate query terms from the top ranked documents returned by the first-pass retrieval. One of them is the cooccurrence approach, based on measures of cooccurrence of the candidate and the query terms in the retrieved documents. T… ▽ More

    Submitted 13 April, 2008; originally announced April 2008.

    Comments: 12 pages

    Journal ref: Advances in Natural Language Processing and Applications. Research in Computing Science 33, 2008, pp. 177-188

  31. arXiv:0801.1210  [pdf, ps, other

    cs.DC

    Increasing GP Computing Power via Volunteer Computing

    Authors: Daniel Lombrana Gonzalez, Francisco Fernandez de Vega, L. Trujillo, G. Olague, F. Chavez de la O, M. Cardenas, L. Araujo, P. Castillo, K. Sharman

    Abstract: This paper describes how it is possible to increase GP Computing Power via Volunteer Computing (VC) using the BOINC framework. Two experiments using well-known GP tools -Lil-gp & ECJ- are performed in order to demonstrate the benefit of using VC in terms of computing power and speed up. Finally we present an extension of the model where any GP tool or framework can be used inside BOINC regardles… ▽ More

    Submitted 8 January, 2008; originally announced January 2008.

    Comments: First draft, preparing for PPSN 2008

  32. arXiv:cs/0610019  [pdf

    cs.IR cs.HC

    NectaRSS, an RSS feed ranking system that implicitly learns user preferences

    Authors: Juan J. Samper, Pedro A. Castillo, Lourdes Araujo, J. J. Merelo

    Abstract: In this paper a new RSS feed ranking method called NectaRSS is introduced. The system recommends information to a user based on his/her past choices. User preferences are automatically acquired, avoiding explicit feedback, and ranking is based on those preferences distilled to a user profile. NectaRSS uses the well-known vector space model for user profiles and new documents, and compares them u… ▽ More

    Submitted 4 October, 2006; originally announced October 2006.

    Comments: Submitted to First Monday. 16 pages

  33. arXiv:cs/0601047  [pdf, ps, other

    cs.IR cs.NE

    Automatic Detection of Trends in Dynamical Text: An Evolutionary Approach

    Authors: Lourdes Araujo, Juan J. Merelo

    Abstract: This paper presents an evolutionary algorithm for modeling the arrival dates of document streams, which is any time-stamped collection of documents, such as newscasts, e-mails, IRC conversations, scientific journals archives and weblog postings. This algorithm assigns frequencies (number of document arrivals per time unit) to time intervals so that it produces an optimal fit to the data. The opt… ▽ More

    Submitted 12 January, 2006; originally announced January 2006.

    Comments: 22 pages, submitted to Journal of Information Retrieval