Skip to main content

Showing 1–11 of 11 results for author: Gayo-Avello, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.15495  [pdf, other

    cs.LG cs.AI cs.IR

    Leveraging Wikidata's edit history in knowledge graph refinement tasks

    Authors: Alejandro Gonzalez-Hevia, Daniel Gayo-Avello

    Abstract: Knowledge graphs have been adopted in many diverse fields for a variety of purposes. Most of those applications rely on valid and complete data to deliver their results, pressing the need to improve the quality of knowledge graphs. A number of solutions have been proposed to that end, ranging from rule-based approaches to the use of probabilistic methods, but there is an element that has not been… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 18 pages, 7 figures. Submitted to the Journal of Web Semantics

    ACM Class: H.3; H.4; I.2

  2. arXiv:1611.08144  [pdf, other

    cs.CY cs.DL cs.SI

    How I Stopped Worrying about the Twitter Archive at the Library of Congress and Learned to Build a Little One for Myself

    Authors: Daniel Gayo-Avello

    Abstract: Twitter is among the commonest sources of data employed in social media research mainly because of its convenient APIs to collect tweets. However, most researchers do not have access to the expensive Firehose and Twitter Historical Archive, and they must rely on data collected with free APIs whose representativeness has been questioned. In 2010 the Library of Congress announced an agreement with T… ▽ More

    Submitted 24 November, 2016; originally announced November 2016.

    Comments: 22 pages, 13 figures

  3. arXiv:1510.00618  [pdf

    cs.CL

    Automatic Taxonomy Extraction from Query Logs with no Additional Sources of Information

    Authors: Miguel Fernandez-Fernandez, Daniel Gayo-Avello

    Abstract: Search engine logs store detailed information on Web users interactions. Thus, as more and more people use search engines on a daily basis, important trails of users common knowledge are being recorded in those files. Previous research has shown that it is possible to extract concept taxonomies from full text documents, while other scholars have proposed methods to obtain similar queries from quer… ▽ More

    Submitted 5 October, 2015; v1 submitted 2 October, 2015; originally announced October 2015.

    Comments: 21 pages, 4 figures, 5 tables. Old (2012) unpublished manuscript

  4. arXiv:1206.5851  [pdf

    cs.SI cs.CL cs.CY physics.soc-ph

    A meta-analysis of state-of-the-art electoral prediction from Twitter data

    Authors: Daniel Gayo-Avello

    Abstract: Electoral prediction from Twitter data is an appealing research topic. It seems relatively straightforward and the prevailing view is overly optimistic. This is problematic because while simple approaches are assumed to be good enough, core problems are not addressed. Thus, this paper aims to (1) provide a balanced and critical review of the state of the art; (2) cast light on the presume predicti… ▽ More

    Submitted 25 June, 2012; originally announced June 2012.

    Comments: 19 pages, 3 tables

    ACM Class: H.2.8; H.3.5; H.4.3; I.2.7; I.5.4; J.4; K.4.1

    Journal ref: Social Science Computer Review, August 23, 2013, 0894439313493979

  5. arXiv:1204.6441  [pdf, ps, other

    cs.CY cs.CL cs.SI physics.soc-ph

    "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" -- A Balanced Survey on Election Prediction using Twitter Data

    Authors: Daniel Gayo-Avello

    Abstract: Predicting X from Twitter is a popular fad within the Twitter research subculture. It seems both appealing and relatively easy. Among such kind of studies, electoral prediction is maybe the most attractive, and at this moment there is a growing body of literature on such a topic. This is not only an interesting research problem but, above all, it is extremely difficult. However, most of the author… ▽ More

    Submitted 28 April, 2012; originally announced April 2012.

    Comments: 13 pages, no figures. Annotated bibliography of 25 papers regarding electoral prediction from Twitter data

  6. arXiv:1012.5913  [pdf, ps, other

    cs.SI cs.CY cs.DM

    All liaisons are dangerous when all your friends are known to us

    Authors: Daniel Gayo-Avello

    Abstract: Online Social Networks (OSNs) are used by millions of users worldwide. Academically speaking, there is little doubt about the usefulness of demographic studies conducted on OSNs and, hence, methods to label unknown users from small labeled samples are very useful. However, from the general public point of view, this can be a serious privacy concern. Thus, both topics are tackled in this paper: Fir… ▽ More

    Submitted 29 December, 2010; originally announced December 2010.

    Comments: 10 pages, 5 tables

    ACM Class: G.2.2; I.5.2; K.4.1

  7. arXiv:1012.2057  [pdf, ps, other

    cs.SI physics.soc-ph

    De retibus socialibus et legibus momenti

    Authors: Daniel Gayo-Avello, David J. Brenes, Diego Fernández-Fernández, María E. Fernández-Menéndez, Rodrigo García-Suárez

    Abstract: Online Social Networks (OSNs) are a cutting edge topic. Almost everybody --users, marketers, brands, companies, and researchers-- is approaching OSNs to better understand them and take advantage of their benefits. Maybe one of the key concepts underlying OSNs is that of influence which is highly related, although not entirely identical, to those of popularity and centrality. Influence is, accordin… ▽ More

    Submitted 14 February, 2011; v1 submitted 9 December, 2010; originally announced December 2010.

    Comments: Changes made for third revision: Brief description of the dataset employed added to Introduction. Minor changes to the description of preparation of the bit.ly datasets. Minor changes to the captions of Tables 1 and 3. Brief addition in the Conclusions section (future line of work added). Added references 16 and 18. Some typos and grammar polished

    Journal ref: 2011 EPL 94 38001

  8. arXiv:1005.5516  [pdf, ps, other

    cs.IR

    On the Fly Query Entity Decomposition Using Snippets

    Authors: David J. Brenes, Daniel Gayo-Avello, Rodrigo Garcia

    Abstract: One of the most important issues in Information Retrieval is inferring the intents underlying users' queries. Thus, any tool to enrich or to better contextualized queries can proof extremely valuable. Entity extraction, provided it is done fast, can be one of such tools. Such techniques usually rely on a prior training phase involving large datasets. That training is costly, specially in environme… ▽ More

    Submitted 6 June, 2010; v1 submitted 30 May, 2010; originally announced May 2010.

    Comments: Extended version of paper submitted to CERI 2010

  9. Nepotistic Relationships in Twitter and their Impact on Rank Prestige Algorithms

    Authors: Daniel Gayo-Avello

    Abstract: Micro-blogging services such as Twitter allow anyone to publish anything, anytime. Needless to say, many of the available contents can be diminished as babble or spam. However, given the number and diversity of users, some valuable pieces of information should arise from the stream of tweets. Thus, such services can develop into valuable sources of up-to-date information (the so-called real-time w… ▽ More

    Submitted 18 October, 2012; v1 submitted 6 April, 2010; originally announced April 2010.

    Comments: 40 pages, 17 tables, 14 figures. Paper has been restructured, new section "3.2. The importance of reciprocal linking in Twitter spam" was added, experiments with verified accounts in addition to spammers have bee conducted to show performance with relevant users and not only regarding spam demotion

    Journal ref: Information Processing & Management Volume 49, Issue 6, November 2013, Pages 1250-1280

  10. arXiv:0911.3979  [pdf

    cs.IR cs.HC

    Making the road by searching - A search engine based on Swarm Information Foraging

    Authors: Daniel Gayo-Avello, David J. Brenes

    Abstract: Search engines are nowadays one of the most important entry points for Internet users and a central tool to solve most of their information needs. Still, there exist a substantial amount of users' searches which obtain unsatisfactory results. Needless to say, several lines of research aim to increase the relevancy of the results users retrieve. In this paper the authors frame this problem within… ▽ More

    Submitted 20 November, 2009; originally announced November 2009.

  11. arXiv:cs/0411074  [pdf

    cs.CL cs.IR

    Building Chinese Lexicons from Scratch by Unsupervised Short Document Self-Segmentation

    Authors: Daniel Gayo-Avello

    Abstract: Chinese text segmentation is a well-known and difficult problem. On one side, there is not a simple notion of "word" in Chinese language making really hard to implement rule-based systems to segment written texts, thus lexicons and statistical information are usually employed to achieve such a task. On the other side, any piece of Chinese text usually includes segments present neither in the lex… ▽ More

    Submitted 19 November, 2004; originally announced November 2004.

    Comments: 9 pages 3 figures 2 tables