Search | arXiv e-print repository

Manifesto from Dagstuhl Perspectives Workshop 24352 -- Conversational Agents: A Framework for Evaluation (CAFE)

Authors: Christine Bauer, Li Chen, Nicola Ferro, Norbert Fuhr, Avishek Anand, Timo Breuer, Guglielmo Faggioli, Ophir Frieder, Hideo Joho, Jussi Karlgren, Johannes Kiesel, Bart P. Knijnenburg, Aldo Lipani, Lien Michiels, Andrea Papenmeier, Maria Soledad Pera, Mark Sanderson, Scott Sanner, Benno Stein, Johanne R. Trippas, Karin Verspoor, Martijn C Willemsen

Abstract: During the workshop, we deeply discussed what CONversational Information ACcess (CONIAC) is and its unique features, proposing a world model abstracting it, and defined the Conversational Agents Framework for Evaluation (CAFE) for the evaluation of CONIAC systems, consisting of six major components: 1) goals of the system's stakeholders, 2) user tasks to be studied in the evaluation, 3) aspects of… ▽ More During the workshop, we deeply discussed what CONversational Information ACcess (CONIAC) is and its unique features, proposing a world model abstracting it, and defined the Conversational Agents Framework for Evaluation (CAFE) for the evaluation of CONIAC systems, consisting of six major components: 1) goals of the system's stakeholders, 2) user tasks to be studied in the evaluation, 3) aspects of the users carrying out the tasks, 4) evaluation criteria to be considered, 5) evaluation methodology to be applied, and 6) measures for the quantitative criteria chosen. △ Less

Submitted 8 June, 2025; originally announced June 2025.

Comments: 43 pages; 10 figures; Dagstuhl manifesto

arXiv:2501.05170 [pdf, other]

De-centering the (Traditional) User: Multistakeholder Evaluation of Recommender Systems

Authors: Robin Burke, Gediminas Adomavicius, Toine Bogers, Tommaso Di Noia, Dominik Kowald, Julia Neidhardt, Özlem Özgöbek, Maria Soledad Pera, Nava Tintarev, Jürgen Ziegler

Abstract: Multistakeholder recommender systems are those that account for the impacts and preferences of multiple groups of individuals, not just the end users receiving recommendations. Due to their complexity, these systems cannot be evaluated strictly by the overall utility of a single stakeholder, as is often the case of more mainstream recommender system applications. In this article, we focus our disc… ▽ More Multistakeholder recommender systems are those that account for the impacts and preferences of multiple groups of individuals, not just the end users receiving recommendations. Due to their complexity, these systems cannot be evaluated strictly by the overall utility of a single stakeholder, as is often the case of more mainstream recommender system applications. In this article, we focus our discussion on the challenges of multistakeholder evaluation of recommender systems. We bring attention to the different aspects involved -- from the range of stakeholders involved (including but not limited to providers and consumers) to the values and specific goals of each relevant stakeholder. We discuss how to move from theoretical principles to practical implementation, providing specific use case examples. Finally, we outline open research directions for the RecSys community to explore. We aim to provide guidance to researchers and practitioners about incorporating these complex and domain-dependent issues of evaluation in the course of designing, developing, and researching applications with multistakeholder aspects. △ Less

Submitted 22 April, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

Comments: Preprint in revision at Elsevier, "Re-centering the User in Recommender System Research" special issue of the International Journal of Human-Computer Studies (IJHCS)

arXiv:2405.02050 [pdf, other]

Ah, that's the great puzzle: On the Quest of a Holistic Understanding of the Harms of Recommender Systems on Children

Authors: Robin Ungruh, Maria Soledad Pera

Abstract: Children come across various media items online, many of which are selected by recommender systems (RS) primarily designed for adults. The specific nature of the content selected by RS to display on online platforms used by children - although not necessarily targeting them as a user base - remains largely unknown. This raises questions about whether such content is appropriate given children's vu… ▽ More Children come across various media items online, many of which are selected by recommender systems (RS) primarily designed for adults. The specific nature of the content selected by RS to display on online platforms used by children - although not necessarily targeting them as a user base - remains largely unknown. This raises questions about whether such content is appropriate given children's vulnerable stages of development and the potential risks to their well-being. In this position paper, we reflect on the relationship between RS and children, emphasizing the possible adverse effects of the content this user group might be exposed to online. As a step towards fostering safer interactions for children in online environments, we advocate for researchers, practitioners, and policymakers to undertake a more comprehensive examination of the impact of RS on children - one focused on harms. This would result in a more holistic understanding that could inform the design and deployment of strategies that would better suit children's needs and preferences while actively mitigating the potential harm posed by RS; acknowledging that identifying and addressing these harms is complex and multifaceted. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 7 pages, 2 figures, DCDW 2024

Journal ref: Designing for Children's Digital Well-being: A Research, Policy and Practice Agenda (DCDW '24), co-located with ACM IDC 2024

arXiv:2402.14990 [pdf, other]

Binary origin of blue straggler stars in Galactic star clusters

Authors: M. J. Rain, M. S. Pera, G. Perren, O. Benvenuto, J. Panei, A. de Vito, G. Carraro, S. Villanova

Abstract: Building on the recent release of a new \emph{Gaia}-based blue straggler star catalog in Galactic open star clusters (OCs), we explored the properties of these stars in a cluster sample spanning a wide range in fundamental parameters. We employed \emph{Gaia} EDR3 to assess the membership of any individual blue or yellow straggler to their parent cluster. We then made use of the \texttt{ASteCA} cod… ▽ More Building on the recent release of a new \emph{Gaia}-based blue straggler star catalog in Galactic open star clusters (OCs), we explored the properties of these stars in a cluster sample spanning a wide range in fundamental parameters. We employed \emph{Gaia} EDR3 to assess the membership of any individual blue or yellow straggler to their parent cluster. We then made use of the \texttt{ASteCA} code to estimate the fundamental parameters of the selected clusters, in particular, the binary fraction. With all this at hand, we critically revisited the relation of the blue straggler population and the latter. For the first time, we found a correlation between the number of blue stragglers and the host cluster binary fraction and binaries. This supports the hypothesis that binary evolution is the most viable scenario of straggler formation in Galactic star clusters. The distribution of blue stragglers in the Gaia color-magnitude diagram was then compared with a suite of composite evolutionary sequences derived from binary evolutionary models that were run by exploring a range of binary parameters: age, mass ratio, period, and so forth. The excellent comparison between the bulk distribution of blue stragglers and the composite evolutionary sequences loci further supports the binary origin of most stragglers in OCs and paves the way for a detailed study of individual blue stragglers △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 15 pages, in press in Astronomy and Astrophysics

arXiv:2308.15265 [pdf, other]

A Multi-Perspective Learning to Rank Approach to Support Children's Information Seeking in the Classroom

Authors: Garrett Allen, Katherine Landau Wright, Jerry Alan Fails, Casey Kennington, Maria Soledad Pera

Abstract: We introduce a novel re-ranking model that aims to augment the functionality of standard search engines to support classroom search activities for children (ages 6 to 11). This model extends the known listwise learning-to-rank framework by balancing risk and reward. Doing so enables the model to prioritize Web resources of high educational alignment, appropriateness, and adequate readability by an… ▽ More We introduce a novel re-ranking model that aims to augment the functionality of standard search engines to support classroom search activities for children (ages 6 to 11). This model extends the known listwise learning-to-rank framework by balancing risk and reward. Doing so enables the model to prioritize Web resources of high educational alignment, appropriateness, and adequate readability by analyzing the URLs, snippets, and page titles of Web resources retrieved by a given mainstream search engine. Experimental results, including an ablation study and comparisons with existing baselines, showcase the correctness of the proposed model. The outcomes of this work demonstrate the value of considering multiple perspectives inherent to the classroom setting, e.g., educational alignment, readability, and objectionability, when applied to the design of algorithms that can better support children's information discovery. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Extended version of the manuscript to appear in proceedings of the 22nd IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology

arXiv:2308.04546 [pdf, other]

The Unified Cluster Catalogue: towards a comprehensive and homogeneous database of stellar clusters

Authors: G. I. Perren, M. S. Pera, H. D. Navone, R. A. Vázquez

Abstract: We introduce the Unified Cluster Catalogue, the largest catalogue of stellar clusters currently listing nearly 14000 objects. In this initial release it exclusively contains Milky Way open clusters, with plans to include other objects in future updates. Each cluster is processed using a novel probability membership algorithm, which incorporates the coordinates, parallax, proper motions, and their… ▽ More We introduce the Unified Cluster Catalogue, the largest catalogue of stellar clusters currently listing nearly 14000 objects. In this initial release it exclusively contains Milky Way open clusters, with plans to include other objects in future updates. Each cluster is processed using a novel probability membership algorithm, which incorporates the coordinates, parallax, proper motions, and their associated uncertainties for each star into the probability assignment process. We employ Gaia DR3 data up to a G magnitude of 20, resulting in the identification of over a million probable members. The catalogue is accompanied by a publicly accessible website designed to simplify the search and data exploration of stellar clusters. The website can be accessed at https://ucc.ar. △ Less

Submitted 11 September, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

arXiv:2302.12043 [pdf, ps, other]

Conversational Agents and Children: Let Children Learn

Authors: Casey Kennington, Jerry Alan Fails, Katherine Landau Wright, Maria Soledad Pera

Abstract: Using online information discovery as a case study, in this position paper we discuss the need to design, develop, and deploy (conversational) agents that can -- non-intrusively -- guide children in their quest for online resources rather than simply finding resources for them. We argue that agents should "let children learn" and should be built to take on a teacher-facilitator function, allowing… ▽ More Using online information discovery as a case study, in this position paper we discuss the need to design, develop, and deploy (conversational) agents that can -- non-intrusively -- guide children in their quest for online resources rather than simply finding resources for them. We argue that agents should "let children learn" and should be built to take on a teacher-facilitator function, allowing children to develop their technical and critical thinking abilities as they interact with varied technology in a broad range of use cases. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: 6 pages

arXiv:2209.02662 [pdf, other]

Matching Consumer Fairness Objectives & Strategies for RecSys

Authors: Michael D. Ekstrand, Maria Soledad Pera

Abstract: The last several years have brought a growing body of work on ensuring that recommender systems are in some sense consumer-fair -- that is, they provide comparable quality of service, accuracy of representation, and other effects to their users. However, there are many different strategies to make systems more fair and a range of intervention points. In this position paper, we build on ongoing wor… ▽ More The last several years have brought a growing body of work on ensuring that recommender systems are in some sense consumer-fair -- that is, they provide comparable quality of service, accuracy of representation, and other effects to their users. However, there are many different strategies to make systems more fair and a range of intervention points. In this position paper, we build on ongoing work to highlight the need for researchers and practitioners to attend to the details of their application, users, and the fairness objective they aim to achieve, and adopt interventions that are appropriate to the situation. We argue that consumer fairness should be a creative endeavor flowing from the particularities of the specific problem to be solved. △ Less

Submitted 7 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

Comments: Paper presented at FAccTRec 2022

arXiv:2209.02338 [pdf, ps, other]

Let's Learn from Children: Scaffolding to Enable Search as Learning in the Educational Environment

Authors: Monica Landoni, Maria Soledad Pera, Emiliana Murgia, Theo Huibers

Abstract: In this manuscript, we argue for the need to further look at search as learning (SAL) with children as the primary stakeholders. Inspired by how children learn and considering the classroom (regardless of the teaching modality) as a natural educational ecosystem, we posit that scaffolding is the tie that can simultaneously allow for learning to search while searching for learning. The main contrib… ▽ More In this manuscript, we argue for the need to further look at search as learning (SAL) with children as the primary stakeholders. Inspired by how children learn and considering the classroom (regardless of the teaching modality) as a natural educational ecosystem, we posit that scaffolding is the tie that can simultaneously allow for learning to search while searching for learning. The main contribution of this work is a list of open challenges focused on the primary school classroom for the IR community to consider when setting up to explore and make progress on SAL research with and for children and beyond. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: Presented at "3rd International Workshop on Investigating Learning During Web Search" (IWILDS 2022) https://iwilds2022.wordpress.com/

arXiv:2204.02153 [pdf, other]

doi 10.1051/0004-6361/202243288

An analysis of the most distant catalogued open clusters -- Re-assessing fundamental parameters with Gaia EDR3 and $\texttt{ASteCA}$

Authors: G. I. Perren, M. S. Pera, H. D. Navone, R. A. Vázquez

Abstract: Several studies have been presented in the last few years applying some kind of automatic processing of data to estimate the fundamental parameters of open clusters. These parameters are later on employed in larger scale analyses, for example the structure of the Galaxy's spiral arms. The distance is one of the more straightforward parameters to estimate, yet enormous differences can still be foun… ▽ More Several studies have been presented in the last few years applying some kind of automatic processing of data to estimate the fundamental parameters of open clusters. These parameters are later on employed in larger scale analyses, for example the structure of the Galaxy's spiral arms. The distance is one of the more straightforward parameters to estimate, yet enormous differences can still be found among published data. This is particularly true for open clusters located more than a few kpc away. We cross-matched several published catalogues and selected the twenty-five most distant open clusters ($>$9000 pc). We then performed a detailed analysis of their fundamental parameters, with emphasis on their distances, to determine the agreement between catalogues and our estimates.} Photometric and astrometric data from the Gaia EDR3 survey was employed. The data was processed with our own membership analysis code (pyUPMASK), and our package for automatic fundamental cluster's parameters estimation ($\texttt{ASteCA}$). We find differences in the estimated distances of up to several kpc between our results and those catalogued, even for the catalogues that show the best matches with $\texttt{ASteCA}$ values. Large differences are also found for the age estimates. As a by-product of the analysis we find that vd Bergh-Hagen 176 could be the open cluster with the largest heliocentric distance catalogued to date. Caution is thus strongly recommended when using catalogued parameters of open clusters to infer large-scale properties of the Galaxy, particularly for those located more than a few kpc away. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: Accepted for publication in A&A

Journal ref: A&A 663, A131 (2022)

arXiv:2112.00076 [pdf, ps, other]

Using Conversational Artificial Intelligence to Support Children's Search in the Classroom

Authors: Garrett Allen, Jie Yang, Maria Soledad Pera, Ujwal Gadiraju

Abstract: We present pathways of investigation regarding conversational user interfaces (CUIs) for children in the classroom. We highlight anticipated challenges to be addressed in order to advance knowledge on CUIs for children. Further, we discuss preliminary ideas on strategies for evaluation. We present pathways of investigation regarding conversational user interfaces (CUIs) for children in the classroom. We highlight anticipated challenges to be addressed in order to advance knowledge on CUIs for children. Further, we discuss preliminary ideas on strategies for evaluation. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: Presented at CUI@CSCW 2021 -- https://www.conversationaluserinterfaces.org/workshops/CSCW2021/pdfs/2-Allen.pdf

ACM Class: H.5.2

arXiv:2109.06573 [pdf, other]

The Impact of User Demographics and Task Types on Cross-App Mobile Search

Authors: Mohammad Aliannejadi, Fabio Crestani, Theo Huibers, Monica Landoni, Emiliana Murgia, Maria Soledad Pera

Abstract: Recent developments in the mobile app industry have resulted in various types of mobile apps, each targeting a different need and a specific audience. Consequently, users access distinct apps to complete their information need tasks. This leads to the use of various apps not only separately, but also collaboratively in the same session to achieve a single goal. Recent work has argued the need for… ▽ More Recent developments in the mobile app industry have resulted in various types of mobile apps, each targeting a different need and a specific audience. Consequently, users access distinct apps to complete their information need tasks. This leads to the use of various apps not only separately, but also collaboratively in the same session to achieve a single goal. Recent work has argued the need for a unified mobile search system that would act as metasearch on users' mobile devices. The system would identify the target apps for the user's query, submit the query to the apps, and present the results to the user in a unified way. In this work, we aim to deepen our understanding of user behavior while accessing information on their mobile phones by conducting an extensive analysis of various aspects related to the search process. In particular, we study the effect of task type and user demographics on their behavior in interacting with mobile apps. Our findings reveal trends and patterns that can inform the design of a more effective mobile information access environment. △ Less

Submitted 14 September, 2021; originally announced September 2021.

Comments: FQAS Invited Paper

arXiv:2106.07813 [pdf, other]

To Infinity and Beyond! Accessibility is the Future for Kids' Search Engines

Authors: Ashlee Milton, Garrett Allen, Maria Soledad Pera

Abstract: Research in the area of search engines for children remains in its infancy. Seminal works have studied how children use mainstream search engines, as well as how to design and evaluate custom search engines explicitly for children. These works, however, tend to take a one-size-fits-all view, treating children as a unit. Nevertheless, even at the same age, children are known to possess and exhibit… ▽ More Research in the area of search engines for children remains in its infancy. Seminal works have studied how children use mainstream search engines, as well as how to design and evaluate custom search engines explicitly for children. These works, however, tend to take a one-size-fits-all view, treating children as a unit. Nevertheless, even at the same age, children are known to possess and exhibit different capabilities. These differences affect how children access and use search engines. To better serve children, in this vision paper, we spotlight accessibility and discuss why current research on children and search engines does not, but should, focus on this significant matter. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: In the proceeding of IR for Children 2000-2020: Where Are We Now? (https://www.fab4.science/ir4c/) -- Workshop co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

arXiv:2105.03708 [pdf, ps, other]

All Together Now: Teachers as Research Partners in the Design of Search Technology for the Classroom

Authors: Emiliana Murgia, Monica Landoni, Theo Huibers, Maria Soledad Pera

Abstract: In the classroom environment, search tools are the means for students to access Web resources. The perspectives of students, researchers, and industry practitioners lead the ongoing research debate in this area. In this article, we argue in favor of incorporating a new voice into this debate: teachers. We showcase the value of involving teachers in all aspects related to the design of search tools… ▽ More In the classroom environment, search tools are the means for students to access Web resources. The perspectives of students, researchers, and industry practitioners lead the ongoing research debate in this area. In this article, we argue in favor of incorporating a new voice into this debate: teachers. We showcase the value of involving teachers in all aspects related to the design of search tools for the classroom; from the beginning till the end. Driven by our research experience designing, developing, and evaluating new tools to support children's information discovery in the classroom, we share insights on the role of the experts-in-the-loop, i.e., teachers who provide the connection between search tools and students. And yes, in our case, always involving a teacher as a research partner. △ Less

Submitted 8 May, 2021; originally announced May 2021.

Comments: In KidRec '21: 5th International and Interdisciplinary Perspectives on Children & Recommender and Information Retrieval Systems (KidRec) Search and Recommendation Technology through the Lens of a Teacher- Co-located with ACM IDC 2021; June 26, 2021; Online Event

arXiv:2105.03456 [pdf, other]

CASTing a Net: Supporting Teachers with Search Technology

Authors: Garrett Allen, Katherine Landau Wright, Jerry Alan Fails, Casey Kennington, Maria Soledad Pera

Abstract: Past and current research has typically focused on ensuring that search technology for the classroom serves children. In this paper, we argue for the need to broaden the research focus to include teachers and how search technology can aid them. In particular, we share how furnishing a behind-the-scenes portal for teachers can empower them by providing a window into the spelling, writing, and conce… ▽ More Past and current research has typically focused on ensuring that search technology for the classroom serves children. In this paper, we argue for the need to broaden the research focus to include teachers and how search technology can aid them. In particular, we share how furnishing a behind-the-scenes portal for teachers can empower them by providing a window into the spelling, writing, and concept connection skills of their students. △ Less

Submitted 7 May, 2021; originally announced May 2021.

Comments: KidRec '21: 5th International and Interdisciplinary Perspectives on Children & Recommender and Information Retrieval Systems (KidRec) Search and Recommendation Technology through the Lens of a Teacher- Co-located with ACM IDC 2021

arXiv:2101.01660 [pdf, other]

doi 10.1051/0004-6361/202040252

pyUPMASK: an improved unsupervised clustering algorithm

Authors: M. S. Pera, G. I. Perren, A. Moitinho, H. D. Navone, R. A. Vazquez

Abstract: Aims. We present pyUPMASK, an unsupervised clustering method for stellar clusters that builds upon the original UPMASK package. Its general approach makes it plausible to be applied to analyses that deal with binary classes of any kind, as long as the fundamental hypotheses are met. The code is written entirely in Python and is made available through a public repository. Methods.The core of the al… ▽ More Aims. We present pyUPMASK, an unsupervised clustering method for stellar clusters that builds upon the original UPMASK package. Its general approach makes it plausible to be applied to analyses that deal with binary classes of any kind, as long as the fundamental hypotheses are met. The code is written entirely in Python and is made available through a public repository. Methods.The core of the algorithm follows the method developed in UPMASK but introducing several key enhancements. These enhancements not only make pyUPMASK more general, they also improve its performance considerably. Results. We thoroughly tested the performance of pyUPMASK on 600 synthetic clusters, affected by varying degrees of contamination by field stars. To assess the performance we employed six different statistical metrics that measure the accuracy of probabilistic classification. Conclusions. Our results show that pyUPMASK is better performant than UPMASK for every statistical performance metric, while still managing to be many times faster. △ Less

Submitted 8 April, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

Journal ref: A&A 650, A109 (2021)

arXiv:2012.07475 [pdf, other]

A Canine Census to Influence Public Policy

Authors: Matias Apa, Maria Cecilia Faini, Mohammad Aliannejadi, Maria Soledad Pera

Abstract: The potential threat that domestic animals pose to the health of human populations tends to be overlooked. We posit that positive steps forward can be made in this area, via suitable state-wide public policy. In this paper, we describe the data collection process that took place in Casilda (a city in Argentina), in the context of a canine census. We outline preliminary findings emerging from the d… ▽ More The potential threat that domestic animals pose to the health of human populations tends to be overlooked. We posit that positive steps forward can be made in this area, via suitable state-wide public policy. In this paper, we describe the data collection process that took place in Casilda (a city in Argentina), in the context of a canine census. We outline preliminary findings emerging from the data, based on a number of perspectives, along with implications of these findings in terms of informing public policy. △ Less

Submitted 14 December, 2020; originally announced December 2020.

Comments: Appeared in epiDAMIK Workshop in SIGKDD

arXiv:2005.12992 [pdf, ps, other]

Evaluating Information Retrieval Systems for Kids

Authors: Ashlee Milton, Maria Soledad Pera

Abstract: Evaluation of information retrieval systems (IRS) is a prominent topic among information retrieval researchers--mainly directed at a general population. Children require unique IRS and by extension different ways to evaluate these systems, but as a large population that use IRS have largely been ignored on the evaluation front. In this position paper, we explore many perspectives that must be cons… ▽ More Evaluation of information retrieval systems (IRS) is a prominent topic among information retrieval researchers--mainly directed at a general population. Children require unique IRS and by extension different ways to evaluate these systems, but as a large population that use IRS have largely been ignored on the evaluation front. In this position paper, we explore many perspectives that must be considered when evaluating IRS; we specially discuss problems faced by researchers who work with children IRS, including lack of evaluation frameworks, limitations of data, and lack of user judgment understanding. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: Accepted at the 4th International and Interdisciplinary Perspectives on Children & Recommender and Information Retrieval Systems (KidRec '20), co-located with the 19th ACM International Conference on Interaction Design and Children (IDC '20), https://kidrec.github.io/

arXiv:2005.05507 [pdf, other]

A Framework for Hierarchical Multilingual Machine Translation

Authors: Ion Madrazo Azpiazu, Maria Soledad Pera

Abstract: Multilingual machine translation has recently been in vogue given its potential for improving machine translation performance for low-resource languages via transfer learning. Empirical examinations demonstrating the success of existing multilingual machine translation strategies, however, are limited to experiments in specific language groups. In this paper, we present a hierarchical framework fo… ▽ More Multilingual machine translation has recently been in vogue given its potential for improving machine translation performance for low-resource languages via transfer learning. Empirical examinations demonstrating the success of existing multilingual machine translation strategies, however, are limited to experiments in specific language groups. In this paper, we present a hierarchical framework for building multilingual machine translation strategies that takes advantage of a typological language family tree for enabling transfer among similar languages while avoiding the negative effects that result from incorporating languages that are too different to each other. Exhaustive experimentation on a dataset with 41 languages demonstrates the validity of the proposed framework, especially when it comes to improving the performance of low-resource languages via the use of typologically related families for which richer sets of resources are available. △ Less

Submitted 11 May, 2020; originally announced May 2020.

arXiv:2003.12138 [pdf, other]

doi 10.1051/0004-6361/201937141

Sixteen overlooked open clusters in the fourth Galactic quadrant. A combined analysis of UBVI photometry and Gaia DR2 with ASteCA

Authors: G. I. Perren, E. E. Giorgi, A. Moitinho, G. Carraro, M. S. Pera, R. A. Vázquez

Abstract: Aims: This paper has two main objectives: (1) To determine the intrinsic properties of 16 faint and mostly unstudied open clusters in the poorly known sector of the Galaxy at 270$^\circ-$300$^\circ$, to probe the Milky Way structure in future investigations. (2) To address previously reported systematics in Gaia DR2 parallaxes by comparing the cluster distances derived from photometry with those d… ▽ More Aims: This paper has two main objectives: (1) To determine the intrinsic properties of 16 faint and mostly unstudied open clusters in the poorly known sector of the Galaxy at 270$^\circ-$300$^\circ$, to probe the Milky Way structure in future investigations. (2) To address previously reported systematics in Gaia DR2 parallaxes by comparing the cluster distances derived from photometry with those derived from parallaxes. Methods: Deep UBVI photometry of 16 open clusters was carried out. Observations were reduced and analyzed in an automaticway using the ASteCA package to get individual distances, reddening, masses, ages and metallicities. Photometric distances were compared to those obtained from a Bayesian analysis of Gaia DR2 parallaxes. Results: Ten out of the 16 clusters are true or highly probable open clusters. Two of them are quite young and follow the trace of the Carina Arm and the already detected warp. The rest of the clusters are placed in the interarm zone between the Perseus and Carina Arms as expected for older objects. We found that the cluster van den Berg-Hagen 85 is 7.5$\times$10$^9$ yrs old becoming then one of the oldest open cluster detected in our Galaxy so far. The relationship of these ten clusters with the Galaxy structure in the solar neighborhood is discussed. The comparison of distances from photometry and parallaxes data, in turn, reveals a variable level of disagreement. Conclusions: Various zero point corrections for Gaia DR2 parallax data recently reported were considered for a comparison between photometric and parallax based distances. The results tend to improve with some of these corrections. Photometric distance analysis suggest an average correction of $\sim$+0.026 mas (to be added to the parallaxes). The correction may have a more intricate distance dependency, but addressing that level of detail will require a larger cluster sample. △ Less

Submitted 26 March, 2020; originally announced March 2020.

Comments: 70 figures. Accepted for publication in A&A

Journal ref: A&A 637, A95 (2020)

arXiv:1808.08274 [pdf, other]

Can we leverage rating patterns from traditional users to enhance recommendations for children?

Authors: Ion Madrazo Azpiazu, Michael Green, Oghenemaro Anuyah, Maria Soledad Pera

Abstract: Recommender algorithms performance is often associated with the availability of sufficient historical rating data. Unfortunately, when it comes to children, this data is seldom available. In this paper, we report on an initial analysis conducted to examine the degree to which data about traditional users, i.e., adults, can be leveraged to enhance the recommendation process for children. Recommender algorithms performance is often associated with the availability of sufficient historical rating data. Unfortunately, when it comes to children, this data is seldom available. In this paper, we report on an initial analysis conducted to examine the degree to which data about traditional users, i.e., adults, can be leveraged to enhance the recommendation process for children. △ Less

Submitted 24 August, 2018; originally announced August 2018.

Comments: ACM RecSys 2018

arXiv:1808.07025 [pdf, other]

Who is Really Affected by Fraudulent Reviews? An analysis of shilling attacks on recommender systems in real-world scenarios

Authors: Anu Shrestha, Francesca Spezzano, Maria Soledad Pera

Abstract: We present the results of an initial analysis conducted on a real-life setting to quantify the effect of shilling attacks on recommender systems. We focus on both algorithm performance as well as the types of users who are most affected by these attacks. We present the results of an initial analysis conducted on a real-life setting to quantify the effect of shilling attacks on recommender systems. We focus on both algorithm performance as well as the types of users who are most affected by these attacks. △ Less

Submitted 21 August, 2018; originally announced August 2018.

Comments: Proceedings of the Late-Breaking Results track part of the Twelfth ACM Conference on Recommender Systems (RecSys'18)

arXiv:1111.7224 [pdf, other]

Generating Exact- and Ranked Partially-Matched Answers to Questions in Advertisements

Authors: Rani Qumsiyeh, Maria S. Pera, Yiu-Kai Ng

Abstract: Taking advantage of the Web, many advertisements (ads for short) websites, which aspire to increase client's transactions and thus profits, offer searching tools which allow users to (i) post keyword queries to capture their information needs or (ii) invoke form-based interfaces to create queries by selecting search options, such as a price range, filled-in entries, check boxes, or drop-down menus… ▽ More Taking advantage of the Web, many advertisements (ads for short) websites, which aspire to increase client's transactions and thus profits, offer searching tools which allow users to (i) post keyword queries to capture their information needs or (ii) invoke form-based interfaces to create queries by selecting search options, such as a price range, filled-in entries, check boxes, or drop-down menus. These search mechanisms, however, are inadequate, since they cannot be used to specify a natural-language query with rich syntactic and semantic content, which can only be handled by a question answering (QA) system. Furthermore, existing ads websites are incapable of evaluating arbitrary Boolean queries or retrieving partiallymatched answers that might be of interest to the user whenever a user's search yields only a few or no results at all. In solving these problems, we present a QA system for ads, called CQAds, which (i) allows users to post a natural-language question Q for retrieving relevant ads, if they exist, (ii) identifies ads as answers that partially-match the requested information expressed in Q, if insufficient or no answers to Q can be retrieved, which are ordered using a similarity-ranking approach, and (iii) analyzes incomplete or ambiguous questions to perform the "best guess" in retrieving answers that "best match" the selection criteria specified in Q. CQAds is also equipped with a Boolean model to evaluate Boolean operators that are either explicitly or implicitly specified in Q, i.e., with or without Boolean operators specified by the users, respectively. CQAds is easy to use, scalable to all ads domains, and more powerful than search tools provided by existing ads websites, since its query-processing strategy retrieves relevant ads of higher quality and quantity. We have verified the accuracy of CQAds in retrieving ads on eight ads domains and compared it...[truncated]. △ Less

Submitted 30 November, 2011; originally announced November 2011.

Comments: VLDB2012

Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 3, pp. 217-228 (2011)

Showing 1–23 of 23 results for author: Pera, M S