Skip to main content

Showing 1–30 of 30 results for author: Terzi, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.23209  [pdf, other

    cs.LG cs.DM cs.SI

    A QUBO Framework for Team Formation

    Authors: Karan Vombatkere, Evimaria Terzi, Theodoros Lappas

    Abstract: The team formation problem assumes a set of experts and a task, where each expert has a set of skills and the task requires some skills. The objective is to find a set of experts that maximizes coverage of the required skills while simultaneously minimizing the costs associated with the experts. Different definitions of cost have traditionally led to distinct problem formulations and algorithmic s… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  2. Forming Coordinated Teams that Balance Task Coverage and Expert Workload

    Authors: Karan Vombatkere, Evimaria Terzi, Aristides Gionis

    Abstract: We study a new formulation of the team-formation problem, where the goal is to form teams to work on a given set of tasks requiring different skills. Deviating from the classic problem setting where one is asking to cover all skills of each given task, we aim to cover as many skills as possible while also trying to minimize the maximum workload among the experts. We do this by combining penalizati… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Journal ref: Data Mining and Knowledge Discovery (2025)

  3. arXiv:2410.22591  [pdf, other

    cs.LG cs.AI stat.ME

    FGCE: Feasible Group Counterfactual Explanations for Auditing Fairness

    Authors: Christos Fragkathoulas, Vasiliki Papanikou, Evaggelia Pitoura, Evimaria Terzi

    Abstract: This paper introduces the first graph-based framework for generating group counterfactual explanations to audit model fairness, a crucial aspect of trustworthy machine learning. Counterfactual explanations are instrumental in understanding and mitigating unfairness by revealing how inputs should change to achieve a desired outcome. Our framework, named Feasible Group Counterfactual Explanations (F… ▽ More

    Submitted 15 November, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

  4. arXiv:2407.19262  [pdf, other

    cs.CL cs.LG

    Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications

    Authors: Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi

    Abstract: Understanding whether and to what extent large language models (LLMs) have memorised training data has important implications for the reliability of their output and the privacy of their training data. In order to cleanly measure and disentangle memorisation from other phenomena (e.g. in-context learning), we create an experimental framework that is based on repeatedly exposing LLMs to random stri… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

  5. Towards Reliable Latent Knowledge Estimation in LLMs: Zero-Prompt Many-Shot Based Factual Knowledge Extraction

    Authors: Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P. Gummadi, Evimaria Terzi

    Abstract: In this paper, we focus on the challenging task of reliably estimating factual knowledge that is embedded inside large language models (LLMs). To avoid reliability concerns with prior approaches, we propose to eliminate prompt engineering when probing LLMs for factual knowledge. Our approach, called Zero-Prompt Latent Knowledge Estimator (ZP-LKE), leverages the in-context learning ability of LLMs… ▽ More

    Submitted 17 December, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

  6. arXiv:2403.00859  [pdf, other

    cs.AI cs.GT cs.SI

    Team Formation amidst Conflicts

    Authors: Iasonas Nikolaou, Evimaria Terzi

    Abstract: In this work, we formulate the problem of team formation amidst conflicts. The goal is to assign individuals to tasks, with given capacities, taking into account individuals' task preferences and the conflicts between them. Using dependent rounding schemes as our main toolbox, we provide efficient approximation algorithms. Our framework is extremely versatile and can model many different real-worl… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  7. arXiv:2402.10243  [pdf, other

    physics.soc-ph cs.LG cs.SI

    Understanding team collapse via probabilistic graphical models

    Authors: Iasonas Nikolaou, Konstantinos Pelechrinis, Evimaria Terzi

    Abstract: In this work, we develop a graphical model to capture team dynamics. We analyze the model and show how to learn its parameters from data. Using our model we study the phenomenon of team collapse from a computational perspective. We use simulations and real-world experiments to find the main causes of team collapse. We also provide the principles of building resilient teams, i.e., teams that avoid… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  8. arXiv:2311.10005  [pdf, other

    cs.DB

    Towards Flexibility and Robustness of LSM Trees

    Authors: Andy Huynh, Harshal A. Chaudhari, Evimaria Terzi, Manos Athanassoulis

    Abstract: Log-Structured Merge trees (LSM trees) are increasingly used as part of the storage engine behind several data systems, and are frequently deployed in the cloud. As the number of applications relying on LSM-based storage backends increases, the problem of performance tuning of LSM trees receives increasing attention. We consider both nominal tunings - where workload and execution environment are a… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 25 pages, 19 figures, VLDB-J. arXiv admin note: substantial text overlap with arXiv:2110.13801

  9. arXiv:2309.04339  [pdf, other

    cs.LG cs.AI math.OC

    Online Submodular Maximization via Online Convex Optimization

    Authors: Tareq Si Salem, Gözde Özcan, Iasonas Nikolaou, Evimaria Terzi, Stratis Ioannidis

    Abstract: We study monotone submodular maximization under general matroid constraints in the online setting. We prove that online optimization of a large class of submodular functions, namely, weighted threshold potential functions, reduces to online convex optimization (OCO). This is precisely because functions in this class admit a concave relaxation; as a result, OCO policies, coupled with an appropriate… ▽ More

    Submitted 7 January, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: Accepted to AAAI Conference on Artificial Intelligence, 2024

  10. arXiv:2110.13801  [pdf, other

    cs.DB

    Endure: A Robust Tuning Paradigm for LSM Trees Under Workload Uncertainty

    Authors: Andy Huynh, Harshal A. Chaudhari, Evimaria Terzi, Manos Athanassoulis

    Abstract: Log-Structured Merge trees (LSM trees) are increasingly used as the storage engines behind several data systems, frequently deployed in the cloud. Similar to other database architectures, LSM trees take into account information about the expected workload (e.g., reads vs. writes, point vs. range queries) to optimize their performance via tuning. Operating in shared infrastructure like the cloud, h… ▽ More

    Submitted 2 November, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: 21 pages, 30 figures

  11. arXiv:2011.04428  [pdf, other

    cs.AI

    Finding teams that balance expert load and task coverage

    Authors: Sofia Maria Nikolakaki, Mingxiang Cai, Evimaria Terzi

    Abstract: The rise of online labor markets (e.g., Freelancer, Guru and Upwork) has ignited a lot of research on team formation, where experts acquiring different skills form teams to complete tasks. The core idea in this line of work has been the strict requirement that the team of experts assigned to complete a given task should contain a superset of the skills required by the task. However, in many applic… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

  12. arXiv:2011.01897   

    cs.SI

    A Multi-aspect Analysis of Gender Bias on Online Student Evaluations

    Authors: Sofia Maria Nikolakaki, Joseph Lai, Evimaria Terzi

    Abstract: Institutions widely use student evaluations to assess the faculty's teaching performance, but underlying trends and biases can influence their interpretation. Using data from Rate My Professors, we conduct the largest and most recent quantitative data analysis to study questions related to the evaluation criteria that students have when they review the performance of their male and female professo… ▽ More

    Submitted 7 December, 2020; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Withdrawal due to a technical issue that we do not know how and if we will be able to resolve

  13. arXiv:2006.10904  [pdf, other

    cs.AI cs.CY

    Learn to Earn: Enabling Coordination within a Ride Hailing Fleet

    Authors: Harshal A. Chaudhari, John W. Byers, Evimaria Terzi

    Abstract: The problem of optimizing social welfare objectives on multi sided ride hailing platforms such as Uber, Lyft, etc., is challenging, due to misalignment of objectives between drivers, passengers, and the platform itself. An ideal solution aims to minimize the response time for each hyper local passenger ride request, while simultaneously maintaining high demand satisfaction and supply utilization a… ▽ More

    Submitted 16 July, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: 16 pages, 9 figures

    MSC Class: 68T05 ACM Class: I.2; K.4; J.6

  14. arXiv:2002.07782  [pdf, other

    cs.DS

    An Efficient Framework for Balancing Submodularity and Cost

    Authors: Sofia Maria Nikolakaki, Alina Ene, Evimaria Terzi

    Abstract: In the classical selection problem, the input consists of a collection of elements and the goal is to pick a subset of elements from the collection such that some objective function $f$ is maximized. This problem has been studied extensively in the data-mining community and it has multiple applications including influence maximization in social networks, team formation and recommender systems. A p… ▽ More

    Submitted 3 September, 2021; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: Extended version of KDD 2021 paper

  15. arXiv:2002.07618  [pdf, ps, other

    math.OC cs.LG stat.ML

    Algorithms for Hiring and Outsourcing in the Online Labor Market

    Authors: Aris Anagnostopoulos, Carlos Castillo, Adriano Fazzone, Stefano Leonardi, Evimaria Terzi

    Abstract: Although freelancing work has grown substantially in recent years, in part facilitated by a number of online labor marketplaces, (e.g., Guru, Freelancer, Amazon Mechanical Turk), traditional forms of "in-sourcing" work continue being the dominant form of employment. This means that, at least for the time being, freelancing and salaried employment will continue to co-exist. In this paper, we provid… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

    Comments: Published at 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2018

  16. arXiv:1905.03037  [pdf, other

    cs.SI cs.DS

    The Guided Team-Partitioning Problem: Definition, Complexity, and Algorithm

    Authors: Sanaz Bahargam, Theodoros Lappas, Evimaria Terzi

    Abstract: A long line of literature has focused on the problem of selecting a team of individuals from a large pool of candidates, such that certain constraints are respected, and a given objective function is maximized. Even though extant research has successfully considered diverse families of objective functions and constraints, one of the most common limitations is the focus on the single-team paradigm.… ▽ More

    Submitted 30 April, 2019; originally announced May 2019.

  17. A Team-Formation Algorithm for Faultline Minimization

    Authors: Sanaz Bahargam, Behzad Golshan, Theodoros Lappas, Evimaria Terzi

    Abstract: In recent years, the proliferation of online resumes and the need to evaluate large populations of candidates for on-site and virtual teams have led to a growing interest in automated team-formation. Given a large pool of candidates, the general problem requires the selection of a team of experts to complete a given task. Surprisingly, while ongoing research has studied numerous variations with di… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

  18. Markov Chain Monitoring

    Authors: Harshal A. Chaudhari, Michael Mathioudakis, Evimaria Terzi

    Abstract: In networking applications, one often wishes to obtain estimates about the number of objects at different parts of the network (e.g., the number of cars at an intersection of a road network or the number of packets expected to reach a node in a computer network) by monitoring the traffic in a small number of network nodes or edges. We formalize this task by defining the 'Markov Chain Monitoring' p… ▽ More

    Submitted 23 January, 2018; originally announced January 2018.

    Comments: 13 pages, 10 figures, 1 table

  19. Matrix completion with queries

    Authors: Natali Ruchansky, Mark Crovella, Evimaria Terzi

    Abstract: In many applications, e.g., recommender systems and traffic monitoring, the data comes in the form of a matrix that is only partially observed and low rank. A fundamental data-analysis task for these datasets is matrix completion, where the goal is to accurately infer the entries missing from the matrix. Even when the data satisfies the low-rank assumption, classical matrix-completion methods may… ▽ More

    Submitted 30 April, 2017; originally announced May 2017.

    Comments: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

  20. arXiv:1705.00375  [pdf, other

    cs.LG stat.ML

    Targeted matrix completion

    Authors: Natali Ruchansky, Mark Crovella, Evimaria Terzi

    Abstract: Matrix completion is a problem that arises in many data-analysis settings where the input consists of a partially-observed matrix (e.g., recommender systems, traffic matrix analysis etc.). Classical approaches to matrix completion assume that the input partially-observed matrix is low rank. The success of these methods depends on the number of observed entries and the rank of the matrix; the large… ▽ More

    Submitted 30 April, 2017; originally announced May 2017.

    Comments: Proceedings of the 2017 SIAM International Conference on Data Mining (SDM)

  21. arXiv:1703.08762  [pdf, other

    cs.AI

    Team Formation for Scheduling Educational Material in Massive Online Classes

    Authors: Sanaz Bahargam, Dóra Erdos, Azer Bestavros, Evimaria Terzi

    Abstract: Whether teaching in a classroom or a Massive Online Open Course it is crucial to present the material in a way that benefits the audience as a whole. We identify two important tasks to solve towards this objective, 1 group students so that they can maximally benefit from peer interaction and 2 find an optimal schedule of the educational material for each group. Thus, in this paper, we solve the pr… ▽ More

    Submitted 25 March, 2017; originally announced March 2017.

  22. arXiv:1701.07221  [pdf, other

    cs.SI

    Community-aware network sparsification

    Authors: Aristides Gionis, Polina Rozenshtein, Nikolaj Tatti, Evimaria Terzi

    Abstract: Network sparsification aims to reduce the number of edges of a network while maintaining its structural properties; such properties include shortest paths, cuts, spectral measures, or network modularity. Sparsification has multiple applications, such as, speeding up graph-mining algorithms, graph visualization, as well as identifying the important network edges. In this paper we consider a novel f… ▽ More

    Submitted 25 January, 2017; originally announced January 2017.

  23. arXiv:1701.05352  [pdf, other

    cs.SI physics.soc-ph

    Finding low-tension communities

    Authors: Esther Galbrun, Behzad Golshan, Aristides Gionis, Evimaria Terzi

    Abstract: Motivated by applications that arise in online social media and collaboration networks, there has been a lot of work on community-search and team-formation problems. In the former class of problems, the goal is to find a subgraph that satisfies a certain connectivity requirement and contains a given collection of seed nodes. In the latter class of problems, on the other hand, the goal is to find i… ▽ More

    Submitted 19 January, 2017; originally announced January 2017.

    Comments: A short version of this paper appeared in the 2017 SIAM International Conference on Data Mining, SDM'17. In this extended version, we discuss the team-formation problem variant, beside the original community-search problem, and include additional experimental results

  24. arXiv:1612.05440  [pdf, other

    cs.SI cs.DS physics.soc-ph

    Best Friends Forever (BFF): Finding Lasting Dense Subgraphs

    Authors: Konstantinos Semertzidis, Evaggelia Pitoura, Evimaria Terzi, Panayiotis Tsaparas

    Abstract: Graphs form a natural model for relationships and interactions between entities, for example, between people in social and cooperation networks, servers in computer networks, or tags and words in documents and tweets. But, which of these relationships or interactions are the most lasting ones? In this paper, we study the following problem: given a set of graph snapshots, which may correspond to th… ▽ More

    Submitted 2 October, 2017; v1 submitted 16 December, 2016; originally announced December 2016.

    Comments: 15 pages, 10 figures, 8 tables

    Journal ref: Data Mining and Knowledge Discovery - Journal Track of ECML PKDD 2019

  25. arXiv:1610.05516  [pdf, other

    cs.SI physics.soc-ph

    Active Network Alignment: A Matching-Based Approach

    Authors: Eric Malmi, Aristides Gionis, Evimaria Terzi

    Abstract: Network alignment is the problem of matching the nodes of two graphs, maximizing the similarity of the matched nodes and the edges between them. This problem is encountered in a wide array of applications-from biological networks to social networks to ontologies-where multiple networked data sources need to be integrated. Due to the difficulty of the task, an accurate alignment can rarely be found… ▽ More

    Submitted 6 September, 2017; v1 submitted 18 October, 2016; originally announced October 2016.

    Comments: This is a pre-print of an article appearing at CIKM 2017

  26. arXiv:1406.4173  [pdf, other

    cs.DS

    A Divide-and-Conquer Algorithm for Betweenness Centrality

    Authors: Dora Erdos, Vatche Ishakian, Azer Bestavros, Evimaria Terzi

    Abstract: The problem of efficiently computing the betweenness centrality of nodes has been researched extensively. To date, the best known exact and centralized algorithm for this task is an algorithm proposed in 2001 by Brandes. The contribution of our paper is Brandes++, an algorithm for exact efficient computation of betweenness centrality. The crux of our algorithm is that we create a sketch of the gra… ▽ More

    Submitted 4 June, 2015; v1 submitted 16 June, 2014; originally announced June 2014.

    Comments: Shorter version of this paper appeared in Siam Data Mining 2015

  27. arXiv:1301.7455  [pdf, other

    cs.SI physics.soc-ph

    Opinion Maximization in Social Networks

    Authors: Aristides Gionis, Evimaria Terzi, Panayiotis Tsaparas

    Abstract: The process of opinion formation through synthesis and contrast of different viewpoints has been the subject of many studies in economics and social sciences. Today, this process manifests itself also in online social networks and social media. The key characteristic of successful promotion campaigns is that they take into consideration such opinion-formation dynamics in order to create a overall… ▽ More

    Submitted 30 January, 2013; originally announced January 2013.

    Journal ref: Siam International Conference on Data Mining (SDM), 2013

  28. arXiv:1201.6565  [pdf, other

    cs.DB

    The Filter-Placement Problem and its Application to Minimizing Information Multiplicity

    Authors: Dóra Erdös, Vatche Ishakian, Andrei Lapets, Evimaria Terzi, Azer Bestavros

    Abstract: In many information networks, data items -- such as updates in social networks, news flowing through interconnected RSS feeds and blogs, measurements in sensor networks, route updates in ad-hoc networks -- propagate in an uncoordinated manner: nodes often relay information they receive to neighbors, independent of whether or not these neighbors received the same information from other sources. Thi… ▽ More

    Submitted 31 January, 2012; originally announced January 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 5, pp. 418-429 (2012)

  29. arXiv:0810.5578  [pdf, ps, other

    cs.DB cs.DS

    Anonymizing Graphs

    Authors: Tomas Feder, Shubha U. Nabar, Evimaria Terzi

    Abstract: Motivated by recently discovered privacy attacks on social networks, we study the problem of anonymizing the underlying graph of interactions in a social network. We call a graph (k,l)-anonymous if for every node in the graph there exist at least k other nodes that share at least l of its neighbors. We consider two combinatorial problems arising from this notion of anonymity in graphs. More spec… ▽ More

    Submitted 30 October, 2008; originally announced October 2008.

    Comments: 15 pages, 5 figures

  30. arXiv:0809.3027  [pdf, ps, other

    cs.AI cs.DB physics.soc-ph

    Finding links and initiators: a graph reconstruction problem

    Authors: Heikki Mannila, Evimaria Terzi

    Abstract: Consider a 0-1 observation matrix M, where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i,j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entries in the matrix are initiators. We formally de… ▽ More

    Submitted 17 September, 2008; originally announced September 2008.

    ACM Class: H.2.8