Skip to main content

Showing 1–50 of 82 results for author: Gionis, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06456  [pdf, ps, other

    cs.DS

    Sample and Expand: Discovering Low-rank Submatrices With Quality Guarantees

    Authors: Martino Ciaperoni, Aristides Gionis, Heikki Mannila

    Abstract: The problem of approximating a matrix by a low-rank one has been extensively studied. This problem assumes, however, that the whole matrix has a low-rank structure. This assumption is often false for real-world matrices. We consider the problem of discovering submatrices from the given matrix with bounded deviations from their low-rank approximations. We introduce an effective two-phase method for… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  2. Forming Coordinated Teams that Balance Task Coverage and Expert Workload

    Authors: Karan Vombatkere, Evimaria Terzi, Aristides Gionis

    Abstract: We study a new formulation of the team-formation problem, where the goal is to form teams to work on a given set of tasks requiring different skills. Deviating from the classic problem setting where one is asking to cover all skills of each given task, we aim to cover as many skills as possible while also trying to minimize the maximum workload among the experts. We do this by combining penalizati… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Journal ref: Data Mining and Knowledge Discovery (2025)

  3. arXiv:2502.14532  [pdf, other

    cs.DS

    OptiRefine: Densest subgraphs and maximum cuts with $k$ refinements

    Authors: Sijing Tu, Aleksa Stankovic, Stefan Neumann, Aristides Gionis

    Abstract: Data-analysis tasks often involve an iterative process, which requires refining previous solutions. For instance, when analyzing dynamic social networks, we may be interested in monitoring the evolution of a community that was identified at an earlier snapshot. This task requires finding a community in the current snapshot of data that is ``close'' to the earlier-discovered community of interest.… ▽ More

    Submitted 25 February, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

    Comments: submitted under review. Add acknowledgement

  4. arXiv:2502.02115  [pdf, other

    cs.DS

    Efficient and Practical Approximation Algorithms for Advertising in Content Feeds

    Authors: Guangyi Zhang, Ilie Sarpe, Aristides Gionis

    Abstract: Content feeds provided by platforms such as X (formerly Twitter) and TikTok are consumed by users on a daily basis. In this paper, we revisit the native advertising problem in content feeds, initiated by Ieong et al. Given a sequence of organic items (e.g., videos or posts) relevant to a user's interests or to an information search, the goal is to place ads within the organic content so as to maxi… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: Accepted manuscript to appear in TheWebConf 2025

  5. arXiv:2412.10944  [pdf, ps, other

    cs.DS

    Sequential Diversification with Provable Guarantees

    Authors: Honglian Wang, Sijing Tu, Aristides Gionis

    Abstract: Diversification is a useful tool for exploring large collections of information items. It has been used to reduce redundancy and cover multiple perspectives in information-search settings. Diversification finds applications in many different domains, including presenting search results of information-retrieval systems and selecting suggestions for recommender systems. Interestingly, existing mea… ▽ More

    Submitted 17 February, 2025; v1 submitted 14 December, 2024; originally announced December 2024.

    Comments: WSDM 2025

    ACM Class: G.1.2

  6. arXiv:2410.12913  [pdf, other

    cs.LG cs.AI cs.CY cs.DM

    Fair Clustering for Data Summarization: Improved Approximation Algorithms and Complexity Insights

    Authors: Ameet Gadekar, Aristides Gionis, Suhas Thejaswi

    Abstract: Data summarization tasks are often modeled as $k$-clustering problems, where the goal is to choose $k$ data points, called cluster centers, that best represent the dataset by minimizing a clustering objective. A popular objective is to minimize the maximum distance between any data point and its nearest center, which is formalized as the $k$-center problem. While in some applications all data poin… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  7. Polaris: Sampling from the Multigraph Configuration Model with Prescribed Color Assortativity

    Authors: Giulia Preti, Matteo Riondato, Aristides Gionis, Gianmarco De Francisci Morales

    Abstract: We introduce Polaris, a network null model for colored multi-graphs that preserves the Joint Color Matrix. Polaris is specifically designed for studying network polarization, where vertices belong to a side in a debate or a partisan group, represented by a vertex color, and relations have different strengths, represented by an integer-valued edge multiplicity. The key feature of Polaris is preserv… ▽ More

    Submitted 18 December, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: Accepted for publication at WSDM2025

  8. Relevance meets Diversity: A User-Centric Framework for Knowledge Exploration through Recommendations

    Authors: Erica Coppolillo, Giuseppe Manco, Aristides Gionis

    Abstract: Providing recommendations that are both relevant and diverse is a key consideration of modern recommender systems. Optimizing both of these measures presents a fundamental trade-off, as higher diversity typically comes at the cost of relevance, resulting in lower user engagement. Existing recommendation algorithms try to resolve this trade-off by combining the two measures, relevance and diversity… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  9. Scalable Temporal Motif Densest Subnetwork Discovery

    Authors: Ilie Sarpe, Fabio Vandin, Aristides Gionis

    Abstract: Finding dense subnetworks, with density based on edges or more complex structures, such as subgraphs or $k$-cliques, is a fundamental algorithmic problem with many applications. While the problem has been studied extensively in static networks, much remains to be explored for temporal networks. In this work we introduce the novel problem of identifying the temporal motif densest subnetwork, i.e.… ▽ More

    Submitted 25 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: Extended version of the accepted KDD'24 paper

  10. arXiv:2406.03059  [pdf, other

    cs.LG

    Efficient Exploration of the Rashomon Set of Rule Set Models

    Authors: Martino Ciaperoni, Han Xiao, Aristides Gionis

    Abstract: Today, as increasingly complex predictive models are developed, simple rule sets remain a crucial tool to obtain interpretable predictions and drive high-stakes decision making. However, a single rule set provides a partial representation of a learning task. An emerging paradigm in interpretable machine learning aims at exploring the Rashomon set of all models exhibiting near-optimal performance.… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  11. arXiv:2402.10053  [pdf, other

    cs.SI

    Modeling the Impact of Timeline Algorithms on Opinion Dynamics Using Low-rank Updates

    Authors: Tianyi Zhou, Stefan Neumann, Kiran Garimella, Aristides Gionis

    Abstract: Timeline algorithms are key parts of online social networks, but during recent years they have been blamed for increasing polarization and disagreement in our society. Opinion-dynamics models have been used to study a variety of phenomena in online social networks, but an open question remains on how these models can be augmented to take into account the fine-grained impact of user-level timeline… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: To appear at The WebConf 2024

  12. arXiv:2402.09124  [pdf, other

    cs.SI

    Finding Densest Subgraphs with Edge-Color Constraints

    Authors: Lutz Oettershagen, Honglian Wang, Aristides Gionis

    Abstract: We consider a variant of the densest subgraph problem in networks with single or multiple edge attributes. For example, in a social network, the edge attributes may describe the type of relationship between users, such as friends, family, or acquaintances, or different types of communication. For conceptual simplicity, we view the attributes as edge colors. The new problem we address is to find a… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

  13. arXiv:2401.05502  [pdf, ps, other

    cs.DS cs.AI cs.CC cs.LG

    Diversity-aware clustering: Computational Complexity and Approximation Algorithms

    Authors: Suhas Thejaswi, Ameet Gadekar, Bruno Ordozgoiti, Aristides Gionis

    Abstract: In this work, we study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups. A clustering solution needs to ensure that the number of chosen cluster centers from each group should be within the range defined by a lower and upper bound threshold for each group, while simultaneously minimizing the clustering objective, whi… ▽ More

    Submitted 20 May, 2025; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: Algorithmic Fairness, Fair Clustering, Diversity-aware Clustering, Intersectionaly, Subgroup fairness

  14. arXiv:2308.14486  [pdf, other

    cs.SI cs.CY cs.LG

    Rebalancing Social Feed to Minimize Polarization and Disagreement

    Authors: Federico Cinus, Aristides Gionis, Francesco Bonchi

    Abstract: Social media have great potential for enabling public discourse on important societal issues. However, adverse effects, such as polarization and echo chambers, greatly impact the benefits of social media and call for algorithms that mitigate these effects. In this paper, we propose a novel problem formulation aimed at slightly nudging users' social feeds in order to strike a balance between releva… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted for publication at ACM CIKM 2023

  15. arXiv:2307.02946  [pdf, other

    cs.DB cs.DS

    Finding Favourite Tuples on Data Streams with Provably Few Comparisons

    Authors: Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

    Abstract: One of the most fundamental tasks in data science is to assist a user with unknown preferences in finding high-utility tuples within a large database. To accurately elicit the unknown user preferences, a widely-adopted way is by asking the user to compare pairs of tuples. In this paper, we study the problem of identifying one or more high-utility tuples by adaptively receiving user input on a mini… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: To appear in KDD 2023

  16. arXiv:2306.10313  [pdf, other

    cs.SI cs.DS cs.LG

    Adversaries with Limited Information in the Friedkin--Johnsen Model

    Authors: Sijing Tu, Stefan Neumann, Aristides Gionis

    Abstract: In recent years, online social networks have been the target of adversaries who seek to introduce discord into societies, to undermine democracies and to destabilize communities. Often the goal is not to favor a certain side of a conflict but to increase disagreement and polarization. To get a mathematical understanding of such attacks, researchers use opinion-formation models from sociology, such… ▽ More

    Submitted 12 September, 2023; v1 submitted 17 June, 2023; originally announced June 2023.

    Comments: KDD'23

  17. arXiv:2306.07930  [pdf, other

    cs.SI cs.CY cs.DS

    Reducing Exposure to Harmful Content via Graph Rewiring

    Authors: Corinna Coupette, Stefan Neumann, Aristides Gionis

    Abstract: Most media content consumed today is provided by digital platforms that aggregate input from diverse sources, where access to information is mediated by recommendation algorithms. One principal challenge in this context is dealing with content that is considered harmful. Striking a balance between competing stakeholder interests, rather than block harmful content altogether, one approach is to min… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: 25 pages, 28 figures, accepted at KDD 2023

  18. arXiv:2306.03571  [pdf, other

    cs.SI cs.DS

    Minimizing Hitting Time between Disparate Groups with Shortcut Edges

    Authors: Florian Adriaens, Honglian Wang, Aristides Gionis

    Abstract: Structural bias or segregation of networks refers to situations where two or more disparate groups are present in the network, so that the groups are highly connected internally, but loosely connected to each other. In many cases it is of interest to increase the connectivity of disparate groups so as to, e.g., minimize social friction, or expose individuals to diverse viewpoints. A commonly-used… ▽ More

    Submitted 16 June, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: To appear in KDD 2023

  19. arXiv:2304.10328  [pdf, other

    cs.NI cs.LG

    Learning Cellular Coverage from Real Network Configurations using GNNs

    Authors: Yifei Jin, Marios Daoutis, Sarunas Girdzijauskas, Aristides Gionis

    Abstract: Cellular coverage quality estimation has been a critical task for self-organized networks. In real-world scenarios, deep-learning-powered coverage quality estimation methods cannot scale up to large areas due to little ground truth can be provided during network design & optimization. In addition they fall short in produce expressive embeddings to adequately capture the variations of the cells' co… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted at 2023 IEEE VTC-Spring

  20. arXiv:2301.06787  [pdf, other

    cs.DS

    Ranking with submodular functions on the fly

    Authors: Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

    Abstract: Maximizing submodular functions have been studied extensively for a wide range of subset-selection problems. However, much less attention has been given to the role of submodularity in sequence-selection and ranking problems. A recently-introduced framework, named \emph{maximum submodular ranking} (MSR), tackles a family of ranking problems that arise naturally when resources are shared among mult… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

    Comments: 10 pages, to appear in SDM 2023

  21. arXiv:2210.01533  [pdf, other

    cs.LG

    Concise and interpretable multi-label rule sets

    Authors: Martino Ciaperoni, Han Xiao, Aristides Gionis

    Abstract: Multi-label classification is becoming increasingly ubiquitous, but not much attention has been paid to interpretability. In this paper, we develop a multi-label classifier that can be represented as a concise set of simple "if-then" rules, and thus, it offers better interpretability compared to black-box models. Notably, our method is able to find a small set of relevant patterns that lead to acc… ▽ More

    Submitted 7 November, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

  22. Diameter Minimization by Shortcutting with Degree Constraints

    Authors: Florian Adriaens, Aristides Gionis

    Abstract: We consider the problem of adding a fixed number of new edges to an undirected graph in order to minimize the diameter of the augmented graph, and under the constraint that the number of edges added for each vertex is bounded by an integer. The problem is motivated by network-design applications, where we want to minimize the worst case communication in the network without excessively increasing t… ▽ More

    Submitted 2 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: A shorter version of this work has been accepted at the IEEE ICDM 2022 conference

  23. Regularized impurity reduction: Accurate decision trees with complexity guarantees

    Authors: Guangyi Zhang, Aristides Gionis

    Abstract: Decision trees are popular classification models, providing high accuracy and intuitive explanations. However, as the tree size grows the model interpretability deteriorates. Traditional tree-induction algorithms, such as C4.5 and CART, rely on impurity-reduction functions that promote the discriminative power of each split. Thus, although these traditional methods are accurate in practice, there… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Journal ref: Data Mining and Knowledge Discovery (2022)

  24. arXiv:2207.14643  [pdf, other

    cs.NI cs.AI cs.LG

    Open World Learning Graph Convolution for Latency Estimation in Routing Networks

    Authors: Yifei Jin, Marios Daoutis, Sarunas Girdzijauskas, Aristides Gionis

    Abstract: Accurate routing network status estimation is a key component in Software Defined Networking. However, existing deep-learning-based methods for modeling network routing are not able to extrapolate towards unseen feature distributions. Nor are they able to handle scaled and drifted network attributes in test sets that include open-world inputs. To deal with these challenges, we propose a novel appr… ▽ More

    Submitted 26 April, 2024; v1 submitted 8 July, 2022; originally announced July 2022.

    Comments: Accepted in IJCNN 2022

  25. arXiv:2206.08054  [pdf, other

    cs.LG cs.DS

    Generalized Leverage Scores: Geometric Interpretation and Applications

    Authors: Bruno Ordozgoiti, Antonis Matakos, Aristides Gionis

    Abstract: In problems involving matrix computations, the concept of leverage has found a large number of applications. In particular, leverage scores, which relate the columns of a matrix to the subspaces spanned by its leading singular vectors, are helpful in revealing column subsets to approximately factorize a matrix with quality guarantees. As such, they provide a solid foundation for a variety of machi… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  26. Ranking with submodular functions on a budget

    Authors: Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

    Abstract: Submodular maximization has been the backbone of many important machine-learning problems, and has applications to viral marketing, diversification, sensor placement, and more. However, the study of maximizing submodular functions has mainly been restricted in the context of selecting a set of items. On the other hand, many real-world applications require a solution that is a ranking over a set of… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Journal ref: Data Mining and Knowledge Discovery (2022) 1-22

  27. arXiv:2203.01241  [pdf, other

    cs.DS

    Coresets remembered and items forgotten: submodular maximization with deletions

    Authors: Guangyi Zhang, Nikolaj Tatti, Aristides Gionis

    Abstract: In recent years we have witnessed an increase on the development of methods for submodular optimization, which have been motivated by the wide applicability of submodular functions in real-world data-science problems. In this paper, we contribute to this line of work by considering the problem of robust submodular maximization against unexpected deletions, which may occur due to privacy issues or… ▽ More

    Submitted 14 September, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

  28. arXiv:2202.07992  [pdf, other

    cs.LG math.NA

    Improved analysis of randomized SVD for top-eigenvector approximation

    Authors: Ruo-Chun Tzeng, Po-An Wang, Florian Adriaens, Aristides Gionis, Chi-Jen Lu

    Abstract: Computing the top eigenvectors of a matrix is a problem of fundamental interest to various fields. While the majority of the literature has focused on analyzing the reconstruction error of low-rank matrices associated with the retrieved eigenvectors, in many applications one is interested in finding one vector with high Rayleigh quotient. In this paper we study the problem of approximating the top… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted to International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

    ACM Class: G.1.3

  29. arXiv:2110.03475  [pdf, other

    cs.DB

    Workload-Aware Materialization of Junction Trees

    Authors: Martino Ciaperoni, Cigdem Aslay, Aristides Gionis, Michael Mathioudakis

    Abstract: Bayesian networks are popular probabilistic models that capture the conditional dependencies among a set of variables. Inference in Bayesian networks is a fundamental task for answering probabilistic queries over a subset of variables in the data. However, exact inference in Bayesian networks is \NP-hard, which has prompted the development of many practical inference methods. In this paper, we f… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

  30. arXiv:2106.11696  [pdf, other

    cs.DS

    Diversity-aware $k$-median : Clustering with fair center representation

    Authors: Suhas Thejaswi, Bruno Ordozgoiti, Aristides Gionis

    Abstract: We introduce a novel problem for diversity-aware clustering. We assume that the potential cluster centers belong to a set of groups defined by protected attributes, such as ethnicity, gender, etc. We then ask to find a minimum-cost clustering of the data into $k$ clusters so that a specified minimum number of cluster centers are chosen from each group. We thus require that all groups are represent… ▽ More

    Submitted 24 October, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: To appear in ECML-PKDD 2021

  31. arXiv:2103.00451  [pdf, other

    cs.SI cs.DS

    Discovering Dense Correlated Subgraphs in Dynamic Networks

    Authors: Giulia Preti, Polina Rozenshtein, Aristides Gionis, Yannis Velegrakis

    Abstract: Given a dynamic network, where edges appear and disappear over time, we are interested in finding sets of edges that have similar temporal behavior and form a dense subgraph. Formally, we define the problem as the enumeration of the maximal subgraphs that satisfy specific density and similarity thresholds. To measure the similarity of the temporal behavior, we use the correlation between the binar… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

    Comments: Full version of the paper included in the proceedings of the PAKDD 2021 conference

    Journal ref: PAKDD 2021

  32. arXiv:2010.08423  [pdf, other

    cs.DS cs.DC

    Restless reachability problems in temporal graphs

    Authors: Suhas Thejaswi, Juho Lauri, Aristides Gionis

    Abstract: We study a family of reachability problems under waiting-time restrictions in temporal and vertex-colored temporal graphs. Given a temporal graph and a set of source vertices, we find the set of vertices that are reachable from a source via a time-respecting path, where the difference in timestamps between consecutive edges is at most a resting time. Given a vertex-colored temporal graph and a mul… ▽ More

    Submitted 3 December, 2024; v1 submitted 16 October, 2020; originally announced October 2020.

    ACM Class: F.2.2; G.4; F.2.1

  33. arXiv:2007.03950  [pdf, other

    cs.DS

    Mining Dense Subgraphs with Similar Edges

    Authors: Polina Rozenshtein, Giulia Preti, Aristides Gionis, Yannis Velegrakis

    Abstract: When searching for interesting structures in graphs, it is often important to take into account not only the graph connectivity, but also the metadata available, such as node and edge labels, or temporal information. In this paper we are interested in settings where such metadata is used to define a similarity between edges. We consider the problem of finding subgraphs that are dense and whose edg… ▽ More

    Submitted 8 July, 2020; originally announced July 2020.

  34. Diverse Rule Sets

    Authors: Guangyi Zhang, Aristides Gionis

    Abstract: While machine-learning models are flourishing and transforming many aspects of everyday life, the inability of humans to understand complex models poses difficulties for these models to be fully trusted and embraced. Thus, interpretability of models has been recognized as an equally important quality as their predictive power. In particular, rule-based systems are experiencing a renaissance owing… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Journal ref: Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '20), August 23--27, 2020, Virtual Event, CA, USA

  35. arXiv:2006.05176  [pdf, other

    cs.SI q-bio.NC

    Explainable Classification of Brain Networks via Contrast Subgraphs

    Authors: Tommaso Lanciano, Francesco Bonchi, Aristides Gionis

    Abstract: Mining human-brain networks to discover patterns that can be used to discriminate between healthy individuals and patients affected by some neurological disorder, is a fundamental task in neuroscience. Learning simple and interpretable models is as important as mere classification accuracy. In this paper we introduce a novel approach for classifying brain networks based on extracting contrast subg… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: To be published at KDD 2020

  36. arXiv:2002.00775  [pdf, other

    cs.SI

    Finding large balanced subgraphs in signed networks

    Authors: Bruno Ordozgoiti, Antonis Matakos, Aristides Gionis

    Abstract: Signed networks are graphs whose edges are labelled with either a positive or a negative sign, and can be used to capture nuances in interactions that are missed by their unsigned counterparts. The concept of balance in signed graph theory determines whether a network can be partitioned into two perfectly opposing subsets, and is therefore useful for modelling phenomena such as the existence of po… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

    Comments: 11 pages, 6 figures, The Web Conference 2020

  37. arXiv:2001.09453  [pdf, other

    cs.DS cs.DM

    Improved mixing time for k-subgraph sampling

    Authors: Ryuta Matsuno, Aristides Gionis

    Abstract: Understanding the local structure of a graph provides valuable insights about the underlying phenomena from which the graph has originated. Sampling and examining k-subgraphs is a widely used approach to understand the local structure of a graph. In this paper, we study the problem of sampling uniformly k-subgraphs from a given graph. We analyze a few different Markov chain Monte Carlo (MCMC) appr… ▽ More

    Submitted 26 January, 2020; originally announced January 2020.

    Comments: Detailed version of a paper to appear in SIAM Data Mining 2020 conference

  38. arXiv:2001.09410  [pdf, other

    cs.SI cs.CY cs.IR

    Searching for polarization in signed graphs: a local spectral approach

    Authors: Han Xiao, Bruno Ordozgoiti, Aristides Gionis

    Abstract: Signed graphs have been used to model interactions in social net-works, which can be either positive (friendly) or negative (antagonistic). The model has been used to study polarization and other related phenomena in social networks, which can be harmful to the process of democratic deliberation in our society. An interesting and challenging task in this application domain is to detect polarized c… ▽ More

    Submitted 26 January, 2020; originally announced January 2020.

    Comments: 11 pages, 6 figures, accepted by WWW 2020, April 20-24, 2020, Taipei, Taiwan

  39. arXiv:2001.07158  [pdf, other

    cs.DS cs.DB cs.DC cs.IR

    Finding path motifs in large temporal graphs using algebraic fingerprints

    Authors: Suhas Thejaswi, Aristides Gionis, Juho Lauri

    Abstract: We study a family of pattern-detection problems in vertex-colored temporal graphs. In particular, given a vertex-colored temporal graph and a multiset of colors as a query, we search for temporal paths in the graph that contain the colors specified in the query. These types of problems have several applications, for example in recommending tours for tourists or detecting abnormal behavior in a net… ▽ More

    Submitted 27 July, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

    Comments: version prior to peer review

    ACM Class: F.2.1; F.2.2; G.2.1; G.2.2; I.5.0

  40. arXiv:2001.03050  [pdf, other

    cs.DS

    Maximizing diversity over clustered data

    Authors: Guangyi Zhang, Aristides Gionis

    Abstract: Maximum diversity aims at selecting a diverse set of high-quality objects from a collection, which is a fundamental problem and has a wide range of applications, e.g., in Web search. Diversity under a uniform or partition matroid constraint naturally describes useful cardinality or budget requirements, and admits simple approximation algorithms. When applied to clustered data, however, popular alg… ▽ More

    Submitted 10 April, 2021; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: SDM 2020

  41. arXiv:1910.02438  [pdf, other

    cs.DS cs.SI

    Discovering Polarized Communities in Signed Networks

    Authors: Francesco Bonchi, Edoardo Galimberti, Aristides Gionis, Bruno Ordozgoiti, Giancarlo Ruffo

    Abstract: Signed networks contain edge annotations to indicate whether each interaction is friendly (positive edge) or antagonistic (negative edge). The model is simple but powerful and it can capture novel and interesting structural properties of real-world phenomena. The analysis of signed networks has many applications from modeling discussions in social media, to mining user reviews, and to recommending… ▽ More

    Submitted 6 October, 2019; originally announced October 2019.

    Journal ref: CIKM 2019, November 3-7, 2019, Beijing, China

  42. Discovering Interesting Cycles in Directed Graphs

    Authors: Florian Adriaens, Cigdem Aslay, Tijl De Bie, Aristides Gionis, Jefrey Lijffijt

    Abstract: Cycles in graphs often signify interesting processes. For example, cyclic trading patterns can indicate inefficiencies or economic dependencies in trade networks, cycles in food webs can identify fragile dependencies in ecosystems, and cycles in financial transaction networks can be an indication of money laundering. Identifying such interesting cycles, which can also be constrained to contain a g… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted for CIKM'19

  43. arXiv:1904.00079  [pdf, other

    cs.DB

    Query the model: precomputations for efficient inference with Bayesian Networks

    Authors: Cigdem Aslay, Martino Ciaperoni, Aristides Gionis, Michael Mathioudakis

    Abstract: Variable Elimination is a fundamental algorithm for probabilistic inference over Bayesian networks. In this paper, we propose a novel materialization method for Variable Elimination, which can lead to significant efficiency gains when answering inference queries. We evaluate our technique using real-world Bayesian networks. Our results show that a modest amount of materialization can lead to signi… ▽ More

    Submitted 27 January, 2021; v1 submitted 29 March, 2019; originally announced April 2019.

  44. Reconciliation k-median: Clustering with Non-Polarized Representatives

    Authors: Bruno Ordozgoiti, Aristides Gionis

    Abstract: We propose a new variant of the k-median problem, where the objective function models not only the cost of assigning data points to cluster representatives, but also a penalty term for disagreement among the representatives. We motivate this novel problem by applications where we are interested in clustering data while avoiding selecting representatives that are too far from each other. For exampl… ▽ More

    Submitted 28 July, 2021; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: The Web Conference 2019

  45. Inferring the strength of social ties: a community-driven approach

    Authors: Polina Rozenshtein, Nikolaj Tatti, Aristides Gionis

    Abstract: Online social networks are growing and becoming denser. The social connections of a given person may have very high variability: from close friends and relatives to acquaintances to people who hardly know. Inferring the strength of social ties is an important ingredient for modeling the interaction of users in a network and understanding their behavior. Furthermore, the problem has applications in… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

  46. Discovering Nested Communities

    Authors: Nikolaj Tatti, Aristides Gionis

    Abstract: Finding communities in graphs is one of the most well-studied problems in data mining and social-network analysis. In many real applications, the underlying graph does not have a clear community structure. In those cases, selecting a single community turns out to be a fairly ill-posed problem, as the optimization criterion has to make a difficult choice between selecting a tight but small communit… ▽ More

    Submitted 4 February, 2019; originally announced February 2019.

  47. What is the dimension of your binary data?

    Authors: Nikolaj Tatti, Taneli Mielikainen, Aristides Gionis, Heikki Mannila

    Abstract: Many 0/1 datasets have a very large number of variables; on the other hand, they are sparse and the dependency structure of the variables is simpler than the number of variables would suggest. Defining the effective dimensionality of such a dataset is a nontrivial problem. We consider the problem of defining a robust measure of dimension for 0/1 datasets, and show that the basic idea of fractal di… ▽ More

    Submitted 4 February, 2019; originally announced February 2019.

  48. arXiv:1811.10354  [pdf, other

    cs.SI

    Tell me something my friends do not know: Diversity maximization in social networks

    Authors: Antonis Matakos, Aristides Gionis

    Abstract: Social media have a great potential to improve information dissemination in our society, yet, they have been held accountable for a number of undesirable effects, such as polarization and filter bubbles. It is thus important to understand these negative phenomena and develop methods to combat them. In this paper we propose a novel approach to address the problem of breaking filter bubbles in socia… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: ICDM 2018

  49. arXiv:1809.05812  [pdf, other

    cs.SI

    Robust Cascade Reconstruction by Steiner Tree Sampling

    Authors: Han Xiao, Cigdem Aslay, Aristides Gionis

    Abstract: We consider a network where an infection cascade has taken place and a subset of infected nodes has been partially observed. Our goal is to reconstruct the underlying cascade that is likely to have generated these observations. We reduce this cascade-reconstruction problem to computing the marginal probability that a node is infected given the partial observations, which is a #P-hard problem. To c… ▽ More

    Submitted 20 November, 2018; v1 submitted 16 September, 2018; originally announced September 2018.

    Comments: 11 pages, accepted at ICDM 2018 (regular paper)

  50. arXiv:1809.05183  [pdf, other

    cs.LG stat.ML

    Explainable time series tweaking via irreversible and reversible temporal transformations

    Authors: Isak Karlsson, Jonathan Rebane, Panagiotis Papapetrou, Aristides Gionis

    Abstract: Time series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. In this paper, we formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: To appear in International Conference on Data Mining, 2018