-
Modeling Musical Genre Trajectories through Pathlet Learning
Authors:
Lilian Marey,
Charlotte Laclau,
Bruno Sguerra,
Tiphaine Viard,
Manuel Moussallam
Abstract:
The increasing availability of user data on music streaming platforms opens up new possibilities for analyzing music consumption. However, understanding the evolution of user preferences remains a complex challenge, particularly as their musical tastes change over time. This paper uses the dictionary learning paradigm to model user trajectories across different musical genres. We define a new fram…
▽ More
The increasing availability of user data on music streaming platforms opens up new possibilities for analyzing music consumption. However, understanding the evolution of user preferences remains a complex challenge, particularly as their musical tastes change over time. This paper uses the dictionary learning paradigm to model user trajectories across different musical genres. We define a new framework that captures recurring patterns in genre trajectories, called pathlets, enabling the creation of comprehensible trajectory embeddings. We show that pathlet learning reveals relevant listening patterns that can be analyzed both qualitatively and quantitatively. This work improves our understanding of users' interactions with music and opens up avenues of research into user behavior and fostering diversity in recommender systems. A dataset of 2000 user histories tagged by genre over 17 months, supplied by Deezer (a leading music streaming company), is also released with the code.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
"Two Means to an End Goal": Connecting Explainability and Contestability in the Regulation of Public Sector AI
Authors:
Timothée Schmude,
Mireia Yurrita,
Kars Alfrink,
Thomas Le Goff,
Tiphaine Viard
Abstract:
Explainability and its emerging counterpart contestability have become important normative and design principles for the trustworthy use of AI as they enable users and subjects to understand and challenge AI decisions. However, the regulation of AI systems spans technical, legal, and organizational dimensions, producing a multiplicity in meaning that complicates the implementation of explainabilit…
▽ More
Explainability and its emerging counterpart contestability have become important normative and design principles for the trustworthy use of AI as they enable users and subjects to understand and challenge AI decisions. However, the regulation of AI systems spans technical, legal, and organizational dimensions, producing a multiplicity in meaning that complicates the implementation of explainability and contestability. Resolving this conceptual ambiguity requires specifying and comparing the meaning of both principles across regulation dimensions, disciplines, and actors. This process, here defined as translation, is essential to provide guidance on the principles' realization. We present the findings of a semi-structured interview study with 14 interdisciplinary AI regulation experts. We report on the experts' understanding of the intersection between explainability and contestability in public AI regulation, their advice for a decision subject and a public agency in a welfare allocation AI use case, and their perspectives on the connections and gaps within the research landscape. We provide differentiations between descriptive and normative explainability, judicial and non-judicial channels of contestation, and individual and collective contestation action. We further outline three translation processes in the alignment of top-down and bottom-up regulation, the assignment of responsibility for interpreting regulations, and the establishment of interdisciplinary collaboration. Our contributions include an empirically grounded conceptualization of the intersection between explainability and contestability and recommendations on implementing these principles in public institutions. We believe our contributions can inform policy-making and regulation of these core principles and enable more effective and equitable design, development, and deployment of trustworthy public AI systems.
△ Less
Submitted 25 April, 2025;
originally announced April 2025.
-
Graph as a feature: improving node classification with non-neural graph-aware logistic regression
Authors:
Simon Delarue,
Thomas Bonald,
Tiphaine Viard
Abstract:
Graph Neural Networks (GNNs) and their message passing framework that leverages both structural and feature information, have become a standard method for solving graph-based machine learning problems. However, these approaches still struggle to generalise well beyond datasets that exhibit strong homophily, where nodes of the same class tend to connect. This limitation has led to the development o…
▽ More
Graph Neural Networks (GNNs) and their message passing framework that leverages both structural and feature information, have become a standard method for solving graph-based machine learning problems. However, these approaches still struggle to generalise well beyond datasets that exhibit strong homophily, where nodes of the same class tend to connect. This limitation has led to the development of complex neural architectures that pose challenges in terms of efficiency and scalability. In response to these limitations, we focus on simpler and more scalable approaches and introduce Graph-aware Logistic Regression (GLR), a non-neural model designed for node classification tasks. Unlike traditional graph algorithms that use only a fraction of the information accessible to GNNs, our proposed model simultaneously leverages both node features and the relationships between entities. However instead of relying on message passing, our approach encodes each node's relationships as an additional feature vector, which is then combined with the node's self attributes. Extensive experimental results, conducted within a rigorous evaluation framework, show that our proposed GLR approach outperforms both foundational and sophisticated state-of-the-art GNN models in node classification tasks. Going beyond the traditional limited benchmarks, our experiments indicate that GLR increases generalisation ability while reaching performance gains in computation time up to two orders of magnitude compared to it best neural competitor.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Evaluating graph-based explanations for AI-based recommender systems
Authors:
Simon Delarue,
Astrid Bertrand,
Tiphaine Viard
Abstract:
Recent years have witnessed a rapid growth of recommender systems, providing suggestions in numerous applications with potentially high social impact, such as health or justice. Meanwhile, in Europe, the upcoming AI Act mentions \emph{transparency} as a requirement for critical AI systems in order to ``mitigate the risks to fundamental rights''. Post-hoc explanations seamlessly align with this goa…
▽ More
Recent years have witnessed a rapid growth of recommender systems, providing suggestions in numerous applications with potentially high social impact, such as health or justice. Meanwhile, in Europe, the upcoming AI Act mentions \emph{transparency} as a requirement for critical AI systems in order to ``mitigate the risks to fundamental rights''. Post-hoc explanations seamlessly align with this goal and extensive literature on the subject produced several forms of such objects, graphs being one of them. Early studies in visualization demonstrated the graphs' ability to improve user understanding, positioning them as potentially ideal explanations. However, it remains unclear how graph-based explanations compare to other explanation designs. In this work, we aim to determine the effectiveness of graph-based explanations in improving users' perception of AI-based recommendations using a mixed-methods approach. We first conduct a qualitative study to collect users' requirements for graph explanations. We then run a larger quantitative study in which we evaluate the influence of various explanation designs, including enhanced graph-based ones, on aspects such as understanding, usability and curiosity toward the AI system. We find that users perceive graph-based explanations as more usable than designs involving feature importance. However, we also reveal that textual explanations lead to higher objective understanding than graph-based designs. Most importantly, we highlight the strong contrast between participants' expressed preferences for graph design and their actual ratings using it, which are lower compared to textual design. These findings imply that meeting stakeholders' expressed preferences might not alone guarantee ``good'' explanations. Therefore, crafting hybrid designs successfully balancing social expectations with downstream performance emerges as a significant challenge.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Exploring and mining attributed sequences of interactions
Authors:
Tiphaine Viard,
Henry Soldano,
Guillaume Santini
Abstract:
We are faced with data comprised of entities interacting over time: this can be individuals meeting, customers buying products, machines exchanging packets on the IP network, among others. Capturing the dynamics as well as the structure of these interactions is of crucial importance for analysis. These interactions can almost always be labeled with content: group belonging, reviews of products, ab…
▽ More
We are faced with data comprised of entities interacting over time: this can be individuals meeting, customers buying products, machines exchanging packets on the IP network, among others. Capturing the dynamics as well as the structure of these interactions is of crucial importance for analysis. These interactions can almost always be labeled with content: group belonging, reviews of products, abstracts, etc. We model these stream of interactions as stream graphs, a recent framework to model interactions over time. Formal Concept Analysis provides a framework for analyzing concepts evolving within a context. Considering graphs as the context, it has recently been applied to perform closed pattern mining on social graphs. In this paper, we are interested in pattern mining in sequences of interactions. After recalling and extending notions from formal concept analysis on graphs to stream graphs, we introduce algorithms to enumerate closed patterns on a labeled stream graph, and introduce a way to select relevant closed patterns. We run experiments on two real-world datasets of interactions among students and citations between authors, and show both the feasibility and the relevance of our method.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Classifying Wikipedia in a fine-grained hierarchy: what graphs can contribute
Authors:
Tiphaine Viard,
Thomas McLachlan,
Hamidreza Ghader,
Satoshi Sekine
Abstract:
Wikipedia is a huge opportunity for machine learning, being the largest semi-structured base of knowledge available. Because of this, many works examine its contents, and focus on structuring it in order to make it usable in learning tasks, for example by classifying it into an ontology. Beyond its textual contents, Wikipedia also displays a typical graph structure, where pages are linked together…
▽ More
Wikipedia is a huge opportunity for machine learning, being the largest semi-structured base of knowledge available. Because of this, many works examine its contents, and focus on structuring it in order to make it usable in learning tasks, for example by classifying it into an ontology. Beyond its textual contents, Wikipedia also displays a typical graph structure, where pages are linked together through citations. In this paper, we address the task of integrating graph (i.e. structure) information to classify Wikipedia into a fine-grained named entity ontology (NE), the Extended Named Entity hierarchy. To address this task, we first start by assessing the relevance of the graph structure for NE classification. We then explore two directions, one related to feature vectors using graph descriptors commonly used in large-scale network analysis, and one extending flat classification to a weighted model taking into account semantic similarity. We conduct at-scale practical experiments, on a manually labeled subset of 22,000 pages extracted from the Japanese Wikipedia. Our results show that integrating graph information succeeds at reducing sparsity of the input feature space, and yields classification results that are comparable or better than previous works.
△ Less
Submitted 22 January, 2020; v1 submitted 21 January, 2020;
originally announced January 2020.
-
Introducing multilayer stream graphs and layer centralities
Authors:
Pimprenelle Parmentier,
Tiphaine Viard,
Benjamin Renoust,
Jean-François Baffier
Abstract:
Graphs are commonly used in mathematics to represent some relationships between items. However, as simple objects, they sometimes fail to capture all relevant aspects of real-world data. To address this problem, we generalize them and model interactions over time with multilayer structure. We build and test several centralities to assess the importance of layers of such structures. In order to sho…
▽ More
Graphs are commonly used in mathematics to represent some relationships between items. However, as simple objects, they sometimes fail to capture all relevant aspects of real-world data. To address this problem, we generalize them and model interactions over time with multilayer structure. We build and test several centralities to assess the importance of layers of such structures. In order to showcase the relevance of this new model with centralities, we give examples on two large-scale datasets of interactions, involving individuals and flights, and show that we are able to explain subtle behaviour patterns in both cases.
△ Less
Submitted 3 October, 2019;
originally announced October 2019.
-
Weighted, Bipartite, or Directed Stream Graphs for the Modeling of Temporal Networks
Authors:
Matthieu Latapy,
Clémence Magnien,
Tiphaine Viard
Abstract:
We recently introduced a formalism for the modeling of temporal networks, that we call stream graphs. It emphasizes the streaming nature of data and allows rigorous definitions of many important concepts generalizing classical graphs. This includes in particular size, density, clique, neighborhood, degree, clustering coefficient, and transitivity. In this contribution, we show that, like graphs, s…
▽ More
We recently introduced a formalism for the modeling of temporal networks, that we call stream graphs. It emphasizes the streaming nature of data and allows rigorous definitions of many important concepts generalizing classical graphs. This includes in particular size, density, clique, neighborhood, degree, clustering coefficient, and transitivity. In this contribution, we show that, like graphs, stream graphs may be extended to cope with bipartite structures, with node and link weights, or with link directions. We review the main bipartite, weighted or directed graph concepts proposed in the literature, we generalize them to the cases of bipartite, weighted, or directed stream graphs, and we show that obtained concepts are consistent with graph and stream graph ones. This provides a formal ground for an accurate modeling of the many temporal networks that have one or several of these features.
△ Less
Submitted 23 November, 2021; v1 submitted 11 June, 2019;
originally announced June 2019.
-
Degree-based Outlier Detection within IP Traffic Modelled as a Link Stream
Authors:
Audrey Wilmet,
Tiphaine Viard,
Matthieu Latapy,
Robin Lamarche-Perrin
Abstract:
This paper aims at precisely detecting and identifying anomalous events in IP traffic. To this end, we adopt the link stream formalism which properly captures temporal and structural features of the data. Within this framework, we focus on finding anomalous behaviours with respect to the degree of IP addresses over time. Due to diversity in IP profiles, this feature is typically distributed hetero…
▽ More
This paper aims at precisely detecting and identifying anomalous events in IP traffic. To this end, we adopt the link stream formalism which properly captures temporal and structural features of the data. Within this framework, we focus on finding anomalous behaviours with respect to the degree of IP addresses over time. Due to diversity in IP profiles, this feature is typically distributed heterogeneously, preventing us to directly find anomalies. To deal with this challenge, we design a method to detect outliers as well as precisely identify their cause in a sequence of similar heterogeneous distributions. We apply it to several MAWI captures of IP traffic and we show that it succeeds in detecting relevant patterns in terms of anomalous network activity.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
Movie rating prediction using content-based and link stream features
Authors:
Tiphaine Viard,
Raphaël Fournier-S'niehotta
Abstract:
While graph-based collaborative filtering recommender systems have been introduced several years ago, there are still several shortcomings to deal with, the temporal information being one of the most important. The new link stream paradigm is aiming at extending graphs for correctly modelling the graph dynamics, without losing crucial information. We investigate the impact of such link stream feat…
▽ More
While graph-based collaborative filtering recommender systems have been introduced several years ago, there are still several shortcomings to deal with, the temporal information being one of the most important. The new link stream paradigm is aiming at extending graphs for correctly modelling the graph dynamics, without losing crucial information. We investigate the impact of such link stream features for recommender systems. by designing link stream features, that capture the intrinsic structure and dynamics of the data. We show that such features encode a fine-grained and subtle description of the underlying recommender system. Focusing on a traditional recommender system context, the rating prediction on the MovieLens20M dataset, we input these features along with some content-based ones into a gradient boosting machine (XGBoost) and show that it outperforms significantly a sole content-based solution. These encouraging results call for further exploration of this original modelling and its integration to complete state-of-the-art recommender systems algorithms. Link streams and graphs, as natural visualizations of recommender systems, can offer more interpretability in a time when algorithm transparency is an increasingly important topic of discussion. We also hope to sparkle interesting discussions in the community about the links between link streams and tensor factorization methods: indeed, they are two sides of the same object.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Enumerating maximal cliques in link streams with durations
Authors:
Tiphaine Viard,
Clémence Magnien,
Matthieu Latapy
Abstract:
Link streams model interactions over time, and a clique in a link stream is defined as a set of nodes and a time interval such that all pairs of nodes in this set interact permanently during this time interval. This notion was introduced recently in the case where interactions are instantaneous. We generalize it to the case of interactions with durations and show that the instantaneous case actual…
▽ More
Link streams model interactions over time, and a clique in a link stream is defined as a set of nodes and a time interval such that all pairs of nodes in this set interact permanently during this time interval. This notion was introduced recently in the case where interactions are instantaneous. We generalize it to the case of interactions with durations and show that the instantaneous case actually is a particular case of the case with durations. We propose an algorithm to detect maximal cliques that improves our previous one for instantaneous link streams, and performs better than the state of the art algorithms in several cases of interest.
△ Less
Submitted 13 February, 2018; v1 submitted 19 December, 2017;
originally announced December 2017.
-
Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams
Authors:
Tiphaine Viard,
Raphaël Fournier-S'niehotta,
Clémence Magnien,
Matthieu Latapy
Abstract:
Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions $(t_1 , t_2 , u, v)$ meaning that devices $u$ and $v$ exchanged packets from time $t_1$…
▽ More
Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions $(t_1 , t_2 , u, v)$ meaning that devices $u$ and $v$ exchanged packets from time $t_1$ to time $t_2$ . This traffic is captured on a single router and so has a bipartite structure: links occur only between nodes in two disjoint sets. We design a method for finding interesting bipartite cliques in such link streams, i.e. two sets of nodes and a time interval such that all nodes in the first set are linked to all nodes in the second set throughout the time interval. We then explore the bipartite cliques present in the considered trace. Comparison with the MAWILab classification of anomalous IP addresses shows that the found cliques succeed in detecting anomalous network activity.
△ Less
Submitted 16 December, 2017; v1 submitted 19 October, 2017;
originally announced October 2017.
-
Stream Graphs and Link Streams for the Modeling of Interactions over Time
Authors:
Matthieu Latapy,
Tiphaine Viard,
Clémence Magnien
Abstract:
Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like densit…
▽ More
Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like density, clusters, or paths, and derive from them more advanced concepts like cliques, degrees, clustering coefficients, or connected components. We obtain a language to directly deal with interactions over time, similar to the language provided by graphs to deal with relations. This formalism is self-consistent: usual relations between different concepts are preserved. It is also consistent with graph theory: graph concepts are special cases of the ones we introduce. This makes it easy to generalize higher-level objects such as quotient graphs, line graphs, k-cores, and centralities. This paper also considers discrete versus continuous time assumptions, instantaneous links, and extensions to more complex cases.
△ Less
Submitted 11 October, 2017;
originally announced October 2017.
-
Analysis of the temporal and structural features of threads in a mailing-list
Authors:
Noé Gaumont,
Tiphaine Viard,
Raphaël Fournier-S'niehotta,
Qinna Wang,
Matthieu Latapy
Abstract:
A link stream is a collection of triplets $(t,u,v)$ indicating that an interaction occurred between $u$ and $v$ at time $t$. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the exi…
▽ More
A link stream is a collection of triplets $(t,u,v)$ indicating that an interaction occurred between $u$ and $v$ at time $t$. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the existing notions of intra-community density and inter-community density. We focus on emails exchanges in the Debian mailing-list, and show that threads of emails, like communities in graphs, are dense subsets loosely connected from a link stream perspective.
△ Less
Submitted 15 December, 2015;
originally announced December 2015.
-
Computing maximal cliques in link streams
Authors:
Tiphaine Viard,
Matthieu Latapy,
Clémence Magnien
Abstract:
A link stream is a collection of triplets $(t, u, v)$ indicating that an interaction occurred between u and v at time t. We generalize the classical notion of cliques in graphs to such link streams: for a given $Δ$, a $Δ$-clique is a set of nodes and a time interval such that all pairs of nodes in this set interact at least once during each sub-interval of duration $Δ$. We propose an algorithm to…
▽ More
A link stream is a collection of triplets $(t, u, v)$ indicating that an interaction occurred between u and v at time t. We generalize the classical notion of cliques in graphs to such link streams: for a given $Δ$, a $Δ$-clique is a set of nodes and a time interval such that all pairs of nodes in this set interact at least once during each sub-interval of duration $Δ$. We propose an algorithm to enumerate all maximal (in terms of nodes or time interval) cliques of a link stream, and illustrate its practical relevance on a real-world contact trace.
△ Less
Submitted 4 July, 2016; v1 submitted 3 February, 2015;
originally announced February 2015.
-
Identifying roles in an IP network with temporal and structural density
Authors:
Tiphaine Viard,
Matthieu Latapy
Abstract:
Captures of IP traffic contain much information on very different kinds of activities like file transfers, users interacting with remote systems, automatic backups, or distributed computations. Identifying such activities is crucial for an appropriate analysis, modeling and monitoring of the traffic. We propose here a notion of density that captures both temporal and structural features of interac…
▽ More
Captures of IP traffic contain much information on very different kinds of activities like file transfers, users interacting with remote systems, automatic backups, or distributed computations. Identifying such activities is crucial for an appropriate analysis, modeling and monitoring of the traffic. We propose here a notion of density that captures both temporal and structural features of interactions, and generalizes the classical notion of clustering coefficient. We use it to point out important differences between distinct parts of the traffic, and to identify interesting nodes and groups of nodes in terms of roles in the network.
△ Less
Submitted 4 July, 2016; v1 submitted 19 June, 2014;
originally announced June 2014.