Skip to main content

Showing 1–24 of 24 results for author: Bressan, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.06725  [pdf, ps, other

    cs.CC cs.DM

    On Finding Randomly Planted Cliques in Arbitrary Graphs

    Authors: Francesco Agrimonti, Marco Bressan, Tommaso d'Orsi

    Abstract: We study a planted clique model introduced by Feige where a complete graph of size $c\cdot n$ is planted uniformly at random in an arbitrary $n$-vertex graph. We give a simple deterministic algorithm that, in almost linear time, recovers a clique of size $(c/3)^{O(1/c)} \cdot n$ as long as the original graph has maximum degree at most $(1-p)n$ for some fixed $p>0$. The proof hinges on showing that… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

    MSC Class: 68W25 ACM Class: F.2.2

  2. arXiv:2412.08012  [pdf, other

    cs.LG stat.ML

    Of Dice and Games: A Theory of Generalized Boosting

    Authors: Marco Bressan, Nataly Brukhim, Nicolò Cesa-Bianchi, Emmanuel Esposito, Yishay Mansour, Shay Moran, Maximilian Thiessen

    Abstract: Cost-sensitive loss functions are crucial in many real-world prediction problems, where different types of errors are penalized differently; for example, in medical diagnosis, a false negative prediction can lead to worse consequences than a false positive prediction. However, traditional PAC learning theory has mostly focused on the symmetric 0-1 loss, leaving cost-sensitive losses largely unaddr… ▽ More

    Submitted 10 December, 2024; originally announced December 2024.

  3. arXiv:2406.10529  [pdf, ps, other

    cs.LG cs.AI stat.ML

    A Theory of Interpretable Approximations

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Emmanuel Esposito, Yishay Mansour, Shay Moran, Maximilian Thiessen

    Abstract: Can a deep neural network be approximated by a small decision tree based on simple features? This question and its variants are behind the growing demand for machine learning models that are *interpretable* by humans. In this work we study such questions by introducing *interpretable approximations*, a notion that captures the idea of approximating a target concept $c$ by a small aggregation of co… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: To appear at COLT 2024

  4. arXiv:2405.00853  [pdf, ps, other

    cs.LG stat.ML

    Efficient Algorithms for Learning Monophonic Halfspaces in Graphs

    Authors: Marco Bressan, Emmanuel Esposito, Maximilian Thiessen

    Abstract: We study the problem of learning a binary classifier on the vertices of a graph. In particular, we consider classifiers given by monophonic halfspaces, partitions of the vertices that are convex in a certain abstract sense. Monophonic halfspaces, and related notions such as geodesic halfspaces,have recently attracted interest, and several connections have been drawn between their properties(e.g.,… ▽ More

    Submitted 17 June, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: To appear at COLT 2024

  5. arXiv:2302.03994  [pdf, ps, other

    cs.DS cs.AI cs.LG

    Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees

    Authors: Marco Bressan, Mauro Sozio

    Abstract: We give the first algorithm that maintains an approximate decision tree over an arbitrary sequence of insertions and deletions of labeled examples, with strong guarantees on the worst-case running time per update request. For instance, we show how to maintain a decision tree where every vertex has Gini gain within an additive $α$ of the optimum by performing $O\Big(\frac{d\,(\log n)^4}{α^3}\Big)$… ▽ More

    Submitted 10 February, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

  6. arXiv:2212.00778  [pdf, other

    cs.LG cs.AI cs.DS

    Fully-Dynamic Decision Trees

    Authors: Marco Bressan, Gabriel Damay, Mauro Sozio

    Abstract: We develop the first fully dynamic algorithm that maintains a decision tree over an arbitrary sequence of insertions and deletions of labeled examples. Given $ε> 0$ our algorithm guarantees that, at every point in time, every node of the decision tree uses a split with Gini gain within an additive $ε$ of the optimum. For real-valued features the algorithm has an amortized running time per insertio… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  7. arXiv:2211.01905  [pdf, other

    cs.CC cs.DM

    The Complexity of Pattern Counting in Directed Graphs, Parameterised by the Outdegree

    Authors: Marco Bressan, Matthias Lanzinger, Marc Roth

    Abstract: We study the fixed-parameter tractability of the following fundamental problem: given two directed graphs $\vec H$ and $\vec G$, count the number of copies of $\vec H$ in $\vec G$. The standard setting, where the tractability is well understood, uses only $|\vec H|$ as a parameter. In this paper we take a step forward, and adopt as a parameter $|\vec H|+d(\vec G)$, where $d(\vec G)$ is the maximum… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 47 pages, 1 figure, abstract shortened due to arXiv requirements

  8. arXiv:2209.03996  [pdf, ps, other

    cs.LG

    Active Learning of Classifiers with Label and Seed Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice, Maximilian Thiessen

    Abstract: We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any unknown classifier on $X$ whose classes have finite strong convex hull margin, a new notion extending the SVM margin. In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  9. arXiv:2209.03402  [pdf, ps, other

    cs.CC cs.DM

    Counting Subgraphs in Somewhere Dense Graphs

    Authors: Marco Bressan, Leslie Ann Goldberg, Kitty Meeks, Marc Roth

    Abstract: We study the problems of counting copies and induced copies of a small pattern graph $H$ in a large host graph $G$. Recent work fully classified the complexity of those problems according to structural restrictions on the patterns $H$. In this work, we address the more challenging task of analysing the complexity for restricted patterns and restricted hosts. Specifically we ask which families of a… ▽ More

    Submitted 12 April, 2024; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: 35 pages, 3 figures, 4 tables, abstract shortened due to ArXiv requirements

  10. arXiv:2110.04654  [pdf, other

    eess.AS cs.CV cs.LG cs.SD

    Complex Network-Based Approach for Feature Extraction and Classification of Musical Genres

    Authors: Matheus Henrique Pimenta-Zanon, Glaucia Maria Bressan, Fabrício Martins Lopes

    Abstract: Musical genre's classification has been a relevant research topic. The association between music and genres is fundamental for the media industry, which manages musical recommendation systems, and for music streaming services, which may appear classified by genres. In this context, this work presents a feature extraction method for the automatic classification of musical genres, based on complex n… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

  11. arXiv:2106.04913  [pdf, ps, other

    cs.LG stat.ML

    On Margin-Based Cluster Recovery with Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?", the task is to recover exactly all clusters using as few queries as possible. We begin by introducing a simple but general notion of margin between clusters that captures, as special cases, the margins used in previous work, the classic SVM… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  12. arXiv:2103.05588  [pdf, ps, other

    cs.CC cs.DS

    Exact and Approximate Pattern Counting in Degenerate Graphs: New Algorithms, Hardness Results, and Complexity Dichotomies

    Authors: Marco Bressan, Marc Roth

    Abstract: We study the problems of counting the homomorphisms, counting the copies, and counting the induced copies of a $k$-vertex graph $H$ in a $d$-degenerate $n$-vertex graph $G$. Our main result establishes exhaustive and explicit complexity classifications for counting subgraphs and induced subgraphs. We show that the (not necessarily induced) copies of $H$ in $G$ can be counted in time… ▽ More

    Submitted 1 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: 44 pages, 3 figures

  13. arXiv:2102.00504  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural char… ▽ More

    Submitted 13 July, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2021

  14. arXiv:2009.03052  [pdf, other

    cs.DS cs.DB

    Faster motif counting via succinct color coding and adaptive sampling

    Authors: Marco Bressan, Stefano Leucci, Alessandro Panconesi

    Abstract: We address the problem of computing the distribution of induced connected subgraphs, aka \emph{graphlets} or \emph{motifs}, in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling, by leveraging the color coding technique by Alon, Yuster and Zwick. In this work we extend the applicability of this approach, by introducing a set of algorithmic optimiza… ▽ More

    Submitted 17 July, 2021; v1 submitted 4 September, 2020; originally announced September 2020.

    Journal ref: ACM Trans. Knowl. Discov. Data 15, 6, Article 96 (June 2021)

  15. arXiv:2007.12102  [pdf, ps, other

    cs.DS cs.DM cs.SI

    Efficient and near-optimal algorithms for sampling small connected subgraphs

    Authors: Marco Bressan

    Abstract: We study the following problem: given an integer $k \ge 3$ and a simple graph $G$, sample a connected induced $k$-node subgraph of $G$ uniformly at random. This is a fundamental graph mining primitive with applications in social network analysis, bioinformatics, and more. Surprisingly, no efficient algorithm is known for uniform sampling; the only somewhat efficient algorithms available yield samp… ▽ More

    Submitted 28 October, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: Full version of STOC'21 paper. 40 pages

  16. arXiv:2006.04675  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Mangled Clusters with Same-Cluster Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical $k$-means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clus… ▽ More

    Submitted 30 October, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020 (oral)

  17. arXiv:1906.01599  [pdf, other

    cs.DB cs.DM cs.IR

    Motivo: fast motif counting via succinct color coding and adaptive sampling

    Authors: Marco Bressan, Stefano Leucci, Alessandro Panconesi

    Abstract: The randomized technique of color coding is behind state-of-the-art algorithms for estimating graph motif counts. Those algorithms, however, are not yet capable of scaling well to very large graphs with billions of edges. In this paper we develop novel tools for the `motif counting via color coding' framework. As a result, our new algorithm, Motivo, is able to scale well to larger graphs while at… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: 13 pages

  18. arXiv:1905.11902  [pdf, other

    cs.LG stat.ML

    Correlation Clustering with Adaptive Similarity Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Andrea Paudice, Fabio Vitale

    Abstract: In correlation clustering, we are given $n$ objects together with a binary similarity score between each pair of them. The goal is to partition the objects into clusters so to minimise the disagreements with the scores. In this work we investigate correlation clustering as an active learning problem: each similarity score can be learned by making a query, and the goal is to minimise both the disag… ▽ More

    Submitted 14 January, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

  19. arXiv:1805.02089  [pdf, ps, other

    cs.CC

    Faster algorithms for counting subgraphs in sparse graphs

    Authors: Marco Bressan

    Abstract: Given a $k$-node pattern graph $H$ and an $n$-node host graph $G$, the subgraph counting problem asks to compute the number of copies of $H$ in $G$. In this work we address the following question: can we count the copies of $H$ faster if $G$ is sparse? We answer in the affirmative by introducing a novel tree-like decomposition for directed acyclic graphs, inspired by the classic tree decomposition… ▽ More

    Submitted 30 August, 2020; v1 submitted 5 May, 2018; originally announced May 2018.

    Comments: Extended version of a work appeared at IPEC 2019

  20. arXiv:1801.00196  [pdf, other

    cs.DM

    On approximating the stationary distribution of time-reversible Markov chains

    Authors: Marco Bressan, Enoch Peserico, Luca Pretto

    Abstract: Approximating the stationary probability of a state in a Markov chain through Markov chain Monte Carlo techniques is, in general, inefficient. Standard random walk approaches require $\tilde{O}(τ/π(v))$ operations to approximate the probability $π(v)$ of a state $v$ in a chain with mixing time $τ$, and even the best available techniques still have complexity $\tilde{O}(τ^{1.5}/π(v)^{0.5})$, and si… ▽ More

    Submitted 30 December, 2017; originally announced January 2018.

    Comments: Full version of a paper accepted at STACS 2018. 18 pages, 1 figure

  21. arXiv:1607.04263  [pdf, other

    cs.SI physics.soc-ph

    The Limits of Popularity-Based Recommendations, and the Role of Social Ties

    Authors: Marco Bressan, Stefano Leucci, Alessandro Panconesi, Prabhakar Raghavan, Erisa Terolli

    Abstract: In this paper we introduce a mathematical model that captures some of the salient features of recommender systems that are based on popularity and that try to exploit social ties among the users. We show that, under very general conditions, the market always converges to a steady state, for which we are able to give an explicit form. Thanks to this we can tell rather precisely how much a market is… ▽ More

    Submitted 14 July, 2016; originally announced July 2016.

    Comments: 10 pages, 9 figures, KDD 2016

  22. arXiv:1604.00202  [pdf, ps, other

    cs.DM cs.SI

    The Power of Local Information in PageRank

    Authors: Marco Bressan, Enoch Peserico, Luca Pretto

    Abstract: How large a fraction of a graph must one explore to rank a small set of nodes according to their PageRank scores? We show that the answer is quite nuanced, and depends crucially on the interplay between the correctness guarantees one requires and the way one can access the graph. On the one hand, assuming the graph can be accessed only via "natural" exploration queries that reveal small pieces of… ▽ More

    Submitted 1 April, 2016; originally announced April 2016.

    Comments: 25 pages, 6 figures

  23. arXiv:1512.07901  [pdf, ps, other

    cs.DM

    Simple set cardinality estimation through random sampling

    Authors: Marco Bressan, Enoch Peserico, Luca Pretto

    Abstract: We present a simple algorithm that estimates the cardinality $n$ of a set $V$ when allowed to sample elements of $V$ uniformly and independently at random. Our algorithm with probability $(1-δ)$ returns a $(1\pmε)-$approximation of $n$ drawing $O\big(\sqrt{n} \cdot ε^{-1}\sqrt{\log(δ^{-1})}\big)$ samples (for $ε^{-1}\sqrt{\log(δ^{-1})} = O(\sqrt{n})$).

    Submitted 11 April, 2018; v1 submitted 24 December, 2015; originally announced December 2015.

    Comments: 3 pages

  24. arXiv:1404.1864  [pdf, ps, other

    cs.DS cs.IR cs.SI

    Sublinear algorithms for local graph centrality estimation

    Authors: Marco Bressan, Enoch Peserico, Luca Pretto

    Abstract: We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through… ▽ More

    Submitted 4 August, 2018; v1 submitted 7 April, 2014; originally announced April 2014.

    Comments: 29 pages, 1 figure