Skip to main content

Showing 1–43 of 43 results for author: von Luxburg, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15366  [pdf, ps, other

    stat.ML cs.CY cs.LG

    Performative Validity of Recourse Explanations

    Authors: Gunnar König, Hidde Fokkema, Timo Freiesleben, Celestine Mendler-Dünner, Ulrike Von Luxburg

    Abstract: When applicants get rejected by an algorithmic decision system, recourse explanations provide actionable suggestions for how to change their input features to get a positive evaluation. A crucial yet overlooked phenomenon is that recourse explanations are performative: When many applicants act according to their recommendations, their collective behavior may change statistical regularities in the… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 34 pages, 3 figures, 1 table, Preprint

  2. arXiv:2505.22491  [pdf, other

    cs.LG cs.AI stat.ML

    On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

    Authors: Moritz Haas, Sebastian Bordt, Ulrike von Luxburg, Leena Chennuru Vankadara

    Abstract: The dominant paradigm for training large-scale vision and language models is He initialization and a single global learning rate (\textit{standard parameterization}, SP). Despite its practical success, standard parametrization remains poorly understood from a theoretical perspective: Existing infinite-width theory would predict instability under large learning rates and vanishing feature learning… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  3. arXiv:2503.23111  [pdf, other

    cs.LG cs.AI stat.ML

    How to safely discard features based on aggregate SHAP values

    Authors: Robi Bhattacharjee, Karolin Frohnapfel, Ulrike von Luxburg

    Abstract: SHAP is one of the most popular local feature-attribution methods. Given a function f and an input x, it quantifies each feature's contribution to f(x). Recently, SHAP has been increasingly used for global insights: practitioners average the absolute SHAP values over many data points to compute global feature importance scores, which are then used to discard unimportant features. In this work, we… ▽ More

    Submitted 29 March, 2025; originally announced March 2025.

  4. arXiv:2410.23772  [pdf, other

    cs.LG stat.ML

    Disentangling Interactions and Dependencies in Feature Attribution

    Authors: Gunnar König, Eric Günther, Ulrike von Luxburg

    Abstract: In explainable machine learning, global feature importance methods try to determine how much each individual feature contributes to predicting the target variable, resulting in one importance score for each feature. But often, predicting the target variable requires interactions between several features (such as in the XOR function), and features might have complex statistical dependencies that al… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

    Comments: GK and EG contributed equally to this article

  5. arXiv:2410.03249  [pdf, ps, other

    cs.LG cs.AI cs.CL

    How Much Can We Forget about Data Contamination?

    Authors: Sebastian Bordt, Suraj Srinivas, Valentyn Boreiko, Ulrike von Luxburg

    Abstract: The leakage of benchmark data into the training data has emerged as a significant challenge for evaluating the capabilities of large language models (LLMs). In this work, we challenge the common assumption that small-scale contamination renders benchmark evaluations invalid. First, we experimentally quantify the magnitude of benchmark overfitting based on scaling along three dimensions: The number… ▽ More

    Submitted 16 June, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: ICML 2025 camera ready

  6. arXiv:2407.13281  [pdf, other

    cs.LG

    Auditing Local Explanations is Hard

    Authors: Robi Bhattacharjee, Ulrike von Luxburg

    Abstract: In sensitive contexts, providers of machine learning algorithms are increasingly required to give explanations for their algorithms' decisions. However, explanation receivers might not trust the provider, who potentially could output misleading or manipulated explanations. In this work, we investigate an auditing framework in which a third-party auditor or a collective of users attempts to sanity-… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: 40 pages

  7. arXiv:2402.02870  [pdf, ps, other

    cs.LG

    Rethinking Explainable Machine Learning as Applied Statistics

    Authors: Sebastian Bordt, Eric Raidl, Ulrike von Luxburg

    Abstract: In the rapidly growing literature on explanation algorithms, it often remains unclear what precisely these algorithms are for and how they should be used. In this position paper, we argue for a novel and pragmatic perspective: Explainable machine learning needs to recognize its parallels with applied statistics. Concretely, explanations are statistics of high-dimensional functions, and we should t… ▽ More

    Submitted 16 June, 2025; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ICML 2025 camera ready

  8. arXiv:2305.14077  [pdf, other

    stat.ML cs.LG math.ST

    Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension

    Authors: Moritz Haas, David Holzmüller, Ulrike von Luxburg, Ingo Steinwart

    Abstract: The success of over-parameterized neural networks trained to near-zero training error has caused great interest in the phenomenon of benign overfitting, where estimators are statistically consistent even though they interpolate noisy training data. While benign overfitting in fixed dimension has been established for some learning methods, current literature suggests that for regression with typica… ▽ More

    Submitted 6 November, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Compared to the NeurIPS version (v2), this version strengthens Assumption (K) from d/2<s<=3d/4 to d/2<s<3d/4 and corrects Lemma B.2 by posing additional assumptions. This does not affect any other statements. We provide Python code to reproduce all of our experimental results at https://github.com/moritzhaas/mind-the-spikes

  9. arXiv:2303.09461  [pdf, other

    cs.CL cs.CY

    ChatGPT Participates in a Computer Science Exam

    Authors: Sebastian Bordt, Ulrike von Luxburg

    Abstract: We asked ChatGPT to participate in an undergraduate computer science exam on ''Algorithms and Data Structures''. The program was evaluated on the entire exam as posed to the students. We hand-copied its answers onto an exam sheet, which was subsequently graded in a blind setup alongside those of 200 participating students. We find that ChatGPT narrowly passed the exam, obtaining 20.5 out of 40 poi… ▽ More

    Submitted 22 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  10. arXiv:2303.04217  [pdf, other

    cs.AI cs.CY

    AI for Science: An Emerging Agenda

    Authors: Philipp Berens, Kyle Cranmer, Neil D. Lawrence, Ulrike von Luxburg, Jessica Montgomery

    Abstract: This report documents the programme and the outcomes of Dagstuhl Seminar 22382 "Machine Learning for Science: Bridging Data-Driven and Mechanistic Modelling". Today's scientific challenges are characterised by complexity. Interconnected natural, technological, and human systems are influenced by forces acting across time- and spatial-scales, resulting in complex interactions and emergent behaviour… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  11. Pitfalls of Climate Network Construction: A Statistical Perspective

    Authors: Moritz Haas, Bedartha Goswami, Ulrike von Luxburg

    Abstract: Network-based analyses of dynamical systems have become increasingly popular in climate science. Here we address network construction from a statistical perspective and highlight the often ignored fact that the calculated correlation values are only empirical estimates. To measure spurious behaviour as deviation from a ground truth network, we simulate time-dependent isotropic random fields on the… ▽ More

    Submitted 17 November, 2022; v1 submitted 5 November, 2022; originally announced November 2022.

  12. arXiv:2211.01903  [pdf, ps, other

    stat.ML cs.LG

    A Consistent Estimator for Confounding Strength

    Authors: Luca Rendsburg, Leena Chennuru Vankadara, Debarghya Ghoshdastidar, Ulrike von Luxburg

    Abstract: Regression on observational data can fail to capture a causal relationship in the presence of unobserved confounding. Confounding strength measures this mismatch, but estimating it requires itself additional assumptions. A common assumption is the independence of causal mechanisms, which relies on concentration phenomena in high dimensions. While high dimensions enable the estimation of confoundin… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 21 pages

  13. arXiv:2211.01858  [pdf, other

    cs.LG

    Relating graph auto-encoders to linear models

    Authors: Solveig Klepper, Ulrike von Luxburg

    Abstract: Graph auto-encoders are widely used to construct graph representations in Euclidean vector spaces. However, it has already been pointed out empirically that linear models on many tasks can outperform graph auto-encoders. In our work, we prove that the solution space induced by graph auto-encoders is a subset of the solution space of a linear map. This demonstrates that linear embedding models have… ▽ More

    Submitted 30 November, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: accepted to TMLR

  14. arXiv:2209.04012  [pdf, other

    cs.LG

    From Shapley Values to Generalized Additive Models and back

    Authors: Sebastian Bordt, Ulrike von Luxburg

    Abstract: In explainable machine learning, local post-hoc explanation algorithms and inherently interpretable models are often seen as competing approaches. This work offers a partial reconciliation between the two by establishing a correspondence between Shapley Values and Generalized Additive Models (GAMs). We introduce $n$-Shapley Values, a parametric family of local post-hoc explanation algorithms that… ▽ More

    Submitted 23 February, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

    Comments: AISTATS 2023

  15. arXiv:2206.07387  [pdf, other

    cs.LG cs.CV

    The Manifold Hypothesis for Gradient-Based Explanations

    Authors: Sebastian Bordt, Uddeshya Upadhyay, Zeynep Akata, Ulrike von Luxburg

    Abstract: When do gradient-based explanation algorithms provide perceptually-aligned explanations? We propose a criterion: the feature attributions need to be aligned with the tangent space of the data manifold. To provide evidence for this hypothesis, we introduce a framework based on variational autoencoders that allows to estimate and generate image manifolds. Through experiments across a range of differ… ▽ More

    Submitted 15 July, 2024; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: Extended version of a CVPR Workshop paper, available at https://openaccess.thecvf.com/content/CVPR2023W/XAI4CV/papers/Bordt_The_Manifold_Hypothesis_for_Gradient-Based_Explanations_CVPRW_2023_paper.pdf

  16. arXiv:2203.03353  [pdf, other

    stat.ML cs.LG

    Discovering Inductive Bias with Gibbs Priors: A Diagnostic Tool for Approximate Bayesian Inference

    Authors: Luca Rendsburg, Agustinus Kristiadi, Philipp Hennig, Ulrike von Luxburg

    Abstract: Full Bayesian posteriors are rarely analytically tractable, which is why real-world Bayesian inference heavily relies on approximate techniques. Approximations generally differ from the true posterior and require diagnostic tools to assess whether the inference can still be trusted. We investigate a new approach to diagnosing approximate inference: the approximation mismatch is attributed to a cha… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 24 pages, 9 figues, to be published in AISTATS22

  17. arXiv:2202.09054  [pdf, other

    stat.ML cs.LG

    Interpolation and Regularization for Causal Learning

    Authors: Leena Chennuru Vankadara, Luca Rendsburg, Ulrike von Luxburg, Debarghya Ghoshdastidar

    Abstract: We study the problem of learning causal models from observational data through the lens of interpolation and its counterpart -- regularization. A large volume of recent theoretical, as well as empirical work, suggests that, in highly complex model classes, interpolating estimators can have good statistical generalization properties and can even be optimal for statistical learning. Motivated by an… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  18. Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

    Authors: Sebastian Bordt, Michèle Finck, Eric Raidl, Ulrike von Luxburg

    Abstract: Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to "explain". Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitab… ▽ More

    Submitted 10 May, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: FAccT 2022

  19. arXiv:2110.09476  [pdf, other

    cs.LG stat.ML

    Recovery Guarantees for Kernel-based Clustering under Non-parametric Mixture Models

    Authors: Leena Chennuru Vankadara, Sebastian Bordt, Ulrike von Luxburg, Debarghya Ghoshdastidar

    Abstract: Despite the ubiquity of kernel-based clustering, surprisingly few statistical guarantees exist beyond settings that consider strong structural assumptions on the data generation process. In this work, we take a step towards bridging this gap by studying the statistical performance of kernel-based clustering algorithms under non-parametric mixture models. We provide necessary and sufficient separab… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

  20. arXiv:2107.04381  [pdf, other

    cs.LG stat.ML

    Specialists Outperform Generalists in Ensemble Classification

    Authors: Sascha Meyen, Frieder Göppert, Helen Alber, Ulrike von Luxburg, Volker H. Franz

    Abstract: Consider an ensemble of $k$ individual classifiers whose accuracies are known. Upon receiving a test point, each of the classifiers outputs a predicted label and a confidence in its prediction for this particular test point. In this paper, we address the question of whether we can determine the accuracy of the ensemble. Surprisingly, even when classifiers are combined in the statistically optimal… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

  21. arXiv:2008.11092  [pdf, other

    stat.ML cs.AI cs.LG

    Looking Deeper into Tabular LIME

    Authors: Damien Garreau, Ulrike von Luxburg

    Abstract: In this paper, we present a thorough theoretical analysis of the default implementation of LIME in the case of tabular data. We prove that in the large sample limit, the interpretable coefficients provided by Tabular LIME can be computed in an explicit way as a function of the algorithm parameters and some expectation computations related to the black-box model. When the function to explain has so… ▽ More

    Submitted 18 July, 2022; v1 submitted 25 August, 2020; originally announced August 2020.

    Comments: 69 pages, 20 figures

  22. arXiv:2007.04800  [pdf, other

    cs.LG stat.ML

    A Bandit Model for Human-Machine Decision Making with Private Information and Opacity

    Authors: Sebastian Bordt, Ulrike von Luxburg

    Abstract: Applications of machine learning inform human decision makers in a broad range of tasks. The resulting problem is usually formulated in terms of a single decision maker. We argue that it should rather be described as a two-player learning problem where one player is the machine and the other the human. While both players try to optimize the final decision, the setup is often characterized by (1) t… ▽ More

    Submitted 3 May, 2022; v1 submitted 9 July, 2020; originally announced July 2020.

  23. arXiv:2006.14444  [pdf, other

    cs.LG stat.ML

    Clustering with Tangles: Algorithmic Framework and Theoretical Guarantees

    Authors: Solveig Klepper, Christian Elbracht, Diego Fioravanti, Jay Lilian Kneip, Luca Rendsburg, Maximilian Teegen, Ulrike von Luxburg

    Abstract: Originally, tangles were invented as an abstract tool in mathematical graph theory to prove the famous graph minor theorem. In this paper, we showcase the practical potential of tangles in machine learning applications. Given a collection of cuts of any dataset, tangles aggregate these cuts to point in the direction of a dense structure. As a result, a cluster is softly characterized by a set of… ▽ More

    Submitted 6 November, 2022; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: 39 pages

  24. arXiv:2001.03447  [pdf, other

    cs.LG stat.ML

    Explaining the Explainer: A First Theoretical Analysis of LIME

    Authors: Damien Garreau, Ulrike von Luxburg

    Abstract: Machine learning is used more and more often for sensitive applications, sometimes replacing humans in critical decision-making processes. As such, interpretability of these algorithms is a pressing need. One popular algorithm to provide interpretability is LIME (Local Interpretable Model-Agnostic Explanation). In this paper, we provide the first theoretical analysis of LIME. We derive closed-form… ▽ More

    Submitted 13 January, 2020; v1 submitted 10 January, 2020; originally announced January 2020.

    Comments: Accepted to AISTATS 2020

  25. arXiv:1912.01666  [pdf, other

    cs.LG stat.ML

    Insights into Ordinal Embedding Algorithms: A Systematic Evaluation

    Authors: Leena Chennuru Vankadara, Siavash Haghiri, Michael Lohaus, Faiz Ul Wahab, Ulrike von Luxburg

    Abstract: The objective of ordinal embedding is to find a Euclidean representation of a set of abstract items, using only answers to triplet comparisons of the form "Is item $i$ closer to the item $j$ or item $k$?". In recent years, numerous algorithms have been proposed to solve this problem. However, there does not exist a fair and thorough assessment of these embedding methods and therefore several key q… ▽ More

    Submitted 21 October, 2021; v1 submitted 3 December, 2019; originally announced December 2019.

  26. arXiv:1908.07962  [pdf, other

    cs.LG cs.CV stat.ML

    Estimation of perceptual scales using ordinal embedding

    Authors: Siavash Haghiri, Felix Wichmann, Ulrike von Luxburg

    Abstract: In this paper, we address the problem of measuring and analysing sensation, the subjective magnitude of one's experience. We do this in the context of the method of triads: the sensation of the stimulus is evaluated via relative judgments of the form: "Is stimulus S_i more similar to stimulus S_j or to stimulus S_k?". We propose to use ordinal embedding methods from machine learning to estimate th… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

  27. arXiv:1906.11655  [pdf, other

    cs.LG stat.ML

    Uncertainty Estimates for Ordinal Embeddings

    Authors: Michael Lohaus, Philipp Hennig, Ulrike von Luxburg

    Abstract: To investigate objects without a describable notion of distance, one can gather ordinal information by asking triplet comparisons of the form "Is object $x$ closer to $y$ or is $x$ closer to $z$?" In order to learn from such data, the objects are typically embedded in a Euclidean space while satisfying as many triplet comparisons as possible. In this paper, we introduce empirical uncertainty estim… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

    Comments: 16 pages

  28. arXiv:1905.07234  [pdf, other

    cs.LG stat.ML

    Comparison-Based Framework for Psychophysics: Lab versus Crowdsourcing

    Authors: Siavash Haghiri, Patricia Rubisch, Robert Geirhos, Felix Wichmann, Ulrike von Luxburg

    Abstract: Traditionally, psychophysical experiments are conducted by repeated measurements on a few well-trained participants under well-controlled conditions, often resulting in, if done properly, high quality data. In recent years, however, crowdsourcing platforms are becoming increasingly popular means of data collection, measuring many participants at the potential cost of obtaining data of worse qualit… ▽ More

    Submitted 26 July, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

  29. arXiv:1811.12752  [pdf, ps, other

    stat.ML cs.LG

    Practical methods for graph two-sample testing

    Authors: Debarghya Ghoshdastidar, Ulrike von Luxburg

    Abstract: Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies in statistics and learning theory have provided some theoretical insights about such high-dimensional graph testing problems, but the practicality of the develo… ▽ More

    Submitted 30 November, 2018; originally announced November 2018.

    Comments: To appear in Neural Information Processing Systems 2018

  30. arXiv:1811.00928  [pdf, other

    stat.ML cs.LG

    Foundations of Comparison-Based Hierarchical Clustering

    Authors: Debarghya Ghoshdastidar, Michaël Perrot, Ulrike von Luxburg

    Abstract: We address the classical problem of hierarchical clustering, but in a framework where one does not have access to a representation of the objects or their pairwise similarities. Instead, we assume that only a set of comparisons between objects is available, that is, statements of the form "objects $i$ and $j$ are more similar than objects $k$ and $l$." Such a scenario is commonly encountered in cr… ▽ More

    Submitted 12 June, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

    Comments: 26 pages

  31. arXiv:1810.13333  [pdf, other

    stat.ML cs.LG

    Boosting for Comparison-Based Learning

    Authors: Michaël Perrot, Ulrike von Luxburg

    Abstract: We consider the problem of classification in a comparison-based setting: given a set of objects, we only have access to triplet comparisons of the form "object $x_i$ is closer to object $x_j$ than to object $x_k$." In this paper we introduce TripletBoost, a new method that can learn a classifier just from such triplet comparisons. The main idea is to aggregate the triplets information into weak cl… ▽ More

    Submitted 29 May, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

    Comments: This is the extended version (38 pages) of a paper accepted to the International Joint Conference on Artificial Intelligence (IJCAI) 2019

  32. arXiv:1806.06616  [pdf, other

    stat.ML cs.LG

    Comparison-Based Random Forests

    Authors: Siavash Haghiri, Damien Garreau, Ulrike von Luxburg

    Abstract: Assume we are given a set of items from a general metric space, but we neither have access to the representation of the data nor to the distances between data points. Instead, suppose that we can actively choose a triplet of items (A,B,C) and ask an oracle whether item A is closer to item B or to item C. In this paper, we propose a novel random forest algorithm for regression and classification th… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Comments: Accepted at ICML 2018, camera-ready version (32 pages, 14 figures)

  33. arXiv:1708.09794  [pdf, other

    cs.DL cs.LG cs.SI stat.ML

    Design and Analysis of the NIPS 2016 Review Process

    Authors: Nihar B. Shah, Behzad Tabibian, Krikamol Muandet, Isabelle Guyon, Ulrike von Luxburg

    Abstract: Neural Information Processing Systems (NIPS) is a top-tier annual conference in machine learning. The 2016 edition of the conference comprised more than 2,400 paper submissions, 3,000 reviewers, and 8,000 attendees. This represents a growth of nearly 40% in terms of submissions, 96% in terms of reviewers, and over 100% in terms of attendees as compared to the previous year. The massive scale as we… ▽ More

    Submitted 23 April, 2018; v1 submitted 31 August, 2017; originally announced August 2017.

  34. arXiv:1704.01460  [pdf, other

    stat.ML cs.DS cs.LG

    Comparison Based Nearest Neighbor Search

    Authors: Siavash Haghiri, Debarghya Ghoshdastidar, Ulrike von Luxburg

    Abstract: We consider machine learning in a comparison-based setting where we are given a set of points in a metric space, but we have no access to the actual distances between the points. Instead, we can only ask an oracle whether the distance between two points $i$ and $j$ is smaller than the distance between the points $i$ and $k$. We are concerned with data structures and algorithms to find nearest neig… ▽ More

    Submitted 5 April, 2017; originally announced April 2017.

    Comments: 16 Pages, 3 Figures

  35. arXiv:1607.08456  [pdf, other

    stat.ML cs.DS cs.LG

    Kernel functions based on triplet comparisons

    Authors: Matthäus Kleindessner, Ulrike von Luxburg

    Abstract: Given only information in the form of similarity triplets "Object A is more similar to object B than to object C" about a data set, we propose two ways of defining a kernel function on the data set. While previous approaches construct a low-dimensional Euclidean embedding of the data set that reflects the given similarity triplets, we aim at defining kernel functions that correspond to high-dimens… ▽ More

    Submitted 29 October, 2017; v1 submitted 28 July, 2016; originally announced July 2016.

  36. arXiv:1602.07194  [pdf, other

    stat.ML cs.DS cs.LG

    Lens depth function and k-relative neighborhood graph: versatile tools for ordinal data analysis

    Authors: Matthäus Kleindessner, Ulrike von Luxburg

    Abstract: In recent years it has become popular to study machine learning problems in a setting of ordinal distance information rather than numerical distance measurements. By ordinal distance information we refer to binary answers to distance comparisons such as $d(A,B)<d(C,D)$. For many problems in machine learning and statistics it is unclear how to solve them in such a scenario. Up to now, the main appr… ▽ More

    Submitted 24 July, 2017; v1 submitted 23 February, 2016; originally announced February 2016.

    Journal ref: Journal of Machine Learning Research 18(58):1-52, 2017

  37. arXiv:1506.00852  [pdf, other

    cs.LG stat.ML

    Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines

    Authors: Mehdi S. M. Sajjadi, Morteza Alamgir, Ulrike von Luxburg

    Abstract: Peer grading is the process of students reviewing each others' work, such as homework submissions, and has lately become a popular mechanism used in massive open online courses (MOOCs). Intrigued by this idea, we used it in a course on algorithms and data structures at the University of Hamburg. Throughout the whole semester, students repeatedly handed in submissions to exercises, which were then… ▽ More

    Submitted 10 February, 2016; v1 submitted 2 June, 2015; originally announced June 2015.

    Comments: Published at the Third Annual ACM Conference on Learning at Scale L@S

  38. arXiv:1206.6381  [pdf, other

    cs.LG stat.ML

    Shortest path distance in random k-nearest neighbor graphs

    Authors: Morteza Alamgir, Ulrike von Luxburg

    Abstract: Consider a weighted or unweighted k-nearest neighbor graph that has been built on n data points drawn randomly according to some density p on R^d. We study the convergence of the shortest path distance in such graphs as the sample size tends to infinity. We prove that for unweighted kNN graphs, this distance converges to an unpleasant distance function on the underlying space whose properties are… ▽ More

    Submitted 9 July, 2012; v1 submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  39. arXiv:1105.0540  [pdf, ps, other

    stat.ML cs.LG

    Pruning nearest neighbor cluster trees

    Authors: Samory Kpotufe, Ulrike von Luxburg

    Abstract: Nearest neighbor (k-NN) graphs are widely used in machine learning and data mining applications, and our aim is to better understand what they reveal about the cluster structure of the unknown underlying distribution of points. Moreover, is it possible to identify spurious structures that might arise due to sampling variability? Our first contribution is a statistical analysis that reveals how c… ▽ More

    Submitted 5 May, 2011; v1 submitted 3 May, 2011; originally announced May 2011.

  40. arXiv:1102.2075  [pdf, other

    stat.ML cs.DS

    How the result of graph clustering methods depends on the construction of the graph

    Authors: Markus Maier, Ulrike von Luxburg, Matthias Hein

    Abstract: We study the scenario of graph-based clustering algorithms such as spectral clustering. Given a set of data points, one first has to construct a graph on the data points and then apply a graph clustering algorithm to find a suitable partition of the graph. Our main question is if and how the construction of the graph (choice of the graph, choice of parameters, choice of weights) influences the out… ▽ More

    Submitted 10 February, 2011; originally announced February 2011.

  41. arXiv:1003.1266  [pdf, other

    cs.DS cs.LG math.PR

    Hitting and commute times in large graphs are often misleading

    Authors: Ulrike von Luxburg, Agnes Radl, Matthias Hein

    Abstract: Next to the shortest path distance, the second most popular distance function between vertices in a graph is the commute distance (resistance distance). For two vertices u and v, the hitting time H_{uv} is the expected time it takes a random walk to travel from u to v. The commute time is its symmetrized version C_{uv} = H_{uv} + H_{vu}. In our paper we study the behavior of hitting times and comm… ▽ More

    Submitted 26 May, 2011; v1 submitted 5 March, 2010; originally announced March 2010.

  42. arXiv:0711.0189  [pdf, other

    cs.DS cs.LG

    A Tutorial on Spectral Clustering

    Authors: Ulrike von Luxburg

    Abstract: In recent years, spectral clustering has become one of the most popular modern clustering algorithms. It is simple to implement, can be solved efficiently by standard linear algebra software, and very often outperforms traditional clustering algorithms such as the k-means algorithm. On the first glance spectral clustering appears slightly mysterious, and it is not obvious to see why it works at… ▽ More

    Submitted 1 November, 2007; originally announced November 2007.

    Journal ref: Statistics and Computing 17(4), 2007

  43. arXiv:math/0608522  [pdf, ps, other

    math.ST cs.LG

    Graph Laplacians and their convergence on random neighborhood graphs

    Authors: Matthias Hein, Jean-Yves Audibert, Ulrike von Luxburg

    Abstract: Given a sample from a probability measure with support on a submanifold in Euclidean space one can construct a neighborhood graph which can be seen as an approximation of the submanifold. The graph Laplacian of such a graph is used in several machine learning methods like semi-supervised learning, dimensionality reduction and clustering. In this paper we determine the pointwise limit of three di… ▽ More

    Submitted 27 June, 2007; v1 submitted 21 August, 2006; originally announced August 2006.

    Comments: Improved presentation, typos corrected, to appear in JMLR

    MSC Class: 62H12 (Primary) 62H30; 62G99 (Secondary)