Skip to main content

Showing 1–17 of 17 results for author: Neill, D B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.17533  [pdf, ps, other

    cs.LG cs.AI cs.CY

    Learning Representational Disparities

    Authors: Pavan Ravishankar, Rushabh Shah, Daniel B. Neill

    Abstract: We propose a fair machine learning algorithm to model interpretable differences between observed and desired human decision-making, with the latter aimed at reducing disparity in a downstream outcome impacted by the human decision. Prior work learns fair representations without considering the outcome in the decision-making process. We model the outcome disparities as arising due to the different… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 27 pages

  2. arXiv:2501.15634  [pdf, other

    cs.CY cs.LG

    Be Intentional About Fairness!: Fairness, Size, and Multiplicity in the Rashomon Set

    Authors: Gordon Dai, Pavan Ravishankar, Rachel Yuan, Daniel B. Neill, Emily Black

    Abstract: When selecting a model from a set of equally performant models, how much unfairness can you really reduce? Is it important to be intentional about fairness when choosing among this set, or is arbitrarily choosing among the set of ''good'' models good enough? Recent work has highlighted that the phenomenon of model multiplicity-where multiple models with nearly identical predictive accuracy exist f… ▽ More

    Submitted 26 January, 2025; originally announced January 2025.

    Comments: 34 pages

  3. arXiv:2306.13064  [pdf, other

    cs.LG

    Auditing Predictive Models for Intersectional Biases

    Authors: Kate S. Boxer, Edward McFowland III, Daniel B. Neill

    Abstract: Predictive models that satisfy group fairness criteria in aggregate for members of a protected class, but do not guarantee subgroup fairness, could produce biased predictions for individuals at the intersection of two or more protected classes. To address this risk, we propose Conditional Bias Scan (CBS), a flexible auditing framework for detecting intersectional biases in classification models. C… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 29 pages, 7 figures

  4. arXiv:2306.11181  [pdf, other

    cs.LG cs.CY stat.ML

    Insufficiently Justified Disparate Impact: A New Criterion for Subgroup Fairness

    Authors: Neil Menghani, Edward McFowland III, Daniel B. Neill

    Abstract: In this paper, we develop a new criterion, "insufficiently justified disparate impact" (IJDI), for assessing whether recommendations (binarized predictions) made by an algorithmic decision support tool are fair. Our novel, utility-based IJDI criterion evaluates false positive and false negative error rate imbalances, identifying statistically significant disparities between groups which are presen… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: 31 pages, 9 figures

  5. arXiv:2302.06752  [pdf, other

    cs.LG cs.CY

    Provable Detection of Propagating Sampling Bias in Prediction Models

    Authors: Pavan Ravishankar, Qingyu Mo, Edward McFowland III, Daniel B. Neill

    Abstract: With an increased focus on incorporating fairness in machine learning models, it becomes imperative not only to assess and mitigate bias at each stage of the machine learning pipeline but also to understand the downstream impacts of bias across stages. Here we consider a general, but realistic, scenario in which a predictive model is learned from (potentially biased) training data, and model predi… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: AAAI 2023 (13 pages, 7 figures)

  6. arXiv:2206.12786  [pdf, other

    stat.ME cs.AI cs.SI

    Calibrated Nonparametric Scan Statistics for Anomalous Pattern Detection in Graphs

    Authors: Chunpai Wang, Daniel B. Neill, Feng Chen

    Abstract: We propose a new approach, the calibrated nonparametric scan statistic (CNSS), for more accurate detection of anomalous patterns in large-scale, real-world graphs. Scan statistics identify connected subgraphs that are interesting or unexpected through maximization of a likelihood ratio statistic; in particular, nonparametric scan statistics (NPSSs) identify subgraphs with a higher than expected pr… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

  7. arXiv:2111.10144  [pdf, other

    cs.LG cs.CV

    Positional Encoder Graph Neural Networks for Geographic Data

    Authors: Konstantin Klemmer, Nathan Safir, Daniel B. Neill

    Abstract: Graph neural networks (GNNs) provide a powerful and scalable solution for modeling continuous spatial data. However, they often rely on Euclidean distances to construct the input graphs. This assumption can be improbable in many real-world settings, where the spatial structure is more complex and explicitly non-Euclidean (e.g., road networks). Here, we propose PE-GNN, a new framework that incorpor… ▽ More

    Submitted 15 February, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: AISTATS 2023

  8. arXiv:2109.15044  [pdf, other

    cs.LG cs.CV

    SPATE-GAN: Improved Generative Modeling of Dynamic Spatio-Temporal Patterns with an Autoregressive Embedding Loss

    Authors: Konstantin Klemmer, Tianlin Xu, Beatrice Acciaio, Daniel B. Neill

    Abstract: From ecology to atmospheric sciences, many academic disciplines deal with data characterized by intricate spatio-temporal complexities, the modeling of which often requires specialized approaches. Generative models of these data are of particular interest, as they enable a range of impactful downstream applications like simulation or creating synthetic training data. Recent work has highlighted th… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

  9. arXiv:2011.06019  [pdf, other

    stat.AP cs.LG

    Policing Chronic and Temporary Hot Spots of Violent Crime: A Controlled Field Experiment

    Authors: Dylan J. Fitzpatrick, Wilpen L. Gorr, Daniel B. Neill

    Abstract: Hot-spot-based policing programs aim to deter crime through increased proactive patrols at high-crime locations. While most hot spot programs target easily identified chronic hot spots, we introduce models for predicting temporary hot spots to address effectiveness and equity objectives for crime prevention, and present findings from a crossover experiment evaluating application of hot spot predic… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 15 pages, 3 figures

  10. arXiv:2006.10461  [pdf, other

    cs.LG cs.CV stat.ML

    Auxiliary-task learning for geographic data with autoregressive embeddings

    Authors: Konstantin Klemmer, Daniel B. Neill

    Abstract: Machine learning is gaining popularity in a broad range of areas working with geographic data, such as ecology or atmospheric sciences. Here, data often exhibit spatial effects, which can be difficult to learn for neural networks. In this study, we propose SXL, a method for embedding information on the autoregressive nature of spatial data directly into the learning process using auxiliary tasks.… ▽ More

    Submitted 19 August, 2021; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: SIGSPATIAL 2021

  11. arXiv:1811.03939  [pdf, other

    cs.CY stat.AP

    Modeling Rape Reporting Delays Using Spatial, Temporal and Social Features

    Authors: Konstantin Klemmer, Daniel B. Neill, Stephen A. Jarvis

    Abstract: We present a novel approach to estimate the delay observed between the occurrence and reporting of rape crimes. We explore spatial, temporal and social effects in sparse aggregated (area-level) and high-dimensional disaggregated (event-level) data for New York and Los Angeles. Focusing on inference, we apply Gradient Boosting and Random Forests to assess predictor importance, as well as Gaussian P… ▽ More

    Submitted 21 November, 2018; v1 submitted 9 November, 2018; originally announced November 2018.

    Comments: Workshop on Modeling and Decision-Making in the Spatiotemporal Domain, 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada

  12. arXiv:1810.11861  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Change Surfaces for Expressive Multidimensional Changepoints and Counterfactual Prediction

    Authors: William Herlands, Daniel B. Neill, Hannes Nickisch, Andrew Gordon Wilson

    Abstract: Identifying changes in model parameters is fundamental in machine learning and statistics. However, standard changepoint models are limited in expressiveness, often addressing unidimensional problems and assuming instantaneous changes. We introduce change surfaces as a multidimensional and highly expressive generalization of changepoints. We provide a model-agnostic formalization of change surface… ▽ More

    Submitted 30 October, 2018; v1 submitted 28 October, 2018; originally announced October 2018.

  13. arXiv:1804.01466  [pdf, other

    stat.ML cs.LG

    Gaussian Process Subset Scanning for Anomalous Pattern Detection in Non-iid Data

    Authors: William Herlands, Edward McFowland III, Andrew Gordon Wilson, Daniel B. Neill

    Abstract: Identifying anomalous patterns in real-world data is essential for understanding where, when, and how systems deviate from their expected dynamics. Yet methods that separately consider the anomalousness of each individual data point have low detection power for subtle, emerging irregularities. Additionally, recent detection techniques based on subset scanning make strong independence assumptions a… ▽ More

    Submitted 4 April, 2018; originally announced April 2018.

    Comments: Presented at AISTATS 2018. 11 pages. Supplement to main paper is included here as an appendix

  14. arXiv:1710.02458  [pdf, other

    cs.CY cs.LG stat.ML

    Machine Learning for Drug Overdose Surveillance

    Authors: Daniel B. Neill, William Herlands

    Abstract: We describe two recently proposed machine learning approaches for discovering emerging trends in fatal accidental drug overdoses. The Gaussian Process Subset Scan enables early detection of emerging patterns in spatio-temporal data, accounting for both the non-iid nature of the data and the fact that detecting subtle patterns requires integration of information across multiple spatial areas and mu… ▽ More

    Submitted 6 October, 2017; originally announced October 2017.

    Comments: Presented at the Data For Good Exchange 2017

  15. arXiv:1701.01470  [pdf, other

    stat.ML cs.SI

    Graph Structure Learning from Unlabeled Data for Event Detection

    Authors: Sriram Somanchi, Daniel B. Neill

    Abstract: Processes such as disease propagation and information diffusion often spread over some latent network structure which must be learned from observation. Given a set of unlabeled training examples representing occurrences of an event type of interest (e.g., a disease outbreak), our goal is to learn a graph structure that can be used to accurately detect future events of that type. Motivated by new t… ▽ More

    Submitted 5 January, 2017; originally announced January 2017.

  16. arXiv:1611.08292  [pdf, other

    stat.ML cs.LG

    Identifying Significant Predictive Bias in Classifiers

    Authors: Zhe Zhang, Daniel B. Neill

    Abstract: We present a novel subset scan method to detect if a probabilistic binary classifier has statistically significant bias -- over or under predicting the risk -- for some subgroup, and identify the characteristics of this subgroup. This form of model checking and goodness-of-fit test provides a way to interpretably detect the presence of classifier bias or regions of poor classifier fit. This allows… ▽ More

    Submitted 4 July, 2017; v1 submitted 24 November, 2016; originally announced November 2016.

    Comments: Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017); earlier version presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  17. arXiv:1602.04393  [pdf, other

    cs.IR stat.ML

    Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams

    Authors: Abhinav Maurya, Kenton Murray, Yandong Liu, Chris Dyer, William W. Cohen, Daniel B. Neill

    Abstract: Early detection and precise characterization of emerging topics in text streams can be highly useful in applications such as timely and targeted public health interventions and discovering evolving regional business trends. Many methods have been proposed for detecting emerging events in text streams using topic modeling. However, these methods have numerous shortcomings that make them unsuitable… ▽ More

    Submitted 13 February, 2016; originally announced February 2016.

    Comments: 10 pages, 4 figures, KDD 2016 submission