Search | arXiv e-print repository

Doing More With Less: Mismatch-Based Risk-Limiting Audits

Authors: Alexander Ek, Michelle Blom, Philip B. Stark, Peter J. Stuckey, Vanessa J. Teague, Damjan Vukcevic

Abstract: One approach to risk-limiting audits (RLAs) compares randomly selected cast vote records (CVRs) to votes read by human auditors from the corresponding ballot cards. Historically, such methods reduce audit sample sizes by considering how each sampled CVR differs from the corresponding true vote, not merely whether they differ. Here we investigate the latter approach, auditing by testing whether the… ▽ More One approach to risk-limiting audits (RLAs) compares randomly selected cast vote records (CVRs) to votes read by human auditors from the corresponding ballot cards. Historically, such methods reduce audit sample sizes by considering how each sampled CVR differs from the corresponding true vote, not merely whether they differ. Here we investigate the latter approach, auditing by testing whether the total number of mismatches in the full set of CVRs exceeds the minimum number of CVR errors required for the reported outcome to be wrong (the "CVR margin"). This strategy makes it possible to audit more social choice functions and simplifies RLAs conceptually, which makes it easier to explain than some other RLA approaches. The cost is larger sample sizes. "Mismatch-based RLAs" only require a lower bound on the CVR margin, which for some social choice functions is easier to calculate than the effect of particular errors. When the population rate of mismatches is low and the lower bound on the CVR margin is close to the true CVR margin, the increase in sample size is small. However, the increase may be very large when errors include errors that, if corrected, would widen the CVR margin rather than narrow it; errors affect the margin between candidates other than the reported winner with the fewest votes and the reported loser with the most votes; or errors that affect different margins. △ Less

Submitted 20 March, 2025; originally announced March 2025.

Comments: 15 pages, 2 figures, accepted for Voting'25

arXiv:2407.16465 [pdf, other]

doi 10.1007/978-3-031-72244-8_3

Improving the Computational Efficiency of Adaptive Audits of IRV Elections

Authors: Alexander Ek, Michelle Blom, Philip B. Stark, Peter J. Stuckey, Damjan Vukcevic

Abstract: AWAIRE is one of two extant methods for conducting risk-limiting audits of instant-runoff voting (IRV) elections. In principle AWAIRE can audit IRV contests with any number of candidates, but the original implementation incurred memory and computation costs that grew superexponentially with the number of candidates. This paper improves the algorithmic implementation of AWAIRE in three ways that ma… ▽ More AWAIRE is one of two extant methods for conducting risk-limiting audits of instant-runoff voting (IRV) elections. In principle AWAIRE can audit IRV contests with any number of candidates, but the original implementation incurred memory and computation costs that grew superexponentially with the number of candidates. This paper improves the algorithmic implementation of AWAIRE in three ways that make it practical to audit IRV contests with 55 candidates, compared to the previous 6 candidates. First, rather than trying from the start to rule out all candidate elimination orders that produce a different winner, the algorithm starts by considering only the final round, testing statistically whether each candidate could have won that round. For those candidates who cannot be ruled out at that stage, it expands to consider earlier and earlier rounds until either it provides strong evidence that the reported winner really won or a full hand count is conducted, revealing who really won. Second, it tests a richer collection of conditions, some of which can rule out many elimination orders at once. Third, it exploits relationships among those conditions, allowing it to abandon testing those that are unlikely to help. We provide real-world examples with up to 36 candidates and synthetic examples with up to 55 candidates, showing how audit sample size depends on the margins and on the tuning parameters. An open-source Python implementation is publicly available. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: 16 pages, 4 figures, accepted for E-Vote-ID 2024

Journal ref: Electronic Voting, E-Vote-ID 2024, Lecture Notes in Computer Science 15014 (2024) 37-53

arXiv:2403.15400 [pdf, other]

doi 10.1007/978-3-031-69231-4_2

Efficient Weighting Schemes for Auditing Instant-Runoff Voting Elections

Authors: Alexander Ek, Philip B. Stark, Peter J. Stuckey, Damjan Vukcevic

Abstract: Various risk-limiting audit (RLA) methods have been developed for instant-runoff voting (IRV) elections. A recent method, AWAIRE, is the first efficient approach that can take advantage of but does not require cast vote records (CVRs). AWAIRE involves adaptively weighted averages of test statistics, essentially "learning" an effective set of hypotheses to test. However, the initial paper on AWAIRE… ▽ More Various risk-limiting audit (RLA) methods have been developed for instant-runoff voting (IRV) elections. A recent method, AWAIRE, is the first efficient approach that can take advantage of but does not require cast vote records (CVRs). AWAIRE involves adaptively weighted averages of test statistics, essentially "learning" an effective set of hypotheses to test. However, the initial paper on AWAIRE only examined a few weighting schemes and parameter settings. We explore schemes and settings more extensively, to identify and recommend efficient choices for practice. We focus on the case where CVRs are not available, assessing performance using simulations based on real election data. The most effective schemes are often those that place most or all of the weight on the apparent "best" hypotheses based on already seen data. Conversely, the optimal tuning parameters tended to vary based on the election margin. Nonetheless, we quantify the performance trade-offs for different choices across varying election margins, aiding in selecting the most desirable trade-off if a default option is needed. A limitation of the current AWAIRE implementation is its restriction to a small number of candidates -- up to six in previous implementations. One path to a more computationally efficient implementation would be to use lazy evaluation and avoid considering all possible hypotheses. Our findings suggest that such an approach could be done without substantially compromising statistical performance. △ Less

Submitted 6 May, 2024; v1 submitted 18 February, 2024; originally announced March 2024.

Comments: 15 pages, 4, figures, presented at Voting'24. The current version includes some improved wording and fixes a few errors

Journal ref: FC 2024 Workshops, Lecture Notes in Computer Science 14746 (2025) 18-32

arXiv:2307.10972 [pdf, other]

doi 10.1007/978-3-031-43756-4_3

Adaptively Weighted Audits of Instant-Runoff Voting Elections: AWAIRE

Authors: Alexander Ek, Philip B. Stark, Peter J. Stuckey, Damjan Vukcevic

Abstract: An election audit is risk-limiting if the audit limits (to a pre-specified threshold) the chance that an erroneous electoral outcome will be certified. Extant methods for auditing instant-runoff voting (IRV) elections are either not risk-limiting or require cast vote records (CVRs), the voting system's electronic record of the votes on each ballot. CVRs are not always available, for instance, in j… ▽ More An election audit is risk-limiting if the audit limits (to a pre-specified threshold) the chance that an erroneous electoral outcome will be certified. Extant methods for auditing instant-runoff voting (IRV) elections are either not risk-limiting or require cast vote records (CVRs), the voting system's electronic record of the votes on each ballot. CVRs are not always available, for instance, in jurisdictions that tabulate IRV contests manually. We develop an RLA method (AWAIRE) that uses adaptively weighted averages of test supermartingales to efficiently audit IRV elections when CVRs are not available. The adaptive weighting 'learns' an efficient set of hypotheses to test to confirm the election outcome. When accurate CVRs are available, AWAIRE can use them to increase the efficiency to match the performance of existing methods that require CVRs. We provide an open-source prototype implementation that can handle elections with up to six candidates. Simulations using data from real elections show that AWAIRE is likely to be efficient in practice. We discuss how to extend the computational approach to handle elections with more candidates. Adaptively weighted averages of test supermartingales are a general tool, useful beyond election audits to test collections of hypotheses sequentially while rigorously controlling the familywise error rate. △ Less

Submitted 5 October, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Comments: 16 pages, 3 figures. Presented at E-Vote-ID 2023. This version contains minor corrections to match the final published version

Journal ref: Electronic Voting, E-Vote-ID 2023, Lecture Notes in Computer Science 14230 (2023) 35-51

arXiv:2209.03881 [pdf, other]

doi 10.1007/978-3-031-25460-4_30

Ballot-Polling Audits of Instant-Runoff Voting Elections with a Dirichlet-Tree Model

Authors: Floyd Everest, Michelle Blom, Philip B. Stark, Peter J. Stuckey, Vanessa Teague, Damjan Vukcevic

Abstract: Instant-runoff voting (IRV) is used in several countries around the world. It requires voters to rank candidates in order of preference, and uses a counting algorithm that is more complex than systems such as first-past-the-post or scoring rules. An even more complex system, the single transferable vote (STV), is used when multiple candidates need to be elected. The complexity of these systems has… ▽ More Instant-runoff voting (IRV) is used in several countries around the world. It requires voters to rank candidates in order of preference, and uses a counting algorithm that is more complex than systems such as first-past-the-post or scoring rules. An even more complex system, the single transferable vote (STV), is used when multiple candidates need to be elected. The complexity of these systems has made it difficult to audit the election outcomes. There is currently no known risk-limiting audit (RLA) method for STV, other than a full manual count of the ballots. A new approach to auditing these systems was recently proposed, based on a Dirichlet-tree model. We present a detailed analysis of this approach for ballot-polling Bayesian audits of IRV elections. We compared several choices for the prior distribution, including some approaches using a Bayesian bootstrap (equivalent to an improper prior). Our findings include that the bootstrap-based approaches can be adapted to perform similarly to a full Bayesian model in practice, and that an overly informative prior can give counter-intuitive results. Via carefully chosen examples, we show why creating an RLA with this model is challenging, but we also suggest ways to overcome this. As well as providing a practical and computationally feasible implementation of a Bayesian IRV audit, our work is important in laying the foundation for an RLA for STV elections. △ Less

Submitted 23 February, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

Comments: 17 pages, 6 figures. Presented at EIS 2022. This version contains minor corrections to match the final published version

Journal ref: ESORICS 2022 Workshops, EIS 2022, Lecture Notes in Computer Science 13785 (2023) 525-540

arXiv:2206.14605 [pdf, other]

doi 10.15157/diss/021

Auditing Ranked Voting Elections with Dirichlet-Tree Models: First Steps

Authors: Floyd Everest, Michelle Blom, Philip B. Stark, Peter J. Stuckey, Vanessa Teague, Damjan Vukcevic

Abstract: Ranked voting systems, such as instant-runoff voting (IRV) and single transferable vote (STV), are used in many places around the world. They are more complex than plurality and scoring rules, presenting a challenge for auditing their outcomes: there is no known risk-limiting audit (RLA) method for STV other than a full hand count. We present a new approach to auditing ranked systems that uses a… ▽ More Ranked voting systems, such as instant-runoff voting (IRV) and single transferable vote (STV), are used in many places around the world. They are more complex than plurality and scoring rules, presenting a challenge for auditing their outcomes: there is no known risk-limiting audit (RLA) method for STV other than a full hand count. We present a new approach to auditing ranked systems that uses a statistical model, a Dirichlet-tree, that can cope with high-dimensional parameters in a computationally efficient manner. We demonstrate this approach with a ballot-polling Bayesian audit for IRV elections. Although the technique is not known to be risk-limiting, we suggest some strategies that might allow it to be calibrated to limit risk. △ Less

Submitted 8 September, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: 5 pages, 2 figures, accepted for E-Vote-ID 2022. This version has an updated URL for the software package

Journal ref: E-Vote-ID 2022, Conference Proceedings, UT Press, pages 76-80

arXiv:2205.14634 [pdf, other]

Assessing the accuracy of the Australian Senate count: Key steps for a rigorous and transparent audit

Authors: Michelle Blom, Philip B. Stark, Peter J. Stuckey, Vanessa Teague, Damjan Vukcevic

Abstract: This paper explains the main principles and some of the technical details for auditing the scanning and digitisation of the Australian Senate ballot papers. We give a short summary of the motivation for auditing paper ballots, explain the necessary supporting steps for a rigorous and transparent audit, and suggest some statistical methods that would be appropriate for the Australian Senate. 22 J… ▽ More This paper explains the main principles and some of the technical details for auditing the scanning and digitisation of the Australian Senate ballot papers. We give a short summary of the motivation for auditing paper ballots, explain the necessary supporting steps for a rigorous and transparent audit, and suggest some statistical methods that would be appropriate for the Australian Senate. 22 June 2022 Update: The update includes analysis of Senate preference data from the 2022 Australian election. △ Less

Submitted 22 June, 2022; v1 submitted 29 May, 2022; originally announced May 2022.

arXiv:2009.06921 [pdf, ps, other]

Optimal Decision Trees for Nonlinear Metrics

Authors: Emir Demirović, Peter J. Stuckey

Abstract: Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more samples of one class than the other. Recent optimal decision tree algorithms have shown remarkable progress in producing trees that are optimal with respect to li… ▽ More Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more samples of one class than the other. Recent optimal decision tree algorithms have shown remarkable progress in producing trees that are optimal with respect to linear criteria, such as accuracy, but unfortunately nonlinear metrics remain a challenge. To address this gap, we propose a novel algorithm based on bi-objective optimisation, which treats misclassifications of each binary class as a separate objective. We show that, for a large class of metrics, the optimal tree lies on the Pareto frontier. Consequently, we obtain the optimal tree by using our method to generate the set of all nondominated trees. To the best of our knowledge, this is the first method to compute provably optimal decision trees for nonlinear metrics. Our approach leads to a trade-off when compared to optimising linear metrics: the resulting trees may be more desirable according to the given nonlinear metric at the expense of higher runtimes. Nevertheless, the experiments illustrate that runtimes are reasonable for majority of the tested datasets. △ Less

Submitted 15 October, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

Journal ref: AAAI 2021

arXiv:2007.12652 [pdf, other]

MurTree: Optimal Classification Trees via Dynamic Programming and Search

Authors: Emir Demirović, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, Peter J. Stuckey

Abstract: Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In r… ▽ More Decision tree learning is a widely used approach in machine learning, favoured in applications that require concise and interpretable models. Heuristic methods are traditionally used to quickly produce models with reasonably high accuracy. A commonly criticised point, however, is that the resulting trees may not necessarily be the best representation of the data in terms of accuracy and size. In recent years, this motivated the development of optimal classification tree algorithms that globally optimise the decision tree in contrast to heuristic methods that perform a sequence of locally optimal decisions. We follow this line of work and provide a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our algorithm supports constraints on the depth of the tree and number of nodes. The success of our approach is attributed to a series of specialised techniques that exploit properties unique to classification trees. Whereas algorithms for optimal classification trees have traditionally been plagued by high runtimes and limited scalability, we show in a detailed experimental study that our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances, providing several orders of magnitude improvements and notably contributing towards the practical realisation of optimal decision trees. △ Less

Submitted 28 June, 2022; v1 submitted 24 July, 2020; originally announced July 2020.

Journal ref: Journal of Machine Learning Research 2022

Showing 1–9 of 9 results for author: Stuckey, P J