Search | arXiv e-print repository

A Review of Pseudo-Labeling for Computer Vision

Authors: Patrick Kage, Jay C. Rothenberger, Pavlos Andreadis, Dimitrios I. Diochnos

Abstract: Deep neural models have achieved state of the art performance on a wide range of problems in computer science, especially in computer vision. However, deep neural networks often require large datasets of labeled samples to generalize effectively, and an important area of active research is semi-supervised learning, which attempts to instead utilize large quantities of (easily acquired) unlabeled s… ▽ More Deep neural models have achieved state of the art performance on a wide range of problems in computer science, especially in computer vision. However, deep neural networks often require large datasets of labeled samples to generalize effectively, and an important area of active research is semi-supervised learning, which attempts to instead utilize large quantities of (easily acquired) unlabeled samples. One family of methods in this space is pseudo-labeling, a class of algorithms that use model outputs to assign labels to unlabeled samples which are then used as labeled samples during training. Such assigned labels, called pseudo-labels, are most commonly associated with the field of semi-supervised learning. In this work we explore a broader interpretation of pseudo-labels within both self-supervised and unsupervised methods. By drawing the connection between these areas we identify new directions when advancements in one area would likely benefit others, such as curriculum learning and self-supervised regularization. △ Less

Submitted 23 May, 2025; v1 submitted 13 August, 2024; originally announced August 2024.

Comments: 40 pages, 4 figures, 2 tables

ACM Class: I.2.0; I.5.4; I.4.0

arXiv:2311.18083 [pdf, other]

Meta Co-Training: Two Views are Better than One

Authors: Jay C. Rothenberger, Dimitrios I. Diochnos

Abstract: In many critical computer vision scenarios unlabeled data is plentiful, but labels are scarce and difficult to obtain. As a result, semi-supervised learning which leverages unlabeled data to boost the performance of supervised classifiers have received significant attention in recent literature. One representative class of semi-supervised algorithms are co-training algorithms. Co-training algorith… ▽ More In many critical computer vision scenarios unlabeled data is plentiful, but labels are scarce and difficult to obtain. As a result, semi-supervised learning which leverages unlabeled data to boost the performance of supervised classifiers have received significant attention in recent literature. One representative class of semi-supervised algorithms are co-training algorithms. Co-training algorithms leverage two different models which have access to different independent and sufficient representations or "views" of the data to jointly make better predictions. Each of these models creates pseudo-labels on unlabeled points which are used to improve the other model. We show that in the common case where independent views are not available, we can construct such views inexpensively using pre-trained models. Co-training on the constructed views yields a performance improvement over any of the individual views we construct and performance comparable with recent approaches in semi-supervised learning. We present Meta Co-Training, a novel semi-supervised learning algorithm, which has two advantages over co-training: (i) learning is more robust when there is large discrepancy between the information content of the different views, and (ii) does not require retraining from scratch on each iteration. Our method achieves new state-of-the-art performance on ImageNet-10% achieving a ~4.7% reduction in error rate over prior work. Our method also outperforms prior semi-supervised work on several other fine-grained image classification datasets. △ Less

Submitted 27 May, 2025; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: 16 pages, 16 figures, 11 tables, for implementation see https://github.com/JayRothenberger/Meta-Co-Training

ACM Class: I.2.6; I.4.10

arXiv:1906.05815 [pdf, ps, other]

Lower Bounds for Adversarially Robust PAC Learning

Authors: Dimitrios I. Diochnos, Saeed Mahloujifar, Mohammad Mahmoody

Abstract: In this work, we initiate a formal study of probably approximately correct (PAC) learning under evasion attacks, where the adversary's goal is to \emph{misclassify} the adversarially perturbed sample point $\widetilde{x}$, i.e., $h(\widetilde{x})\neq c(\widetilde{x})$, where $c$ is the ground truth concept and $h$ is the learned hypothesis. Previous work on PAC learning of adversarial examples hav… ▽ More In this work, we initiate a formal study of probably approximately correct (PAC) learning under evasion attacks, where the adversary's goal is to \emph{misclassify} the adversarially perturbed sample point $\widetilde{x}$, i.e., $h(\widetilde{x})\neq c(\widetilde{x})$, where $c$ is the ground truth concept and $h$ is the learned hypothesis. Previous work on PAC learning of adversarial examples have all modeled adversarial examples as corrupted inputs in which the goal of the adversary is to achieve $h(\widetilde{x}) \neq c(x)$, where $x$ is the original untampered instance. These two definitions of adversarial risk coincide for many natural distributions, such as images, but are incomparable in general. We first prove that for many theoretically natural input spaces of high dimension $n$ (e.g., isotropic Gaussian in dimension $n$ under $\ell_2$ perturbations), if the adversary is allowed to apply up to a sublinear $o(||x||)$ amount of perturbations on the test instances, PAC learning requires sample complexity that is exponential in $n$. This is in contrast with results proved using the corrupted-input framework, in which the sample complexity of robust learning is only polynomially more. We then formalize hybrid attacks in which the evasion attack is preceded by a poisoning attack. This is perhaps reminiscent of "trapdoor attacks" in which a poisoning phase is involved as well, but the evasion phase here uses the error-region definition of risk that aims at misclassifying the perturbed instances. In this case, we show PAC learning is sometimes impossible all together, even when it is possible without the attack (e.g., due to the bounded VC dimension). △ Less

Submitted 13 June, 2019; originally announced June 2019.

arXiv:1810.12272 [pdf, ps, other]

Adversarial Risk and Robustness: General Definitions and Implications for the Uniform Distribution

Authors: Dimitrios I. Diochnos, Saeed Mahloujifar, Mohammad Mahmoody

Abstract: We study adversarial perturbations when the instances are uniformly distributed over $\{0,1\}^n$. We study both "inherent" bounds that apply to any problem and any classifier for such a problem as well as bounds that apply to specific problems and specific hypothesis classes. As the current literature contains multiple definitions of adversarial risk and robustness, we start by giving a taxonomy… ▽ More We study adversarial perturbations when the instances are uniformly distributed over $\{0,1\}^n$. We study both "inherent" bounds that apply to any problem and any classifier for such a problem as well as bounds that apply to specific problems and specific hypothesis classes. As the current literature contains multiple definitions of adversarial risk and robustness, we start by giving a taxonomy for these definitions based on their goals, we identify one of them as the one guaranteeing misclassification by pushing the instances to the error region. We then study some classic algorithms for learning monotone conjunctions and compare their adversarial risk and robustness under different definitions by attacking the hypotheses using instances drawn from the uniform distribution. We observe that sometimes these definitions lead to significantly different bounds. Thus, this study advocates for the use of the error-region definition, even though other definitions, in other contexts, may coincide with the error-region definition. Using the error-region definition of adversarial perturbations, we then study inherent bounds on risk and robustness of any classifier for any classification problem whose instances are uniformly distributed over $\{0,1\}^n$. Using the isoperimetric inequality for the Boolean hypercube, we show that for initial error $0.01$, there always exists an adversarial perturbation that changes $O(\sqrt{n})$ bits of the instances to increase the risk to $0.5$, making classifier's decisions meaningless. Furthermore, by also using the central limit theorem we show that when $n\to \infty$, at most $c \cdot \sqrt{n}$ bits of perturbations, for a universal constant $c< 1.17$, suffice for increasing the risk to $0.5$, and the same $c \cdot \sqrt{n} $ bits of perturbations on average suffice to increase the risk to $1$, hence bounding the robustness by $c \cdot \sqrt{n}$. △ Less

Submitted 29 October, 2018; originally announced October 2018.

Comments: Full version of a work with the same title that will appear in NIPS 2018, 31 pages containing 5 figures, 1 table, 2 algorithms

arXiv:1809.03063 [pdf, ps, other]

The Curse of Concentration in Robust Learning: Evasion and Poisoning Attacks from Concentration of Measure

Authors: Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody

Abstract: Many modern machine learning classifiers are shown to be vulnerable to adversarial perturbations of the instances. Despite a massive amount of work focusing on making classifiers robust, the task seems quite challenging. In this work, through a theoretical study, we investigate the adversarial risk and robustness of classifiers and draw a connection to the well-known phenomenon of concentration of… ▽ More Many modern machine learning classifiers are shown to be vulnerable to adversarial perturbations of the instances. Despite a massive amount of work focusing on making classifiers robust, the task seems quite challenging. In this work, through a theoretical study, we investigate the adversarial risk and robustness of classifiers and draw a connection to the well-known phenomenon of concentration of measure in metric measure spaces. We show that if the metric probability space of the test instance is concentrated, any classifier with some initial constant error is inherently vulnerable to adversarial perturbations. One class of concentrated metric probability spaces are the so-called Levy families that include many natural distributions. In this special case, our attacks only need to perturb the test instance by at most $O(\sqrt n)$ to make it misclassified, where $n$ is the data dimension. Using our general result about Levy instance spaces, we first recover as special case some of the previously proved results about the existence of adversarial examples. However, many more Levy families are known (e.g., product distribution under the Hamming distance) for which we immediately obtain new attacks that find adversarial examples of distance $O(\sqrt n)$. Finally, we show that concentration of measure for product spaces implies the existence of forms of "poisoning" attacks in which the adversary tampers with the training data with the goal of degrading the classifier. In particular, we show that for any learning algorithm that uses $m$ training examples, there is an adversary who can increase the probability of any "bad property" (e.g., failing on a particular test instance) that initially happens with non-negligible probability to $\approx 1$ by substituting only $\tilde{O}(\sqrt m)$ of the examples with other (still correctly labeled) examples. △ Less

Submitted 5 November, 2018; v1 submitted 9 September, 2018; originally announced September 2018.

arXiv:1711.03707 [pdf, ps, other]

Learning under $p$-Tampering Attacks

Authors: Saeed Mahloujifar, Dimitrios I. Diochnos, Mohammad Mahmoody

Abstract: Recently, Mahloujifar and Mahmoody (TCC'17) studied attacks against learning algorithms using a special case of Valiant's malicious noise, called $p$-tampering, in which the adversary gets to change any training example with independent probability $p$ but is limited to only choose malicious examples with correct labels. They obtained $p$-tampering attacks that increase the error probability in th… ▽ More Recently, Mahloujifar and Mahmoody (TCC'17) studied attacks against learning algorithms using a special case of Valiant's malicious noise, called $p$-tampering, in which the adversary gets to change any training example with independent probability $p$ but is limited to only choose malicious examples with correct labels. They obtained $p$-tampering attacks that increase the error probability in the so called targeted poisoning model in which the adversary's goal is to increase the loss of the trained hypothesis over a particular test example. At the heart of their attack was an efficient algorithm to bias the expected value of any bounded real-output function through $p$-tampering. In this work, we present new biasing attacks for increasing the expected value of bounded real-valued functions. Our improved biasing attacks, directly imply improved $p$-tampering attacks against learners in the targeted poisoning model. As a bonus, our attacks come with considerably simpler analysis. We also study the possibility of PAC learning under $p$-tampering attacks in the non-targeted (aka indiscriminate) setting where the adversary's goal is to increase the risk of the generated hypothesis (for a random test example). We show that PAC learning is possible under $p$-tampering poisoning attacks essentially whenever it is possible in the realizable setting without the attacks. We further show that PAC learning under "correct-label" adversarial noise is not possible in general, if the adversary could choose the (still limited to only $p$ fraction of) tampered examples that she substitutes with adversarially chosen ones. Our formal model for such "bounded-budget" tampering attackers is inspired by the notions of (strong) adaptive corruption in secure multi-party computation. △ Less

Submitted 27 November, 2018; v1 submitted 10 November, 2017; originally announced November 2017.

arXiv:1304.5863 [pdf, ps, other]

Commonsense Reasoning and Large Network Analysis: A Computational Study of ConceptNet 4

Authors: Dimitrios I. Diochnos

Abstract: In this report a computational study of ConceptNet 4 is performed using tools from the field of network analysis. Part I describes the process of extracting the data from the SQL database that is available online, as well as how the closure of the input among the assertions in the English language is computed. This part also performs a validation of the input as well as checks for the consistency… ▽ More In this report a computational study of ConceptNet 4 is performed using tools from the field of network analysis. Part I describes the process of extracting the data from the SQL database that is available online, as well as how the closure of the input among the assertions in the English language is computed. This part also performs a validation of the input as well as checks for the consistency of the entire database. Part II investigates the structural properties of ConceptNet 4. Different graphs are induced from the knowledge base by fixing different parameters. The degrees and the degree distributions are examined, the number and sizes of connected components, the transitivity and clustering coefficient, the cores, information related to shortest paths in the graphs, and cliques. Part III investigates non-overlapping, as well as overlapping communities that are found in ConceptNet 4. Finally, Part IV describes an investigation on rules. △ Less

Submitted 23 June, 2013; v1 submitted 22 April, 2013; originally announced April 2013.

Comments: 152 pages, 99 tables, 23 figures (76 sub-figures)

arXiv:1203.1017 [pdf, ps, other]

doi 10.1016/j.jsc.2008.04.009

On the asymptotic and practical complexity of solving bivariate systems over the reals

Authors: Dimitrios I. Diochnos, Ioannis Z. Emiris, Elias P. Tsigaridas

Abstract: This paper is concerned with exact real solving of well-constrained, bivariate polynomial systems. The main problem is to isolate all common real roots in rational rectangles, and to determine their intersection multiplicities. We present three algorithms and analyze their asymptotic bit complexity, obtaining a bound of $\sOB(N^{14})$ for the purely projection-based method, and $\sOB(N^{12})$ for… ▽ More This paper is concerned with exact real solving of well-constrained, bivariate polynomial systems. The main problem is to isolate all common real roots in rational rectangles, and to determine their intersection multiplicities. We present three algorithms and analyze their asymptotic bit complexity, obtaining a bound of $\sOB(N^{14})$ for the purely projection-based method, and $\sOB(N^{12})$ for two subresultant-based methods: this notation ignores polylogarithmic factors, where $N$ bounds the degree and the bitsize of the polynomials. The previous record bound was $\sOB(N^{14})$. Our main tool is signed subresultant sequences. We exploit recent advances on the complexity of univariate root isolation, and extend them to sign evaluation of bivariate polynomials over two algebraic numbers, and real root counting for polynomials over an extension field. Our algorithms apply to the problem of simultaneous inequalities; they also compute the topology of real plane algebraic curves in $\sOB(N^{12})$, whereas the previous bound was $\sOB(N^{14})$. All algorithms have been implemented in MAPLE, in conjunction with numeric filtering. We compare them against FGB/RS, system solvers from SYNAPS, and MAPLE libraries INSULATE and TOP, which compute curve topology. Our software is among the most robust, and its runtimes are comparable, or within a small constant factor, with respect to the C/C++ libraries. Key words: real solving, polynomial systems, complexity, MAPLE software △ Less

Submitted 5 March, 2012; originally announced March 2012.

Comments: 17 pages, 4 algorithms, 1 table, and 1 figure with 2 sub-figures

Journal ref: J. Symb. Comput. 44(7): 818-835 (2009)

Showing 1–8 of 8 results for author: Diochnos, D I