-
Automatic Complementary Separation Pruning Toward Lightweight CNNs
Authors:
David Levin,
Gonen Singer
Abstract:
In this paper, we present Automatic Complementary Separation Pruning (ACSP), a novel and fully automated pruning method for convolutional neural networks. ACSP integrates the strengths of both structured pruning and activation-based pruning, enabling the efficient removal of entire components such as neurons and channels while leveraging activations to identify and retain the most relevant compone…
▽ More
In this paper, we present Automatic Complementary Separation Pruning (ACSP), a novel and fully automated pruning method for convolutional neural networks. ACSP integrates the strengths of both structured pruning and activation-based pruning, enabling the efficient removal of entire components such as neurons and channels while leveraging activations to identify and retain the most relevant components. Our approach is designed specifically for supervised learning tasks, where we construct a graph space that encodes the separation capabilities of each component with respect to all class pairs. By employing complementary selection principles and utilizing a clustering algorithm, ACSP ensures that the selected components maintain diverse and complementary separation capabilities, reducing redundancy and maintaining high network performance. The method automatically determines the optimal subset of components in each layer, utilizing a knee-finding algorithm to select the minimal subset that preserves performance without requiring user-defined pruning volumes. Extensive experiments on multiple architectures, including VGG-16, ResNet-50, and MobileNet-V2, across datasets like CIFAR-10, CIFAR-100, and ImageNet-1K, demonstrate that ACSP achieves competitive accuracy compared to other methods while significantly reducing computational costs. This fully automated approach not only enhances scalability but also makes ACSP especially practical for real-world deployment by eliminating the need for manually defining the pruning volume.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Latency Guarantees for Caching with Delayed Hits
Authors:
Keerthana Gurushankar,
Noah G. Singer,
Bernardo Subercaseaux
Abstract:
In the classical caching problem, when a requested page is not present in the cache (i.e., a "miss"), it is assumed to travel from the backing store into the cache "before" the next request arrives. However, in many real-life applications, such as content delivery networks, this assumption is unrealistic.
The "delayed-hits" model for caching, introduced by Atre, Sherry, Wang, and Berger, account…
▽ More
In the classical caching problem, when a requested page is not present in the cache (i.e., a "miss"), it is assumed to travel from the backing store into the cache "before" the next request arrives. However, in many real-life applications, such as content delivery networks, this assumption is unrealistic.
The "delayed-hits" model for caching, introduced by Atre, Sherry, Wang, and Berger, accounts for the latency between a missed cache request and the corresponding arrival from the backing store. This theoretical model has two parameters: the "delay" $Z$, representing the ratio between the retrieval delay and the inter-request delay in an application, and the "cache size" $k$, as in classical caching. Classical caching corresponds to $Z=1$, whereas larger values of $Z$ model applications where retrieving missed requests is expensive. Despite the practical relevance of the delayed-hits model, its theoretical underpinnings are still poorly understood.
We present the first tight theoretical guarantee for optimizing delayed-hits caching: The "Least Recently Used" algorithm, a natural, deterministic, online algorithm widely used in practice, is $O(Zk)$-competitive, meaning it incurs at most $O(Zk)$ times more latency than the (offline) optimal schedule. Our result extends to any so-called "marking" algorithm.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Streaming Algorithms via Local Algorithms for Maximum Directed Cut
Authors:
Raghuvansh R. Saxena,
Noah G. Singer,
Madhu Sudan,
Santhoshini Velusamy
Abstract:
We explore the use of local algorithms in the design of streaming algorithms for the Maximum Directed Cut problem. Specifically, building on the local algorithm of Buchbinder et al. (FOCS'12) and Censor-Hillel et al. (ALGOSENSORS'17), we develop streaming algorithms for both adversarially and randomly ordered streams that approximate the value of maximum directed cut in bounded-degree graphs. In…
▽ More
We explore the use of local algorithms in the design of streaming algorithms for the Maximum Directed Cut problem. Specifically, building on the local algorithm of Buchbinder et al. (FOCS'12) and Censor-Hillel et al. (ALGOSENSORS'17), we develop streaming algorithms for both adversarially and randomly ordered streams that approximate the value of maximum directed cut in bounded-degree graphs. In $n$-vertex graphs, for adversarially ordered streams, our algorithm uses $O(n^{1-Ω(1)})$ (sub-linear) space and for randomly ordered streams, our algorithm uses logarithmic space. Moreover, both algorithms require only one pass over the input stream. With a constant number of passes, we give a logarithmic-space algorithm which works even on graphs with unbounded degree on adversarially ordered streams. Our algorithms achieve any fixed constant approximation factor less than $\frac12$. In the single-pass setting, this is tight: known lower bounds show that obtaining any constant approximation factor greater than $\frac12$ is impossible without using linear space in adversarially ordered streams (Kapralov and Krachun, STOC'19) and $Ω(\sqrt{n})$ space in randomly ordered streams, even on bounded degree graphs (Kapralov, Khanna, and Sudan, SODA'15).
In terms of techniques, our algorithms partition the vertices into a small number of different types based on the structure of their local neighborhood, ensuring that each type carries enough information about the structure to approximately simulate the local algorithm on a vertex with that type. We then develop tools to accurately estimate the frequency of each type. This allows us to simulate an execution of the local algorithm on all vertices, and thereby approximate the value of the maximum directed cut.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
Oblivious Algorithms for Maximum Directed Cut: New Upper and Lower Bounds
Authors:
Samuel Hwang,
Noah G. Singer,
Santhoshini Velusamy
Abstract:
In the maximum directed cut problem, the input is a directed graph $G=(V,E)$, and the goal is to pick a partition $V = S \cup (V \setminus S)$ of the vertices such that as many edges as possible go from $S$ to $V\setminus S$. Oblivious algorithms, introduced by Feige and Jozeph (Algorithmica'17), are a simple class of algorithms for this problem. These algorithms independently and randomly assign…
▽ More
In the maximum directed cut problem, the input is a directed graph $G=(V,E)$, and the goal is to pick a partition $V = S \cup (V \setminus S)$ of the vertices such that as many edges as possible go from $S$ to $V\setminus S$. Oblivious algorithms, introduced by Feige and Jozeph (Algorithmica'17), are a simple class of algorithms for this problem. These algorithms independently and randomly assign each vertex $v$ to either $S$ or $V \setminus S$, and the distribution of $v$'s assignment is determined using only extremely local information about $v$: its bias, i.e., the relative difference between its out- and in-degrees. These algorithms have natural implementations in certain graph streaming models, where they have important implications (Saxena, Singer, Sudan, and Velusamy, SODA'23, FOCS'23, Kallaugher, Parekh, and Voronova, STOC'24).
In this work, we narrow the gap between upper and lower bounds on the best approximation ratio achievable by oblivious algorithms for Max-Directed-Cut. We show that there exists an oblivious algorithm achieving an approximation ratio of at least $0.4853$, while every oblivious algorithm obeying a natural symmetry property achieves an approximation ratio of at most $0.4889$. The previous known bounds were $0.4844$ and $0.4899$, due to Singer (APPROX'23) and Feige and Jozeph, respectively. Our techniques involve designing principled parameterizations of the spaces of algorithms and lower bounds and then executing computer searches through these spaces.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Coboundary expansion inside Chevalley coset complex HDXs
Authors:
Ryan O'Donnell,
Noah G. Singer
Abstract:
Recent major results in property testing~\cite{BLM24,DDL24} and PCPs~\cite{BMV24} were unlocked by moving to high-dimensional expanders (HDXs) constructed from $\widetilde{C}_d$-type buildings, rather than the long-known $\widetilde{A}_d$-type ones. At the same time, these building quotient HDXs are not as easy to understand as the more elementary (and more symmetric/explicit) \emph{coset complex}…
▽ More
Recent major results in property testing~\cite{BLM24,DDL24} and PCPs~\cite{BMV24} were unlocked by moving to high-dimensional expanders (HDXs) constructed from $\widetilde{C}_d$-type buildings, rather than the long-known $\widetilde{A}_d$-type ones. At the same time, these building quotient HDXs are not as easy to understand as the more elementary (and more symmetric/explicit) \emph{coset complex} HDXs constructed by Kaufman--Oppenheim~\cite{KO18} (of $A_d$-type) and O'Donnell--Pratt~\cite{OP22} (of $B_d$-, $C_d$-, $D_d$-type). Motivated by these considerations, we study the $B_3$-type generalization of a recent work of Kaufman--Oppenheim~\cite{KO21}, which showed that the $A_3$-type coset complex HDXs have good $1$-coboundary expansion in their links, and thus yield $2$-dimensional topological expanders.
The crux of Kaufman--Oppenheim's proof of $1$-coboundary expansion was: (1)~identifying a group-theoretic result by Biss and Dasgupta~\cite{BD01} on small presentations for the $A_3$-unipotent group over~$\mathbb{F}_q$; (2)~``lifting'' it to an analogous result for an $A_3$-unipotent group over polynomial extensions~$\mathbb{F}_q[x]$.
For our $B_3$-type generalization, the analogue of~(1) appears to not hold. We manage to circumvent this with a significantly more involved strategy: (1)~getting a computer-assisted proof of vanishing $1$-cohomology of $B_3$-type unipotent groups over~$\mathbb{F}_5$; (2)~developing significant new ``lifting'' technology to deduce the required quantitative $1$-cohomology results in $B_3$-type unipotent groups over $\mathbb{F}_{5^k}[x]$.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Graph-Based Automatic Feature Selection for Multi-Class Classification via Mean Simplified Silhouette
Authors:
David Levin,
Gonen Singer
Abstract:
This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS) for multi-class classification tasks. The method determines the minimum combination of features required to sustain prediction performance while maintaining complementary discriminating abilities between different classes. It does not require any user-defined parameters such as the numbe…
▽ More
This paper introduces a novel graph-based filter method for automatic feature selection (abbreviated as GB-AFS) for multi-class classification tasks. The method determines the minimum combination of features required to sustain prediction performance while maintaining complementary discriminating abilities between different classes. It does not require any user-defined parameters such as the number of features to select. The methodology employs the Jeffries-Matusita (JM) distance in conjunction with t-distributed Stochastic Neighbor Embedding (t-SNE) to generate a low-dimensional space reflecting how effectively each feature can differentiate between each pair of classes. The minimum number of features is selected using our newly developed Mean Simplified Silhouette (abbreviated as MSS) index, designed to evaluate the clustering results for the feature selection task. Experimental results on public data sets demonstrate the superior performance of the proposed GB-AFS over other filter-based techniques and automatic feature selection approaches. Moreover, the proposed algorithm maintained the accuracy achieved when utilizing all features, while using only $7\%$ to $30\%$ of the features. Consequently, this resulted in a reduction of the time needed for classifications, from $15\%$ to $70\%$.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
Authors:
Phillip Howard,
Junlin Wang,
Vasudev Lal,
Gadi Singer,
Yejin Choi,
Swabha Swayamdipta
Abstract:
Comparative knowledge (e.g., steel is stronger and heavier than styrofoam) is an essential component of our world knowledge, yet understudied in prior literature. In this paper, we harvest the dramatic improvements in knowledge capabilities of language models into a large-scale comparative knowledge base. While the ease of acquisition of such comparative knowledge is much higher from extreme-scale…
▽ More
Comparative knowledge (e.g., steel is stronger and heavier than styrofoam) is an essential component of our world knowledge, yet understudied in prior literature. In this paper, we harvest the dramatic improvements in knowledge capabilities of language models into a large-scale comparative knowledge base. While the ease of acquisition of such comparative knowledge is much higher from extreme-scale models like GPT-4, compared to their considerably smaller and weaker counterparts such as GPT-2, not even the most powerful models are exempt from making errors. We thus ask: to what extent are models at different scales able to generate valid and diverse comparative knowledge?
We introduce NeuroComparatives, a novel framework for comparative knowledge distillation overgenerated from language models such as GPT-variants and LLaMA, followed by stringent filtering of the generated knowledge. Our framework acquires comparative knowledge between everyday objects, producing a corpus of up to 8.8M comparisons over 1.74M entity pairs - 10X larger and 30% more diverse than existing resources. Moreover, human evaluations show that NeuroComparatives outperform existing resources in terms of validity (up to 32% absolute improvement). Our acquired NeuroComparatives leads to performance improvements on five downstream tasks. We find that neuro-symbolic manipulation of smaller models offers complementary benefits to the currently dominant practice of prompting extreme-scale language models for knowledge distillation.
△ Less
Submitted 5 April, 2024; v1 submitted 8 May, 2023;
originally announced May 2023.
-
Oblivious algorithms for the Max-$k$AND Problem
Authors:
Noah G. Singer
Abstract:
Motivated by recent works on streaming algorithms for constraint satisfaction problems (CSPs), we define and analyze oblivious algorithms for the Max-$k$AND problem. This generalizes the definition by Feige and Jozeph (Algorithmica '15) of oblivious algorithms for Max-DICUT, a special case of Max-$2$AND. Oblivious algorithms round each variable with probability depending only on a quantity called…
▽ More
Motivated by recent works on streaming algorithms for constraint satisfaction problems (CSPs), we define and analyze oblivious algorithms for the Max-$k$AND problem. This generalizes the definition by Feige and Jozeph (Algorithmica '15) of oblivious algorithms for Max-DICUT, a special case of Max-$2$AND. Oblivious algorithms round each variable with probability depending only on a quantity called the variable's bias.
For each oblivious algorithm, we design a so-called "factor-revealing linear program" (LP) which captures its worst-case instance, generalizing one of Feige and Jozeph for Max-DICUT. Then, departing from their work, we perform a fully explicit analysis of these (infinitely many!) LPs. In particular, we show that for all $k$, oblivious algorithms for Max-$k$AND provably outperform a special subclass of algorithms we call "superoblivious" algorithms.
Our result has implications for streaming algorithms: Generalizing the result for Max-DICUT of Saxena, Singer, Sudan, and Velusamy (SODA'23), we prove that certain separation results hold between streaming models for infinitely many CSPs: for every $k$, $O(\log n)$-space sketching algorithms for Max-$k$AND known to be optimal in $o(\sqrt n)$-space can be beaten in (a) $O(\log n)$-space under a random-ordering assumption, and (b) $O(n^{1-1/k} D^{1/k})$ space under a maximum-degree-$D$ assumption. Even in the previously-known case of Max-DICUT, our analytic proof gives a fuller, computer-free picture of these separation results.
△ Less
Submitted 7 May, 2023;
originally announced May 2023.
-
On streaming approximation algorithms for constraint satisfaction problems
Authors:
Noah G. Singer
Abstract:
In this thesis, we explore streaming algorithms for approximating constraint satisfaction problems (CSPs). The setup is roughly the following: A computer has limited memory space, sees a long "stream" of local constraints on a set of variables, and tries to estimate how many of the constraints may be simultaneously satisfied. The past ten years have seen a number of works in this area, and this th…
▽ More
In this thesis, we explore streaming algorithms for approximating constraint satisfaction problems (CSPs). The setup is roughly the following: A computer has limited memory space, sees a long "stream" of local constraints on a set of variables, and tries to estimate how many of the constraints may be simultaneously satisfied. The past ten years have seen a number of works in this area, and this thesis includes both expository material and novel contributions. Throughout, we emphasize connections to the broader theories of CSPs, approximability, and streaming models, and highlight interesting open problems.
The first part of our thesis is expository: We present aspects of previous works that completely characterize the approximability of specific CSPs like Max-Cut and Max-Dicut with $\sqrt{n}$-space streaming algorithm (on $n$-variable instances), while characterizing the approximability of all CSPs in $\sqrt n$ space in the special case of "composable" (i.e., sketching) algorithms, and of a particular subclass of CSPs with linear-space streaming algorithms.
In the second part of the thesis, we present two of our own joint works. We begin with a work with Madhu Sudan and Santhoshini Velusamy in which we prove linear-space streaming approximation-resistance for all ordering CSPs (OCSPs), which are "CSP-like" problems maximizing over sets of permutations. Next, we present joint work with Joanna Boyland, Michael Hwang, Tarun Prasad, and Santhoshini Velusamy in which we investigate the $\sqrt n$-space streaming approximability of symmetric Boolean CSPs with negations. We give explicit $\sqrt n$-space sketching approximability ratios for several families of CSPs, including Max-$k$AND; develop simpler optimal sketching approximation algorithms for threshold predicates; and show that previous lower bounds fail to characterize the $\sqrt n$-space streaming approximability of Max-$3$AND.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
Thrill-K Architecture: Towards a Solution to the Problem of Knowledge Based Understanding
Authors:
Gadi Singer,
Joscha Bach,
Tetiana Grinberg,
Nagib Hakim,
Phillip Howard,
Vasudev Lal,
Zev Rivlin
Abstract:
While end-to-end learning systems are rapidly gaining capabilities and popularity, the increasing computational demands for deploying such systems, along with a lack of flexibility, adaptability, explainability, reasoning and verification capabilities, require new types of architectures. Here we introduce a classification of hybrid systems which, based on an analysis of human knowledge and intelli…
▽ More
While end-to-end learning systems are rapidly gaining capabilities and popularity, the increasing computational demands for deploying such systems, along with a lack of flexibility, adaptability, explainability, reasoning and verification capabilities, require new types of architectures. Here we introduce a classification of hybrid systems which, based on an analysis of human knowledge and intelligence, combines neural learning with various types of knowledge and knowledge sources. We present the Thrill-K architecture as a prototypical solution for integrating instantaneous knowledge, standby knowledge and external knowledge sources in a framework capable of inference, learning and intelligent control.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Graph-based Extreme Feature Selection for Multi-class Classification Tasks
Authors:
Shir Friedman,
Gonen Singer,
Neta Rabin
Abstract:
When processing high-dimensional datasets, a common pre-processing step is feature selection. Filter-based feature selection algorithms are not tailored to a specific classification method, but rather rank the relevance of each feature with respect to the target and the task. This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks. W…
▽ More
When processing high-dimensional datasets, a common pre-processing step is feature selection. Filter-based feature selection algorithms are not tailored to a specific classification method, but rather rank the relevance of each feature with respect to the target and the task. This work focuses on a graph-based, filter feature selection method that is suited for multi-class classifications tasks. We aim to drastically reduce the number of selected features, in order to create a sketch of the original data that codes valuable information for the classification task. The proposed graph-based algorithm is constructed by combing the Jeffries-Matusita distance with a non-linear dimension reduction method, diffusion maps. Feature elimination is performed based on the distribution of the features in the low-dimensional space. Then, a very small number of feature that have complementary separation strengths, are selected. Moreover, the low-dimensional embedding allows to visualize the feature space. Experimental results are provided for public datasets and compared with known filter-based feature selection techniques.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Improved Streaming Algorithms for Maximum Directed Cut via Smoothed Snapshots
Authors:
Raghuvansh R. Saxena,
Noah G. Singer,
Madhu Sudan,
Santhoshini Velusamy
Abstract:
We give an $\widetilde{O}(\sqrt{n})$-space single-pass $0.483$-approximation streaming algorithm for estimating the maximum directed cut size (Max-DICUT) in a directed graph on $n$ vertices. This improves over an $O(\log n)$-space $4/9 < 0.45$ approximation algorithm due to Chou, Golovnev, and Velusamy (FOCS 2020), which was known to be optimal for $o(\sqrt{n})$-space algorithms. Max-DICUT is a sp…
▽ More
We give an $\widetilde{O}(\sqrt{n})$-space single-pass $0.483$-approximation streaming algorithm for estimating the maximum directed cut size (Max-DICUT) in a directed graph on $n$ vertices. This improves over an $O(\log n)$-space $4/9 < 0.45$ approximation algorithm due to Chou, Golovnev, and Velusamy (FOCS 2020), which was known to be optimal for $o(\sqrt{n})$-space algorithms. Max-DICUT is a special case of a constraint satisfaction problem (CSP). In this broader context, we give the first CSP for which algorithms with $\widetilde{O}(\sqrt{n})$ space can provably outperform $o(\sqrt{n})$-space algorithms.
The key technical contribution of our work is development of the notions of a first-order snapshot of a (directed) graph and of estimates of such snapshots. These snapshots can be used to simulate certain (non-streaming) Max-DICUT algorithms, including the "oblivious" algorithms introduced by Feige and Jozeph (Algorithmica, 2015), who showed that one such algorithm achieves a 0.483-approximation.
Previous work of the authors (SODA 2023) studied the restricted case of bounded-degree graphs, and observed that in this setting, it is straightforward to estimate the snapshot with $\ell_1$ errors and this suffices to simulate oblivious algorithms. But for unbounded-degree graphs, even defining an achievable and sufficient notion of estimation is subtle. We describe a new notion of snapshot estimation and prove its sufficiency using careful smoothing techniques, and then develop an algorithm which sketches such an estimate via a delicate process of intertwined vertex- and edge-subsampling.
Prior to our work, the only streaming algorithms for any CSP on general instances were based on generalizations of the $O(\log n)$-space algorithm for Max-DICUT, and thus our work opens the possibility of a new class of algorithms for approximating CSPs.
△ Less
Submitted 9 May, 2023; v1 submitted 7 November, 2022;
originally announced November 2022.
-
NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation
Authors:
Phillip Howard,
Gadi Singer,
Vasudev Lal,
Yejin Choi,
Swabha Swayamdipta
Abstract:
While counterfactual data augmentation offers a promising step towards robust generalization in natural language processing, producing a set of counterfactuals that offer valuable inductive bias for models remains a challenge. Most existing approaches for producing counterfactuals, manual or automated, rely on small perturbations via minimal edits, resulting in simplistic changes. We introduce Neu…
▽ More
While counterfactual data augmentation offers a promising step towards robust generalization in natural language processing, producing a set of counterfactuals that offer valuable inductive bias for models remains a challenge. Most existing approaches for producing counterfactuals, manual or automated, rely on small perturbations via minimal edits, resulting in simplistic changes. We introduce NeuroCounterfactuals, designed as loose counterfactuals, allowing for larger edits which result in naturalistic generations containing linguistic diversity, while still bearing similarity to the original document. Our novel generative approach bridges the benefits of constrained decoding, with those of language model adaptation for sentiment steering. Training data augmentation with our generations results in both in-domain and out-of-domain improvements for sentiment classification, outperforming even manually curated counterfactuals, under select settings. We further present detailed analyses to show the advantages of NeuroCounterfactuals over approaches involving simple, minimal edits.
△ Less
Submitted 22 October, 2022;
originally announced October 2022.
-
Cross-Domain Aspect Extraction using Transformers Augmented with Knowledge Graphs
Authors:
Phillip Howard,
Arden Ma,
Vasudev Lal,
Ana Paula Simoes,
Daniel Korat,
Oren Pereg,
Moshe Wasserblat,
Gadi Singer
Abstract:
The extraction of aspect terms is a critical step in fine-grained sentiment analysis of text. Existing approaches for this task have yielded impressive results when the training and testing data are from the same domain. However, these methods show a drastic decrease in performance when applied to cross-domain settings where the domain of the testing data differs from that of the training data. To…
▽ More
The extraction of aspect terms is a critical step in fine-grained sentiment analysis of text. Existing approaches for this task have yielded impressive results when the training and testing data are from the same domain. However, these methods show a drastic decrease in performance when applied to cross-domain settings where the domain of the testing data differs from that of the training data. To address this lack of extensibility and robustness, we propose a novel approach for automatically constructing domain-specific knowledge graphs that contain information relevant to the identification of aspect terms. We introduce a methodology for injecting information from these knowledge graphs into Transformer models, including two alternative mechanisms for knowledge insertion: via query enrichment and via manipulation of attention patterns. We demonstrate state-of-the-art performance on benchmark datasets for cross-domain aspect term extraction using our approach and investigate how the amount of external knowledge available to the Transformer impacts model performance.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Adaptive Learning for the Resource-Constrained Classification Problem
Authors:
Danit Shifman Abukasis,
Izack Cohen,
Xiaochen Xian,
Kejun Huang,
Gonen Singer
Abstract:
Resource-constrained classification tasks are common in real-world applications such as allocating tests for disease diagnosis, hiring decisions when filling a limited number of positions, and defect detection in manufacturing settings under a limited inspection budget. Typical classification algorithms treat the learning process and the resource constraints as two separate and sequential tasks. H…
▽ More
Resource-constrained classification tasks are common in real-world applications such as allocating tests for disease diagnosis, hiring decisions when filling a limited number of positions, and defect detection in manufacturing settings under a limited inspection budget. Typical classification algorithms treat the learning process and the resource constraints as two separate and sequential tasks. Here we design an adaptive learning approach that considers resource constraints and learning jointly by iteratively fine-tuning misclassification costs. Via a structured experimental study using a publicly available data set, we evaluate a decision tree classifier that utilizes the proposed approach. The adaptive learning approach performs significantly better than alternative approaches, especially for difficult classification problems in which the performance of common approaches may be unsatisfactory. We envision the adaptive learning approach as an important addition to the repertoire of techniques for handling resource-constrained classification problems.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Adaptive Cost-Sensitive Learning in Neural Networks for Misclassification Cost Problems
Authors:
Ohad Volk,
Gonen Singer
Abstract:
We design a new adaptive learning algorithm for misclassification cost problems that attempt to reduce the cost of misclassified instances derived from the consequences of various errors. Our algorithm (adaptive cost sensitive learning - AdaCSL) adaptively adjusts the loss function such that the classifier bridges the difference between the class distributions between subgroups of samples in the t…
▽ More
We design a new adaptive learning algorithm for misclassification cost problems that attempt to reduce the cost of misclassified instances derived from the consequences of various errors. Our algorithm (adaptive cost sensitive learning - AdaCSL) adaptively adjusts the loss function such that the classifier bridges the difference between the class distributions between subgroups of samples in the training and test data sets with similar predicted probabilities (i.e., local training-test class distribution mismatch). We provide some theoretical performance guarantees on the proposed algorithm and present empirical evidence that a deep neural network used with the proposed AdaCSL algorithm yields better cost results on several binary classification data sets that have class-imbalanced and class-balanced distributions compared to other alternative approaches.
△ Less
Submitted 14 November, 2021;
originally announced November 2021.
-
Streaming approximation resistance of every ordering CSP
Authors:
Noah G. Singer,
Madhu Sudan,
Santhoshini Velusamy
Abstract:
An ordering constraint satisfaction problem (OCSP) is defined by a family $\mathcal{F}$ of predicates mapping permutations on $\{1,\ldots,k\}$ to $\{0,1\}$. An instance of Max-OCSP($\mathcal{F}$) on $n$ variables consists of a list of constraints, each consisting of a predicate from $\mathcal{F}$ applied on $k$ distinct variables. The goal is to find an ordering of the $n$ variables that maximizes…
▽ More
An ordering constraint satisfaction problem (OCSP) is defined by a family $\mathcal{F}$ of predicates mapping permutations on $\{1,\ldots,k\}$ to $\{0,1\}$. An instance of Max-OCSP($\mathcal{F}$) on $n$ variables consists of a list of constraints, each consisting of a predicate from $\mathcal{F}$ applied on $k$ distinct variables. The goal is to find an ordering of the $n$ variables that maximizes the number of constraints for which the induced ordering on the $k$ variables satisfies the predicate. OCSPs capture well-studied problems including `maximum acyclic subgraph' (MAS) and "maximum betweenness".
In this work, we consider the task of approximating the maximum number of satisfiable constraints in the (single-pass) streaming setting, when an instance is presented as a stream of constraints. We show that for every $\mathcal{F}$, Max-OCSP($\mathcal{F}$) is approximation-resistant to $o(n)$-space streaming algorithms, i.e., algorithms using $o(n)$ space cannot distinguish streams where almost every constraint is satisfiable from streams where no ordering beats the random ordering by a noticeable amount. This space bound is tight up to polylogarithmic factors. In the case of MAS our result shows that for every $ε>0$, MAS is not $(1/2+ε)$-approximable in $o(n)$ space. The previous best inapproximability result, due to Guruswami and Tao (APPROX'19), only ruled out $3/4$-approximations in $o(\sqrt n)$ space.
Our results build on a recent work of Chou, Golovnev, Sudan, Velingker, and Velusamy (STOC'22), who provide a tight, linear-space inapproximability theorem for a broad class of "standard" (i.e., non-ordering) constraint satisfaction problems (CSPs) over arbitrary (finite) alphabets. We construct a family of appropriate standard CSPs from any given OCSP, apply their hardness result to this family of CSPs, and then convert back to our OCSP.
△ Less
Submitted 1 August, 2024; v1 submitted 4 May, 2021;
originally announced May 2021.
-
The relationship between internet user type and user performance when carrying out simple vs. complex search tasks
Authors:
Georg Singer,
Pille Pruulmann-Vengerfeldt,
Ulrich Norbisrath,
Dirk Lewandowski
Abstract:
It is widely known that people become better at an activity if they perform this activity long and often. Yet, the question is whether being active in related areas like communicating online, writing blog articles or commenting on community forums have an impact on a persons ability to perform Web searches, is still unanswered. Web searching has become a key task conducted online; in this paper we…
▽ More
It is widely known that people become better at an activity if they perform this activity long and often. Yet, the question is whether being active in related areas like communicating online, writing blog articles or commenting on community forums have an impact on a persons ability to perform Web searches, is still unanswered. Web searching has become a key task conducted online; in this paper we present our findings on whether the user type, which categorises a persons online activities, has an impact on her or his search capabilities. We show (1) the characteristics of different user types when carrying out simple search tasks; (2) their characteristics when carrying out complex search tasks; and, (3) the significantly different user type characteristics between simple and complex search tasks. The results are based on an experiment with 56 ordinary Web users in a laboratory environment. The Search-Logger study framework was used to analyze and measure user behavior when carrying out a set of 12 predefined search tasks. Our findings include the fact that depending on task type (simple or complex) significant differences can be observed between users of different types.
△ Less
Submitted 18 November, 2015;
originally announced November 2015.
-
Ordinary Search Engine Users assessing Difficulty, Effort, and Outcome for Simple and Complex Search Tasks
Authors:
Georg Singer,
Ulrich Norbisrath,
Dirk Lewandowski
Abstract:
Search engines are the preferred tools for finding information on the Web. They are advancing to be the common helpers to answer any of our search needs. We use them to carry out simple look-up tasks and also to work on rather time consuming and more complex search tasks. Yet, we do not know very much about the user performance while carrying out those tasks -- especially not for ordinary users.…
▽ More
Search engines are the preferred tools for finding information on the Web. They are advancing to be the common helpers to answer any of our search needs. We use them to carry out simple look-up tasks and also to work on rather time consuming and more complex search tasks. Yet, we do not know very much about the user performance while carrying out those tasks -- especially not for ordinary users. The aim of this study was to get more insight into whether Web users manage to assess difficulty, time effort, query effort, and task outcome of search tasks, and if their judging performance relates to task complexity. Our study was conducted with a systematically selected sample of 56 people with a wide demographic background. They carried out a set of 12 search tasks with commercial Web search engines in a laboratory environment. The results confirm that it is hard for normal Web users to judge the difficulty and effort to carry out complex search tasks. The judgments are more reliable for simple tasks than for complex ones. Task complexity is an indicator for judging performance.
△ Less
Submitted 12 June, 2012;
originally announced June 2012.
-
Search Strategies of Library Search Experts
Authors:
Kristiina Singer,
Georg Singer,
Krista Lepik,
Ulrich Norbisrath,
Pille Pruulmann-Vengerfeldt
Abstract:
Search engines like Google, Yahoo or Bing are an excellent support for finding documents, but this strength also imposes a limitation. As they are optimized for document retrieval tasks, they perform less well when it comes to more complex search needs. Complex search tasks are usually described as open-ended, abstract and poorly defined information needs with a multifaceted character. In this pap…
▽ More
Search engines like Google, Yahoo or Bing are an excellent support for finding documents, but this strength also imposes a limitation. As they are optimized for document retrieval tasks, they perform less well when it comes to more complex search needs. Complex search tasks are usually described as open-ended, abstract and poorly defined information needs with a multifaceted character. In this paper we will present the results of an experiment carried out with information professionals from libraries and museums in the course of a search contest. The aim of the experiment was to analyze the search strategies of experienced information workers trying to tackle search tasks of varying complexity and get qualitative results on the impact of time pressure on such an experiment.
△ Less
Submitted 26 June, 2012; v1 submitted 12 June, 2012;
originally announced June 2012.
-
Impact of Gender and Age on performing Search Tasks Online
Authors:
Georg Singer,
Ulrich Norbisrath,
Dirk Lewandowski
Abstract:
More and more people use the Internet to work on duties of their daily work routine. To find the right information online, Web search engines are the tools of their choice. Apart from finding facts, people use Web search engines to also execute rather complex and time consuming search tasks. So far search engines follow the one-for-all approach to serve its users and little is known about the impa…
▽ More
More and more people use the Internet to work on duties of their daily work routine. To find the right information online, Web search engines are the tools of their choice. Apart from finding facts, people use Web search engines to also execute rather complex and time consuming search tasks. So far search engines follow the one-for-all approach to serve its users and little is known about the impact of gender and age on people's Web search behavior. In this article we present a study that examines (1) how female and male web users carry out simple and complex search tasks and what are the differences between the two user groups, and (2) how the age of the users impacts their search performance. The laboratory study was done with 56 ordinary people each carrying out 12 search tasks. Our findings confirm that age impacts behavior and search performance significantly, while gender influences were smaller than expected.
△ Less
Submitted 7 June, 2012;
originally announced June 2012.
-
Ordinary Search Engine Users Carrying Out Complex Search Tasks
Authors:
Georg Singer,
Ulrich Norbisrath,
Dirk Lewandowski
Abstract:
Web search engines have become the dominant tools for finding information on the Internet. Due to their popularity, users apply them to a wide range of search needs, from simple look-ups to rather complex information tasks. This paper presents the results of a study to investigate the characteristics of these complex information needs in the context of Web search engines. The aim of the study is t…
▽ More
Web search engines have become the dominant tools for finding information on the Internet. Due to their popularity, users apply them to a wide range of search needs, from simple look-ups to rather complex information tasks. This paper presents the results of a study to investigate the characteristics of these complex information needs in the context of Web search engines. The aim of the study is to find out more about (1) what makes complex search tasks distinct from simple tasks and if it is possible to find simple measures for describing their complexity, (2) if search success for a task can be predicted by means of unique measures, and (3) if successful searchers show a different behavior than unsuccessful ones. The study includes 60 people who carried out a set of 12 search tasks with current commercial search engines. Their behavior was logged with the Search-Logger tool. The results confirm that complex tasks show significantly different characteristics than simple tasks. Yet it seems to be difficult to distinguish successful from unsuccessful search behaviors. Good searchers can be differentiated from bad searchers by means of measurable parameters. The implications of these findings for search engine vendors are discussed.
△ Less
Submitted 4 July, 2012; v1 submitted 7 June, 2012;
originally announced June 2012.