-
Neural Fingerprints for Adversarial Attack Detection
Authors:
Haim Fisher,
Moni Shahar,
Yehezkel S. Resheff
Abstract:
Deep learning models for image classification have become standard tools in recent years. A well known vulnerability of these models is their susceptibility to adversarial examples. These are generated by slightly altering an image of a certain class in a way that is imperceptible to humans but causes the model to classify it wrongly as another class. Many algorithms have been proposed to address…
▽ More
Deep learning models for image classification have become standard tools in recent years. A well known vulnerability of these models is their susceptibility to adversarial examples. These are generated by slightly altering an image of a certain class in a way that is imperceptible to humans but causes the model to classify it wrongly as another class. Many algorithms have been proposed to address this problem, falling generally into one of two categories: (i) building robust classifiers (ii) directly detecting attacked images. Despite the good performance of these detectors, we argue that in a white-box setting, where the attacker knows the configuration and weights of the network and the detector, they can overcome the detector by running many examples on a local copy, and sending only those that were not detected to the actual model. This problem is common in security applications where even a very good model is not sufficient to ensure safety. In this paper we propose to overcome this inherent limitation of any static defence with randomization. To do so, one must generate a very large family of detectors with consistent performance, and select one or more of them randomly for each input. For the individual detectors, we suggest the method of neural fingerprints. In the training phase, for each class we repeatedly sample a tiny random subset of neurons from certain layers of the network, and if their average is sufficiently different between clean and attacked images of the focal class they are considered a fingerprint and added to the detector bank. During test time, we sample fingerprints from the bank associated with the label predicted by the model, and detect attacks using a likelihood ratio test. We evaluate our detectors on ImageNet with different attack methods and model architectures, and show near-perfect detection with low rates of false detection.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Biased AI can Influence Political Decision-Making
Authors:
Jillian Fisher,
Shangbin Feng,
Robert Aron,
Thomas Richardson,
Yejin Choi,
Daniel W. Fisher,
Jennifer Pan,
Yulia Tsvetkov,
Katharina Reinecke
Abstract:
As modern large language models (LLMs) become integral to everyday tasks, concerns about their inherent biases and their potential impact on human decision-making have emerged. While bias in models are well-documented, less is known about how these biases influence human decisions. This paper presents two interactive experiments investigating the effects of partisan bias in LLMs on political opini…
▽ More
As modern large language models (LLMs) become integral to everyday tasks, concerns about their inherent biases and their potential impact on human decision-making have emerged. While bias in models are well-documented, less is known about how these biases influence human decisions. This paper presents two interactive experiments investigating the effects of partisan bias in LLMs on political opinions and decision-making. Participants interacted freely with either a biased liberal, biased conservative, or unbiased control model while completing these tasks. We found that participants exposed to partisan biased models were significantly more likely to adopt opinions and make decisions which matched the LLM's bias. Even more surprising, this influence was seen when the model bias and personal political partisanship of the participant were opposite. However, we also discovered that prior knowledge of AI was weakly correlated with a reduction of the impact of the bias, highlighting the possible importance of AI education for robust mitigation of bias effects. Our findings not only highlight the critical effects of interacting with biased LLMs and its ability to impact public discourse and political conduct, but also highlights potential techniques for mitigating these risks in the future.
△ Less
Submitted 5 June, 2025; v1 submitted 8 October, 2024;
originally announced October 2024.
-
Reasoning Beyond Bias: A Study on Counterfactual Prompting and Chain of Thought Reasoning
Authors:
Kyle Moore,
Jesse Roberts,
Thao Pham,
Douglas Fisher
Abstract:
Language models are known to absorb biases from their training data, leading to predictions driven by statistical regularities rather than semantic relevance. We investigate the impact of these biases on answer choice preferences in the Massive Multi-Task Language Understanding (MMLU) task. Our findings reveal that differences in learned regularities across answer options are predictive of model p…
▽ More
Language models are known to absorb biases from their training data, leading to predictions driven by statistical regularities rather than semantic relevance. We investigate the impact of these biases on answer choice preferences in the Massive Multi-Task Language Understanding (MMLU) task. Our findings reveal that differences in learned regularities across answer options are predictive of model preferences and mirror human test-taking strategies. To address this issue, we introduce two novel methods: Counterfactual Prompting with Chain of Thought (CoT) and Counterfactual Prompting with Agnostically Primed CoT (APriCoT). We demonstrate that while Counterfactual Prompting with CoT alone is insufficient to mitigate bias, our novel Primed Counterfactual Prompting with CoT approach effectively reduces the influence of base-rate probabilities while improving overall accuracy. Our results suggest that mitigating bias requires a "System-2" like process and that CoT reasoning is susceptible to confirmation bias under some prompting methodologies. Our contributions offer practical solutions for developing more robust and fair language models.
△ Less
Submitted 5 September, 2024; v1 submitted 16 August, 2024;
originally announced August 2024.
-
Large Language Model Recall Uncertainty is Modulated by the Fan Effect
Authors:
Jesse Roberts,
Kyle Moore,
Thao Pham,
Oseremhen Ewaleifoh,
Doug Fisher
Abstract:
This paper evaluates whether large language models (LLMs) exhibit cognitive fan effects, similar to those discovered by Anderson in humans, after being pre-trained on human textual data. We conduct two sets of in-context recall experiments designed to elicit fan effects. Consistent with human results, we find that LLM recall uncertainty, measured via token probability, is influenced by the fan eff…
▽ More
This paper evaluates whether large language models (LLMs) exhibit cognitive fan effects, similar to those discovered by Anderson in humans, after being pre-trained on human textual data. We conduct two sets of in-context recall experiments designed to elicit fan effects. Consistent with human results, we find that LLM recall uncertainty, measured via token probability, is influenced by the fan effect. Our results show that removing uncertainty disrupts the observed effect. The experiments suggest the fan effect is consistent whether the fan value is induced in-context or in the pre-training data. Finally, these findings provide in-silico evidence that fan effects and typicality are expressions of the same phenomena.
△ Less
Submitted 29 September, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
The Base-Rate Effect on LLM Benchmark Performance: Disambiguating Test-Taking Strategies from Benchmark Performance
Authors:
Kyle Moore,
Jesse Roberts,
Thao Pham,
Oseremhen Ewaleifoh,
Doug Fisher
Abstract:
Cloze testing is a common method for measuring the behavior of large language models on a number of benchmark tasks. Using the MMLU dataset, we show that the base-rate probability (BRP) differences across answer tokens are significant and affect task performance ie. guess A if uncertain. We find that counterfactual prompting does sufficiently mitigate the BRP effect. The BRP effect is found to hav…
▽ More
Cloze testing is a common method for measuring the behavior of large language models on a number of benchmark tasks. Using the MMLU dataset, we show that the base-rate probability (BRP) differences across answer tokens are significant and affect task performance ie. guess A if uncertain. We find that counterfactual prompting does sufficiently mitigate the BRP effect. The BRP effect is found to have a similar effect to test taking strategies employed by humans leading to the conflation of task performance and test-taking ability. We propose the Nvr-X-MMLU task, a variation of MMLU, which helps to disambiguate test-taking ability from task performance and reports the latter.
△ Less
Submitted 30 September, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Do Large Language Models Learn Human-Like Strategic Preferences?
Authors:
Jesse Roberts,
Kyle Moore,
Doug Fisher
Abstract:
In this paper, we evaluate whether LLMs learn to make human-like preference judgements in strategic scenarios as compared with known empirical results. Solar and Mistral are shown to exhibit stable value-based preference consistent with humans and exhibit human-like preference for cooperation in the prisoner's dilemma (including stake-size effect) and traveler's dilemma (including penalty-size eff…
▽ More
In this paper, we evaluate whether LLMs learn to make human-like preference judgements in strategic scenarios as compared with known empirical results. Solar and Mistral are shown to exhibit stable value-based preference consistent with humans and exhibit human-like preference for cooperation in the prisoner's dilemma (including stake-size effect) and traveler's dilemma (including penalty-size effect). We establish a relationship between model size, value-based preference, and superficiality. Finally, results here show that models tending to be less brittle have relied on sliding window attention suggesting a potential link. Additionally, we contribute a novel method for constructing preference relations from arbitrary LLMs and support for a hypothesis regarding human behavior in the traveler's dilemma.
△ Less
Submitted 2 October, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
Using Artificial Populations to Study Psychological Phenomena in Neural Models
Authors:
Jesse Roberts,
Kyle Moore,
Drew Wilenzick,
Doug Fisher
Abstract:
The recent proliferation of research into transformer based natural language processing has led to a number of studies which attempt to detect the presence of human-like cognitive behavior in the models. We contend that, as is true of human psychology, the investigation of cognitive behavior in language models must be conducted in an appropriate population of an appropriate size for the results to…
▽ More
The recent proliferation of research into transformer based natural language processing has led to a number of studies which attempt to detect the presence of human-like cognitive behavior in the models. We contend that, as is true of human psychology, the investigation of cognitive behavior in language models must be conducted in an appropriate population of an appropriate size for the results to be meaningful. We leverage work in uncertainty estimation in a novel approach to efficiently construct experimental populations. The resultant tool, PopulationLM, has been made open source. We provide theoretical grounding in the uncertainty estimation literature and motivation from current cognitive work regarding language models. We discuss the methodological lessons from other scientific communities and attempt to demonstrate their application to two artificial population studies. Through population based experimentation we find that language models exhibit behavior consistent with typicality effects among categories highly represented in training. However, we find that language models don't tend to exhibit structural priming effects. Generally, our results show that single models tend to over estimate the presence of cognitive behaviors in neural models.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Topic Modeling via Full Dependence Mixtures
Authors:
Dan Fisher,
Mark Kozdoba,
Shie Mannor
Abstract:
In this paper we introduce a new approach to topic modelling that scales to large datasets by using a compact representation of the data and by leveraging the GPU architecture. In this approach, topics are learned directly from the co-occurrence data of the corpus. In particular, we introduce a novel mixture model which we term the Full Dependence Mixture (FDM) model. FDMs model second moment unde…
▽ More
In this paper we introduce a new approach to topic modelling that scales to large datasets by using a compact representation of the data and by leveraging the GPU architecture. In this approach, topics are learned directly from the co-occurrence data of the corpus. In particular, we introduce a novel mixture model which we term the Full Dependence Mixture (FDM) model. FDMs model second moment under general generative assumptions on the data. While there is previous work on topic modeling using second moments, we develop a direct stochastic optimization procedure for fitting an FDM with a single Kullback Leibler objective. Moment methods in general have the benefit that an iteration no longer needs to scale with the size of the corpus. Our approach allows us to leverage standard optimizers and GPUs for the problem of topic modeling. In particular, we evaluate the approach on two large datasets, NeurIPS papers and a Twitter corpus, with a large number of topics, and show that the approach performs comparably or better than the the standard benchmarks.
△ Less
Submitted 1 March, 2020; v1 submitted 13 June, 2019;
originally announced June 2019.
-
Visualizing a Million Time Series with the Density Line Chart
Authors:
Dominik Moritz,
Danyel Fisher
Abstract:
Data analysts often need to work with multiple series of data---conventionally shown as line charts---at once. Few visual representations allow analysts to view many lines simultaneously without becoming overwhelming or cluttered. In this paper, we introduce the DenseLines technique to calculate a discrete density representation of time series. DenseLines normalizes time series by the arc length t…
▽ More
Data analysts often need to work with multiple series of data---conventionally shown as line charts---at once. Few visual representations allow analysts to view many lines simultaneously without becoming overwhelming or cluttered. In this paper, we introduce the DenseLines technique to calculate a discrete density representation of time series. DenseLines normalizes time series by the arc length to compute accurate densities. The derived density visualization allows users both to see the aggregate trends of multiple series and to identify anomalous extrema.
△ Less
Submitted 6 September, 2018; v1 submitted 17 August, 2018;
originally announced August 2018.
-
Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features
Authors:
Naveen Sai Madiraju,
Seid M. Sadat,
Dimitry Fisher,
Homa Karimabadi
Abstract:
Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduct…
▽ More
Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objec tive. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, we apply a visualization method that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, we show that the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.
△ Less
Submitted 3 February, 2018;
originally announced February 2018.
-
The perceived assortativity of social networks: Methodological problems and solutions
Authors:
David N Fisher,
Matthew J Silk,
Daniel W Franks
Abstract:
Networks describe a range of social, biological and technical phenomena. An important property of a network is its degree correlation or assortativity, describing how nodes in the network associate based on their number of connections. Social networks are typically thought to be distinct from other networks in being assortative (possessing positive degree correlations); well-connected individuals…
▽ More
Networks describe a range of social, biological and technical phenomena. An important property of a network is its degree correlation or assortativity, describing how nodes in the network associate based on their number of connections. Social networks are typically thought to be distinct from other networks in being assortative (possessing positive degree correlations); well-connected individuals associate with other well-connected individuals, and poorly-connected individuals associate with each other. We review the evidence for this in the literature and find that, while social networks are more assortative than non-social networks, only when they are built using group-based methods do they tend to be positively assortative. Non-social networks tend to be disassortative. We go on to show that connecting individuals due to shared membership of a group, a commonly used method, biases towards assortativity unless a large enough number of censuses of the network are taken. We present a number of solutions to overcoming this bias by drawing on advances in sociological and biological fields. Adoption of these methods across all fields can greatly enhance our understanding of social networks and networks in general.
△ Less
Submitted 30 January, 2017;
originally announced January 2017.
-
miniAdapton: A Minimal Implementation of Incremental Computation in Scheme
Authors:
Dakota Fisher,
Matthew A. Hammer,
William Byrd,
Matthew Might
Abstract:
We describe a complete Scheme implementation of miniAdapton, which implements the core functionality of the Adapton system for incremental computation (also known as self-adjusting computation). Like Adapton, miniAdapton allows programmers to safely combine mutation and memoization. miniAdapton is built on top of an even simpler system, microAdapton. Both miniAdapton and microAdapton are designed…
▽ More
We describe a complete Scheme implementation of miniAdapton, which implements the core functionality of the Adapton system for incremental computation (also known as self-adjusting computation). Like Adapton, miniAdapton allows programmers to safely combine mutation and memoization. miniAdapton is built on top of an even simpler system, microAdapton. Both miniAdapton and microAdapton are designed to be easy to understand, extend, and port to host languages other than Scheme. We also present adapton variables, a new interface in Adapton for variables intended to represent expressions.
△ Less
Submitted 17 September, 2016;
originally announced September 2016.
-
Fundamental principles of cortical computation: unsupervised learning with prediction, compression and feedback
Authors:
Micah Richert,
Dimitry Fisher,
Filip Piekniewski,
Eugene M. Izhikevich,
Todd L. Hylton
Abstract:
There has been great progress in understanding of anatomical and functional microcircuitry of the primate cortex. However, the fundamental principles of cortical computation - the principles that allow the visual cortex to bind retinal spikes into representations of objects, scenes and scenarios - have so far remained elusive. In an attempt to come closer to understanding the fundamental principle…
▽ More
There has been great progress in understanding of anatomical and functional microcircuitry of the primate cortex. However, the fundamental principles of cortical computation - the principles that allow the visual cortex to bind retinal spikes into representations of objects, scenes and scenarios - have so far remained elusive. In an attempt to come closer to understanding the fundamental principles of cortical computation, here we present a functional, phenomenological model of the primate visual cortex. The core part of the model describes four hierarchical cortical areas with feedforward, lateral, and recurrent connections. The three main principles implemented in the model are information compression, unsupervised learning by prediction, and use of lateral and top-down context. We show that the model reproduces key aspects of the primate ventral stream of visual processing including Simple and Complex cells in V1, increasingly complicated feature encoding, and increased separability of object representations in higher cortical areas. The model learns representations of the visual environment that allow for accurate classification and state-of-the-art visual tracking performance on novel objects.
△ Less
Submitted 19 August, 2016;
originally announced August 2016.
-
Unsupervised Learning from Continuous Video in a Scalable Predictive Recurrent Network
Authors:
Filip Piekniewski,
Patryk Laurent,
Csaba Petre,
Micah Richert,
Dimitry Fisher,
Todd Hylton
Abstract:
Understanding visual reality involves acquiring common-sense knowledge about countless regularities in the visual world, e.g., how illumination alters the appearance of objects in a scene, and how motion changes their apparent spatial relationship. These regularities are hard to label for training supervised machine learning algorithms; consequently, algorithms need to learn these regularities fro…
▽ More
Understanding visual reality involves acquiring common-sense knowledge about countless regularities in the visual world, e.g., how illumination alters the appearance of objects in a scene, and how motion changes their apparent spatial relationship. These regularities are hard to label for training supervised machine learning algorithms; consequently, algorithms need to learn these regularities from the real world in an unsupervised way. We present a novel network meta-architecture that can learn world dynamics from raw, continuous video. The components of this network can be implemented using any algorithm that possesses three key capabilities: prediction of a signal over time, reduction of signal dimensionality (compression), and the ability to use supplementary contextual information to inform the prediction. The presented architecture is highly-parallelized and scalable, and is implemented using localized connectivity, processing, and learning. We demonstrate an implementation of this architecture where the components are built from multi-layer perceptrons. We apply the implementation to create a system capable of stable and robust visual tracking of objects as seen by a moving camera. Results show performance on par with or exceeding state-of-the-art tracking algorithms. The tracker can be trained in either fully supervised or unsupervised-then-briefly-supervised regimes. Success of the briefly-supervised regime suggests that the unsupervised portion of the model extracts useful information about visual reality. The results suggest a new class of AI algorithms that uniquely combine prediction and scalability in a way that makes them suitable for learning from and --- and eventually acting within --- the real world.
△ Less
Submitted 30 September, 2016; v1 submitted 22 July, 2016;
originally announced July 2016.
-
Tuning Collision Warning Algorithms to Individual Drivers for Design of Active Safety Systems
Authors:
Ali Rakhshan,
Hossein Pishro-Nik,
Donald L. Fisher,
Mohammad Nekoui
Abstract:
Every year, many people are killed and injured in highway traffic accidents. In order to reduce such casualties, collisions warning systems has been studied extensively. These systems are built by taking the driver reaction times into account. However, most of the existing literature focuses on characterizing how driver reaction times vary across an entire population. Therefore, many of the warnin…
▽ More
Every year, many people are killed and injured in highway traffic accidents. In order to reduce such casualties, collisions warning systems has been studied extensively. These systems are built by taking the driver reaction times into account. However, most of the existing literature focuses on characterizing how driver reaction times vary across an entire population. Therefore, many of the warnings that are given turn out to be false alarms. A false alarm occurs whenever a warning is sent, but it is not needed. This would nagate any safety benefit of the system, and could even reduce the overall safety if warnings become a distraction. In this paper, we propose our solution to address the described problem; First, we briefly describe our method for estimating the distribution of brake response times for a particular driver using data from a Vehicular Ad-Hoc Network (VANET) system. Then, we investigate how brake response times of individual drivers can be used in collision warning algorithms to reduce false alarm rates while still maintaining a high level of safety. This will yield a system that is overall more reliable and trustworthy for drivers, which could lead to wider adoption and applicability for V2V/V2I communication systems. Moreover, we show how false alarm rate varies with respect to probability of accident. Our simulation results show that by individualizing collision warnings the number of false alarms can be reduced more than $50\%$. Then, we conclude safety applications could potentially take full advantage of being customized to an individual's characteristics.
△ Less
Submitted 1 March, 2019; v1 submitted 4 February, 2014;
originally announced February 2014.
-
Iterative Optimization and Simplification of Hierarchical Clusterings
Authors:
D. Fisher
Abstract:
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can…
▽ More
Clustering is often used for discovering structure in data. Clustering systems differ in the objective function used to evaluate clustering quality and the control strategy used to search the space of clusterings. Ideally, the search strategy should consistently construct clusterings of high quality, but be computationally inexpensive as well. In general, we cannot have it both ways, but we can partition the search so that a system inexpensively constructs a `tentative' clustering for initial examination, followed by iterative optimization, which continues to search in background for improved clusterings. Given this motivation, we evaluate an inexpensive strategy for creating initial clusterings, coupled with several control strategies for iterative optimization, each of which repeatedly modifies an initial clustering in search of a better one. One of these methods appears novel as an iterative optimization strategy in clustering contexts. Once a clustering has been constructed it is judged by analysts -- often according to task-specific criteria. Several authors have abstracted these criteria and posited a generic performance task akin to pattern completion, where the error rate over completed patterns is used to `externally' judge clustering utility. Given this performance task, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus promising to ease post-clustering analysis. Finally, we propose a number of objective functions, based on attribute-selection measures for decision-tree induction, that might perform well on the error rate and simplicity dimensions.
△ Less
Submitted 31 March, 1996;
originally announced April 1996.
-
CRYSTAL: Inducing a Conceptual Dictionary
Authors:
Stephen Soderland,
David Fisher,
Jonathan Aseltine,
Wendy Lehnert
Abstract:
One of the central knowledge sources of an information extraction system is a dictionary of linguistic patterns that can be used to identify the conceptual content of a text. This paper describes CRYSTAL, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus. Each of these concept-node definitions is gen…
▽ More
One of the central knowledge sources of an information extraction system is a dictionary of linguistic patterns that can be used to identify the conceptual content of a text. This paper describes CRYSTAL, a system which automatically induces a dictionary of "concept-node definitions" sufficient to identify relevant information from a training corpus. Each of these concept-node definitions is generalized as far as possible without producing errors, so that a minimum number of dictionary entries cover the positive training instances. Because it tests the accuracy of each proposed definition, CRYSTAL can often surpass human intuitions in creating reliable extraction rules.
△ Less
Submitted 9 May, 1995;
originally announced May 1995.