-
For GPT-4 as with Humans: Information Structure Predicts Acceptability of Long-Distance Dependencies
Authors:
Nicole Cuneo,
Eleanor Graves,
Supantho Rakshit,
Adele E. Goldberg
Abstract:
It remains debated how well any LM understands natural language or generates reliable metalinguistic judgments. Moreover, relatively little work has demonstrated that LMs can represent and respect subtle relationships between form and function proposed by linguists. We here focus on a particular such relationship established in recent work: English speakers' judgments about the information structu…
▽ More
It remains debated how well any LM understands natural language or generates reliable metalinguistic judgments. Moreover, relatively little work has demonstrated that LMs can represent and respect subtle relationships between form and function proposed by linguists. We here focus on a particular such relationship established in recent work: English speakers' judgments about the information structure of canonical sentences predicts independently collected acceptability ratings on corresponding 'long distance dependency' [LDD] constructions, across a wide array of base constructions and multiple types of LDDs. To determine whether any LM captures this relationship, we probe GPT-4 on the same tasks used with humans and new extensions.Results reveal reliable metalinguistic skill on the information structure and acceptability tasks, replicating a striking interaction between the two, despite the zero-shot, explicit nature of the tasks, and little to no chance of contamination [Studies 1a, 1b]. Study 2 manipulates the information structure of base sentences and confirms a causal relationship: increasing the prominence of a constituent in a context sentence increases the subsequent acceptability ratings on an LDD construction. The findings suggest a tight relationship between natural and GPT-4 generated English, and between information structure and syntax, which begs for further exploration.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
A suite of LMs comprehend puzzle statements as well as humans
Authors:
Adele E Goldberg,
Supantho Rakshit,
Jennifer Hu,
Kyle Mahowald
Abstract:
Recent claims suggest that large language models (LMs) underperform humans in comprehending minimally complex English statements (Dentella et al., 2024). Here, we revisit those findings and argue that human performance was overestimated, while LLM abilities were underestimated. Using the same stimuli, we report a preregistered study comparing human responses in two conditions: one allowed rereadin…
▽ More
Recent claims suggest that large language models (LMs) underperform humans in comprehending minimally complex English statements (Dentella et al., 2024). Here, we revisit those findings and argue that human performance was overestimated, while LLM abilities were underestimated. Using the same stimuli, we report a preregistered study comparing human responses in two conditions: one allowed rereading (replicating the original study), and one that restricted rereading (a more naturalistic comprehension test). Human accuracy dropped significantly when rereading was restricted (73%), falling below that of Falcon-180B-Chat (76%) and GPT-4 (81%). The newer GPT-o1 model achieves perfect accuracy. Results further show that both humans and models are disproportionately challenged by queries involving potentially reciprocal actions (e.g., kissing), suggesting shared pragmatic sensitivities rather than model-specific deficits. Additional analyses using Llama-2-70B log probabilities, a recoding of open-ended model responses, and grammaticality ratings of other sentences reveal systematic underestimation of model performance. We find that GPT-4o can align with either naive or expert grammaticality judgments, depending on prompt framing. These findings underscore the need for more careful experimental design and coding practices in LLM evaluation, and they challenge the assumption that current models are inherently weaker than humans at language comprehension.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
"It Felt Like I Was Left in the Dark": Exploring Information Needs and Design Opportunities for Family Caregivers of Older Adult Patients in Critical Care Settings
Authors:
Shihan Fu,
Bingsheng Yao,
Smit Desai,
Yuqi Hu,
Yuling Sun,
Samantha Stonbraker,
Yanjun Gao,
Elizabeth M. Goldberg,
Dakuo Wang
Abstract:
Older adult patients constitute a rapidly growing subgroup of Intensive Care Unit (ICU) patients. In these situations, their family caregivers are expected to represent the unconscious patients to access and interpret patients' medical information. However, caregivers currently have to rely on overloaded clinicians for information updates and typically lack the health literacy to understand comple…
▽ More
Older adult patients constitute a rapidly growing subgroup of Intensive Care Unit (ICU) patients. In these situations, their family caregivers are expected to represent the unconscious patients to access and interpret patients' medical information. However, caregivers currently have to rely on overloaded clinicians for information updates and typically lack the health literacy to understand complex medical information. Our project aims to explore the information needs of caregivers of ICU older adult patients, from which we can propose design opportunities to guide future AI systems. The project begins with formative interviews with 11 caregivers to identify their challenges in accessing and interpreting medical information; From these findings, we then synthesize design requirements and propose an AI system prototype to cope with caregivers' challenges. The system prototype has two key features: a timeline visualization to show the AI extracted and summarized older adult patients' key medical events; and an LLM-based chatbot to provide context-aware informational support. We conclude our paper by reporting on the follow-up user evaluation of the system and discussing future AI-based systems for ICU caregivers of older adults.
△ Less
Submitted 4 April, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
LED there be DoS: Exploiting variable bitrate IP cameras for network DoS
Authors:
Emmanuel Goldberg,
Oleg Brodt,
Aviad Elyashar,
Rami Puzis
Abstract:
Variable-bitrate video streaming is ubiquitous in video surveillance and CCTV, enabling high-quality video streaming while conserving network bandwidth. However, as the name suggests, variable-bitrate IP cameras can generate sharp traffic spikes depending on the dynamics of the visual input. In this paper, we show that the effectiveness of video compression can be reduced by up to 6X using a simpl…
▽ More
Variable-bitrate video streaming is ubiquitous in video surveillance and CCTV, enabling high-quality video streaming while conserving network bandwidth. However, as the name suggests, variable-bitrate IP cameras can generate sharp traffic spikes depending on the dynamics of the visual input. In this paper, we show that the effectiveness of video compression can be reduced by up to 6X using a simple laser LED pointing at a variable-bitrate IP camera, forcing the camera to generate excessive network traffic. Experiments with IP cameras connected to wired and wireless networks indicate that a laser attack on a single camera can cause significant packet loss in systems sharing the network with the camera and reduce the available bandwidth of a shared network link by 90%. This attack represents a new class of cyber-physical attacks that manipulate variable bitrate devices through changes in the physical environment without a digital presence on the device or the network. We also analyze the broader view of multidimensional cyberattacks that involve both the physical and digital realms and present a taxonomy that categorizes attacks based on their direction of influence (physical-to-digital or digital-to-physical) and their method of operation (environment-driven or device-driven), highlighting multiple areas for future research.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
Zero-shot Large Language Models for Long Clinical Text Summarization with Temporal Reasoning
Authors:
Maya Kruse,
Shiyue Hu,
Nicholas Derby,
Yifu Wu,
Samantha Stonbraker,
Bingsheng Yao,
Dakuo Wang,
Elizabeth Goldberg,
Yanjun Gao
Abstract:
Recent advancements in large language models (LLMs) have shown potential for transforming data processing in healthcare, particularly in understanding complex clinical narratives. This study evaluates the efficacy of zero-shot LLMs in summarizing long clinical texts that require temporal reasoning, a critical aspect for comprehensively capturing patient histories and treatment trajectories. We app…
▽ More
Recent advancements in large language models (LLMs) have shown potential for transforming data processing in healthcare, particularly in understanding complex clinical narratives. This study evaluates the efficacy of zero-shot LLMs in summarizing long clinical texts that require temporal reasoning, a critical aspect for comprehensively capturing patient histories and treatment trajectories. We applied a series of advanced zero-shot LLMs to extensive clinical documents, assessing their ability to integrate and accurately reflect temporal dynamics without prior task-specific training. While the models efficiently identified key temporal events, they struggled with chronological coherence over prolonged narratives. The evaluation, combining quantitative and qualitative methods, highlights the strengths and limitations of zero-shot LLMs in clinical text summarization. The results suggest that while promising, zero-shot LLMs require further refinement to effectively support clinical decision-making processes, underscoring the need for enhanced model training approaches that better capture the nuances of temporal information in long context medical documents.
△ Less
Submitted 24 February, 2025; v1 submitted 30 January, 2025;
originally announced January 2025.
-
On Efficient Algorithms For Partial Quantifier Elimination
Authors:
Eugene Goldberg
Abstract:
Earlier, we introduced Partial Quantifier Elimination (PQE). It is a $\mathit{generalization}$ of regular quantifier elimination where one can take a $\mathit{part}$ of the formula out of the scope of quantifiers. We apply PQE to CNF formulas of propositional logic with existential quantifiers. The appeal of PQE is that many problems like equivalence checking and model checking can be solved in te…
▽ More
Earlier, we introduced Partial Quantifier Elimination (PQE). It is a $\mathit{generalization}$ of regular quantifier elimination where one can take a $\mathit{part}$ of the formula out of the scope of quantifiers. We apply PQE to CNF formulas of propositional logic with existential quantifiers. The appeal of PQE is that many problems like equivalence checking and model checking can be solved in terms of PQE and the latter can be very efficient. The main flaw of current PQE solvers is that they do not $\mathit{reuse}$ learned information. The problem here is that these PQE solvers are based on the notion of clause redundancy and the latter is a $\mathit{structural}$ rather than $\mathit{semantic}$ property. In this paper, we provide two important theoretical results that enable reusing the information learned by a PQE solver. Such reusing can dramatically boost the efficiency of PQE like conflict clause learning boosts SAT solving.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Structure-Aware Computing, Partial Quantifier Elimination And SAT
Authors:
Eugene Goldberg
Abstract:
Virtually all efficient algorithms of hardware verification are formula-specific i.e., take into account the structure of the formula at hand. So, those algorithms can be viewed as $\mathit{structure}$-$\mathit{aware\;computing}$ (SAC). We relate SAC and $\mathit{partial\;quantifier\;elimination}$ (PQE), a generalization of regular quantifier elimination. In PQE, one can take a $\mathit{part}$ of…
▽ More
Virtually all efficient algorithms of hardware verification are formula-specific i.e., take into account the structure of the formula at hand. So, those algorithms can be viewed as $\mathit{structure}$-$\mathit{aware\;computing}$ (SAC). We relate SAC and $\mathit{partial\;quantifier\;elimination}$ (PQE), a generalization of regular quantifier elimination. In PQE, one can take a $\mathit{part}$ of the formula out of the scope of quantifiers. Interpolation can be viewed as a special case of PQE. The objective of this paper is twofold. First, we want to show that new powerful methods of SAC can be formulated in terms of PQE. We use three hardware verification problems (testing by property generation, equivalence checking and model checking) to explain how SAC is performed by PQE. Second, we want to demonstrate that PQE solving itself can benefit from SAC. To this end, we describe a new SAT procedure based on SAC and then use it to introduce a structure-aware PQE algorithm.
△ Less
Submitted 1 March, 2025; v1 submitted 9 March, 2024;
originally announced March 2024.
-
Leveraging Transformers to Improve Breast Cancer Classification and Risk Assessment with Multi-modal and Longitudinal Data
Authors:
Yiqiu Shen,
Jungkyu Park,
Frank Yeung,
Eliana Goldberg,
Laura Heacock,
Farah Shamout,
Krzysztof J. Geras
Abstract:
Breast cancer screening, primarily conducted through mammography, is often supplemented with ultrasound for women with dense breast tissue. However, existing deep learning models analyze each modality independently, missing opportunities to integrate information across imaging modalities and time. In this study, we present Multi-modal Transformer (MMT), a neural network that utilizes mammography a…
▽ More
Breast cancer screening, primarily conducted through mammography, is often supplemented with ultrasound for women with dense breast tissue. However, existing deep learning models analyze each modality independently, missing opportunities to integrate information across imaging modalities and time. In this study, we present Multi-modal Transformer (MMT), a neural network that utilizes mammography and ultrasound synergistically, to identify patients who currently have cancer and estimate the risk of future cancer for patients who are currently cancer-free. MMT aggregates multi-modal data through self-attention and tracks temporal tissue changes by comparing current exams to prior imaging. Trained on 1.3 million exams, MMT achieves an AUROC of 0.943 in detecting existing cancers, surpassing strong uni-modal baselines. For 5-year risk prediction, MMT attains an AUROC of 0.826, outperforming prior mammography-based risk models. Our research highlights the value of multi-modal and longitudinal imaging in cancer diagnosis and risk stratification.
△ Less
Submitted 15 November, 2023; v1 submitted 6 November, 2023;
originally announced November 2023.
-
A Robust Measure on FDFAs Following Duo-Normalized Acceptance
Authors:
Dana Fisman,
Emmanuel Goldberg,
Oded Zimerman
Abstract:
Families of DFAs (FDFAs) are a computational model recognizing $ω$-regular languages. They were introduced in the quest of finding a Myhill-Nerode theorem for $ω$-regular languages, and obtaining learning algorithms. FDFAs have been shown to have good qualities in terms of the resources required for computing Boolean operations on them (complementation, union, and intersection) and answering decis…
▽ More
Families of DFAs (FDFAs) are a computational model recognizing $ω$-regular languages. They were introduced in the quest of finding a Myhill-Nerode theorem for $ω$-regular languages, and obtaining learning algorithms. FDFAs have been shown to have good qualities in terms of the resources required for computing Boolean operations on them (complementation, union, and intersection) and answering decision problems (emptiness and equivalence); all can be done in non-deterministic logspace. In this paper we study FDFAs with a new type of acceptance condition, duo-normalization, that generalizes the traditional normalization acceptance type. We show that duo-normalized FDFAs are advantageous to normalized FDFAs in terms of succinctness as they can be exponentially smaller. Fortunately this added succinctness doesn't come at the cost of increasing the complexity of Boolean operations and decision problems -- they can still be preformed in non-deterministic logspace.
An important measure of the complexity of an $ω$-regular language, is its position in the Wagner hierarchy. It is based on the inclusion measure of Muller automata and for the common $ω$-automata there exist algorithms computing their position. We develop a similarly robust measure for duo-normalized (and normalized) FDFAs, which we term the diameter measure. We show that the diameter measure corresponds one-to-one to the position on the Wagner hierarchy. We show that computing it for duo-normalized FDFAs is PSPACE-complete, while it can be done in non-deterministic logspace for traditional FDFAs.
△ Less
Submitted 21 February, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Causal interventions expose implicit situation models for commonsense language understanding
Authors:
Takateru Yamakoshi,
James L. McClelland,
Adele E. Goldberg,
Robert D. Hawkins
Abstract:
Accounts of human language processing have long appealed to implicit ``situation models'' that enrich comprehension with relevant but unstated world knowledge. Here, we apply causal intervention techniques to recent transformer models to analyze performance on the Winograd Schema Challenge (WSC), where a single context cue shifts interpretation of an ambiguous pronoun. We identify a relatively sma…
▽ More
Accounts of human language processing have long appealed to implicit ``situation models'' that enrich comprehension with relevant but unstated world knowledge. Here, we apply causal intervention techniques to recent transformer models to analyze performance on the Winograd Schema Challenge (WSC), where a single context cue shifts interpretation of an ambiguous pronoun. We identify a relatively small circuit of attention heads that are responsible for propagating information from the context word that guides which of the candidate noun phrases the pronoun ultimately attends to. We then compare how this circuit behaves in a closely matched ``syntactic'' control where the situation model is not strictly necessary. These analyses suggest distinct pathways through which implicit situation models are constructed to guide pronoun resolution.
△ Less
Submitted 7 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Verification Of Partial Quantifier Elimination
Authors:
Eugene Goldberg
Abstract:
Quantifier elimination (QE) is an important problem that has numerous applications. Unfortunately, QE is computationally very hard. Earlier we introduced a generalization of QE called $\mathit{partial}$ QE (or PQE for short). PQE allows to unquantify a $\mathit{part}$ of the formula. The appeal of PQE is twofold. First, many important problems can be solved in terms of PQE. Second, PQE can be dras…
▽ More
Quantifier elimination (QE) is an important problem that has numerous applications. Unfortunately, QE is computationally very hard. Earlier we introduced a generalization of QE called $\mathit{partial}$ QE (or PQE for short). PQE allows to unquantify a $\mathit{part}$ of the formula. The appeal of PQE is twofold. First, many important problems can be solved in terms of PQE. Second, PQE can be drastically faster than QE if only a small part of the formula gets unquantified. To make PQE practical, one needs an algorithm for verifying the solution produced by a PQE solver. In this paper, we describe a very simple SAT-based verifier called $\mathit{VerPQE}$ and provide some experimental results.
△ Less
Submitted 3 April, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Partial Quantifier Elimination And Property Generation
Authors:
Eugene Goldberg
Abstract:
We study partial quantifier elimination (PQE) for propositional CNF formulas with existential quantifiers. PQE is a generalization of quantifier elimination where one can limit the set of clauses taken out of the scope of quantifiers to a small subset of clauses. The appeal of PQE is that many verification problems (e.g. equivalence checking and model checking) can be solved in terms of PQE and th…
▽ More
We study partial quantifier elimination (PQE) for propositional CNF formulas with existential quantifiers. PQE is a generalization of quantifier elimination where one can limit the set of clauses taken out of the scope of quantifiers to a small subset of clauses. The appeal of PQE is that many verification problems (e.g. equivalence checking and model checking) can be solved in terms of PQE and the latter can be dramatically simpler than full quantifier elimination. We show that PQE can be used for property generation that can be viewed as a generalization of testing. The objective here is to produce an $\mathit{unwanted}$ property of a design implementation thus exposing a bug. We introduce two PQE solvers called $\mathit{EG}$-$\mathit{PQE}$ and $\mathit{EG}$-$\mathit{PQE}^+$. $\mathit{EG}$-$\mathit{PQE}$ is a very simple SAT-based algorithm. $\mathit{EG}$-$\mathit{PQE}^+$ is more sophisticated and robust than $\mathit{EG}$-$\mathit{PQE}$. We use these PQE solvers to find an unwanted property (namely, an unwanted invariant) of a buggy FIFO buffer. We also apply them to invariant generation for sequential circuits from a HWMCC benchmark set. Finally, we use these solvers to generate properties of a combinational circuit that mimic symbolic simulation.
△ Less
Submitted 24 May, 2023; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Deep learning in a bilateral brain with hemispheric specialization
Authors:
Chandramouli Rajagopalan,
David Rawlinson,
Elkhonon Goldberg,
Gideon Kowadlo
Abstract:
The brains of all bilaterally symmetric animals on Earth are divided into left and right hemispheres. The anatomy and functionality of the hemispheres have a large degree of overlap, but there are asymmetries, and they specialise in possesses different attributes. Other authors have used computational models to mimic hemispheric asymmetries with a focus on reproducing human data on semantic and vi…
▽ More
The brains of all bilaterally symmetric animals on Earth are divided into left and right hemispheres. The anatomy and functionality of the hemispheres have a large degree of overlap, but there are asymmetries, and they specialise in possesses different attributes. Other authors have used computational models to mimic hemispheric asymmetries with a focus on reproducing human data on semantic and visual processing tasks. We took a different approach and aimed to understand how dual hemispheres in a bilateral architecture interact to perform well in a given task. We propose a bilateral artificial neural network that imitates lateralisation observed in nature: that the left hemisphere specialises in local features and the right in global features. We used different training objectives to achieve the desired specialisation and tested it on an image classification task with two different CNN backbones: ResNet and VGG. Our analysis found that the hemispheres represent complementary features that are exploited by a network head that implements a type of weighted attention. The bilateral architecture outperformed a range of baselines of similar representational capacity that do not exploit differential specialisation, with the exception of a conventional ensemble of unilateral networks trained on dual training objectives for local and global features. The results demonstrate the efficacy of bilateralism, contribute to the discussion of bilateralism in biological brains, and the principle may serve as an inductive bias for new AI systems.
△ Less
Submitted 10 July, 2024; v1 submitted 8 September, 2022;
originally announced September 2022.
-
Singleton-type bounds for list-decoding and list-recovery, and related results
Authors:
Eitan Goldberg,
Chong Shangguan,
Itzhak Tamo
Abstract:
List-decoding and list-recovery are important generalizations of unique decoding that received considerable attention over the years. However, the optimal trade-off among list-decoding (resp. list-recovery) radius, list size, and the code rate are not fully understood in both problems. This paper takes a step towards this direction when the list size is a given constant and the alphabet size is la…
▽ More
List-decoding and list-recovery are important generalizations of unique decoding that received considerable attention over the years. However, the optimal trade-off among list-decoding (resp. list-recovery) radius, list size, and the code rate are not fully understood in both problems. This paper takes a step towards this direction when the list size is a given constant and the alphabet size is large (as a function of the code length). We prove a new Singleton-type upper bound for list-decodable codes, which improves upon the previously known bound by roughly a factor of $1/L$, where $L$ is the list size. We also prove a Singleton-type upper bound for list-recoverable codes, which is to the best of our knowledge, the first such bound for list-recovery. We apply these results to obtain new lower bounds that are optimal up to a multiplicative constant on the list size for list-decodable and list-recoverable codes with rates approaching capacity.
Moreover, we show that list-decodable \emph{nonlinear} codes can strictly outperform list-decodable linear codes. More precisely, we show that there is a gap for a wide range of parameters, which grows fast with the alphabet size, between the size of the largest list-decodable nonlinear code and the size of the largest list-decodable linear codes. This is achieved by a novel connection between list-decoding and the notion of sparse hypergraphs in extremal combinatorics. We remark that such a gap is not known to exist in the problem of unique decoding.
Lastly, we show that list-decodability or recoverability of codes implies in some sense good unique decodability.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
List-decoding and list-recovery of Reed-Solomon codes beyond the Johnson radius for any rate
Authors:
Eitan Goldberg,
Chong Shangguan,
Itzhak Tamo
Abstract:
Understanding the limits of list-decoding and list-recovery of Reed-Solomon (RS) codes is of prime interest in coding theory and has attracted a lot of attention in recent decades. However, the best possible parameters for these problems are still unknown, and in this paper, we take a step in this direction. We show the existence of RS codes that are list-decodable or list-recoverable beyond the J…
▽ More
Understanding the limits of list-decoding and list-recovery of Reed-Solomon (RS) codes is of prime interest in coding theory and has attracted a lot of attention in recent decades. However, the best possible parameters for these problems are still unknown, and in this paper, we take a step in this direction. We show the existence of RS codes that are list-decodable or list-recoverable beyond the Johnson radius for \emph{any} rate, with a polynomial field size in the block length. In particular, we show that for any $ε\in (0,1)$ there exist RS codes that are list-decodable from radius $1-ε$ and rate less than $\fracε{2-ε}$, with constant list size. We deduce our results by extending and strengthening a recent result of Ferber, Kwan, and Sauermann on puncturing codes with large minimum distance and by utilizing the underlying code's linearity.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
From partners to populations: A hierarchical Bayesian account of coordination and convention
Authors:
Robert D. Hawkins,
Michael Franke,
Michael C. Frank,
Adele E. Goldberg,
Kenny Smith,
Thomas L. Griffiths,
Noah D. Goodman
Abstract:
Languages are powerful solutions to coordination problems: they provide stable, shared expectations about how the words we say correspond to the beliefs and intentions in our heads. Yet language use in a variable and non-stationary social environment requires linguistic representations to be flexible: old words acquire new ad hoc or partner-specific meanings on the fly. In this paper, we introduce…
▽ More
Languages are powerful solutions to coordination problems: they provide stable, shared expectations about how the words we say correspond to the beliefs and intentions in our heads. Yet language use in a variable and non-stationary social environment requires linguistic representations to be flexible: old words acquire new ad hoc or partner-specific meanings on the fly. In this paper, we introduce CHAI (Continual Hierarchical Adaptation through Inference), a hierarchical Bayesian theory of coordination and convention formation that aims to reconcile the long-standing tension between these two basic observations. We argue that the central computational problem of communication is not simply transmission, as in classical formulations, but continual learning and adaptation over multiple timescales. Partner-specific common ground quickly emerges from social inferences within dyadic interactions, while community-wide social conventions are stable priors that have been abstracted away from interactions with multiple partners. We present new empirical data alongside simulations showing how our model provides a computational foundation for several phenomena that have posed a challenge for previous accounts: (1) the convergence to more efficient referring expressions across repeated interaction with the same partner, (2) the gradual transfer of partner-specific common ground to strangers, and (3) the influence of communicative context on which conventions eventually form.
△ Less
Submitted 2 December, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Investigating representations of verb bias in neural language models
Authors:
Robert D. Hawkins,
Takateru Yamakoshi,
Thomas L. Griffiths,
Adele E. Goldberg
Abstract:
Languages typically provide more than one grammatical construction to express certain types of messages. A speaker's choice of construction is known to depend on multiple factors, including the choice of main verb -- a phenomenon known as \emph{verb bias}. Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct sentence pairs in the English dative alternati…
▽ More
Languages typically provide more than one grammatical construction to express certain types of messages. A speaker's choice of construction is known to depend on multiple factors, including the choice of main verb -- a phenomenon known as \emph{verb bias}. Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct sentence pairs in the English dative alternation. This dataset includes 200 unique verbs and systematically varies the definiteness and length of arguments. We use this dataset, as well as an existing corpus of naturally occurring data, to evaluate how well recent neural language models capture human preferences. Results show that larger models perform better than smaller models, and transformer architectures (e.g. GPT-2) tend to out-perform recurrent architectures (e.g. LSTMs) even under comparable parameter and training settings. Additional analyses of internal feature representations suggest that transformers may better integrate specific lexical information with grammatical constructions.
△ Less
Submitted 15 October, 2020; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Advancing computerized cognitive training for early Alzheimer's disease in a Covid-19 pandemic and post-pandemic world
Authors:
Kaylee A. Bodner,
Terry E. Goldberg,
D. P. Devanand,
P. Murali Doraiswamy
Abstract:
The COVID-19 pandemic has transformed mobile health applications and telemedicine from nice to have tools into essential healthcare infrastructure. This need is particularly great for the elderly who, due to their greater risk for infection, may avoid medical facilities or be required to self-isolate. These are also the very groups at highest risk for cognitive decline. For example, during the COV…
▽ More
The COVID-19 pandemic has transformed mobile health applications and telemedicine from nice to have tools into essential healthcare infrastructure. This need is particularly great for the elderly who, due to their greater risk for infection, may avoid medical facilities or be required to self-isolate. These are also the very groups at highest risk for cognitive decline. For example, during the COVID-19 pandemic artificially intelligent conversational agents were employed by hospitals and government agencies (such as the CDC) to field queries from patients about symptoms and treatments. Digital health tools also proved invaluable to provide neuropsychiatric and psychological self-help to people isolated at home or in retirement centers and nursing homes.
△ Less
Submitted 15 May, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
On Verifying Designs With Incomplete Specification
Authors:
Eugene Goldberg
Abstract:
Incompleteness of a specification $\mathit{Spec}$ creates two problems. First, an implementation $\mathit{Impl}$ of $\mathit{Spec}$ may have some $\mathit{unwanted}$ properties that $\mathit{Spec}$ does not forbid. Second, $\mathit{Impl}$ may break some $\mathit{desired}$ properties that are not in $\mathit{Spec}$. In either case, $\mathit{Spec}$ fails to expose bugs of $\mathit{Impl}$. In an earl…
▽ More
Incompleteness of a specification $\mathit{Spec}$ creates two problems. First, an implementation $\mathit{Impl}$ of $\mathit{Spec}$ may have some $\mathit{unwanted}$ properties that $\mathit{Spec}$ does not forbid. Second, $\mathit{Impl}$ may break some $\mathit{desired}$ properties that are not in $\mathit{Spec}$. In either case, $\mathit{Spec}$ fails to expose bugs of $\mathit{Impl}$. In an earlier paper, we addressed the first problem above by a technique called Partial Quantifier Elimination (PQE). In contrast to complete QE, in PQE, one takes out of the scope of quantifiers only a small piece of the formula. We used PQE to generate properties of $\mathit{Impl}$ i.e. those $\mathit{consistent}$ with $\mathit{Impl}$. Generation of an unwanted property means that $\mathit{Impl}$ is buggy. In this paper, we address the second problem above by using PQE to generate false properties i.e those that are $\mathit{inconsistent}$ with $\mathit{Impl}$. Such properties are meant to imitate the missing properties of $\mathit{Spec}$ that are not satisfied by $\mathit{Impl}$ (if any). A false property is generated by modifying a piece of a quantified formula describing 'the truth table' of $\mathit{Impl}$ and taking this piece out of the scope of quantifiers. By modifying different pieces of this formula one can generate a "structurally complete" set of false properties. By generating tests detecting false properties of $\mathit{Impl}$ one produces a high quality test set. We apply our approach to verification of combinational and sequential circuits.
△ Less
Submitted 19 April, 2020;
originally announced April 2020.
-
Generation Of A Complete Set Of Properties
Authors:
Eugene Goldberg
Abstract:
One of the problems of formal verification is that it is not functionally complete due the incompleteness of specifications. An implementation meeting an incomplete specification may still have a lot of bugs. In testing, this issue is addressed by replacing functional completeness with $\mathit{structural}$ one. The latter is achieved by generating a set of tests probing every piece of a design im…
▽ More
One of the problems of formal verification is that it is not functionally complete due the incompleteness of specifications. An implementation meeting an incomplete specification may still have a lot of bugs. In testing, this issue is addressed by replacing functional completeness with $\mathit{structural}$ one. The latter is achieved by generating a set of tests probing every piece of a design implementation. We show that a similar approach can be used in formal verification. The idea here is to generate a property of the implementation at hand that is not implied by the specification. Finding such a property means that the specification is not complete. If this is an $\mathit{unwanted}$ property, the implementation is buggy. Otherwise, a new specification property needs to be added. Generation of implementation properties related to different parts of the design followed by adding new specification properties produces a $\mathit{structurally}$-$\mathit{complete\:specification}$. Implementation properties are built by $\mathit{partial\: quantifier\:elimination}$, a technique where only a part of the formula is taken out of the scope of quantifiers. An implementation property is generated by applying partial quantifier elimination to a formula defining the "truth table" of the implementation. We show how our approach works on specifications of combinational and sequential circuits.
△ Less
Submitted 12 October, 2020; v1 submitted 13 April, 2020;
originally announced April 2020.
-
Partial Quantifier Elimination By Certificate Clauses
Authors:
Eugene Goldberg
Abstract:
In this report, we study partial quantifier elimination (PQE) for propositional CNF formulas. PQE is a generalization of quantifier elimination where one can limit the set of clauses taken out of the scope of quantifiers to a small subset of target clauses. The appeal of PQE is twofold. First, PQE can be dramatically simpler than full quantifier elimination. Second, PQE provides a language for per…
▽ More
In this report, we study partial quantifier elimination (PQE) for propositional CNF formulas. PQE is a generalization of quantifier elimination where one can limit the set of clauses taken out of the scope of quantifiers to a small subset of target clauses. The appeal of PQE is twofold. First, PQE can be dramatically simpler than full quantifier elimination. Second, PQE provides a language for performing incremental computations. Many verification problems (e.g. equivalence checking and model checking) are inherently incremental and so can be solved in terms of PQE. Our approach is based on deriving clauses depending only on unquantified variables that make the target clauses $\mathit{redundant}$. Proving redundancy of a target clause is done by construction of a "certificate" clause implying the former. We describe a PQE algorithm called $\mathit{START}$ that employs the approach above. To evaluate $\mathit{START}$, we apply it to invariant generation for a sequential circuit $N$. The goal of invariant generation is to find an $\mathit{unwanted}$ invariant of $N$ proving unreachability of a state that is supposed to be reachable. If $N$ has an unwanted invariant, it is buggy. Our experiments with FIFO buffers and HWMCC-13 benchmarks suggest that $\mathit{START}$ can be used for detecting bugs that are hard to find by existing methods.
△ Less
Submitted 17 August, 2024; v1 submitted 21 March, 2020;
originally announced March 2020.
-
Generalizing meanings from partners to populations: Hierarchical inference supports convention formation on networks
Authors:
Robert D. Hawkins,
Noah D. Goodman,
Adele E. Goldberg,
Thomas L. Griffiths
Abstract:
A key property of linguistic conventions is that they hold over an entire community of speakers, allowing us to communicate efficiently even with people we have never met before. At the same time, much of our language use is partner-specific: we know that words may be understood differently by different people based on our shared history. This poses a challenge for accounts of convention formation…
▽ More
A key property of linguistic conventions is that they hold over an entire community of speakers, allowing us to communicate efficiently even with people we have never met before. At the same time, much of our language use is partner-specific: we know that words may be understood differently by different people based on our shared history. This poses a challenge for accounts of convention formation. Exactly how do agents make the inferential leap to community-wide expectations while maintaining partner-specific knowledge? We propose a hierarchical Bayesian model to explain how speakers and listeners solve this inductive problem. To evaluate our model's predictions, we conducted an experiment where participants played an extended natural-language communication game with different partners in a small community. We examine several measures of generalization and find key signatures of both partner-specificity and community convergence that distinguish our model from alternatives. These results suggest that partner-specificity is not only compatible with the formation of community-wide conventions, but may facilitate it when coupled with a powerful inductive mechanism.
△ Less
Submitted 30 May, 2020; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Partial Quantifier Elimination With Learning
Authors:
Eugene Goldberg
Abstract:
We consider a modification of the Quantifier Elimination (QE) problem called Partial QE (PQE). In PQE, only a small part of the formula is taken out of the scope of quantifiers. The appeal of PQE is that many verification problems, e.g. equivalence checking and model checking, reduce to PQE and the latter is much easier than complete QE. Earlier, we introduced a PQE algorithm based on the machiner…
▽ More
We consider a modification of the Quantifier Elimination (QE) problem called Partial QE (PQE). In PQE, only a small part of the formula is taken out of the scope of quantifiers. The appeal of PQE is that many verification problems, e.g. equivalence checking and model checking, reduce to PQE and the latter is much easier than complete QE. Earlier, we introduced a PQE algorithm based on the machinery of D-sequents. A D-sequent is a record stating that a clause is redundant in a quantified CNF formula in a specified subspace. To make this algorithm efficient, it is important to reuse learned D-sequents. However, reusing D-sequents is not as easy as conflict clauses in SAT-solvers because redundancy is a structural rather than a semantic property. Earlier, we modified the definition of D-sequents to enable their safe reusing. In this paper, we present a PQE algorithm based on new D-sequents. It is different from its predecessor in two aspects. First, the new algorithm can learn and reuse D-sequents. Second, it proves clauses redundant one by one and thus backtracks as soon as the current target clause is proved redundant in the current subspace. This makes the new PQE algorithm similar to a SAT-solver that backtracks as soon as just one clause is falsified. We show experimentally that the new PQE algorithm outperforms its predecessor.
△ Less
Submitted 13 July, 2019; v1 submitted 25 June, 2019;
originally announced June 2019.
-
Quantifier Elimination With Structural Learning
Authors:
Eugene Goldberg
Abstract:
We consider the Quantifier Elimination (QE) problem for propositional CNF formulas with existential quantifiers. QE plays a key role in formal verification. Earlier, we presented an approach based on the following observation. To perform QE, one just needs to add a set of clauses depending on free variables that makes the quantified clauses (i.e. clauses with quantified variables) redundant. To im…
▽ More
We consider the Quantifier Elimination (QE) problem for propositional CNF formulas with existential quantifiers. QE plays a key role in formal verification. Earlier, we presented an approach based on the following observation. To perform QE, one just needs to add a set of clauses depending on free variables that makes the quantified clauses (i.e. clauses with quantified variables) redundant. To implement this approach, we introduced a branching algorithm making quantified clauses redundant in subspaces and merging the results of branches. To implement this algorithm we developed the machinery of D-sequents. A D-sequent is a record stating that a quantified clause is redundant in a specified subspace. Redundancy of a clause is a structural property (i.e. it holds only for a subset of logically equivalent formulas as opposed to a semantic property). So, re-using D-sequents is not as easy as re-using conflict clauses in SAT-solving. In this paper, we address this problem. We introduce a new definition of D-sequents that enables their re-usability. We develop a theory showing under what conditions a D-sequent can be safely re-used.
△ Less
Submitted 14 October, 2018; v1 submitted 29 September, 2018;
originally announced October 2018.
-
Improving Convergence Rate Of IC3
Authors:
Eugene Goldberg
Abstract:
IC3, a well-known model checker, proves a property of a transition system by building a sequence of formulas $F_0,\dots,F_k$. Formula $F_i$, $0 \leq i \leq k$ over-approximates the set of states reachable in at most $i$ transitions. The basic algorithm of IC3 cannot guarantee that the value of $k$ never exceeds the reachability diameter of the system. We describe an algorithm called IC4 that gives…
▽ More
IC3, a well-known model checker, proves a property of a transition system by building a sequence of formulas $F_0,\dots,F_k$. Formula $F_i$, $0 \leq i \leq k$ over-approximates the set of states reachable in at most $i$ transitions. The basic algorithm of IC3 cannot guarantee that the value of $k$ never exceeds the reachability diameter of the system. We describe an algorithm called IC4 that gives such a guarantee. (IC4 stands for 'IC3 + Improved Convergence'). One can argue that the average convergence rate of IC4 is better than for IC3 as well. Improving convergence can facilitate some other variations of the basic algorithm. As an example, we describe a version of IC4 employing property decomposition. The latter means replacing an original (strong) property with a conjunction of weaker properties to prove by IC4. We argue that addressing the convergence problem is important for making the property decomposition approach work.
△ Less
Submitted 17 October, 2018; v1 submitted 3 September, 2018;
originally announced September 2018.
-
Complete Test Sets And Their Approximations
Authors:
Eugene Goldberg
Abstract:
We use testing to check if a combinational circuit $N$ always evaluates to 0 (written as $N \equiv 0$). We call a set of tests proving $N \equiv 0$ a complete test set (CTS). The conventional point of view is that to prove $N \equiv 0$ one has to generate a trivial CTS. It consists of all $2^{|X|}$ input assignments where $X$ is the set of input variables of $N$. We use the notion of a Stable Set…
▽ More
We use testing to check if a combinational circuit $N$ always evaluates to 0 (written as $N \equiv 0$). We call a set of tests proving $N \equiv 0$ a complete test set (CTS). The conventional point of view is that to prove $N \equiv 0$ one has to generate a trivial CTS. It consists of all $2^{|X|}$ input assignments where $X$ is the set of input variables of $N$. We use the notion of a Stable Set of Assignments (SSA) to show that one can build a non-trivial CTS consisting of less than $2^{|X|}$ tests. Given an unsatisfiable CNF formula $H(W)$, an SSA of $H$ is a set of assignments to $W$ that proves unsatisfiability of $H$. A trivial SSA is the set of all $2^{|W|}$ assignments to $W$. Importantly, real-life formulas can have non-trivial SSAs that are much smaller than $2^{|W|}$. In general, construction of even non-trivial CTSs is inefficient. We describe a much more efficient approach where tests are extracted from an SSA built for a `projection' of $N$ on a subset of variables of $N$. These tests can be viewed as an approximation of a CTS for $N$. We give experimental results and describe potential applications of this approach.
△ Less
Submitted 16 August, 2018;
originally announced August 2018.
-
Generation of complete test sets
Authors:
Eugene Goldberg
Abstract:
We use testing to check if a combinational circuit N always evaluates to 0. The usual point of view is that to prove that N always evaluates to 0 one has to check the value of N for all 2^|X| input assignments where X is the set of input variables of N. We use the notion of a Stable Set of Assignments (SSA) to show that one can build a complete test set (i.e. a test set proving that N always evalu…
▽ More
We use testing to check if a combinational circuit N always evaluates to 0. The usual point of view is that to prove that N always evaluates to 0 one has to check the value of N for all 2^|X| input assignments where X is the set of input variables of N. We use the notion of a Stable Set of Assignments (SSA) to show that one can build a complete test set (i.e. a test set proving that N always evaluates to 0) that consists of less than 2^|X| tests. Given an unsatisfiable CNF formula H(W), an SSA of H is a set of assignments to W proving unsatisfiability of H. A trivial SSA is the set of all 2^|W| assignments to W. Importantly, real-life formulas can have SSAs that are much smaller than 2^|W|. Generating a complete test set for N using only the machinery of SSAs is inefficient. We describe a much faster algorithm that combines computation of SSAs with resolution derivation and produces a complete test set for a "projection" of N on a subset of variables of N. We give experimental results and describe potential applications of this algorithm.
△ Less
Submitted 30 March, 2018;
originally announced April 2018.
-
Efficient Verification of Multi-Property Designs (The Benefit of Wrong Assumptions) (Extended Version)
Authors:
Eugene Goldberg,
Matthias Gudemann,
Daniel Kroening,
Rajdeep Mukherjee
Abstract:
We consider the problem of efficiently checking a set of safety properties P1,....,Pk of one design. We introduce a new approach called JA-verification where JA stands for "Just-Assume" (as opposed to "assume-guarantee"). In this approach, when proving property Pi, one assumes that every property Pj for j!=i holds. The process of proving properties either results in showing that P1,....,Pk hold wi…
▽ More
We consider the problem of efficiently checking a set of safety properties P1,....,Pk of one design. We introduce a new approach called JA-verification where JA stands for "Just-Assume" (as opposed to "assume-guarantee"). In this approach, when proving property Pi, one assumes that every property Pj for j!=i holds. The process of proving properties either results in showing that P1,....,Pk hold without any assumptions or finding a "debugging set" of properties. The latter identifies a subset of failed properties that cause failure of other properties. The design behaviors that cause the properties in the debugging set to fail must be fixed first. Importantly, in our approach, there is no need to prove the assumptions used. We describe the theory behind our approach and report experimental results that demonstrate substantial gains in performance, especially in the cases where a small debugging set exists.
△ Less
Submitted 8 March, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Property Checking Without Inductive Invariants
Authors:
Eugene Goldberg
Abstract:
We introduce a procedure for proving safety properties. This procedure is based on a technique called Partial Quantifier Elimination (PQE). In contrast to complete quantifier elimination, in PQE, only a part of the formula is taken out of the scope of quantifiers. So, PQE can be dramatically more efficient than complete quantifier elimination. The appeal of our procedure is twofold. First, it can…
▽ More
We introduce a procedure for proving safety properties. This procedure is based on a technique called Partial Quantifier Elimination (PQE). In contrast to complete quantifier elimination, in PQE, only a part of the formula is taken out of the scope of quantifiers. So, PQE can be dramatically more efficient than complete quantifier elimination. The appeal of our procedure is twofold. First, it can prove a property without generating an inductive invariant. Second, it employs depth-first search and so can be used to find deep bugs.
△ Less
Submitted 29 October, 2020; v1 submitted 18 February, 2016;
originally announced February 2016.
-
Property Checking By Logic Relaxation
Authors:
Eugene Goldberg
Abstract:
We introduce a new framework for Property Checking (PC) of sequential circuits. It is based on a method called Lo-gic Relaxation (LoR). Given a safety property, the LoR method relaxes the transition system at hand, which leads to expanding the set of reachable states. For j-th time frame, the LoR method computes a superset A_j of the set of bad states reachable in j transitions only by the relaxed…
▽ More
We introduce a new framework for Property Checking (PC) of sequential circuits. It is based on a method called Lo-gic Relaxation (LoR). Given a safety property, the LoR method relaxes the transition system at hand, which leads to expanding the set of reachable states. For j-th time frame, the LoR method computes a superset A_j of the set of bad states reachable in j transitions only by the relaxed system. Set A_j is constructed by a technique called partial quantifier elimination. If A_j does not contain a bad state and this state is reachable in j transitions in the relaxed system, it is also reachable in the original system. Hence the property in question does not hold.
The appeal of PC by LoR is as follows. An inductive invariant (or a counterexample) generated by LoR is a result of computing the states reachable only in the relaxed system. So, the complexity of PC can be drastically reduced by finding a "faulty" relaxation that is close to the original system. This is analogous to equivalence checking whose complexity strongly depends on how similar the designs to be compared are.
△ Less
Submitted 12 January, 2016;
originally announced January 2016.
-
Equivalence Checking By Logic Relaxation
Authors:
Eugene Goldberg
Abstract:
We introduce a new framework for Equivalence Checking (EC) of Boolean circuits based on a general technique called Logic Relaxation (LoR). The essence of LoR is to relax the formula to be solved and compute a superset S of the set of new behaviors. Namely, S contains all new satisfying assignments that appeared due to relaxation and does not contain assignments satisfying the original formula. Set…
▽ More
We introduce a new framework for Equivalence Checking (EC) of Boolean circuits based on a general technique called Logic Relaxation (LoR). The essence of LoR is to relax the formula to be solved and compute a superset S of the set of new behaviors. Namely, S contains all new satisfying assignments that appeared due to relaxation and does not contain assignments satisfying the original formula. Set S is generated by a procedure called partial quantifier elimination. If all possible bad behaviors are in S, the original formula cannot have them and so the property described by this formula holds. The appeal of EC by LoR is twofold. First, it facilitates generation of powerful inductive proofs. Second, proving inequivalence comes down to checking the presence of some bad behaviors in the relaxed formula i.e. in a simpler version of the original formula. We give some experimental evidence that supports our approach.
△ Less
Submitted 11 July, 2016; v1 submitted 4 November, 2015;
originally announced November 2015.
-
Equivalence Checking and Simulation By Computing Range Reduction
Authors:
Eugene Goldberg
Abstract:
We introduce new methods of equivalence checking and simulation based on Computing Range Reduction (CRR). Given a combinational circuit $N$, the CRR problem is to compute the set of outputs that disappear from the range of $N$ if a set of inputs of $N$ is excluded from consideration. Importantly, in many cases, range reduction can be efficiently found even if computing the entire range of $N$ is i…
▽ More
We introduce new methods of equivalence checking and simulation based on Computing Range Reduction (CRR). Given a combinational circuit $N$, the CRR problem is to compute the set of outputs that disappear from the range of $N$ if a set of inputs of $N$ is excluded from consideration. Importantly, in many cases, range reduction can be efficiently found even if computing the entire range of $N$ is infeasible.
Solving equivalence checking by CRR facilitates generation of proofs of equivalence that mimic a "cut propagation" approach. A limited version of such an approach has been successfully used by commercial tools. Functional verification of a circuit $N$ by simulation can be viewed as a way to reduce the complexity of computing the range of $N$. Instead of finding the entire range of $N$ and checking if it contains a bad output, such a range is computed only for one input. Simulation by CRR offers an alternative way of coping with the complexity of range computation. The idea is to exclude a subset of inputs of $N$ and compute the range reduction caused by such an exclusion. If the set of disappeared outputs contains a bad one, then $N$ is buggy.
△ Less
Submitted 11 August, 2015; v1 submitted 8 July, 2015;
originally announced July 2015.
-
Bug Hunting By Computing Range Reduction
Authors:
Eugene Goldberg,
Panagiotis Manolios
Abstract:
We describe a method of model checking called Computing Range Reduction (CRR). The CRR method is based on derivation of clauses that reduce the set of traces of reachable states in such a way that at least one counterexample remains (if any). These clauses are derived by a technique called Partial Quantifier Elimination (PQE). Given a number n, the CRR method finds a counterexample of length less…
▽ More
We describe a method of model checking called Computing Range Reduction (CRR). The CRR method is based on derivation of clauses that reduce the set of traces of reachable states in such a way that at least one counterexample remains (if any). These clauses are derived by a technique called Partial Quantifier Elimination (PQE). Given a number n, the CRR method finds a counterexample of length less or equal to n or proves that such a counterexample does not exist. We show experimentally that a PQE-solver we developed earlier can be efficiently applied to derivation of constraining clauses for transition relations of realistic benchmarks.
One of the most appealing features of the CRR method is that it can potentially find long counterexamples. This is the area where it can beat model checkers computing reachable states (or their approximations as in IC3) or SAT-based methods of bounded model checking. PQE cannot be efficiently simulated by a SAT-solver. This is important because the current research in model checking is dominated by SAT-based algorithms. The CRR method is a reminder that one should not put all eggs in one basket.
△ Less
Submitted 11 October, 2014; v1 submitted 29 August, 2014;
originally announced August 2014.
-
Partial Quantifier Elimination
Authors:
Eugene Goldberg,
Panagiotis Manolios
Abstract:
We consider the problem of Partial Quantifier Elimination (PQE). Given formula exists(X)[F(X,Y) & G(X,Y)], where F, G are in conjunctive normal form, the PQE problem is to find a formula F*(Y) such that F* & exists(X)[G] is logically equivalent to exists(X)[F & G]. We solve the PQE problem by generating and adding to F clauses over the free variables that make the clauses of F with quantified vari…
▽ More
We consider the problem of Partial Quantifier Elimination (PQE). Given formula exists(X)[F(X,Y) & G(X,Y)], where F, G are in conjunctive normal form, the PQE problem is to find a formula F*(Y) such that F* & exists(X)[G] is logically equivalent to exists(X)[F & G]. We solve the PQE problem by generating and adding to F clauses over the free variables that make the clauses of F with quantified variables redundant. The traditional Quantifier Elimination problem (QE) is a special case of PQE where G is empty so all clauses of the input formula with quantified variables need to be made redundant. The importance of PQE is twofold. First, many problems are more naturally formulated in terms of PQE rather than QE. Second, in many cases PQE can be solved more efficiently than QE. We describe a PQE algorithm based on the machinery of dependency sequents and give experimental results showing the promise of PQE.
△ Less
Submitted 1 April, 2017; v1 submitted 17 July, 2014;
originally announced July 2014.
-
Verification of Sequential Circuits by Tests-As-Proofs Paradigm
Authors:
Eugene Goldberg,
Mitesh Jain,
Panagiotis Manolios
Abstract:
We introduce an algorithm for detection of bugs in sequential circuits. This algorithm is incomplete i.e. its failure to find a bug breaking a property P does not imply that P holds. The appeal of incomplete algorithms is that they scale better than their complete counterparts. However, to make an incomplete algorithm effective one needs to guarantee that the probability of finding a bug is reason…
▽ More
We introduce an algorithm for detection of bugs in sequential circuits. This algorithm is incomplete i.e. its failure to find a bug breaking a property P does not imply that P holds. The appeal of incomplete algorithms is that they scale better than their complete counterparts. However, to make an incomplete algorithm effective one needs to guarantee that the probability of finding a bug is reasonably high. We try to achieve such effectiveness by employing the Test-As-Proofs (TAP) paradigm. In our TAP based approach, a counterexample is built as a sequence of states extracted from proofs that some local variations of property P hold. This increases the probability that a) a representative set of states is examined and that b) the considered states are relevant to property P.
We describe an algorithm of test generation based on the TAP paradigm and give preliminary experimental results.
△ Less
Submitted 25 September, 2013; v1 submitted 2 August, 2013;
originally announced August 2013.
-
Checking Satisfiability by Dependency Sequents
Authors:
Eugene Goldberg,
Panagiotis Manolios
Abstract:
We introduce a new algorithm for checking satisfiability based on a calculus of Dependency sequents (D-sequents). Given a CNF formula F(X), a D-sequent is a record stating that under a partial assignment a set of variables of X is redundant in formula \exists{X}[F]. The D-sequent calculus is based on operation join that forms a new D-sequent from two existing D-sequents. The new algorithm solves t…
▽ More
We introduce a new algorithm for checking satisfiability based on a calculus of Dependency sequents (D-sequents). Given a CNF formula F(X), a D-sequent is a record stating that under a partial assignment a set of variables of X is redundant in formula \exists{X}[F]. The D-sequent calculus is based on operation join that forms a new D-sequent from two existing D-sequents. The new algorithm solves the quantified version of SAT. That is, given a satisfiable formula F, it, in general, does not produce an assignment satisfying F.
The new algorithm is called DS-QSAT where DS stands for Dependency Sequent and Q for Quantified.
Importantly, a DPLL-like procedure is only a special case of DS-QSAT where a very restricted kind of D-sequents is used. We argue that this restriction a) adversely affects scalability of SAT-solvers and b) is caused by looking for an explicit satisfying assignment rather than just proving satisfiability. We give experimental results substantiating these claims.
△ Less
Submitted 20 July, 2012;
originally announced July 2012.
-
Removal of Quantifiers by Elimination of Boundary Points
Authors:
Eugene Goldberg,
Panagiotis Manolios
Abstract:
We consider the problem of elimination of existential quantifiers from a Boolean CNF formula. Our approach is based on the following observation. One can get rid of dependency on a set of variables of a quantified CNF formula F by adding resolvent clauses of F eliminating boundary points. This approach is similar to the method of quantifier elimination described in [9]. The difference of the metho…
▽ More
We consider the problem of elimination of existential quantifiers from a Boolean CNF formula. Our approach is based on the following observation. One can get rid of dependency on a set of variables of a quantified CNF formula F by adding resolvent clauses of F eliminating boundary points. This approach is similar to the method of quantifier elimination described in [9]. The difference of the method described in the present paper is twofold: {\bullet} branching is performed only on quantified variables, {\bullet} an explicit search for boundary points is performed by calls to a SAT-solver Although we published the paper [9] before this one, chrono- logically the method of the present report was developed first. Preliminary presentations of this method were made in [10], [11]. We postponed a publication of this method due to preparation of a patent application [8].
△ Less
Submitted 5 June, 2012; v1 submitted 8 April, 2012;
originally announced April 2012.
-
Quantifier Elimination by Dependency Sequents
Authors:
Eugene Goldberg,
Panagiotis Manolios
Abstract:
We consider the problem of existential quantifier elimination for Boolean formulas in Conjunctive Normal Form (CNF). We present a new method for solving this problem called Derivation of Dependency-Sequents (DDS). A Dependency-sequent (D-sequent) is used to record that a set of quantified variables is redundant under a partial assignment. We introduce a resolution-like operation called join that p…
▽ More
We consider the problem of existential quantifier elimination for Boolean formulas in Conjunctive Normal Form (CNF). We present a new method for solving this problem called Derivation of Dependency-Sequents (DDS). A Dependency-sequent (D-sequent) is used to record that a set of quantified variables is redundant under a partial assignment. We introduce a resolution-like operation called join that produces a new D-sequent from two existing D-sequents. We also show that DDS is compositional, i.e. if our input formula is a conjunction of independent formulas, DDS automatically recognizes and exploits this information. We introduce an algorithm based on DDS and present experimental results demonstrating its potential.
△ Less
Submitted 2 June, 2013; v1 submitted 26 January, 2012;
originally announced January 2012.
-
iBOA: The Incremental Bayesian Optimization Algorithm
Authors:
Martin Pelikan,
Kumara Sastry,
David E. Goldberg
Abstract:
This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This re…
▽ More
This paper proposes the incremental Bayesian optimization algorithm (iBOA), which modifies standard BOA by removing the population of solutions and using incremental updates of the Bayesian network. iBOA is shown to be able to learn and exploit unrestricted Bayesian networks using incremental techniques for updating both the structure as well as the parameters of the probabilistic model. This represents an important step toward the design of competent incremental estimation of distribution algorithms that can solve difficult nearly decomposable problems scalably and reliably.
△ Less
Submitted 20 January, 2008;
originally announced January 2008.
-
Decomposable Problems, Niching, and Scalability of Multiobjective Estimation of Distribution Algorithms
Authors:
Kumara Sastry,
Martin Pelikan,
David E. Goldberg
Abstract:
The paper analyzes the scalability of multiobjective estimation of distribution algorithms (MOEDAs) on a class of boundedly-difficult additively-separable multiobjective optimization problems. The paper illustrates that even if the linkage is correctly identified, massive multimodality of the search problems can easily overwhelm the nicher and lead to exponential scale-up. Facetwise models are s…
▽ More
The paper analyzes the scalability of multiobjective estimation of distribution algorithms (MOEDAs) on a class of boundedly-difficult additively-separable multiobjective optimization problems. The paper illustrates that even if the linkage is correctly identified, massive multimodality of the search problems can easily overwhelm the nicher and lead to exponential scale-up. Facetwise models are subsequently used to propose a growth rate of the number of differing substructures between the two objectives to avoid the niching method from being overwhelmed and lead to polynomial scalability of MOEDAs.
△ Less
Submitted 12 February, 2005;
originally announced February 2005.
-
Multiobjective hBOA, Clustering, and Scalability
Authors:
Martin Pelikan,
Kumara Sastry,
David E. Goldberg
Abstract:
This paper describes a scalable algorithm for solving multiobjective decomposable problems by combining the hierarchical Bayesian optimization algorithm (hBOA) with the nondominated sorting genetic algorithm (NSGA-II) and clustering in the objective space. It is first argued that for good scalability, clustering or some other form of niching in the objective space is necessary and the size of ea…
▽ More
This paper describes a scalable algorithm for solving multiobjective decomposable problems by combining the hierarchical Bayesian optimization algorithm (hBOA) with the nondominated sorting genetic algorithm (NSGA-II) and clustering in the objective space. It is first argued that for good scalability, clustering or some other form of niching in the objective space is necessary and the size of each niche should be approximately equal. Multiobjective hBOA (mohBOA) is then described that combines hBOA, NSGA-II and clustering in the objective space. The algorithm mohBOA differs from the multiobjective variants of BOA and hBOA proposed in the past by including clustering in the objective space and allocating an approximately equally sized portion of the population to each cluster. The algorithm mohBOA is shown to scale up well on a number of problems on which standard multiobjective evolutionary algorithms perform poorly.
△ Less
Submitted 7 February, 2005;
originally announced February 2005.
-
Sub-structural Niching in Estimation of Distribution Algorithms
Authors:
K. Sastry,
H. A. Abbass,
D. E. Goldberg,
D. D. Johnson
Abstract:
We propose a sub-structural niching method that fully exploits the problem decomposition capability of linkage-learning methods such as the estimation of distribution algorithms and concentrate on maintaining diversity at the sub-structural level. The proposed method consists of three key components: (1) Problem decomposition and sub-structure identification, (2) sub-structure fitness estimation…
▽ More
We propose a sub-structural niching method that fully exploits the problem decomposition capability of linkage-learning methods such as the estimation of distribution algorithms and concentrate on maintaining diversity at the sub-structural level. The proposed method consists of three key components: (1) Problem decomposition and sub-structure identification, (2) sub-structure fitness estimation, and (3) sub-structural niche preservation. The sub-structural niching method is compared to restricted tournament selection (RTS)--a niching method used in hierarchical Bayesian optimization algorithm--with special emphasis on sustained preservation of multiple global solutions of a class of boundedly-difficult, additively-separable multimodal problems. The results show that sub-structural niching successfully maintains multiple global optima over large number of generations and does so with significantly less population than RTS. Additionally, the market share of each of the niche is much closer to the expected level in sub-structural niching when compared to RTS.
△ Less
Submitted 3 February, 2005;
originally announced February 2005.
-
Sub-Structural Niching in Non-Stationary Environments
Authors:
K. Sastry,
H. A. Abbass,
D. E. Goldberg
Abstract:
Niching enables a genetic algorithm (GA) to maintain diversity in a population. It is particularly useful when the problem has multiple optima where the aim is to find all or as many as possible of these optima. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. In these problems, niching can maintain useful solution…
▽ More
Niching enables a genetic algorithm (GA) to maintain diversity in a population. It is particularly useful when the problem has multiple optima where the aim is to find all or as many as possible of these optima. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. In these problems, niching can maintain useful solutions to respond quickly, reliably and accurately to a change in the environment. In this paper, we present a niching method that works on the problem substructures rather than the whole solution, therefore it has less space complexity than previously known niching mechanisms. We show that the method is responding accurately when environmental changes occur.
△ Less
Submitted 3 February, 2005;
originally announced February 2005.
-
Oiling the Wheels of Change: The Role of Adaptive Automatic Problem Decomposition in Non--Stationary Environments
Authors:
H. A. Abbass,
K. Sastry,
D. E. Goldberg
Abstract:
Genetic algorithms (GAs) that solve hard problems quickly, reliably and accurately are called competent GAs. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. This paper investigates the use of competent GAs for optimizing non--stationary optimization problems. More specifically, we use an information theoretic appr…
▽ More
Genetic algorithms (GAs) that solve hard problems quickly, reliably and accurately are called competent GAs. When the fitness landscape of a problem changes overtime, the problem is called non--stationary, dynamic or time--variant problem. This paper investigates the use of competent GAs for optimizing non--stationary optimization problems. More specifically, we use an information theoretic approach based on the minimum description length principle to adaptively identify regularities and substructures that can be exploited to respond quickly to changes in the environment. We also develop a special type of problems with bounded difficulties to test non--stationary optimization problems. The results provide new insights into non-stationary optimization problems and show that a search algorithm which automatically identifies and exploits possible decompositions is more robust and responds quickly to changes than a simple genetic algorithm.
△ Less
Submitted 3 February, 2005;
originally announced February 2005.
-
Population Sizing for Genetic Programming Based Upon Decision Making
Authors:
K. Sastry,
U. -M. O'Reilly,
D. E. Goldberg
Abstract:
This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolution multiple times.…
▽ More
This paper derives a population sizing relationship for genetic programming (GP). Following the population-sizing derivation for genetic algorithms in Goldberg, Deb, and Clark (1992), it considers building block decision making as a key facet. The analysis yields a GP-unique relationship because it has to account for bloat and for the fact that GP solutions often use subsolution multiple times. The population-sizing relationship depends upon tree size, solution complexity, problem difficulty and building block expression probability. The relationship is used to analyze and empirically investigate population sizing for three model GP problems named ORDER, ON-OFF and LOUD. These problems exhibit bloat to differing extents and differ in whether their solutions require the use of a building block multiple times.
△ Less
Submitted 3 February, 2005;
originally announced February 2005.
-
Efficiency Enhancement of Genetic Algorithms via Building-Block-Wise Fitness Estimation
Authors:
Kumara Sastry,
Martin Pelikan,
David E. Goldberg
Abstract:
This paper studies fitness inheritance as an efficiency enhancement technique for a class of competent genetic algorithms called estimation distribution algorithms. Probabilistic models of important sub-solutions are developed to estimate the fitness of a proportion of individuals in the population, thereby avoiding computationally expensive function evaluations. The effect of fitness inheritanc…
▽ More
This paper studies fitness inheritance as an efficiency enhancement technique for a class of competent genetic algorithms called estimation distribution algorithms. Probabilistic models of important sub-solutions are developed to estimate the fitness of a proportion of individuals in the population, thereby avoiding computationally expensive function evaluations. The effect of fitness inheritance on the convergence time and population sizing are modeled and the speed-up obtained through inheritance is predicted. The results show that a fitness-inheritance mechanism which utilizes information on building-block fitnesses provides significant efficiency enhancement. For additively separable problems, fitness inheritance reduces the number of function evaluations to about half and yields a speed-up of about 1.75--2.25.
△ Less
Submitted 18 May, 2004;
originally announced May 2004.
-
Designing Competent Mutation Operators via Probabilistic Model Building of Neighborhoods
Authors:
Kumara Sastry,
David E. Goldberg
Abstract:
This paper presents a competent selectomutative genetic algorithm (GA), that adapts linkage and solves hard problems quickly, reliably, and accurately. A probabilistic model building process is used to automatically identify key building blocks (BBs) of the search problem. The mutation operator uses the probabilistic model of linkage groups to find the best among competing building blocks. The c…
▽ More
This paper presents a competent selectomutative genetic algorithm (GA), that adapts linkage and solves hard problems quickly, reliably, and accurately. A probabilistic model building process is used to automatically identify key building blocks (BBs) of the search problem. The mutation operator uses the probabilistic model of linkage groups to find the best among competing building blocks. The competent selectomutative GA successfully solves additively separable problems of bounded difficulty, requiring only subquadratic number of function evaluations. The results show that for additively separable problems the probabilistic model building BB-wise mutation scales as O(2^km^{1.5}), and requires O(k^{0.5}logm) less function evaluations than its selectorecombinative counterpart, confirming theoretical results reported elsewhere (Sastry & Goldberg, 2004).
△ Less
Submitted 18 May, 2004;
originally announced May 2004.
-
Let's Get Ready to Rumble: Crossover Versus Mutation Head to Head
Authors:
Kumara Sastry,
David E. Goldberg
Abstract:
This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to det…
▽ More
This paper analyzes the relative advantages between crossover and mutation on a class of deterministic and stochastic additively separable problems. This study assumes that the recombination and mutation operators have the knowledge of the building blocks (BBs) and effectively exchange or search among competing BBs. Facetwise models of convergence time and population sizing have been used to determine the scalability of each algorithm. The analysis shows that for additively separable deterministic problems, the BB-wise mutation is more efficient than crossover, while the crossover outperforms the mutation on additively separable problems perturbed with additive Gaussian noise. The results show that the speed-up of using BB-wise mutation on deterministic problems is O(k^{0.5}logm), where k is the BB size, and m is the number of BBs. Likewise, the speed-up of using crossover on stochastic problems with fixed noise variance is O(mk^{0.5}log m).
△ Less
Submitted 18 May, 2004;
originally announced May 2004.
-
Efficiency Enhancement of Probabilistic Model Building Genetic Algorithms
Authors:
Kumara Sastry,
David E. Goldberg,
Martin Pelikan
Abstract:
This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the p…
▽ More
This paper presents two different efficiency-enhancement techniques for probabilistic model building genetic algorithms. The first technique proposes the use of a mutation operator which performs local search in the sub-solution neighborhood identified through the probabilistic model. The second technique proposes building and using an internal probabilistic model of the fitness along with the probabilistic model of variable interactions. The fitness values of some offspring are estimated using the probabilistic model, thereby avoiding computationally expensive function evaluations. The scalability of the aforementioned techniques are analyzed using facetwise models for convergence time and population sizing. The speed-up obtained by each of the methods is predicted and verified with empirical results. The results show that for additively separable problems the competent mutation operator requires O(k 0.5 logm)--where k is the building-block size, and m is the number of building blocks--less function evaluations than its selectorecombinative counterpart. The results also show that the use of an internal probabilistic fitness model reduces the required number of function evaluations to as low as 1-10% and yields a speed-up of 2--50.
△ Less
Submitted 18 May, 2004;
originally announced May 2004.