Search | arXiv e-print repository

Analyzing the Effectiveness of Large Language Models on Text-to-SQL Synthesis

Authors: Richard Roberson, Gowtham Kaki, Ashutosh Trivedi

Abstract: This study investigates various approaches to using Large Language Models (LLMs) for Text-to-SQL program synthesis, focusing on the outcomes and insights derived. Employing the popular Text-to-SQL dataset, spider, the goal was to input a natural language question along with the database schema and output the correct SQL SELECT query. The initial approach was to fine-tune a local and open-source mo… ▽ More This study investigates various approaches to using Large Language Models (LLMs) for Text-to-SQL program synthesis, focusing on the outcomes and insights derived. Employing the popular Text-to-SQL dataset, spider, the goal was to input a natural language question along with the database schema and output the correct SQL SELECT query. The initial approach was to fine-tune a local and open-source model to generate the SELECT query. After QLoRa fine-tuning WizardLM's WizardCoder-15B model on the spider dataset, the execution accuracy for generated queries rose to a high of 61%. With the second approach, using the fine-tuned gpt-3.5-turbo-16k (Few-shot) + gpt-4-turbo (Zero-shot error correction), the execution accuracy reached a high of 82.1%. Of all the incorrect queries, most can be categorized into a seven different categories of what went wrong: selecting the wrong columns or wrong order of columns, grouping by the wrong column, predicting the wrong values in conditionals, using different aggregates than the ground truth, extra or too few JOIN clauses, inconsistencies in the Spider dataset, and lastly completely incorrect query structure. Most if not all of the queries fall into these categories and it is insightful to understanding where the faults still lie with LLM program synthesis and where they can be improved. △ Less

Submitted 22 January, 2024; originally announced January 2024.

arXiv:2309.04464 [pdf, ps, other]

doi 10.1145/3622865

Historia: Refuting Callback Reachability with Message-History Logics (Extended Version)

Authors: Shawn Meier, Sergio Mover, Gowtham Kaki, Bor-Yuh Evan Chang

Abstract: This paper determines if a callback can be called by an event-driven framework in an unexpected state.Event-driven programming frameworks are pervasive for creating user-interactive apps on just about every modern platform.Control flow between callbacks is determined by the framework and largely opaque to the programmer.This opacity of the callback control flow not only causes difficulty for the p… ▽ More This paper determines if a callback can be called by an event-driven framework in an unexpected state.Event-driven programming frameworks are pervasive for creating user-interactive apps on just about every modern platform.Control flow between callbacks is determined by the framework and largely opaque to the programmer.This opacity of the callback control flow not only causes difficulty for the programmer but is also difficult for those developing static analysis.Previous static analysis techniques address this opacity either by assuming an arbitrary framework implementation or attempting to eagerly specify all possible callback control flow, but this is either too coarse or too burdensome and tricky to get right.Instead, we present a middle way where the callback control flow can be gradually refined in a targeted manner to prove assertions of interest.The key insight to get this middle way is by reasoning about the history of method invocations at the boundary between app and framework code - enabling a decoupling of the specification of callback control flow from the analysis of app code.We call the sequence of such boundary-method invocations message histories and develop message-history logics to do this reasoning.In particular, we define the notion of an application-only transition system with boundary transitions, a message-history program logic for programs with such transitions, and a temporal specification logic for capturing callback control flow in a targeted and compositional manner.Then to utilize the logics in a goal-directed verifier, we define a way to combine after-the-fact an assertion about message histories with a specification of callback control flow.We implemented a prototype message history-based verifier called Historia that enables proving the absence of multi-callback bug patterns in real-world open-source Android apps. △ Less

Submitted 11 September, 2023; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 40 pages, 8 figures, Accepted to OOPSLA 2023

MSC Class: 68Q60 ACM Class: D.3.3

arXiv:1710.09844 [pdf, other]

doi 10.1145/3158115

Alone Together: Compositional Reasoning and Inference for Weak Isolation

Authors: Gowtham Kaki, Kartik Nagar, Mahsa Nazafzadeh, Suresh Jagannathan

Abstract: Serializability is a well-understood correctness criterion that simplifies reasoning about the behavior of concurrent transactions by ensuring they are isolated from each other while they execute. However, enforcing serializable isolation comes at a steep cost in performance and hence database systems in practice support, and often encourage, developers to implement transactions using weaker alter… ▽ More Serializability is a well-understood correctness criterion that simplifies reasoning about the behavior of concurrent transactions by ensuring they are isolated from each other while they execute. However, enforcing serializable isolation comes at a steep cost in performance and hence database systems in practice support, and often encourage, developers to implement transactions using weaker alternatives. Unfortunately, the semantics of weak isolation is poorly understood, and usually explained only informally in terms of low-level implementation artifacts. Consequently, verifying high-level correctness properties in such environments remains a challenging problem. To address this issue, we present a novel program logic that enables compositional reasoning about the behavior of concurrently executing weakly-isolated transactions. Recognizing that the proof burden necessary to use this logic may dissuade application developers, we also describe an inference procedure based on this foundation that ascertains the weakest isolation level that still guarantees the safety of high-level consistency invariants associated with such transactions. The key to effective inference is the observation that weakly-isolated transactions can be viewed as functional (monadic) computations over an abstract database state, allowing us to treat their operations as state transformers over the database. This interpretation enables automated verification using off-the-shelf SMT solvers. Case studies and experiments of real-world applications (written in an embedded DSL in OCaml) demonstrate the utility of our approach. △ Less

Submitted 9 November, 2017; v1 submitted 26 October, 2017; originally announced October 2017.

Comments: 46 pages, 12 figures

Showing 1–3 of 3 results for author: Kaki, G