-
When Autonomy Breaks: The Hidden Existential Risk of AI
Authors:
Joshua Krook
Abstract:
AI risks are typically framed around physical threats to humanity, a loss of control or an accidental error causing humanity's extinction. However, I argue in line with the gradual disempowerment thesis, that there is an underappreciated risk in the slow and irrevocable decline of human autonomy. As AI starts to outcompete humans in various areas of life, a tipping point will be reached where it n…
▽ More
AI risks are typically framed around physical threats to humanity, a loss of control or an accidental error causing humanity's extinction. However, I argue in line with the gradual disempowerment thesis, that there is an underappreciated risk in the slow and irrevocable decline of human autonomy. As AI starts to outcompete humans in various areas of life, a tipping point will be reached where it no longer makes sense to rely on human decision-making, creativity, social care or even leadership.
What may follow is a process of gradual de-skilling, where we lose skills that we currently take for granted. Traditionally, it is argued that AI will gain human skills over time, and that these skills are innate and immutable in humans. By contrast, I argue that humans may lose such skills as critical thinking, decision-making and even social care in an AGI world. The biggest threat to humanity is therefore not that machines will become more like humans, but that humans will become more like machines.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Manipulation and the AI Act: Large Language Model Chatbots and the Danger of Mirrors
Authors:
Joshua Krook
Abstract:
Large Language Model chatbots are increasingly taking the form and visage of human beings, adapting human faces, names, voices, personalities, and quirks, including those of celebrities and well-known political figures. Personifying AI chatbots could foreseeably increase their trust with users. However, it could also make them more capable of manipulation, by creating the illusion of a close and i…
▽ More
Large Language Model chatbots are increasingly taking the form and visage of human beings, adapting human faces, names, voices, personalities, and quirks, including those of celebrities and well-known political figures. Personifying AI chatbots could foreseeably increase their trust with users. However, it could also make them more capable of manipulation, by creating the illusion of a close and intimate relationship with an artificial entity. The European Commission has finalized the AI Act, with the EU Parliament making amendments banning manipulative and deceptive AI systems that cause significant harm to users. Although the AI Act covers harms that accumulate over time, it is unlikely to prevent harms associated with prolonged discussions with AI chatbots. Specifically, a chatbot could reinforce a person's negative emotional state over weeks, months, or years through negative feedback loops, prolonged conversations, or harmful recommendations, contributing to a user's deteriorating mental health.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
Assessing confidence in frontier AI safety cases
Authors:
Stephen Barrett,
Philip Fox,
Joshua Krook,
Tuneer Mondal,
Simon Mylius,
Alejandro Tlaie
Abstract:
Powerful new frontier AI technologies are bringing many benefits to society but at the same time bring new risks. AI developers and regulators are therefore seeking ways to assure the safety of such systems, and one promising method under consideration is the use of safety cases. A safety case presents a structured argument in support of a top-level claim about a safety property of the system. Suc…
▽ More
Powerful new frontier AI technologies are bringing many benefits to society but at the same time bring new risks. AI developers and regulators are therefore seeking ways to assure the safety of such systems, and one promising method under consideration is the use of safety cases. A safety case presents a structured argument in support of a top-level claim about a safety property of the system. Such top-level claims are often presented as a binary statement, for example "Deploying the AI system does not pose unacceptable risk". However, in practice, it is often not possible to make such statements unequivocally. This raises the question of what level of confidence should be associated with a top-level claim. We adopt the Assurance 2.0 safety assurance methodology, and we ground our work by specific application of this methodology to a frontier AI inability argument that addresses the harm of cyber misuse. We find that numerical quantification of confidence is challenging, though the processes associated with generating such estimates can lead to improvements in the safety case. We introduce a method for better enabling reproducibility and transparency in probabilistic assessment of confidence in argument leaf nodes through a purely LLM-implemented Delphi method. We propose a method by which AI developers can prioritise, and thereby make their investigation of argument defeaters more efficient. Proposals are also made on how best to communicate confidence information to executive decision-makers.
△ Less
Submitted 9 February, 2025;
originally announced February 2025.
-
Geometric shape matching for recovering protein conformations from single-particle Cryo-EM data
Authors:
Erik Jansson,
Jonathan Krook,
Klas Modin,
Ozan Öktem
Abstract:
We address recovery of the three-dimensional backbone structure of single polypeptide proteins from single-particle cryo-electron microscopy (Cryo-SPA) data. Cryo-SPA produces noisy tomographic projections of electrostatic potentials of macromolecules. From these projections, we use methods from shape analysis to recover the three-dimensional backbone structure. Thus, we view the reconstruction pr…
▽ More
We address recovery of the three-dimensional backbone structure of single polypeptide proteins from single-particle cryo-electron microscopy (Cryo-SPA) data. Cryo-SPA produces noisy tomographic projections of electrostatic potentials of macromolecules. From these projections, we use methods from shape analysis to recover the three-dimensional backbone structure. Thus, we view the reconstruction problem as an indirect matching problem, where a point cloud representation of the protein backbone is deformed to match 2D tomography data. The deformations are obtained via the action of a matrix Lie group. By selecting a deformation energy, the optimality conditions are obtained, which lead to computational algorithms for optimal deformations. We showcase our approach on synthetic data, for which we recover the three-dimensional structure of the backbone.
△ Less
Submitted 1 October, 2024;
originally announced October 2024.
-
Objection Overruled! Lay People can Distinguish Large Language Models from Lawyers, but still Favour Advice from an LLM
Authors:
Eike Schneiders,
Tina Seabrooke,
Joshua Krook,
Richard Hyde,
Natalie Leesakul,
Jeremie Clos,
Joel Fischer
Abstract:
Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N = 288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice…
▽ More
Large Language Models (LLMs) are seemingly infiltrating every domain, and the legal context is no exception. In this paper, we present the results of three experiments (total N = 288) that investigated lay people's willingness to act upon, and their ability to discriminate between, LLM- and lawyer-generated legal advice. In Experiment 1, participants judged their willingness to act on legal advice when the source of the advice was either known or unknown. When the advice source was unknown, participants indicated that they were significantly more willing to act on the LLM-generated advice. The result of the source unknown condition was replicated in Experiment 2. Intriguingly, despite participants indicating higher willingness to act on LLM-generated advice in Experiments 1 and 2, participants discriminated between the LLM- and lawyer-generated texts significantly above chance-level in Experiment 3. Lastly, we discuss potential explanations and risks of our findings, limitations and future work.
△ Less
Submitted 21 March, 2025; v1 submitted 12 September, 2024;
originally announced September 2024.
-
A Formal-Methods Approach to Provide Evidence in Automated-Driving Safety Cases
Authors:
Jonas Krook,
Yuvaraj Selvaraj,
Wolfgang Ahrendt,
Martin Fabian
Abstract:
The safety of automated driving systems must be justified by convincing arguments and supported by compelling evidence to persuade certification agencies, regulatory entities, and the general public to allow the systems on public roads. This persuasion is typically facilitated by compiling the arguments and the compelling evidence into a safety case. Reviews and testing, two common approaches to e…
▽ More
The safety of automated driving systems must be justified by convincing arguments and supported by compelling evidence to persuade certification agencies, regulatory entities, and the general public to allow the systems on public roads. This persuasion is typically facilitated by compiling the arguments and the compelling evidence into a safety case. Reviews and testing, two common approaches to ensure correctness of automotive systems cannot explore the typically infinite set of possible behaviours. In contrast, formal methods are exhaustive methods that can provide mathematical proofs of correctness of models, and they can be used to prove that formalizations of functional safety requirements are fulfilled by formal models of system components. This paper shows how formal methods can provide evidence for the correct break-down of the functional safety requirements onto the components that are part of feedback loops, and how this evidence fits into the argument of the safety case. If a proof is obtained, the formal models are used as requirements on the components. This structure of the safety argumentation can be used to alleviate the need for reviews and tests to ensure that the break-down is correct, thereby saving effort both in data collection and verification time.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
On How to Not Prove Faulty Controllers Safe in Differential Dynamic Logic
Authors:
Yuvaraj Selvaraj,
Jonas Krook,
Wolfgang Ahrendt,
Martin Fabian
Abstract:
Cyber-physical systems are often safety-critical and their correctness is crucial, as in the case of automated driving. Using formal mathematical methods is one way to guarantee correctness. Though these methods have shown their usefulness, care must be taken as modeling errors might result in proving a faulty controller safe, which is potentially catastrophic in practice. This paper deals with tw…
▽ More
Cyber-physical systems are often safety-critical and their correctness is crucial, as in the case of automated driving. Using formal mathematical methods is one way to guarantee correctness. Though these methods have shown their usefulness, care must be taken as modeling errors might result in proving a faulty controller safe, which is potentially catastrophic in practice. This paper deals with two such modeling errors in differential dynamic logic. Differential dynamic logic is a formal specification and verification language for hybrid systems, which are mathematical models of cyber-physical systems. The main contribution is to prove conditions that when fulfilled, these two modeling errors cannot cause a faulty controller to be proven safe. The problems are illustrated with a real world example of a safety controller for automated driving, and it is shown that the formulated conditions have the intended effect both for a faulty and a correct controller. It is also shown how the formulated conditions aid in finding a loop invariant candidate to prove properties of hybrid systems with feedback loops. The results are proven using the interactive theorem prover KeYmaera X.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Robust Stutter Bisimulation for Abstraction and Controller Synthesis with Disturbance: Proofs
Authors:
Jonas Krook,
Robi Malik,
Sahar Mohajerani,
Martin Fabian
Abstract:
This paper proposes a method to synthesise controllers for cyber-physical systems such that the controlled systems satisfy specifications given as linear temporal logic formulas. The focus is on systems with disturbance, where future states cannot be predicted exactly due to uncertainty in the environment. The approach used to solve this problem is to first construct a finite-state abstraction of…
▽ More
This paper proposes a method to synthesise controllers for cyber-physical systems such that the controlled systems satisfy specifications given as linear temporal logic formulas. The focus is on systems with disturbance, where future states cannot be predicted exactly due to uncertainty in the environment. The approach used to solve this problem is to first construct a finite-state abstraction of the original system and then synthesise a controller for the abstract system. For this approach, the robust stutter bisimulation relation is introduced, which preserves the existence of controllers for any given linear temporal logic formula. States are related by the robust stutter bisimulation relation if the same target sets can be guaranteed to be reached or avoided under control of some controllers, thereby ensuring that disturbances have similar effect on paths that start in related states. This paper presents an algorithm to construct the corresponding robust stutter bisimulation quotient to solve the abstraction problem, and it is shown, by explicit construction, that there exists a controller enforcing a linear temporal logic formula for the original system if and only if a corresponding controller exists for the quotient system. Lastly, the result of the algorithm and the controller construction are demonstrated by application to an example of robot navigation.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.