Search | arXiv e-print repository

Honey, I shrunk the hypothesis space (through logical preprocessing)

Authors: Andrew Cropper, Filipe Gouveia, David M. Cerna

Abstract: Inductive logic programming (ILP) is a form of logical machine learning. The goal is to search a hypothesis space for a hypothesis that generalises training examples and background knowledge. We introduce an approach that 'shrinks' the hypothesis space before an ILP system searches it. Our approach uses background knowledge to find rules that cannot be in an optimal hypothesis regardless of the tr… ▽ More Inductive logic programming (ILP) is a form of logical machine learning. The goal is to search a hypothesis space for a hypothesis that generalises training examples and background knowledge. We introduce an approach that 'shrinks' the hypothesis space before an ILP system searches it. Our approach uses background knowledge to find rules that cannot be in an optimal hypothesis regardless of the training examples. For instance, our approach discovers relationships such as "even numbers cannot be odd" and "prime numbers greater than 2 are odd". It then removes violating rules from the hypothesis space. We implement our approach using answer set programming and use it to shrink the hypothesis space of a constraint-based ILP system. Our experiments on multiple domains, including visual reasoning and game playing, show that our approach can substantially reduce learning times whilst maintaining predictive accuracies. For instance, given just 10 seconds of preprocessing time, our approach can reduce learning times from over 10 hours to only 2 seconds. △ Less

Submitted 7 June, 2025; originally announced June 2025.

Comments: Submitted to JAIR

arXiv:2503.07554 [pdf, other]

An Empirical Comparison of Cost Functions in Inductive Logic Programming

Authors: Céline Hocquette, Andrew Cropper

Abstract: Recent inductive logic programming (ILP) approaches learn optimal hypotheses. An optimal hypothesis minimises a given cost function on the training data. There are many cost functions, such as minimising training error, textual complexity, or the description length of hypotheses. However, selecting an appropriate cost function remains a key question. To address this gap, we extend a constraint-bas… ▽ More Recent inductive logic programming (ILP) approaches learn optimal hypotheses. An optimal hypothesis minimises a given cost function on the training data. There are many cost functions, such as minimising training error, textual complexity, or the description length of hypotheses. However, selecting an appropriate cost function remains a key question. To address this gap, we extend a constraint-based ILP system to learn optimal hypotheses for seven standard cost functions. We then empirically compare the generalisation error of optimal hypotheses induced under these standard cost functions. Our results on over 20 domains and 1000 tasks, including game playing, program synthesis, and image reasoning, show that, while no cost function consistently outperforms the others, minimising training error or description length has the best overall performance. Notably, our results indicate that minimising the size of hypotheses does not always reduce generalisation error. △ Less

Submitted 10 March, 2025; originally announced March 2025.

arXiv:2502.01232 [pdf, other]

Efficient rule induction by ignoring pointless rules

Authors: Andrew Cropper, David M. Cerna

Abstract: The goal of inductive logic programming (ILP) is to find a set of logical rules that generalises training examples and background knowledge. We introduce an ILP approach that identifies pointless rules. A rule is pointless if it contains a redundant literal or cannot discriminate against negative examples. We show that ignoring pointless rules allows an ILP system to soundly prune the hypothesis s… ▽ More The goal of inductive logic programming (ILP) is to find a set of logical rules that generalises training examples and background knowledge. We introduce an ILP approach that identifies pointless rules. A rule is pointless if it contains a redundant literal or cannot discriminate against negative examples. We show that ignoring pointless rules allows an ILP system to soundly prune the hypothesis space. Our experiments on multiple domains, including visual reasoning and game playing, show that our approach can reduce learning times by 99% whilst maintaining predictive accuracies. △ Less

Submitted 3 February, 2025; originally announced February 2025.

Comments: Under review for a conference

arXiv:2408.12212 [pdf]

Relational decomposition for program synthesis

Authors: Céline Hocquette, Andrew Cropper

Abstract: We introduce a relational approach to program synthesis. The key idea is to decompose synthesis tasks into simpler relational synthesis subtasks. Specifically, our representation decomposes a training input-output example into sets of input and output facts respectively. We then learn relations between the input and output facts. We demonstrate our approach using an off-the-shelf inductive logic p… ▽ More We introduce a relational approach to program synthesis. The key idea is to decompose synthesis tasks into simpler relational synthesis subtasks. Specifically, our representation decomposes a training input-output example into sets of input and output facts respectively. We then learn relations between the input and output facts. We demonstrate our approach using an off-the-shelf inductive logic programming (ILP) system on four challenging synthesis datasets. Our results show that (i) our representation can outperform a standard one, and (ii) an off-the-shelf ILP system with our representation can outperform domain-specific approaches. △ Less

Submitted 10 June, 2025; v1 submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.11530 [pdf, ps, other]

doi 10.1609/aaai.v39i14.33650

Scalable Knowledge Refactoring using Constrained Optimisation

Authors: Minghao Liu, David M. Cerna, Filipe Gouveia, Andrew Cropper

Abstract: Knowledge refactoring compresses a logic program by introducing new rules. Current approaches struggle to scale to large programs. To overcome this limitation, we introduce a constrained optimisation refactoring approach. Our first key idea is to encode the problem with decision variables based on literals rather than rules. Our second key idea is to focus on linear invented rules. Our empirical r… ▽ More Knowledge refactoring compresses a logic program by introducing new rules. Current approaches struggle to scale to large programs. To overcome this limitation, we introduce a constrained optimisation refactoring approach. Our first key idea is to encode the problem with decision variables based on literals rather than rules. Our second key idea is to focus on linear invented rules. Our empirical results on multiple domains show that our approach can refactor programs quicker and with more compression than the previous state-of-the-art approach, sometimes by 60%. △ Less

Submitted 21 August, 2024; originally announced August 2024.

arXiv:2404.19397 [pdf, other]

Can humans teach machines to code?

Authors: Céline Hocquette, Johannes Langer, Andrew Cropper, Ute Schmid

Abstract: The goal of inductive program synthesis is for a machine to automatically generate a program from user-supplied examples. A key underlying assumption is that humans can provide sufficient examples to teach a concept to a machine. To evaluate the validity of this assumption, we conduct a study where human participants provide examples for six programming concepts, such as finding the maximum elemen… ▽ More The goal of inductive program synthesis is for a machine to automatically generate a program from user-supplied examples. A key underlying assumption is that humans can provide sufficient examples to teach a concept to a machine. To evaluate the validity of this assumption, we conduct a study where human participants provide examples for six programming concepts, such as finding the maximum element of a list. We evaluate the generalisation performance of five program synthesis systems trained on input-output examples (i) from non-expert humans, (ii) from a human expert, and (iii) randomly sampled. Our results suggest that non-experts typically do not provide sufficient examples for a program synthesis system to learn an accurate program. △ Less

Submitted 17 February, 2025; v1 submitted 30 April, 2024; originally announced April 2024.

arXiv:2401.16383 [pdf, ps, other]

Learning logic programs by finding minimal unsatisfiable subprograms

Authors: Andrew Cropper, Céline Hocquette

Abstract: The goal of inductive logic programming (ILP) is to search for a logic program that generalises training examples and background knowledge. We introduce an ILP approach that identifies minimal unsatisfiable subprograms (MUSPs). We show that finding MUSPs allows us to efficiently and soundly prune the search space. Our experiments on multiple domains, including program synthesis and game playing, s… ▽ More The goal of inductive logic programming (ILP) is to search for a logic program that generalises training examples and background knowledge. We introduce an ILP approach that identifies minimal unsatisfiable subprograms (MUSPs). We show that finding MUSPs allows us to efficiently and soundly prune the search space. Our experiments on multiple domains, including program synthesis and game playing, show that our approach can reduce learning times by 99%. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.16215 [pdf, other]

Learning big logical rules by joining small rules

Authors: Céline Hocquette, Andreas Niskanen, Rolf Morel, Matti Järvisalo, Andrew Cropper

Abstract: A major challenge in inductive logic programming is learning big rules. To address this challenge, we introduce an approach where we join small rules to learn big rules. We implement our approach in a constraint-driven system and use constraint solvers to efficiently join rules. Our experiments on many domains, including game playing and drug design, show that our approach can (i) learn rules with… ▽ More A major challenge in inductive logic programming is learning big rules. To address this challenge, we introduce an approach where we join small rules to learn big rules. We implement our approach in a constraint-driven system and use constraint solvers to efficiently join rules. Our experiments on many domains, including game playing and drug design, show that our approach can (i) learn rules with more than 100 literals, and (ii) drastically outperform existing approaches in terms of predictive accuracies. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2308.09393 [pdf, ps, other]

Learning MDL logic programs from noisy data

Authors: Céline Hocquette, Andreas Niskanen, Matti Järvisalo, Andrew Cropper

Abstract: Many inductive logic programming approaches struggle to learn programs from noisy data. To overcome this limitation, we introduce an approach that learns minimal description length programs from noisy data, including recursive programs. Our experiments on several domains, including drug design, game playing, and program synthesis, show that our approach can outperform existing approaches in terms… ▽ More Many inductive logic programming approaches struggle to learn programs from noisy data. To overcome this limitation, we introduce an approach that learns minimal description length programs from noisy data, including recursive programs. Our experiments on several domains, including drug design, game playing, and program synthesis, show that our approach can outperform existing approaches in terms of predictive accuracies and scale to moderate amounts of noise. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: arXiv admin note: text overlap with arXiv:2206.01614

arXiv:2308.08334 [pdf, ps, other]

Learning logic programs by discovering higher-order abstractions

Authors: Céline Hocquette, Sebastijan Dumančić, Andrew Cropper

Abstract: We introduce the higher-order refactoring problem, where the goal is to compress a logic program by discovering higher-order abstractions, such as map, filter, and fold. We implement our approach in Stevie, which formulates the refactoring problem as a constraint optimisation problem. Our experiments on multiple domains, including program synthesis and visual reasoning, show that refactoring can i… ▽ More We introduce the higher-order refactoring problem, where the goal is to compress a logic program by discovering higher-order abstractions, such as map, filter, and fold. We implement our approach in Stevie, which formulates the refactoring problem as a constraint optimisation problem. Our experiments on multiple domains, including program synthesis and visual reasoning, show that refactoring can improve the learning performance of an inductive logic programming system, specifically improving predictive accuracies by 27% and reducing learning times by 47%. We also show that Stevie can discover abstractions that transfer to multiple domains. △ Less

Submitted 29 January, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

arXiv:2301.07629 [pdf, other]

doi 10.1609/aaai.v38i9.28915

Generalisation Through Negation and Predicate Invention

Authors: David M. Cerna, Andrew Cropper

Abstract: The ability to generalise from a small number of examples is a fundamental challenge in machine learning. To tackle this challenge, we introduce an inductive logic programming (ILP) approach that combines negation and predicate invention. Combining these two features allows an ILP system to generalise better by learning rules with universally quantified body-only variables. We implement our idea i… ▽ More The ability to generalise from a small number of examples is a fundamental challenge in machine learning. To tackle this challenge, we introduce an inductive logic programming (ILP) approach that combines negation and predicate invention. Combining these two features allows an ILP system to generalise better by learning rules with universally quantified body-only variables. We implement our idea in NOPI, which can learn normal logic programs with predicate invention, including Datalog programs with stratified negation. Our experimental results on multiple domains show that our approach can improve predictive accuracies and learning times. △ Less

Submitted 27 December, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

Comments: Accepted at AAAI-24

arXiv:2210.00764 [pdf, other]

Relational program synthesis with numerical reasoning

Authors: Céline Hocquette, Andrew Cropper

Abstract: Program synthesis approaches struggle to learn programs with numerical values. An especially difficult problem is learning continuous values over multiple examples, such as intervals. To overcome this limitation, we introduce an inductive logic programming approach which combines relational learning with numerical reasoning. Our approach, which we call NUMSYNTH, uses satisfiability modulo theories… ▽ More Program synthesis approaches struggle to learn programs with numerical values. An especially difficult problem is learning continuous values over multiple examples, such as intervals. To overcome this limitation, we introduce an inductive logic programming approach which combines relational learning with numerical reasoning. Our approach, which we call NUMSYNTH, uses satisfiability modulo theories solvers to efficiently learn programs with numerical values. Our approach can identify numerical values in linear arithmetic fragments, such as real difference logic, and from infinite domains, such as real numbers or integers. Our experiments on four diverse domains, including game playing and program synthesis, show that our approach can (i) learn programs with numerical values from linear arithmetical reasoning, and (ii) outperform existing approaches in terms of predictive accuracies and learning times. △ Less

Submitted 4 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

arXiv:2208.11656 [pdf, other]

Constraint-driven multi-task learning

Authors: Bogdan Cretu, Andrew Cropper

Abstract: Inductive logic programming is a form of machine learning based on mathematical logic that generates logic programs from given examples and background knowledge. In this project, we extend the Popper ILP system to make use of multi-task learning. We implement the state-of-the-art approach and several new strategies to improve search performance. Furthermore, we introduce constraint preservation,… ▽ More Inductive logic programming is a form of machine learning based on mathematical logic that generates logic programs from given examples and background knowledge. In this project, we extend the Popper ILP system to make use of multi-task learning. We implement the state-of-the-art approach and several new strategies to improve search performance. Furthermore, we introduce constraint preservation, a technique that improves overall performance for all approaches. Constraint preservation allows the system to transfer knowledge between updates on the background knowledge set. Consequently, we reduce the amount of repeated work performed by the system. Additionally, constraint preservation allows us to transition from the current state-of-the-art iterative deepening search approach to a more efficient breadth first search approach. Finally, we experiment with curriculum learning techniques and show their potential benefit to the field. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: 4th year undergraduate project at the University of Oxford

arXiv:2208.03238 [pdf, ps, other]

Learning programs with magic values

Authors: Céline Hocquette, Andrew Cropper

Abstract: A magic value in a program is a constant symbol that is essential for the execution of the program but has no clear explanation for its choice. Learning programs with magic values is difficult for existing program synthesis approaches. To overcome this limitation, we introduce an inductive logic programming approach to efficiently learn programs with magic values. Our experiments on diverse domain… ▽ More A magic value in a program is a constant symbol that is essential for the execution of the program but has no clear explanation for its choice. Learning programs with magic values is difficult for existing program synthesis approaches. To overcome this limitation, we introduce an inductive logic programming approach to efficiently learn programs with magic values. Our experiments on diverse domains, including program synthesis, drug design, and game playing, show that our approach can (i) outperform existing approaches in terms of predictive accuracies and learning times, (ii) learn magic values from infinite domains, such as the value of pi, and (iii) scale to domains with millions of constant symbols. △ Less

Submitted 1 October, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

arXiv:2206.01614 [pdf, ps, other]

Learning logic programs by combining programs

Authors: Andrew Cropper, Céline Hocquette

Abstract: The goal of inductive logic programming is to induce a logic program (a set of logical rules) that generalises training examples. Inducing programs with many rules and literals is a major challenge. To tackle this challenge, we introduce an approach where we learn small non-separable programs and combine them. We implement our approach in a constraint-driven ILP system. Our approach can learn opti… ▽ More The goal of inductive logic programming is to induce a logic program (a set of logical rules) that generalises training examples. Inducing programs with many rules and literals is a major challenge. To tackle this challenge, we introduce an approach where we learn small non-separable programs and combine them. We implement our approach in a constraint-driven ILP system. Our approach can learn optimal and recursive programs and perform predicate invention. Our experiments on multiple domains, including game playing and program synthesis, show that our approach can drastically outperform existing approaches in terms of predictive accuracies and learning times, sometimes reducing learning times from over an hour to a few seconds. △ Less

Submitted 17 August, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

arXiv:2202.09806 [pdf, ps, other]

Learning logic programs by discovering where not to search

Authors: Andrew Cropper, Céline Hocquette

Abstract: The goal of inductive logic programming (ILP) is to search for a hypothesis that generalises training examples and background knowledge (BK). To improve performance, we introduce an approach that, before searching for a hypothesis, first discovers where not to search. We use given BK to discover constraints on hypotheses, such as that a number cannot be both even and odd. We use the constraints to… ▽ More The goal of inductive logic programming (ILP) is to search for a hypothesis that generalises training examples and background knowledge (BK). To improve performance, we introduce an approach that, before searching for a hypothesis, first discovers where not to search. We use given BK to discover constraints on hypotheses, such as that a number cannot be both even and odd. We use the constraints to bootstrap a constraint-driven ILP system. Our experiments on multiple domains (including program synthesis and game playing) show that our approach can (i) substantially reduce learning times by up to 97%, and (ii) scale to domains with millions of facts. △ Less

Submitted 5 December, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

Comments: Preprint for AAAI23

arXiv:2109.07818 [pdf, ps, other]

Learning logic programs through divide, constrain, and conquer

Authors: Andrew Cropper

Abstract: We introduce an inductive logic programming approach that combines classical divide-and-conquer search with modern constraint-driven search. Our anytime approach can learn optimal, recursive, and large programs and supports predicate invention. Our experiments on three domains (classification, inductive general game playing, and program synthesis) show that our approach can increase predictive acc… ▽ More We introduce an inductive logic programming approach that combines classical divide-and-conquer search with modern constraint-driven search. Our anytime approach can learn optimal, recursive, and large programs and supports predicate invention. Our experiments on three domains (classification, inductive general game playing, and program synthesis) show that our approach can increase predictive accuracies and reduce learning times. △ Less

Submitted 7 December, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

Comments: Accepted for AAAI2022

arXiv:2109.07132 [pdf, ps, other]

Parallel Constraint-Driven Inductive Logic Programming

Authors: Andrew Cropper, Oghenejokpeme Orhobor, Cristian Dinu, Rolf Morel

Abstract: Multi-core machines are ubiquitous. However, most inductive logic programming (ILP) approaches use only a single core, which severely limits their scalability. To address this limitation, we introduce parallel techniques based on constraint-driven ILP where the goal is to accumulate constraints to restrict the hypothesis space. Our experiments on two domains (program synthesis and inductive genera… ▽ More Multi-core machines are ubiquitous. However, most inductive logic programming (ILP) approaches use only a single core, which severely limits their scalability. To address this limitation, we introduce parallel techniques based on constraint-driven ILP where the goal is to accumulate constraints to restrict the hypothesis space. Our experiments on two domains (program synthesis and inductive general game playing) show that (i) parallelisation can substantially reduce learning times, and (ii) worker communication (i.e. sharing constraints) is important for good performance. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: Paper under review

arXiv:2104.14426 [pdf, ps, other]

Predicate Invention by Learning From Failures

Authors: Andrew Cropper, Rolf Morel

Abstract: Discovering novel high-level concepts is one of the most important steps needed for human-level AI. In inductive logic programming (ILP), discovering novel high-level concepts is known as predicate invention (PI). Although seen as crucial since the founding of ILP, PI is notoriously difficult and most ILP systems do not support it. In this paper, we introduce POPPI, an ILP system that formulates t… ▽ More Discovering novel high-level concepts is one of the most important steps needed for human-level AI. In inductive logic programming (ILP), discovering novel high-level concepts is known as predicate invention (PI). Although seen as crucial since the founding of ILP, PI is notoriously difficult and most ILP systems do not support it. In this paper, we introduce POPPI, an ILP system that formulates the PI problem as an answer set programming problem. Our experiments show that (i) PI can drastically improve learning performance when useful, (ii) PI is not too costly when unnecessary, and (iii) POPPI can substantially outperform existing ILP systems. △ Less

Submitted 29 April, 2021; originally announced April 2021.

Comments: Rejected manuscript for IJCAI21

arXiv:2102.12551 [pdf, other]

Learning logic programs by explaining their failures

Authors: Rolf Morel, Andrew Cropper

Abstract: Scientists form hypotheses and experimentally test them. If a hypothesis fails (is refuted), scientists try to explain the failure to eliminate other hypotheses. The more precise the failure analysis the more hypotheses can be eliminated. Thus inspired, we introduce failure explanation techniques for inductive logic programming. Given a hypothesis represented as a logic program, we test it on exam… ▽ More Scientists form hypotheses and experimentally test them. If a hypothesis fails (is refuted), scientists try to explain the failure to eliminate other hypotheses. The more precise the failure analysis the more hypotheses can be eliminated. Thus inspired, we introduce failure explanation techniques for inductive logic programming. Given a hypothesis represented as a logic program, we test it on examples. If a hypothesis fails, we explain the failure in terms of failing sub-programs. In case a positive example fails, we identify failing sub-programs at the granularity of literals. We introduce a failure explanation algorithm based on analysing branches of SLD-trees. We integrate a meta-interpreter based implementation of this algorithm with the test-stage of the Popper ILP system. We show that fine-grained failure analysis allows for learning fine-grained constraints on the hypothesis space. Our experimental results show that explaining failures can drastically reduce hypothesis space exploration and learning times. △ Less

Submitted 24 May, 2023; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: 26 pages; under review at the Machine Learning journal since February 2022

arXiv:2102.10556 [pdf, other]

Inductive logic programming at 30

Authors: Andrew Cropper, Sebastijan Dumančić, Richard Evans, Stephen H. Muggleton

Abstract: Inductive logic programming (ILP) is a form of logic-based machine learning. The goal is to induce a hypothesis (a logic program) that generalises given training examples. As ILP turns 30, we review the last decade of research. We focus on (i) new meta-level search methods, (ii) techniques for learning recursive programs, (iii) new approaches for predicate invention, and (iv) the use of different… ▽ More Inductive logic programming (ILP) is a form of logic-based machine learning. The goal is to induce a hypothesis (a logic program) that generalises given training examples. As ILP turns 30, we review the last decade of research. We focus on (i) new meta-level search methods, (ii) techniques for learning recursive programs, (iii) new approaches for predicate invention, and (iv) the use of different technologies. We conclude by discussing current limitations of ILP and directions for future research. △ Less

Submitted 22 September, 2021; v1 submitted 21 February, 2021; originally announced February 2021.

Comments: Extension of IJCAI20 survey paper. Accepted for the MLJ. arXiv admin note: substantial text overlap with arXiv:2002.11002, arXiv:2008.07912

arXiv:2008.07912 [pdf, ps, other]

Inductive logic programming at 30: a new introduction

Authors: Andrew Cropper, Sebastijan Dumančić

Abstract: Inductive logic programming (ILP) is a form of machine learning. The goal of ILP is to induce a hypothesis (a set of logical rules) that generalises training examples. As ILP turns 30, we provide a new introduction to the field. We introduce the necessary logical notation and the main learning settings; describe the building blocks of an ILP system; compare several systems on several dimensions; d… ▽ More Inductive logic programming (ILP) is a form of machine learning. The goal of ILP is to induce a hypothesis (a set of logical rules) that generalises training examples. As ILP turns 30, we provide a new introduction to the field. We introduce the necessary logical notation and the main learning settings; describe the building blocks of an ILP system; compare several systems on several dimensions; describe four systems (Aleph, TILDE, ASPAL, and Metagol); highlight key application areas; and, finally, summarise current limitations and directions for future research. △ Less

Submitted 22 March, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: Preprint of a paper accepted for JAIR

arXiv:2005.02259 [pdf, ps, other]

Learning programs by learning from failures

Authors: Andrew Cropper, Rolf Morel

Abstract: We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and constrain. In the generate stage, the learner generates a hypothesis (a logic program) that satisfies a set of hypothesis constraints (constraints on the syntactic form of hypotheses).… ▽ More We describe an inductive logic programming (ILP) approach called learning from failures. In this approach, an ILP system (the learner) decomposes the learning problem into three separate stages: generate, test, and constrain. In the generate stage, the learner generates a hypothesis (a logic program) that satisfies a set of hypothesis constraints (constraints on the syntactic form of hypotheses). In the test stage, the learner tests the hypothesis against training examples. A hypothesis fails when it does not entail all the positive examples or entails a negative example. If a hypothesis fails, then, in the constrain stage, the learner learns constraints from the failed hypothesis to prune the hypothesis space, i.e. to constrain subsequent hypothesis generation. For instance, if a hypothesis is too general (entails a negative example), the constraints prune generalisations of the hypothesis. If a hypothesis is too specific (does not entail all the positive examples), the constraints prune specialisations of the hypothesis. This loop repeats until either (i) the learner finds a hypothesis that entails all the positive and none of the negative examples, or (ii) there are no more hypotheses to test. We introduce Popper, an ILP system that implements this approach by combining answer set programming and Prolog. Popper supports infinite problem domains, reasoning about lists and numbers, learning textually minimal programs, and learning recursive programs. Our experimental results on three domains (toy game problems, robot strategies, and list transformations) show that (i) constraints drastically improve learning performance, and (ii) Popper can outperform existing ILP systems, both in terms of predictive accuracies and learning times. △ Less

Submitted 25 November, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: Accepted for the machine learning journal

arXiv:2004.09931 [pdf, other]

Knowledge Refactoring for Inductive Program Synthesis

Authors: Sebastijan Dumancic, Tias Guns, Andrew Cropper

Abstract: Humans constantly restructure knowledge to use it more efficiently. Our goal is to give a machine learning system similar abilities so that it can learn more efficiently. We introduce the \textit{knowledge refactoring} problem, where the goal is to restructure a learner's knowledge base to reduce its size and to minimise redundancy in it. We focus on inductive logic programming, where the knowledg… ▽ More Humans constantly restructure knowledge to use it more efficiently. Our goal is to give a machine learning system similar abilities so that it can learn more efficiently. We introduce the \textit{knowledge refactoring} problem, where the goal is to restructure a learner's knowledge base to reduce its size and to minimise redundancy in it. We focus on inductive logic programming, where the knowledge base is a logic program. We introduce Knorf, a system which solves the refactoring problem using constraint optimisation. We evaluate our approach on two program induction domains: real-world string transformations and building Lego structures. Our experiments show that learning from refactored knowledge can improve predictive accuracies fourfold and reduce learning times by half. △ Less

Submitted 24 November, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

Comments: 7 pages, 6 figures

arXiv:2004.09855 [pdf, ps, other]

Learning large logic programs by going beyond entailment

Authors: Andrew Cropper, Sebastijan Dumančić

Abstract: A major challenge in inductive logic programming (ILP) is learning large programs. We argue that a key limitation of existing systems is that they use entailment to guide the hypothesis search. This approach is limited because entailment is a binary decision: a hypothesis either entails an example or does not, and there is no intermediate position. To address this limitation, we go beyond entailme… ▽ More A major challenge in inductive logic programming (ILP) is learning large programs. We argue that a key limitation of existing systems is that they use entailment to guide the hypothesis search. This approach is limited because entailment is a binary decision: a hypothesis either entails an example or does not, and there is no intermediate position. To address this limitation, we go beyond entailment and use \emph{example-dependent} loss functions to guide the search, where a hypothesis can partially cover an example. We implement our idea in Brute, a new ILP system which uses best-first search, guided by an example-dependent loss function, to incrementally build programs. Our experiments on three diverse program synthesis domains (robot planning, string transformations, and ASCII art), show that Brute can substantially outperform existing ILP systems, both in terms of predictive accuracies and learning times, and can learn programs 20 times larger than state-of-the-art systems. △ Less

Submitted 22 April, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

Comments: IJCAI2020 paper

arXiv:2002.11002 [pdf, ps, other]

Turning 30: New Ideas in Inductive Logic Programming

Authors: Andrew Cropper, Sebastijan Dumančić, Stephen H. Muggleton

Abstract: Common criticisms of state-of-the-art machine learning include poor generalisation, a lack of interpretability, and a need for large amounts of training data. We survey recent work in inductive logic programming (ILP), a form of machine learning that induces logic programs from data, which has shown promise at addressing these limitations. We focus on new methods for learning recursive programs th… ▽ More Common criticisms of state-of-the-art machine learning include poor generalisation, a lack of interpretability, and a need for large amounts of training data. We survey recent work in inductive logic programming (ILP), a form of machine learning that induces logic programs from data, which has shown promise at addressing these limitations. We focus on new methods for learning recursive programs that generalise from few examples, a shift from using hand-crafted background knowledge to \emph{learning} background knowledge, and the use of different technologies, notably answer set programming and neural networks. As ILP approaches 30, we also discuss directions for future research. △ Less

Submitted 22 April, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: IJCAI2020 survey paper

arXiv:1911.06643 [pdf, ps, other]

Forgetting to learn logic programs

Authors: Andrew Cropper

Abstract: Most program induction approaches require predefined, often hand-engineered, background knowledge (BK). To overcome this limitation, we explore methods to automatically acquire BK through multi-task learning. In this approach, a learner adds learned programs to its BK so that they can be reused to help learn other programs. To improve learning performance, we explore the idea of forgetting, where… ▽ More Most program induction approaches require predefined, often hand-engineered, background knowledge (BK). To overcome this limitation, we explore methods to automatically acquire BK through multi-task learning. In this approach, a learner adds learned programs to its BK so that they can be reused to help learn other programs. To improve learning performance, we explore the idea of forgetting, where a learner can additionally remove programs from its BK. We consider forgetting in an inductive logic programming (ILP) setting. We show that forgetting can significantly reduce both the size of the hypothesis space and the sample complexity of an ILP learner. We introduce Forgetgol, a multi-task ILP learner which supports forgetting. We experimentally compare Forgetgol against approaches that either remember or forget everything. Our experimental results show that Forgetgol outperforms the alternative approaches when learning from over 10,000 tasks. △ Less

Submitted 15 November, 2019; originally announced November 2019.

Comments: AAAI20

arXiv:1907.10953 [pdf, other]

Learning higher-order logic programs

Authors: Andrew Cropper, Rolf Morel, Stephen H. Muggleton

Abstract: A key feature of inductive logic programming (ILP) is its ability to learn first-order programs, which are intrinsically more expressive than propositional programs. In this paper, we introduce techniques to learn higher-order programs. Specifically, we extend meta-interpretive learning (MIL) to support learning higher-order programs by allowing for \emph{higher-order definitions} to be used as ba… ▽ More A key feature of inductive logic programming (ILP) is its ability to learn first-order programs, which are intrinsically more expressive than propositional programs. In this paper, we introduce techniques to learn higher-order programs. Specifically, we extend meta-interpretive learning (MIL) to support learning higher-order programs by allowing for \emph{higher-order definitions} to be used as background knowledge. Our theoretical results show that learning higher-order programs, rather than first-order programs, can reduce the textual complexity required to express programs which in turn reduces the size of the hypothesis space and sample complexity. We implement our idea in two new MIL systems: the Prolog system \namea{} and the ASP system \nameb{}. Both systems support learning higher-order programs and higher-order predicate invention, such as inventing functions for \tw{map/3} and conditions for \tw{filter/3}. We conduct experiments on four domains (robot strategies, chess playing, list transformations, and string decryption) that compare learning first-order and higher-order programs. Our experimental results support our theoretical claims and show that, compared to learning first-order programs, learning higher-order programs can significantly improve predictive accuracies and reduce learning times. △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: Submitted to the MLJ

arXiv:1907.10952 [pdf, ps, other]

Logical reduction of metarules

Authors: Andrew Cropper, Sophie Tourret

Abstract: Many forms of inductive logic programming (ILP) use \emph{metarules}, second-order Horn clauses, to define the structure of learnable programs and thus the hypothesis space. Deciding which metarules to use for a given learning task is a major open problem and is a trade-off between efficiency and expressivity: the hypothesis space grows given more metarules, so we wish to use fewer metarules, but… ▽ More Many forms of inductive logic programming (ILP) use \emph{metarules}, second-order Horn clauses, to define the structure of learnable programs and thus the hypothesis space. Deciding which metarules to use for a given learning task is a major open problem and is a trade-off between efficiency and expressivity: the hypothesis space grows given more metarules, so we wish to use fewer metarules, but if we use too few metarules then we lose expressivity. In this paper, we study whether fragments of metarules can be logically reduced to minimal finite subsets. We consider two traditional forms of logical reduction: subsumption and entailment. We also consider a new reduction technique called \emph{derivation reduction}, which is based on SLD-resolution. We compute reduced sets of metarules for fragments relevant to ILP and theoretically show whether these reduced sets are reductions for more general infinite fragments. We experimentally compare learning with reduced sets of metarules on three domains: Michalski trains, string transformations, and game rules. In general, derivation reduced sets of metarules outperforms subsumption and entailment reduced sets, both in terms of predictive accuracies and learning times. △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: MLJ submission

arXiv:1906.09627 [pdf, other]

Inductive general game playing

Authors: Andrew Cropper, Richard Evans, Mark Law

Abstract: General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over a… ▽ More General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central to inductive general game playing (IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such as Sudoku, Sokoban, and Checkers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research. △ Less

Submitted 23 June, 2019; originally announced June 2019.

Comments: Accepted for the Machine Learning journal

arXiv:1904.08993 [pdf, ps, other]

Playgol: learning programs through play

Authors: Andrew Cropper

Abstract: Children learn though play. We introduce the analogous idea of learning programs through play. In this approach, a program induction system (the learner) is given a set of tasks and initial background knowledge. Before solving the tasks, the learner enters an unsupervised playing stage where it creates its own tasks to solve, tries to solve them, and saves any solutions (programs) to the backgroun… ▽ More Children learn though play. We introduce the analogous idea of learning programs through play. In this approach, a program induction system (the learner) is given a set of tasks and initial background knowledge. Before solving the tasks, the learner enters an unsupervised playing stage where it creates its own tasks to solve, tries to solve them, and saves any solutions (programs) to the background knowledge. After the playing stage is finished, the learner enters the supervised building stage where it tries to solve the user-supplied tasks and can reuse solutions learnt whilst playing. The idea is that playing allows the learner to discover reusable general programs on its own which can then help solve the user-supplied tasks. We claim that playing can improve learning performance. We show that playing can reduce the textual complexity of target concepts which in turn reduces the sample complexity of a learner. We implement our idea in Playgol, a new inductive logic programming system. We experimentally test our claim on two domains: robot planning and real-world string transformations. Our experimental results suggest that playing can substantially improve learning performance. We think that the idea of playing (or, more verbosely, unsupervised bootstrapping for supervised program induction) is an important contribution to the problem of developing program induction approaches that self-discover BK. △ Less

Submitted 20 May, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

Comments: IJCAI 2019

arXiv:1902.09900 [pdf, other]

SLD-Resolution Reduction of Second-Order Horn Fragments -- technical report --

Authors: Sophie Tourret, Andrew Cropper

Abstract: We present the derivation reduction problem for SLD-resolution, the undecidable problem of finding a finite subset of a set of clauses from which the whole set can be derived using SLD-resolution. We study the reducibility of various fragments of second-order Horn logic with particular applications in Inductive Logic Programming. We also discuss how these results extend to standard resolution. We present the derivation reduction problem for SLD-resolution, the undecidable problem of finding a finite subset of a set of clauses from which the whole set can be derived using SLD-resolution. We study the reducibility of various fragments of second-order Horn logic with particular applications in Inductive Logic Programming. We also discuss how these results extend to standard resolution. △ Less

Submitted 26 February, 2019; originally announced February 2019.

Comments: technical report, extends a conference paper accepted at JELIA 2019 with detailed proofs

Showing 1–32 of 32 results for author: Cropper, A