E 2.4pre

Stephan Schulz
DHBW Stuttgart, Germany

Architecture

E 2.4pre [Sch2002, Sch2013, SCV2019] is a purely equational theorem prover for many-sorted first-order logic with equality. It consists of an (optional) clausifier for pre-processing full first-order formulae into clausal form, and a saturation algorithm implementing an instance of the superposition calculus with negative literal selection and a number of redundancy elimination techniques. E is based on the DISCOUNT-loop variant of the given-clause algorithm, i.e., a strict separation of active and passive facts. No special rules for non-equational literals have been implemented. Resolution is effectively simulated by paramodulation and equality resolution. As of E 2.1, PicoSAT [Bie2008] can be used to periodically check the (on-the-fly grounded) proof state for propositional unsatisfiability.

For the LTB divisions, a control program uses a SInE-like analysis to extract reduced axiomatizations that are handed to several instances of E. E will not use on-the-fly learning this year.

Strategies

Proof search in E is primarily controlled by a literal selection strategy, a clause selection heuristic, and a simplification ordering. The prover supports a large number of pre-programmed literal selection strategies. Clause selection heuristics can be constructed on the fly by combining various parameterized primitive evaluation functions, or can be selected from a set of predefined heuristics. Clause evaluation heuristics are based on symbol-counting, but also take other clause properties into account. In particular, the search can prefer clauses from the set of support, or containing many symbols also present in the goal. Supported term orderings are several parameterized instances of Knuth-Bendix-Ordering (KBO) and Lexicographic Path Ordering (LPO), which can be lifted in different ways to literal orderings.

For CASC-27, E implements a strategy-scheduling automatic mode. The total CPU time available is broken into several (unequal) time slices. For each time slice, the problem is classified into one of several classes, based on a number of simple features (number of clauses, maximal symbol arity, presence of equality, presence of non-unit and non-Horn clauses,...). For each class, a schedule of strategies is greedily constructed from experimental data as follows: The first strategy assigned to a schedule is the the one that solves the most problems from this class in the first time slice. Each subsequent strategy is selected based on the number of solutions on problems not already solved by a preceding strategy.

About 230 different strategies have been thoroughly evaluated on all untyped first-order problems from TPTP 7.2.0. In addition, we have explored some parts of the heuristic parameter space with a short time limit of 5 seconds. This allowed us to test about 650 strategies on all TPTP problems, and an extra 7000 strategies on all 1193 UEQ problems from TPTP 7.2.0. About 100 of these strategies are used in the automatic mode, and about 450 are used in at least one schedule.

Implementation

E is build around perfectly shared terms, i.e. each distinct term is only represented once in a term bank. The whole set of terms thus consists of a number of interconnected directed acyclic graphs. Term memory is managed by a simple mark-and-sweep garbage collector. Unconditional (forward) rewriting using unit clauses is implemented using perfect discrimination trees with size and age constraints. Whenever a possible simplification is detected, it is added as a rewrite link in the term bank. As a result, not only terms, but also rewrite steps are shared. Subsumption and contextual literal cutting (also known as subsumption resolution) is supported using feature vector indexing [Sch2013a]. Superposition and backward rewriting use fingerprint indexing [Sch2012], a new technique combining ideas from feature vector indexing and path indexing. Finally, LPO and KBO are implemented using the elegant and efficient algorithms developed by Bernd Löchner in [Loe2006, Loe2006a]. The prover and additional information are available at
    https://www.eprover.org

Expected Competition Performance

The inference core of E 2.4pre has been slightly simplified since last years pre-release. We have also been able to evaluate more different search strategies, in particular those incorporating PicoSAT. As a result, we expect performance to be somewhat better than in the last years. The system is expected to perform well in most proof classes, but will at best complement top systems in the disproof classes.

References

SCV2019
Schulz S., Cruanes, S., Vukmirovic, P., (2019), Faster, Higher, Stronger: E 2.3, Proc. of the 27th CADE, Natal, LNAI 11716, Springer (to appear)
Sch2013
Schulz S. (2013), System Description: E 1.8, Proc. of the 19th LPAR, Stellenbosch, LNCS 8312, pp.735-743, Springer
Sch2002
Schulz S. (2002), E: A Brainiac Theorem Prover, Journal of AI Communications 15(2/3), pp.111-126, IOS Press
Sch2013a
Schulz S. (2013), Simple and Efficient Clause Subsumption with Feature Vector Indexing, Automated Reasoning and Mathematics: Essays in Memory of William W. McCune, LNAI 7788, pp. 45-67, Springer
Sch2012
Schulz S. (2012), Fingerprint Indexing for Paramodulation and Rewriting, Proceedings of the 6th IJCAR (Manchester, UK), LNAI 7364, pp.477-483, Springer
Loe2006
Löchner B. (2004), Things to Know when Implementing LPO, International Journal on Artificial Intelligence Tools, 15(1), pp.53–80, 2006.
Loe2006a
Löchner B. (2006), Things to Know when Implementing KBO, Journal of Automated Reasoning 36(4), pp.289-310.
Bie2008
Biere A. (2008), PicoSAT essentials, Journal on Satisfiability, Boolean Modeling and Computation 36, pp.75-97, 2008