For the LTB divisions, a control program uses a SInE-like analysis to
extract reduced axiomatizations that are handed to several instances
of E. E will not use on-the-fly learning this year.
For CASC-27, E implements a strategy-scheduling automatic mode. The
total CPU time available is broken into several (unequal) time
slices. For each time slice, the problem is classified into one of
several classes, based on a number of simple features (number of
clauses, maximal symbol arity, presence of equality, presence of
non-unit and non-Horn clauses,...). For each class, a schedule of
strategies is greedily constructed from experimental data as follows:
The first strategy assigned to a schedule is the the one that solves
the most problems from this class in the first time slice. Each
subsequent strategy is selected based on the number of solutions on
problems not already solved by a preceding strategy.
About 230 different strategies have been thoroughly evaluated on all
untyped first-order problems from TPTP 7.2.0. In addition, we have
explored some parts of the heuristic parameter space with a short time
limit of 5 seconds. This allowed us to test about 650 strategies on
all TPTP problems, and an extra 7000 strategies on all 1193 UEQ
problems from TPTP 7.2.0.
About 100 of these strategies are used in the automatic mode, and
about 450 are used in at least one schedule.
E 2.4pre
Stephan Schulz
DHBW Stuttgart, Germany
Architecture
E 2.4pre [Sch2002, Sch2013, SCV2019] is a purely equational theorem
prover for many-sorted first-order logic with equality. It consists
of an (optional) clausifier for pre-processing full first-order
formulae into clausal form, and a saturation algorithm implementing an
instance of the superposition calculus with negative literal selection
and a number of redundancy elimination techniques. E is based on the
DISCOUNT-loop variant of the given-clause algorithm, i.e., a
strict separation of active and passive facts. No special rules for
non-equational literals have been implemented. Resolution is
effectively simulated by paramodulation and equality resolution. As of
E 2.1, PicoSAT [Bie2008] can be used to periodically check the
(on-the-fly grounded) proof state for propositional unsatisfiability.
Strategies
Proof search in E is primarily controlled by a literal selection
strategy, a clause selection heuristic, and a simplification
ordering. The prover supports a large number of pre-programmed literal
selection strategies. Clause selection heuristics can be constructed
on the fly by combining various parameterized primitive evaluation
functions, or can be selected from a set of predefined
heuristics. Clause evaluation heuristics are based on symbol-counting,
but also take other clause properties into account. In particular, the
search can prefer clauses from the set of support, or containing many
symbols also present in the goal. Supported term orderings are several
parameterized instances of Knuth-Bendix-Ordering (KBO) and
Lexicographic Path Ordering (LPO), which can be lifted in different
ways to literal orderings.
Implementation
E is build around perfectly shared terms, i.e. each distinct term is
only represented once in a term bank. The whole set of terms thus
consists of a number of interconnected directed acyclic graphs. Term
memory is managed by a simple mark-and-sweep garbage collector.
Unconditional (forward) rewriting using unit clauses is implemented
using perfect discrimination trees with size and age constraints.
Whenever a possible simplification is detected, it is added as a
rewrite link in the term bank. As a result, not only terms, but also
rewrite steps are shared. Subsumption and contextual literal cutting
(also known as subsumption resolution) is supported using feature
vector indexing [Sch2013a]. Superposition and backward rewriting use
fingerprint indexing [Sch2012], a new technique combining ideas from
feature vector indexing and path indexing. Finally, LPO and KBO are
implemented using the elegant and efficient algorithms developed by
Bernd Löchner in [Loe2006, Loe2006a]. The prover and additional
information are available at
https://www.eprover.org
Expected Competition Performance
The inference core of E 2.4pre has been slightly simplified since last
years pre-release. We have also been able to evaluate more different
search strategies, in particular those incorporating PicoSAT. As a
result, we expect performance to be somewhat better than in the last
years. The system is expected to perform well in most proof classes,
but will at best complement top systems in the disproof classes.