Princess Benchmarks

TPTP Performance (June 2012)

Princess entered the CASC J6 competition, in the divisions TFA (arithmetic problems), FOF (first-order), and FOF@Turing, and won in the TFA division. The results altogether were:

AUFLIA Benchmarks (November 2011)

We have evaluated the performance of Princess on benchmarks containing arrays, uninterpreted functions, integer arithmetic, and quantifiers (AUFLIA) from the SMT competition 2011, in the categories AUFLIA+p and AUFLIA-p. The difference between AUFLIA+p and AUFLIA-p is that the problems of the former kind contain user-specified triggers (as part of quantified sub-formulae), while such triggers have been removed from benchmarks of the latter category. We used the actual scrambled benchmarks from the competition.

Results of the experiments (as an Excel spreadsheet), which were done on an Intel Core i5 2-core machine with 3.2GHz, heap-space limited to 4GB, running Linux, with a timeout of 1200s; the conditions are similar as for the actual SMT competition, so that the results should be roughly comparable. Princess would have ended up on the second place in the competition, before CVC3 but after Z3.

Interpolation Benchmarks (May 2011)

We have compared the performance of the interpolation procedure integrated into Princess with the following tools:

The benchmarks for our experiments are drawn from different families of the SMT-LIB category QF-LIA; conversion to the Princess format is done using the converter linked further down this page. Because SMT-LIB benchmarks are usually conjunctions at the outermost level, we partitioned them into A ∧ B by choosing the first 10 · n of the benchmark conjuncts as A, the rest as B (where n is the total number of conjuncts, and k ∈ {1, ... , 9}). This yields 9 interpolation problems for each SMT-LIB benchmark.

Results of the experiments (as an Excel spreadsheet), which were done on an Intel Xeon X5667 4-core machine with 3.07GHz, heap-space limited to 12GB, running Linux, with a timeout of 900s.

This partitioning is done by the script princessBench that can be used in combination with the binary distribution of Princess. The script can be used in the following ways:

./princessBench <inputfile.pri>                     # Partition the given problem, solve it and compute interpolants
./princessBench -mode=qebased <inputfile.pri> # Partition the given problem and compute interpolants using
# quantifier elimination
./princessBench -mode=smtdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the SMT-LIB 1 format (targetting OpenSMT and SMTInterpol)
./princessBench -mode=csisatdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the CSIsat format

Older Benchmarks (October 2008)

Some results of running Princess on benchmarks in the categories QF_LIA and QF_IDL (quantifier-free formulae over linear integer arithmetic) of SMT-LIB. Although Princess is not really designed for the problems in those categories, it performs reasonably well on most of the benchmarks. (the problems in some of the directories, for instance in nec-smt, seem to too big or require more efficient propositional reasoning than Princess currently provides, however).

Directory solved/total Directory solved/total Directory solved/total Directory solved/total
Averest 10/19 nec-smt/small 17/35 Averest 195/252 planning 5/45
CIRC 33/51 nec-smt/med 3/364 cellar 0/14 qlock 0/72
RTCL 2/2 nec-smt/large ??? check 3/3 queens_bench 53/297
check 5/5 rings 53/293 diamonds 18/36 RTCL 30/33
mathsat 55/121 wisa 4/5 DTP 0/60 sal 27/50

mathsat 96/146 sep 17/17

parity 34/248

Run on Dual Core AMD Opteron 270, 2 GHz
Heapspace limited to 1.5GB
Timeout: 1000s
SMT-LIB converter (based on SMT-LIB parser 2.0
by Mike Decoster and Michael Schidlowsky)