We have evaluated the performance of Princess on benchmarks containing
arrays, uninterpreted functions, integer arithmetic, and quantifiers (AUFLIA) from the SMT
competition 2011, in the categories AUFLIA+p
and AUFLIA-p.
The
difference
between AUFLIA+p and AUFLIA-p is that the problems of
the former kind contain user-specified triggers (as part of quantified
sub-formulae), while such triggers have been removed from benchmarks of
the latter category. We used the actual scrambled benchmarks from the
competition.
Results of the
experiments
(as an Excel spreadsheet), which were done on an Intel Core i5 2-core
machine with 3.2GHz, heap-space limited to 4GB, running Linux, with a
timeout of 1200s; the conditions are similar as for the actual SMT
competition, so that the results should be roughly comparable. Princess
would have ended up on the second place in the competition, before CVC3
but after Z3.
We have compared the performance of the interpolation procedure
integrated into Princess with
the following tools:
Results of the experiments (as an Excel spreadsheet), which were done on an Intel Xeon X5667 4-core machine with 3.07GHz, heap-space limited to 12GB, running Linux, with a timeout of 900s.
This partitioning is done by the script princessBench
that can
be used in combination with the binary
distribution of Princess. The script can be used in the
following ways:
./princessBench <inputfile.pri> # Partition the given problem, solve it and compute interpolants
./princessBench -mode=qebased <inputfile.pri> # Partition the given problem and compute interpolants using
# quantifier elimination
./princessBench -mode=smtdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the SMT-LIB 1 format (targetting OpenSMT and SMTInterpol)
./princessBench -mode=csisatdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the CSIsat format
Some results of running Princess on benchmarks in
the categories QF_LIA
and QF_IDL (quantifier-free formulae over linear integer arithmetic)
of SMT-LIB. Although
Princess is not
really designed for the problems in those categories, it performs
reasonably well on most
of the benchmarks. (the problems in some of the directories, for
instance in
nec-smt
, seem to too big or
require more efficient propositional reasoning than Princess currently
provides, however).
QF_LIA | QF_IDL | ||||||
---|---|---|---|---|---|---|---|
Directory | solved/total | Directory | solved/total | Directory | solved/total | Directory | solved/total |
Averest | 10/19 | nec-smt/small | 17/35 | Averest | 195/252 | planning | 5/45 |
CIRC | 33/51 | nec-smt/med | 3/364 | cellar | 0/14 | qlock | 0/72 |
RTCL | 2/2 | nec-smt/large | ??? | check | 3/3 | queens_bench | 53/297 |
check | 5/5 | rings | 53/293 | diamonds | 18/36 | RTCL | 30/33 |
mathsat | 55/121 | wisa | 4/5 | DTP | 0/60 | sal | 27/50 |
mathsat | 96/146 | sep | 17/17 | ||||
parity | 34/248 |
Run on Dual Core AMD Opteron 270, 2 GHz
Heapspace limited to 1.5GB
Timeout: 1000s
SMT-LIB converter
(based on SMT-LIB parser 2.0
by Mike Decoster and Michael Schidlowsky)