Princess Benchmarks

TPTP Performance (June 2012)

CASC J6

TFA: Winner, place 1/3

FOF: place 12/16

AUFLIA Benchmarks (November 2011)

We have evaluated the performance of Princess on benchmarks containing arrays, uninterpreted functions, integer arithmetic, and quantifiers (AUFLIA) from the SMT competition 2011, in the categories AUFLIA+p and AUFLIA-p. The difference between AUFLIA+p and AUFLIA-p is that the problems of the former kind contain user-specified triggers (as part of quantified sub-formulae), while such triggers have been removed from benchmarks of the latter category. We used the actual scrambled benchmarks from the competition.

Results of the experiments (as an Excel spreadsheet), which were done on an Intel Core i5 2-core machine with 3.2GHz, heap-space limited to 4GB, running Linux, with a timeout of 1200s; the conditions are similar as for the actual SMT competition, so that the results should be roughly comparable. Princess would have ended up on the second place in the competition, before CVC3 but after Z3.

Interpolation Benchmarks (May 2011)

We have compared the performance of the interpolation procedure integrated into Princess with the following tools:

the SMT-solver SMTInterpol, a recently released interpolating decision procedure for linear integer arithmetic,

an interpolating version of the SMT-solver OpenSMT that we developed in the scope of our LPAR paper,

the interpolation procedure CSIsat for linear rational arithmetic and uninterpreted functions,

quantifier elimination (QE) procedures, which can be used to generate interpolants by existentially quantifying the variables of the left partition; for our experiments, we use the implementation of the Omega test available in Princess itself.

SMT-LIB

Results of the experiments (as an Excel spreadsheet), which were done on an Intel Xeon X5667 4-core machine with 3.07GHz, heap-space limited to 12GB, running Linux, with a timeout of 900s.

This partitioning is done by the script princessBench that can be used in combination with the binary distribution of Princess. The script can be used in the following ways:

./princessBench <inputfile.pri> # Partition the given problem, solve it and compute interpolants
./princessBench -mode=qebased <inputfile.pri> # Partition the given problem and compute interpolants using
# quantifier elimination
./princessBench -mode=smtdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the SMT-LIB 1 format (targetting OpenSMT and SMTInterpol)
./princessBench -mode=csisatdump <inputfile.pri> # Partition the given problem and output interpolation problems
# in the CSIsat format

Older Benchmarks (October 2008)

Some results of running Princess on benchmarks in the categories QF_LIA and QF_IDL (quantifier-free formulae over linear integer arithmetic) of SMT-LIB. Although Princess is not really designed for the problems in those categories, it performs reasonably well on most of the benchmarks. (the problems in some of the directories, for instance in nec-smt, seem to too big or require more efficient propositional reasoning than Princess currently provides, however).

QF_LIA	QF_IDL
Directory	solved/total	Directory	solved/total	Directory	solved/total	Directory	solved/total
Averest	10/19	nec-smt/small	17/35	Averest	195/252	planning	5/45
CIRC	33/51	nec-smt/med	3/364	cellar	0/14	qlock	0/72
RTCL	2/2	nec-smt/large	???	check	3/3	queens_bench	53/297
check	5/5	rings	53/293	diamonds	18/36	RTCL	30/33
mathsat	55/121	wisa	4/5	DTP	0/60	sal	27/50
				mathsat	96/146	sep	17/17
				parity	34/248

QF_LIA

QF_IDL

Directory

solved/total

Directory

solved/total

Directory

solved/total

Directory

solved/total

???

Run on Dual Core AMD Opteron 270, 2 GHz
Heapspace limited to 1.5GB
Timeout: 1000s
SMT-LIB converter (based on SMT-LIB parser 2.0
by Mike Decoster and Michael Schidlowsky)