4
Performance components Hardware platform Data structures Algebraic optimizer SQL parser Application code –What is the total cost of execution ? –How many tasks can be performed/minute ? –How good is the optimizer? –What is the overhead of the datastructures ?

15
Suites of tasks used to quantify the performance of software systems Important in comparing database systems, especially as systems become more standards compliant. Commonly used performance measures: –Throughput (transactions per second, or tps) –Response time (delay from submission of transaction to return of result) –Availability or mean time to failure

16
Benchmark design Benchmark suite structures –Simple, one shot experiment time to set-up a connection with a db server Selection processing for multiple selectivity factors –Complex, multi-target experiments Geared at supporting a particular domain To study the behavior of the DBMS software

20
Case study: Wisconsin benchmark Designed in 1981 to study the query performance of database algorithms on a single user system Used extensively over the last 20 years to assess maturity of a kernel The results published caused legal problems for the authors

28
Case study: Wisconsin Benchmark The speed at which a database system can process a selection operation depends on a number of factors including: 1) The storage organization of the relation. 2) The selectivity factor of the predicate. 3) The hardware speed and the quality of the software. 4) The output mode of the query.

29
Case study: Wisconsin Benchmark The selection queries in the Wisconsin benchmark explore the effect of each and the impact of three different storage organizations : 1) Sequential (heap) organization. 2) Primary clustered index on the unique2 attribute. (Relation is sorted on unique2 attribute) 3) Secondary, dense, non-clustered indices on the unique1 and onePercent attributes.

32
Case study: Wisconsin Benchmark The join queries in the benchmark were designed to study the effect of three different factors: 1) The impact of the complexity of a query on the relative performance of the different database systems. 2) The performance of the join algorithms used by the different systems. 3) The effectiveness of the query optimizers on complex queries.

33
Case study: Wisconsin Benchmark JoinABprime - a simple join of relations A and Bprime where the cardinality of the Bprime relation is 10% that of the A relation. JoinASelB - this query is composed of one join and one selection. A and B have the same number of tuples. The selection on B has a 10% selectivity factor, reducing B to the size of the Bprime relation in the JoinABprime query. The result relation for this query has the same number of tuples as the corresponding JoinABprime query.

37
Case study: Wisconsin benchmark Implementation of the projection operation is normally done in two phases in the general case. First a pass is made through the source relation to discard unwanted attributes. A second phase is necessary in to eliminate any duplicate tuples that may have been introduced as a side effect of the first phase (i.e. elimination of an attribute which is the key or some part of the key).

39
AS3AP Benchmark ANSI SQL Standard Scalable and Portable (AS3AP) benchmark for relational database systems. It is designed to: provide a comprehensive but tractable set of tests for database processing power. have built in scalability and portability, so that it can be used to test a broad range of systems. minimize human effort in implementing and running the benchmark tests. provide a uniform metric, the equivalent database ratio, for a straightforward and non-ambiguous interpretation of the benchmark results.

40
AS3AP Benchmark the AS3AP benchmark determines an equivalent database size, which is the maximum size of the AS3AP database for which the system is able to perform the designated AS3AP set of single and multiuser tests in under 12 hours.

44
Conclusions WISC & AS3AP Observation: Database system performance differ widely A benchmark suite is a collection of database tasks which should have a precisely articulated goal should be minimal should be scalable should have an associated metric

45
Performance Benchmarks Commonly used performance measures: –Response time (delay from submission of transaction to return of result) –Throughput (transactions per second, or tps) –Availability or mean time to failure –Speedup (linear->twice as much resources reduces time half) –Scaleup (response time remains constant with increasing load and resources) –Sizeup (doubling the size does not double required resources)

46
Issues in Metric Selection Response time metric –Do you measure arrival of first or last result? How to deal with outliers ? How to avoid tampering of results ? How to catch complex behaviour in a single number?

47
Beware when computing average throughput of different transaction types –E.g., suppose a system runs transaction type A at 99 tps and transaction type B at 1 tps. –Given an equal mixture of types A and B, throughput is not (99+1)/2 = 50 tps. –Running one transaction of each type takes time seconds, giving a throughput of 1.98 tps (= 2/1.01). –Interference (e.g. lock contention) makes even this incorrect if different transaction types run concurrently Performance Benchmarks (Cont.)

53
Benchmarks Suites The Transaction Processing Council (www.tpc.org) benchmark suites are widely used. –TPC-A and TPC-B: simple OLTP application modeling a bank teller application with and without communication Not used anymore –TPC-C: complex OLTP application modeling an inventory system Current standard for OLTP benchmarking

54
Benchmarks Suites (Cont.) TPC benchmarks (cont.) –TPC-D: complex decision support application Superceded by TPC-H and TPC-R –TPC-H: (H for ad hoc) Models ad hoc queries which are not known beforehand –Total of 22 queries with emphasis on aggregation prohibits materialized views permits indices only on primary and foreign keys –TPC-R: (R for reporting) same as TPC-H, but without any restrictions on materialized views and indices –TPC-W: (W for Web) End-to-end Web service benchmark modeling a Web bookstore, with combination of static and dynamically generated pages

55
TPC Performance Measures TPC performance measures –transactions-per-second with specified constraints on response time –transactions-per-second-per-dollar accounts for cost of owning system TPC benchmark requires database sizes to be scaled up with increasing transactions-per-second – reflects real world applications where more customers means more database size and more transactions-per- second External audit of TPC performance numbers mandatory –TPC performance claims can be trusted