3
Problem definition What are the questions that you really want answered? Refine – What specific information is needed to influence the decision? – What level of confidence is needed to influence others to adopt your recommendations? When must the decision be made? What are the time and money constraints for this study?

5
It takes two models to tango Workload A benchmark is a model of a workload The most accurate model of the workload is the workload itself – Working with the actual workload is often impractical? Reasons why? Workload characterization helps abstract the important workload characteristics – Benchmarks are sometimes used to model the workload System The most accurate model of a system is the system itself – Working with real system is often impractical Reasons why? Models abstract the real system – Analytic – Simulation – Hybrid

6
Level of Detail Risk of “going Rainman ”, Jay Veazey Do you really need to model every detail? Avoid model parameters that cannot be accurately measured We need to find the right level of abstraction Identify the key characteristics of the: – Workload For OLTP it might include IO rate, instructions executed per transaction, and lock contention rate, … – For system For OLTP it might be service rates, latencies and ability to process lock contention The rest is just a distraction

9
Workload characterization Measure real systems to collect: 1.Workload parameters for your model Critical aspects of the workload for making the decision Examples: – Transaction types and rates – Number of users – IO rate – IO block size – Instructions executed per transaction, … – Remember we may need to scale the workload up or down for specific model scenarios Measurement of operational variables is preferred. – Variables that can be established beyond doubt by measurement are called operationally testable. – GIGO 2.Data to help validate your model Throughput Response times Utilizations Queue lengths, …

10
Validation Don’t just look at the predicted performance metric Compare known (validation) cases for: – proper queue lengths, – number of visits and – utilizations on as many components as you can. Understand deviations.

11
Validation Never trust model results until validated – Are the results reasonable?? Sources of error – Wrong workload Poor workload characterization – Missed a key aspect of the workload – Measurement error Improperly scaled the workload for the new situation Benchmarking – Instrumentation can perturb system (Measurement artifact) – Not really the system we want to measure! – Analytical model (Symptom) Improper queue lengths on validation cases Not enough detail or there are software blockages – Simulation Programming errors? Too much detail – A detailed model requires more workload assumptions which are subject to error Are the random numbers really random? Untested corner cases? High value decisions may merit cross checking between more than one approach

17
Simulation models Simulation – Types of simulations for use in capacity planning Transaction oriented – model from the point of view of the transaction visiting services Process oriented – modeled from the point of view of either transactions or servers or both – Workload source Trace-driven – perhaps traces of real system activity Stochastic – Use of random number generators Statistical tools can be used to: Reduce the simulation time. – Confidence intervals Determine whether a change made to a system has a statistically significant impact on performance