Introduction

This report compares the performance of Rational
Quality Manager version 5.0 to the previous 4.0.6
release. The test objective is achieved in three steps:

Run version 5.0 with standard 1-hour test using 1,000 concurrent users.

Run version 4.0.6 with standard 1-hour test using 1,000 concurrent users.

The test is run three time for each version and the resulting six tests are compared with each other. Three tests per version is used to get a more accurate picture since there are variations expected between runs.

Disclaimer

The information in this document is distributed AS IS. The use of this information or the implementation of any of these techniques is a customer responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk. Any pointers in this publication to external Web sites are provided for convenience only and do not in any manner serve as an endorsement of these Web sites. Any performance data contained in this document was determined in a controlled environment, and therefore, the results that may be obtained in other operating environments may vary significantly. Users of this document should verify the applicable data for their specific environment.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multi-programming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

This testing was done as a way to compare and characterize the differences in performance between different versions of the product. The results shown here should thus be looked at as a comparison of the contrasting performance between different versions, and not as an absolute benchmark of performance.

What our tests measure

We use predominantly automated tooling such as Rational Performance Tester (RPT) to simulate a workload normally generated by client software such as the Eclipse client or web browsers. All response times listed are those measured by our automated tooling and not a client.

The diagram below describes at a very high level which aspects of the entire end-to-end experience (human end-user to server and back again) that our performance tests simulate. The tests described in this article simulate a large part of the end-to-end transaction as indicated. Performance tests include some simulation of browser rendering and network latency between the simulated browser client and the application server stack.

Findings

Performance goals

Verify that there are no performance regressions between current release and prior release with 1,000 concurrent users using the workload described below.

Findings

According to the testing results, some response times of the pages for 5.0 are degraded comparing to 4.0.6.

Topology

The specifications of machines under test are listed in the table below. Server tuning details listed in Appendix A

Function

Number of Machines

Machine Type

CPU / Machine

Total # of CPU vCores/Machine

Memory/Machine

Disk

Disk capacity

Network interface

OS and Version

Proxy Server (IBM HTTP Server and WebSphere Plugin)

1

IBM System x3250 M4

1 x Intel Xeon E3-1240 3.4GHz (quad-core)

8

16GB

RAID 1 -- SAS Disk x 2

299GB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.5

JTS Server

1

IBM System x3550 M4

2 x Intel Xeon E5-2640 2.5GHz (six-core)

24

32GB

RAID 5 -- SAS Disk x 2

897GB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.5

QM Server

1

IBM System x3550 M4

2 x Intel Xeon E5-2640 2.5GHz (six-core)

24

32GB

RAID 5 -- SAS Disk x 2

897GB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.5

Database Server

1

IBM System x3650 M4

2 x Intel Xeon E5-2640 2.5GHz (six-core)

24

64GB

RAID 5 -- SAS Disk x 2

2.4TB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.1

RPT workbench

1

IBM System x3550 M4

2 x Intel Xeon E5-2640 2.5GHz (six-core)

24

32GB

RAID 5 -- SAS Disk x 2

897GB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.4

RPT Agents

6

VM image

4 x Intel Xeon X5650 CPU (1-Core 2.67GHz)

1

2GB

N/A

30GB

Gigabit Ethernet

Red Hat Enterprise Linux Server release 6.5

Network switches

N/A

Cisco 2960G-24TC-L

N/A

N/A

N/A

N/A

N/A

Gigabit Ethernet

24 Ethernet 10/100/1000 ports

N/A: Not applicable.
vCores = Cores with hyperthreading

Network connectivity

All server machines and test clients are located on the same subnet. The LAN has 1000 Mbps of maximum bandwidth and less than 0.3 ms latency in ping.

Data volume and shape

The artifacts were contained in one large project for a total of 579,142 artifacts.

The repository contained the following data:

50 test plans

30,000 test scripts

30,000 test cases

120,000 test case execution records

360,000 test case results

3,000 test suites

5,000 work items(defects)

200 test environments

600 test phases

30 build definitions

6,262 execution sequences

3,000 test suite execution records

15,000 test suite execution results

6,000 build records

Database size = 15 GB

QM index size = 1.3 GB

Methodology

Rational Performance Tester(RPT) was used to simulate the workload created using the
web client. Each user completed a random use case from a set of
available use cases. A Rational Performance Tester script is created for
each use case. The scripts are organized by pages and each page
represents a user action.

The work load is role based as each of the areas defined under sequence of actions which are separated into individual user groups within an RPT schedule.
The settings of the RPT schedule is shown below:

Edit Test Environment: user lists all test environments, and then selects one of the environments and modifies it.

15

Edit test plan: list all test plans; from query result, open a test plan for editing, add a test case to the test plan, a few other sections of the test plan are edited and then the test plan is saved.

Edit test case: user searches Test Case by name, the test case is then opened in the editor, then a test script is added to the test case (user clicks next a few times (server size paging feature) before selecting test script), The test case is then saved.

Response time comparison

The median response time provided more even results than the average response time. The nature of the high variance between tests where some tasks at time takes a longer time to run, such as when the server is under heavy load, makes the average response time less predictive. The median values are mainly included in the following tables and charts for comparison.

Summary

There are totally 91 pages which are under test in the performance runs. And the table below shows the summary of all the degradation pages whose response time are longer than in 4.0.6:

Degradation percent range

Count

Total

83

>=100%

13

>=50% and < 100%

32

>=20% and < 50%

24

>=10% and < 20%

8

>=0% and < 10%

6

If we ignore the pages which response time are less than 1s and degradation percent are less than 20%, then we got the conclusion:

Degradation percent range

Count

>=20%

10

So some page performance degradations could be observed by users.

The numbers in the following charts include all of the pages for all of the scripts that ran.

Results

Resource utilization

Resource utilization for 5.0

QM server

DB server

CPU

Disk

Memory

Garbage collection

Verbose garbage collection is enable to create the GC logs. The GC logs shows very little variation between runs. There is also no discernible difference between versions . Below is one example of the output from the GC log for each application.

QM

JTS

NOTE

For all usecase comparison charts, the unit is millisecond, and for the data, smaller is better.