Introducing the Wait Times Report

The Wait Times Report was the first report I got to work on. The report measures how long jobs wait in the queue before starting, more specific, it measures the time difference between the timestamp of the change that generated that job and the timestamp of when that job is assigned to a free slave.

The report is per build pool: build pool, try build pool and test pool. For more specific details on the report contents jump further to Report Contents.

It also allows the specification of a timeframe for the jobs (starttime and endtime as UNIX timestamps). If these parameters are not specified, the defaults are used: endtime will be the server’s current timestamp and starttime 24 hours before (i.e. the last 24 hours).

To see exactly how jobs are selected from the Scheduler Database, and what restrictions are applied on them, see Wait Times Query.

Wait Time E-Mails

The Wait Time e-mails are sent by fetching and parsing the JSON format of these reports (found at <report_url>?<report_params>&format=json).

Report Contents

The report measures how long jobs wait in the queue before starting, considering all jobs in one build pool, submitted in a specified timeframe (several other filters are applied too).

The report groups jobs’ wait times in blocks of mpb minutes, for example: 0-15, 15-30, 30-45,… are the first 3 blocks, where a block has 15 minutes (mpb=15). For each of these blocks, the report counts how many jobs had their wait time in that interval.

Let’s say we have:0-15 44 88%15-30 5 10%30-45 1 2%
In the report above, we have 50 jobs, from which 44 jobs waited between 0 and 15 minutes, representing 88% of all jobs registered, 5 jobs (10%) waited between 15 and 30 minutes and only 1 job (2%) waited more than 30 minutes.
For a real, more detailed example, scroll down to Wait Times Example.

The same stats are computed, but broke down by platform (linux, linux64, fedora, snowleopard, xp, … for complete list see buildapi.model.util.PLATFORMS_BUILDERNAME).

Report Python Class
The Wait Times Report Python class can be found at buildapi.model.waittimes.WaitTimesReport.

Constructing the Report
The report is computed by calling buildapi.model.waittimes.GetWaitTimes. This function calls buildapi.model.waittimes.WaitTimesQuery, which handles the logic of selecting only the jobs of interest. See Wait Times Query post for further details.

Each of the jobs are added to the report one by one, and the report stats are updated in the same time.

Other Report Info:

unknownbuilders – excluded builders, like l10n

otherplatforms – platforms not found in known platforms, and not excluded

pending – jobs that have not started yet (still waiting)

has_no_changes – jobs that have no change, like nightly builds

Example

Wait Times for August 6th, 2010, for try build pool. The report online looks like this:

We can see the wait times were bad for that day, only 58.84% (752) jobs waited between 0 and 15 minutes, 5.24% (64) jobs waited between 15 and 30 minutes, and over 28% (362) jobs waited more than 60 minutes (blue table on the left)! On the right the numbers are broke down by platform (green tables on the right).

The overall wait times (blue table on the left) are also displayed as charts broke down by time intervals (int_size = 2 hours):

Chart 1 - Percentage Stacked Chart

Chart 1 displays the percentage of each of the wait time blocks per time interval. For example, in the 2:00-4:00 interval, around 50% of the jobs waited less than 15 minutes (blue color), around 30% jobs waited 15 to 30 minutes (red color), and 20% jobs waited 30 to 45 minutes (orange), and there are no jobs that waited more than 45 minutes. You can see that starting with 2PM (14:00) wait times started going really bad, and from 6PM-8PM the majority of jobs waited more than 60 minutes (purple block)!