Month: May 2007

So I came across Google Patents the other day and thought that it was a pretty cool idea! Searching for whether a patent already exists had been a pretty hard job in the past, so having a readily available resource such as Google is a fantastic step forwards.

Especially in light of the recent Microsoft ‘FUD’.

Here’s a patent I came across whilst browsing around the various database related patents that exist out there:

Abstract
The present invention provides a merchant system for online shopping and merchandising. The merchant system architecture provides great flexibility for a merchant to adapt the merchant system to their existing business practices, promotions and databases. The merchant system includes a dynamic page generator, a configurable order processing module and a database module capable of retrieving data from the database without regard to its schema. The present invention enables merchants to create electronic orders which are easily adaptable for different sales situations. The order processing module includes multiple configurable stages to process a merchant’s electronic orders. The merchant system is capable of generating pages dynamically using templates having embedded directives. The database module and the dynamic page generator allow merchants to modify their databases and page displays without having to reengineer the merchant system.

Wow, it looks to me like Microsoft patented the Online shopping cart! There’s a lot of interesting patents that Microsoft seem to have taken out. Google makes searching for these relatively easily.

Share this:

Like this:

When I first joined MySQL one of the things that was evident was the Support Engineers spent quite some time with customers issues that were focused on performance tuning. Performance tuning issues generally start with a engineer requesting a bunch of information from the customer such as:

SHOW GLOBAL VARIABLES

SHOW GLOBAL STATUS (a number of times, to give us some rate information)

SHOW FULL PROCESSLIST

SHOW INNODB STATUS (if InnoDB is widely being used)

vmstat output for a number of short periods

iostat -dx output for a number of short periods

A lot of this output is pretty easy to go through when you know what you are looking for. However where we spend a lot of our time is looking through the SHOW GLOBAL STATUS output – trying to piece together the rates of change etc. so that we can get more insight in to what is hurting the database.

I also had never used Python – and wanted to give it a try – and said some time early last year “One of these days I’m going to create a Python script that aggregates all of this stuff for us“. Well, towards the end of last year (and with some extra stuff added early this year), I actually went ahead and took the dive and created such a script. Now, I know there are some scripts out there that already do some of this from other people (such as the pretty good one put together over on hackmysql.com) – however these are scripts that all connect to a running database, and get stats to aggregate in that way. This really didn’t help us in the Support Group – as we needed customers to send us stats in files generally (although sometimes we do connect and troubleshoot that way). It also didn’t serve the purpose of my wanting to try Python 🙂

I’ve reached a point with this script now where I’m ready to shove it out to the world to use as they wish, and request feedback on it. The current version I am releasing here is actually a little behind – I’m working on an ‘interactive’ version as well (more like mysqlreport et al), but more on that later.

Input files can contain many snapshots of SHOW STATUS within the same file, and you can also pass in any number of files in chronological order (or a mixture of the two). StatPack also understands status files with both “batch” and “non-batch” outputs. That is to say it currently can parse files of both of the following formats:

There will be one report for each period reported within the “Number of Snapshots” section (less one, as initially we need 2 reports to start aggregation). I.e if it reports 4 snapshot periods, you will get 3 reports based on the differences between each snapshot.

Now, as I’ve learned a little more a Python I’ve realized I’ve done some things that could…. be done better… 🙂 I’m working ‘Version 2’, which is changing a fair bit of the way things are done, and also adds in connecting to a running server to generate a report from as well. Currently this is what it supports:

-h --host Host for MySQL server to connect to in interactive mode
-P --port Port the MySQL server is running on
-S --socket UNIX domain socket to use (instead of -h and -P)
-u --user User to connect to the MySQL server with
-p --password Password for the user
-d --defaults-file Defaults file to read options from (reads [client] group by default)
-i --interval Length of each collection interval
-c --report-count Count of intervals to collect and aggregate
-r --report-file File to output report data to
-s --status-file File to output raw SHOW STATUS data to

I’m still ironing a few things, but wanted to get this last version out for any feedback that anybody would like to provide before I release the next version! You can download the ‘current’ version from http://markleith.co.uk/dl/statpack-v1.tar.gz.