I had to write some reports for some legacy software today and I was unpleasantly surprised with the results of my SQL queries. I was selecting dollar values and summing them to for the monthly spending of certain individuals. Easy enough right? I wrote a query something like this :

SELECT SUM(t.money_spent) as sum_of_spent,
c.customer_name
from transactions t
join customers c on t.customer_id=c.customer_id
group by customer_name order by c.customer_name asc

I ended up getting numerical values that were 10 decimal places long with seemingly random numbers. After checking to make sure the database didn’t have any odd entries I stumbled on this bug report.
The ‘money_spent’ column had a data type of float, which is a waste, but I still don’t think that it should sum up incorrectly. When I select individual values I get proper two decimal results.
Apparently floats and doubles use floating point math, which deals with approximate values for numbers and can thus result in confusion like this. It seems that it isn’t really possible to store 0.1 in a column of type float. You can only store 0.00999999977648258. This behavior is a little silly but easily fixed by using the ROUND() function :

SELECT SUM(ROUND(t.money_spent)) as sum_of_money_spent,
c.customer_name from transactions t
join customers c on t.customer_id=c.customer_id
group by customer_name order by c.customer_name asc

If you have root access to a linux server and you don’t have the root mysql password, but need it, then you can easily reset the root mysql password in just a few commands. These commands probably differ depending on what linux distro you use. I was using Ubuntu 9.04 (Jaunty Jackalope) when I wrote this.

Now we restart the mysql server with the ‘skip-grant-tables’ option which basically allows anyone to do whatever they like. It’s usually preferable to include the ‘skip-networking’ option so that only localhost (you) have access to the naked database.

My webhost recently moved all my sites to a ‘stabilization’ server because my sites were using far to much CPU time and Memory. After reviewing the logs it looked like some bot from India decided to repeatedly scrape one of my sites in it’s entirety without any delays between requests. So the support team over there either requires me to correct the problem or upgrade to a dedicated server plan at ridiculous costs.

Since I didn’t really think that there was a problem I emailed back about the single IP address that was causing all the issues and took steps to prevent requests from that IP address from accessing the site. The support team replied saying that my usage was still high and that I still needed to correct the problem. A little frustrated, I did some research on how to improve my site’s load time and hopefully reduce CPU and memory usage.

Most of my sites use wordpress so I found a large number of articles geared specifically to optimizing wordpress blogs. Before I tried anything I backed up my entire public_html directory and did a dump of all my mySQL databases (took almost 20 minutes for the dump).

Dealing with Plugins
The first thing I did was upgrade all my plugins. Most wordpress plugins allow you to upgrade automatically so all you really have to do is click a button and all the work is done for you. I also deactivated and deleted a surprising number of plugins that I haven’t really had any use for recently. Apparently a lot of free plugins can cause large amounts of unneccesary load on your server due to the authors not really knowing or caring how well their software performs.

Dealing with spam bots
I have been using the Akismet plugin for awhile and it has been reporting large amounts of spam comments and pingbacks. It’s not really something that most people worry about because the spam is automatically deleted after a period of time. It does however increase server load, especially if it’s in the thousands of messages a day. I found this little mod_rewrite snippet to deny any blatent spammers that don’t have a proper referer :

Cache and Compress
Since most of my pages rarely change it’s silly to generate every page for every request dynamically. After some reading I decided to use WP Super Cache to help optimize my WordPress sites. Of course just enabling Super Cache in the WP Super Cache plugin didn’t really improve load times for the end user but it should reduce server load immensely. What did improve load times drastically was the Super Cache Compression. This was a little more involved to get going but if you’re comfortable with copying and pasting code into a .htaccess file then it shouldn’t be difficult as long as your host supports mod_mime, mod_rewrite, and mod_deflate.

After going through all that, my sites now average at about half the load time they used to. Hopefully my web host feels that I’ve done enough to get off the ‘stabilization’ server so I don’t have to transfer all my stuff to another company.

When I was creating a table in postgres earlier today I needed a unique identifier to use as a foreign key in other tables. There were no unique fields in the postgres table so I wanted to create an auto increment column type. I was surprised after browsing the field types in pgAdmin III that there is no ‘auto increment’ data type in postgres. I was expecting one because I’m so used to using mysql for the actual creation of databases. It seems that when using postgres we are expected to use either sequences for the very specifics or the serial data type for the basic +1 auto increment integer.

It is a lot easier to just let the postgres database handle the creation and automation by specifying the field as a SERIAL type. Sequences are mostly useful when you want to increment the integer by a certain value like if you’re using master-master replication and don’t want to worry about race conditions.

With either a sequence or a serial field type it is much easier to insert data than with manually updated integer because you don’t need to include anything about the field in your insert statements.

Gary Pendergast writes “Monty Widenius, the ‘father’ of MySQL, has created the the Open Database Alliance, with the aim of becoming the industry hub for the MySQL open source database. He wants to unify all MySQL-related development and services, providing a potential solution to the fragmentation and uncertainty facing the communities, businesses and technical experts involved with MySQL, following the news of the Oracle acquisition of Sun.” Related to this, an anonymous reader writes that “MySQL has announced a project to refactor MySQL to be a more Drizzle-like database.”

PL/SQL Guy writes “The Kindle has a number of “remote kill” flags built in to the hardware that, among other things, allow the text-to-speech function to be disabled at any time on a book-by-book basis. ‘Beginning yesterday, Random House Publishers began to disable text-to-speech remotely. The TTS function has apparently been remotely disabled in over 40 works so far.’ But what no one at Amazon will discuss is what other flags are lurking in the Kindle format: is there a “read only once” flag? A “no turning the pages backwards” flag?”

PL/SQL Guy writes “A Wisconsin appeals court ruled Thursday that police can attach GPS trackers to cars to secretly track anybody’s movements without obtaining search warrants. As the law currently stands, the court said police can mount GPS on cars to track people without violating their constitutional rights — even if the drivers aren’t suspects. Officers do not need to get warrants beforehand because GPS tracking does not involve a search or a seizure, wrote Madison Judge Paul Lundsten.”