My webhost recently moved all my sites to a ‘stabilization’ server because my sites were using far to much CPU time and Memory. After reviewing the logs it looked like some bot from India decided to repeatedly scrape one of my sites in it’s entirety without any delays between requests. So the support team over there either requires me to correct the problem or upgrade to a dedicated server plan at ridiculous costs.

Since I didn’t really think that there was a problem I emailed back about the single IP address that was causing all the issues and took steps to prevent requests from that IP address from accessing the site. The support team replied saying that my usage was still high and that I still needed to correct the problem. A little frustrated, I did some research on how to improve my site’s load time and hopefully reduce CPU and memory usage.

Most of my sites use wordpress so I found a large number of articles geared specifically to optimizing wordpress blogs. Before I tried anything I backed up my entire public_html directory and did a dump of all my mySQL databases (took almost 20 minutes for the dump).

Dealing with Plugins
The first thing I did was upgrade all my plugins. Most wordpress plugins allow you to upgrade automatically so all you really have to do is click a button and all the work is done for you. I also deactivated and deleted a surprising number of plugins that I haven’t really had any use for recently. Apparently a lot of free plugins can cause large amounts of unneccesary load on your server due to the authors not really knowing or caring how well their software performs.

Dealing with spam bots
I have been using the Akismet plugin for awhile and it has been reporting large amounts of spam comments and pingbacks. It’s not really something that most people worry about because the spam is automatically deleted after a period of time. It does however increase server load, especially if it’s in the thousands of messages a day. I found this little mod_rewrite snippet to deny any blatent spammers that don’t have a proper referer :

Cache and Compress
Since most of my pages rarely change it’s silly to generate every page for every request dynamically. After some reading I decided to use WP Super Cache to help optimize my WordPress sites. Of course just enabling Super Cache in the WP Super Cache plugin didn’t really improve load times for the end user but it should reduce server load immensely. What did improve load times drastically was the Super Cache Compression. This was a little more involved to get going but if you’re comfortable with copying and pasting code into a .htaccess file then it shouldn’t be difficult as long as your host supports mod_mime, mod_rewrite, and mod_deflate.

After going through all that, my sites now average at about half the load time they used to. Hopefully my web host feels that I’ve done enough to get off the ‘stabilization’ server so I don’t have to transfer all my stuff to another company.

When I was creating a table in postgres earlier today I needed a unique identifier to use as a foreign key in other tables. There were no unique fields in the postgres table so I wanted to create an auto increment column type. I was surprised after browsing the field types in pgAdmin III that there is no ‘auto increment’ data type in postgres. I was expecting one because I’m so used to using mysql for the actual creation of databases. It seems that when using postgres we are expected to use either sequences for the very specifics or the serial data type for the basic +1 auto increment integer.

It is a lot easier to just let the postgres database handle the creation and automation by specifying the field as a SERIAL type. Sequences are mostly useful when you want to increment the integer by a certain value like if you’re using master-master replication and don’t want to worry about race conditions.

With either a sequence or a serial field type it is much easier to insert data than with manually updated integer because you don’t need to include anything about the field in your insert statements.

nandemoari writes “Hackers have reportedly infiltrated restricted computer databases at the University of California Berkeley, putting the private data of 160,000 students, alumni, and others at risk. According to UC Berkeley, computer administrators determined on April 9, 2009 that electronic databases in University Health Services had been breached by overseas criminals. The breakins began in October 2008. Information contained on the breached databases included Social Security numbers, health insurance information, and non-treatment medical information such as records of immunization and names of treating physicians.”

stoolpigeon writes “Princess Ruruna, of the Kingdom of Kod, has a problem. Her parents, the King and Queen, have left to travel abroad. Ruruna has been left to manage the nations fruit business. Much is at stake, Kod is known as “The Country of Fruit.” Ruruna is not happy though, as she is swamped by paperwork and information overload. A mysterious book, sent by her father, contains Tico the fairy. Tico, and the supernatural book are going to help Princess Ruruna solve her problems with the power of the database. This is the setting for all that takes place in The Manga Guide to Databases. If you are like me and learned things like normalization and set operations from a rather dry text book, you may be quite entertained by the contents of this book. If you would like to teach others about creating and using relational databases and you want it to be fun, this book may be exactly what you need.” Read below for the rest of JR’s review.

CurtMonash writes “Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2/12 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff.”

Mike writes “Starting this month, the Federal Bureau of Investigation will join 15 states that collect DNA samples from those awaiting trial and will also collect DNA from detained immigrants. For example, This year, California began taking DNA upon arrest and expects to nearly double the growth rate of its database, to 390,000 profiles a year, up from 200,000. Until now, the federal government genetically tracked only convicts, however law enforcement officials are expanding their collection of DNA to include millions of people who have only been arrested or detained, but not yet convicted. The move, intended to ‘help solve more crimes,’ is raising concerns about the privacy of petty offenders and people who are presumed innocent.”

snydeq writes “Fatal Exception’s Neil McAllister believes Oracle is next in line to make a play for Sun now that IBM has withdrawn its offer. Dismissing server market arguments in favor of Cisco or Dell as suitors, McAllister suggests that MySQL, ZFS, DTrace, and Java make Sun an even better asset to Oracle than to IBM. MySQL as a complement to Oracle’s existing database business would make sense, given Oracle’s 2005 purchase of Innobase, and with ‘the long history of Oracle databases on Solaris servers, it might actually see owning Solaris as an asset,’ McAllister writes. But the ‘crown jewel’ of the deal would be Java. ‘It’s almost impossible to overestimate the importance of Java to Oracle. Java has become the backbone of Oracle’s middleware strategy,’ McAllister contends.”

A quarter of all government databases are illegal and should be scrapped or redesigned, a report has claimed. The Joseph Rowntree Reform Trust says storing information leads to vulnerable people, such as young black men, single parents and children, being victimised.

There are several ways to backup MySQL data. In this article we’ll look at how to backup your databases using different methods, we will also learn how to achieve an automatic backup solution to make the process easier.