Google

Almost everyone uses Google, most of us use it a lot. The main Google product is also probably the most demanding, their search engine.

In a typical day I probably do about 50 to 100 Google searches, that sounds like a lot, but half of them would probably be for one topic that is difficult to find. I don’t think that I do that many Google searches because I generally know what I’m doing and when I find what I need I spend a lot of time reading it. I’m sure that many people do a lot more.

Each Google search takes a few seconds to complete (or maybe more if it’s an image search and I’m on a slow link), but I think it’s safe to assume that more than a few seconds of CPU time are involved. How much work would each Google search take if performed on a single system? Presumably Google uses the RAM of many systems as cache which gives a result more similar to a NUMA system than one regular server working for a longer time so there is no way of asking how long it would take to do a Google search with a single server. But I’m sure that Google has some ratio of servers to the rate of requests coming in, it’s almost certainly a great secret, but we can make some guesses. If the main Google user base comprises people who collectively do an average of 100 searches per day then we can probably guess at the amount of server use required for each search based on the number of servers Google would run. I think it’s safe to assume that Google doesn’t plan to buy one server for every person on the planet and that they want to have users significantly outnumbering servers. So even for core users they should be aiming to have each user only take a fraction of the resources that one server adds to the pool.

So 100 searches probably each take more than 1 second of server use. But they almost certainly take a lot less than 864 seconds (the server use if Google had one server for every 100 daily requests which would imply one server for each of the heavier users). Maybe it takes 10 seconds of server use (CPU, disk, or network – whichever is the bottleneck) to complete one search request. That would mean that if the Google network was at 50% utilisation on average then they would have 86400*.5/10/100 == 43 users per server for the core user base who average 100 daily requests. If there are 80M core users that would be about 2M servers, and then maybe something like another 4M servers for the rest of the world.

So I could be using 1000 seconds of server time per day on Google searches. I also have a Gmail account which probably uses a few seconds for storing email and giving it to Fetchmail, and I have a bunch of Android devices which use Google calendars, play store, etc. The total Google server use on my behalf for everything other than search is probably a rounding error.

But I could be out by an order of magnitude, if it only took 1 second of server use for a Google search then I would be at 100 server seconds per day and Google would only need one server for every 430 users like me.

Google also serves lots of adverts on web sites that I visit, I presume that serving the adverts doesn’t take much resources by Google standards. But accounting for it, paying the people who host content, and detecting fraud probably takes some significant resources.

Other Big Services

There are many people who spend hours per day using services such as Facebook. No matter how I try to estimate the server requirements it’s probably going to be fairly wrong. But I’ll make a guess at a minute of server time per hour. So someone who averages 3 hours of social networking per day (which probably isn’t that uncommon) would be using 180 seconds of server time.

Personal Servers

The server that hosts my blog is reasonably powerful and has two other people as core users. So that could count as 33% of a fairly powerful server in my name. But if we are counting server use per USER then most of the resources of my blog server would be divided among the readers. My blog has about 10,000 people casually reading it through Planet syndication, that could mean that each person who casually reads my blog has 1/30,000 of a server allocated to them for that. Another way of considering it is that 10% of a server (8640 seconds) is covered by me maintaining my blog and writing posts, 20% is for users who visit my blog directly, and 3% is for the users who just see a Planet feed. That would mean that a Planet reader gets 1/330,000 of a server (250ms per day) and someone who reads directly gets 1/50,000 of a server (1.72s per day) as I have about 10,000 people visiting my blog directly in a month.

My mail server which is also shared by a dozen or so people (maybe that counts as 5% of a server for me or 4320 seconds per day). Then there’s the server I use for SE Linux development (including my Play Machine) and a server I use as a DNS secondary and a shell server for various testing and proxying.

Other People’s Servers

If every reader of a Planet instance like Planet Debian and Planet Linux Australia counts as 1/330,000 of a server for their usage of my blog, then how would that count for my own use of blogs? I tend to read blogs written by the type of people who like to run things themselves, so there would be a lot of fairly under-utilised servers that run blogs. Through Planet Debian and Planet Linux Australia I could be reading 100 or more blogs which are run in the same manner as mine, and in a typical day I probably directly visit a dozen blogs that are run in such a manner. This could give me 50 seconds of server time for blog reading.

Home Servers

I have a file server at home which is also a desktop system for my wife. In terms of buying and running systems that doesn’t count as an extra server as she needs to have a desktop system anyway and using bigger disks doesn’t make much difference to the power use (7W is the difference between a RAID-1 server and a single disk desktop system). I also have a PC running as an Internet gateway and firewall.

Running servers at home isn’t making that much of an impact on my computer power use as there is only one dedicated 24*7 server and that is reasonably low power. But having two desktop systems on 24*7 is a significant factor.

Where Power is Used/Wasted

No matter how things are counted or what numbers we make up it seems clear that having a desktop system running 24*7 is the biggest use of power that will be assigned to one person. Making PCs more energy efficient through better hardware design and better OS support for suspending would be the best way of saving energy. Nothing that can be done at the server side can compare.

Running a server that is only really used by three people is a significant waste by the standards of the NYT article. Of course the thing is that Hetzner is really cheap (and I’m not contributing any money) so there isn’t a great incentive to be more efficient in this regard. Even if I allocate some portion of the server use to blog readers then there’s still a significant portion that has to be assigned to me for my choice to not use a managed service. Running a mail server for a small number of users and running a DNS server and a SE Linux development server are all ways of wasting more power. But the vast majority of the population don’t have the skills to run their own server directly, so this sort of use doesn’t affect the average power use for the population.

Nothing else really matters. No matter what Google does in terms of power use it just doesn’t matter when compared to all the desktop systems running 24*7. Small companies may be less efficient, but that will be due to issues of how to share servers among more people and the fact that below a certain limit you can’t save money by using less resources – particularly if you pay people to develop software.

Conclusion

I blame Intel for most of the power waste. Android phones and tablets can do some amazing things, which is hardly surprising as by almost every measure they are more powerful than the desktop systems we were all using 10 years ago and by many measures they beat desktop systems from 5 years ago. The same technology should be available in affordable desktop systems.

I’d like to have a desktop system running Debian based on a multi-core ARM CPU that can drive a monitor at better than FullHD resolution and which uses so little power that it is passively cooled almost all the time. A 64bit ARM system with 8G of RAM a GPU that can decode video (with full Linux driver support) and a fast SSD should compete well enough with typical desktop systems on performance while being quiet, reliable, and energy efficient.

Finally please note that most of this post relies on just making stuff up. I don’t think that this is wrong given the NYT article that started this. I also think that my estimates are good enough to draw some sensible conclusions.

Vodafone doesn’t offer free calls to other Vodafone customers unless you are on a $30 per month plan, but that plan only gives 500MB of data measured in 12KB increments so that’s going to be expensive. Also Vodafone have had some quality problems recently so I’m not going to link to them.

Telstra are really expensive, their web site is poorly designed, and they tell me to use Windows or a Mac. Everyone who spends most of their time in urban areas shouldn’t consider them, the only reason for using Telstra is their coverage of rural areas.

Internode have a new mobile phone service based on the Optus network which offers good value for money [4]. They start with a $10 per month plan that includes $165 of calls and SMS. The call cost is $0.90 per minute plus $0.35 flagfall and the SMS cost is $0.25. It also includes 100MB of data charged at 1KB increments. The $20 per month plan from Internode includes $450 of calls $1000 of free calls to other Internode mobile phones, and 1.5G of data transfer. Internode also has a $15 charge for sale and delivery of the SIM. Internode also offer 150GB of free “social networking” traffic, I wonder whether it would be viable to tunnel some other protocol over Twitter or Facebook…

Conclusion

My usage pattern includes a reasonably large number of calls to my wife and more than 500MB of data use every month. For this pattern the Internode plan is the cheapest for me and for my wife. It seems that a large portion of the phone using population who use the Internet a lot would find this to be an ideal plan.

TPG is another good option, particularly for people who use TPG ADSL as they get a discount on the call rates and free calls to their land-line.

It seems to me that anyone who uses a mobile phone enough that a pre-paid option isn’t cheaper and who doesn’t need the coverage that only Telstra can provide will be best served by Internode or TPG.

I plan to transition to Internode some time after my current Virgin contract ends. I will probably delay the transition until the contracts for some of my wife’s relatives expire. If we all migrate at the same time then we keep getting free calls to each other – my relatives don’t use mobile phones much so there’s no money to save on calling them for free.

18 months ago when I signed up with Virgin Mobile [1] the data transfer quotas were 200MB on the $29 per month plan and 1.5G on the $39 per month plan. About 4 months ago when I checked the prices the amounts of data had gone up on the same plans (2.25G for $39 per month from memory). Now $39 per month gets only 500MB! It seems that recently Virgin has significantly reduced their value for money.

Virgin does have an option to pay an exgtra $10 per month for 2GB of data which gets you 2GB per month if you sign up for 24 months. That is reasonably good value, when I first signed up with Virgin I paid $39 per month to get extra data transfer, now I could use the $29 plan for phone access and spend $10 per month on data with a Wifi gateway device.

On top of this the phone plans aren’t nearly as good value as they used to be. When I signed up with Virgin the Sony Ericsson Xperia X10 was “free” on a $29 plan, at the time that was a hell of a phone. I believe that the Samsung Galaxy S3 currently occupies a similar market position to the one that the Xperia X10 did 18 months ago – so it shouldn’t be much more expensive. But Virgin are offering the Galaxy S3 for $21 extra per month over 24 months on the $29 plan – a total cost of ($29+$21)*24==$1200 while offering the same amount of calls and data transfer for $19 per month ($19*24==$456) when you don’t get a phone – this makes the price of a Galaxy S3 $1200-$456==$744 while Kogan [2] sells the same phone for $519 + postage!

The cheapest phone that Virgin is offering is a Galaxy S2 for $5 per month on a $29 plan which when compared to $19 per month for the same plan without a phone makes the phone cost ($5+$10)*24==$360. Kogan sells the Galaxy S2 for $399 so there’s a possibility of a Virgin plan saving some money over buying a phone from Kogan. But given the choice of $360 for a Galaxy S2 from Virgin and the Kogan prices of $839 for a Galaxy Note 2, $349 for a Galaxy Nexus, $469 for a Galaxy Note, $529 for a Galaxy S3, and $219 for a HTC One V I find it difficult to imagine that anyone would think that the $360 Galaxy S2 is the best option.

I’ve previously investigated dual-sim phones for cheap calls and data [3] but they didn’t seem like good value at the time because the “free” phones offered by the telcos used to be a good deal. Now it seems that none of the telcos are offering good deals on phones so with my needs the way to go would be to buy a Samsung Galaxy S3 or Samsung Galaxy Note 2 from Kogan, and get the $19 plan from Virgin – probably with a $10 per month extra fee to get an extra 2GB of data. For my wife the best option would be to keep using the Xperia X10 on a $19 per month plan as she doesn’t seem to have any problem with the Xperia X10 that justifies spending hundreds of dollars.

I idly considered getting a portable Wifi-3G device to use a cheap pre-paid 3G data option ($10 per month) and a cheap phone plan without data (maybe $10 per month), but decided that it’s not worth the effort. The Virgin $19 plan gives me free calls to my wife and lots of calls to other numbers (more than I can use) and an extra $10 gives me all the data transfer I need. To use a Wifi-3G device would involve buying such a device and the hassle of carrying it and using it, that wouldn’t save money for at least a year and would be annoying.

The sudden decrease in data quotas is a real concern though. It’s an indication that the telco cartel in Australia is pushing prices up, that’s not a good sign. LTE is nice, but 3G with better quotas would be more generally useful to me.

I just had a conversation with someone who thinks that their office should have no servers.

The office in question has four servers, an Internet gateway/firewall system, the old file server (and also Xen server), the new file server, and the VOIP server.

The Internet gateway system could possibly be replaced by a suitably smart ADSL modem type device, but that would reduce the control over the network and wouldn’t provide much of a benefit.

The VOIP server has to be a separate system for low latency IMHO. In theory you could use a Xen DomU for running Asterisk or you could run Asterisk on the Dom0 of the file/Xen server. But that just makes things difficult. A VOIP server needs to be reliable and is something that you typically don’t want to touch once it’s working, in this case the Asterisk server has been a few more years without upgrades than the Xen server. An Asterisk system could be replaced by a dedicated telephony device which some people might consider to be removing a server, but really a dedicated VOIP server device is just as much of a server as a P4 running Asterisk but with greater expense. A major advantage of a P4 running Asterisk is that you can easily replace the system at no cost if there is a hardware problem.

Having two file servers is excessive for a relatively small office. But running two servers is the common practice when one server is being replaced. The alternative is to just immediately cut things over which has the potential for a lot of people to arrive at work on Monday and find multiple things not working as desired. Having two file servers is a temporary problem.

File Servers

The first real problem when trying to remove servers from an office is the file server.

ADSL links with Annex M can theoretically upload data at 3Mb/s which means almost 400KB/s. So if you have an office with a theoretically perfect ADSL2+ Annex M installation then you could save a 4MB file to a file server on the Internet in not much more than 10 seconds if no-one else is using the Internet connection. Note that 4MB isn’t THAT big by today’s standards, the organisation in question has many files which are considerably bigger than that. Large files include TIFF and RAW files used for high quality image processing, MS-Office documents, and data files for most accounting programs. Saving a 65MB quick-books file in 3 minutes (assuming that your Annex M connection is perfect and no-one else is using the Internet) would have to suck.

Then there’s the issue of reading files, video files (which are often used for training and promotion) are generally larger than 100MB which would be more than 30 seconds of download time at ADSL2+ speed – but if someone sends an email to everyone in the office saying “please watch this video” then the average time to load it would be a lot more. Through quickly examining my collection of Youtube downloads I found a video which averaged 590KB/s, if an office using a theoretically perfect ADSL2+ connection giving 24Mb/s (3MB/s) download speed had such a file on a remote file server then a maximum of five people could view it at one time if no-one else in the office was using the Internet.

Now when the NBN is connected (which won’t happen in areas like the Melbourne CBD for at least another 3 years) it will be possible to get speeds like 100Mb/s download and 25Mb/s upload. That would allow up to 20 people to view videos at once and a 65MB quick-books file could be saved in a mere 22 seconds if everyone else was idle. Of course that relies on the size of data files remaining the same for another 3 years which seems unlikely, currently no Youtube videos use resolutions higher than 1920*1080 (so they don’t take full advantage of a $400 Dell monitor) and there’s always potential for storing more financial data. I expect that by the time we all have 100Mb/25Mb speeds on the NBN it will be as useful to us as 24Mb/3Mb ADSL2+ Annex M speeds are today (great for home use but limited for an office full of people).

There are of course various ways of caching data, but all of them involve something which would be considered to be a “server” and I expect that all of them are more difficult to install and manage than just having a local file server.

Of course instead of crunching the numbers for ADSL speeds etc you could just think for a moment about the way that 100baseT networking to the desktop has been replaced by Gigabit networking. When people expect each workstation to have 1000Mb/s send and receive speed it seems quite obvious that one ADSL connection shared by an entire office isn’t going to work well if all the work that is done depends on it.

Management could dictate that there is to be no server in the office, but if that was to happen then the users would create file shares on their workstations so you would end up with ad-hoc servers which aren’t correctly managed or backed up. That wouldn’t be an improvement and technically wouldn’t achieve the goal of not having servers.

Home Networking Without Servers

It is becoming increasingly common to have various servers in a home network. Due to a lack of space and power and the low requirements a home file server will usually be a workstation with some big disks, but there are cheap NAS devices which some people are installing at home. I don’t recommend the cheap NAS devices, I’m merely noting that they are being used.

Home entertainment is also something that can benefit from a server. A MythTV system for recording TV and playing music has more features than a dedicated PVR box. But even the most basic PVR ($169 for a 1TB device in Aldi now) is still a fairly complex computer which would probably conflict with any aim to have a house free of servers.

The home network design of having a workstation run as a file and print server can work reasonably well as long as the desktop tasks aren’t particularly demanding (IE no games) and the system doesn’t change much (IE don’t track Debian/Testing or otherwise have new versions of software). But this is really something that only works if you only have a few workstations.

Running an office without servers seems rather silly as it seems that none of my friends are able to have a home without a server.

Running Internet Services

Hypothetically speaking if one was to run an office without servers then that would require running all the servers in question on the Internet somewhere. For some things this can work better than a local server, for example most of my clients who insist on running a mail server in their office would probably get a better result if they had a mail server running on Linode or Hetzner – or one of the “Hosted Exchange” offerings if they want a Windows mail sever. But for a file server if you were to get around the issue of bandwidth required to access the files in normal use there’s the issue of managing a server (which is going to take more effort and expense than for a server on the LAN).

Then there’s the issue of backups. In my previous post about Hard Drives for Backup [1] I considered some of the issues related to backing data up over the Internet. The big problem however is a complete restore, if you have even a few dozen gigs of data that you want to transfer to a remote server in a hurry it can be a difficult problem. If you have hundreds of gigs then it becomes a very difficult problem. I’m sure that I could find a Melbourne based Data Center (DC) that gives the option of bringing a USB attached SATA disk for a restore – but even that case would give a significant delay when compared to backing things up on a LAN. If a server on the office LAN breaks in the afternoon my client can make arrangements to let me work in their office in the evening to fix it, but sometimes DCs don’t allow 24*7 access and sometimes when they do allow access there are organisational problems that make it impossible when you want it (EG the people at the client company who are authorised become unavailable).

The Growth of Servers

Generally it’s a really bad idea to build a server that has exactly the hardware you need. The smart thing to do is to install more of every resource (disk, RAM, CPU, etc) than is needed and to allow expansion when possible (EG have some RAM slots and drive bays free). No matter how well you know your environment and it’s users you can get surprised by the way that requirements change. Buying a slightly bigger server at the start costs hardly any money but upgrading a server will cost a lot.

Once you have a server that’s somewhat over-specced you will always find other things to run on it. Many things could be run elsewhere at some cost, but if you have unused hardware then you may as well use it. Xen and other virtualisation systems are really good in this regard as they allow you to add more services without making upgrades difficult. This means that it’s quite common to have a server that is purchased for one task but which ends up being used for many tasks.

Anyone who would aspire to an office without servers would probably regard adding extra features in such a manner to be a problem. But really if you want to allow the workers to do their jobs then it’s best to be able to add new services as needed without going through a budget approval process for each one.

Conclusion

There probably are some offices where no-one does any serious file access and everyone’s work is based around a web browser or some client software that is suited to storing data on the Internet. But for an office where the workers use traditional “Office” software such as MS-Office or Libre-Office a file server is necessary.

Some sort of telephony server is necessary no matter how you do things. If you have a traditional telephone system then you might try not to call the PABX a “server”, but really that’s what it is. Then when the traditional phone service becomes too expensive you have to consider whether to use Asterisk or a proprietary system, in either case it’s really a server.

In almost every case the issue isn’t whether to have a server in the office, but how many servers to have and how to manage them.

Hardware Reliability

Some time ago a friend told me that he bought a Sony phone in preference to a Samsung phone because he didn’t think that Samsung phones were reliable enough. I assured him that Samsung phones would be fine if you used a gel-case, but now I’m not so sure. My mother in law has a Samsung Galaxy S which now has a single crack across the face, it doesn’t appear that her phone was dropped, maybe it just bent a bit – it’s a fairly thin phone. My Galaxy S started crashing over the last few months and now many applications will crash any time I use 3G networking. Currently my Galaxy S is working well as a small Wifi tablet and hasn’t crashed since I replaced the SIM with one that has expired.

I wish that phone designers would make mode solid products with bigger batteries. The fact that the Xperia X10 weighs maybe 20g more than the Galaxy S (according to Wikipedia) isn’t a problem for me. Even with the Mugen Power 1800mah battery [4] to replace the original 1500mah battery it’s still nowhere near the limit of the phone mass that I’m prepared to carry.

Sony Upgrades

Some time ago Sony released an Android 2.3.3 image for the Xperia X10. There is no Cyanogenmod image for the Xperia X10 because it has been locked down which greatly limits what can be done. Also Sony has a proprietary backup program on their Android 2.1 image which isn’t supported on Android 2.3.3 – this inspired my post about 5 principles of backup software [3]. Due to this pain I didn’t even try to upgrade the Xperia X10 phones for me and my wife until recently.

Before upgrading the Xperia X10 phones I was unable to use my wife’s phone. The phone didn’t seem to like recognising my touch so long touch actions (such as unlocking the phone) were almost impossible for me. I think that this is due to the fact that I have fairly dry skin which presumably gives me a higher capacitance. After the upgrade both phones are usable for me, so presumably either Sony or Google upgraded the algorithms for recognising touch to work better with varying screen quality.

Comparing the Galaxy S and the Xperia X10

When I first started running Cyanogenmod on the Galaxy S I noticed that it was a lot faster than the Xperia X10 but I didn’t know why. It was documented that there had been performance improvements in Android 2.2. Now that I’m running Android 2.3.3 on the Xperia X10 I know that the performance difference is not due to the Android version. It could be due to Cyanogenmod optimisations or Sony stupidity, but it’s most likely due to hardware differences.

The Galaxy S has more RAM and storage which allows installing and running more applications. Now that I’m using the Xperia X10 for the bare minimum applications (phone calls, SMS, camera, email, ssh, and web browsing) it works quite well. I still play games on the Galaxy S and use it for more serious web browsing via Wifi. I think that the value I’m getting from the Galaxy S as a tiny wifi tablet is greater than the money I might get from selling a partially broken phone that’s been obsoleted by two significantly better models.

Conclusion

The camera on the Xperia X10 is significantly better than the one on the Galaxy S, so going back to a phone that has a great camera is a real benefit. But being slow and locked down is a real drag. I was tempted to buy a Samsung Galaxy Note or Galaxy S3, but it seemed like a bad idea to buy a phone given that my contract comes up for renewal in about 6 months which means I’ll be offered a “free” phone which while not really free is still going to be cheaper than buying a phone outright.

Also in future given the low opinion I’m now getting of smart phone reliability I’ll try and keep a small stock of spare Android phones to cover the case of broken phones.

The above is a picture of the chocolate display at Woolworths, an Australian supermarket that was formerly known as Safeway – it had the same logo as the US Safeway so there’s probably a connection. This is actually a 24.81% discount. It’s possible that some people might consider it a legal issue to advertise something as a 25% discount when it’s 1 cent short of that (even though we haven’t had a coin smaller than 5 cents in Australia since 1991). But then if they wanted to advertise a discount percentage that’s a multiple of 5% they could have made the discount price $2.99, presumably whatever factors made them make the original price $3.99 instead of $4.00 would also apply when choosing a discount price.

So the question is, do Woolworths have a strict policy of rounding down discount rates to the nearest 5% or do they just employ people who failed maths in high school?

Sometimes when discussing education people ask rhetorical questions such as “when would someone use calculus in real life”, I think that the best answer is “people who have studied calculus probably won’t write such stupid signs”. Sure the claimed discount is technically correct as they don’t say “no more than 20% off” and not misleading in a legal sense (it’s OK to claim less than you provide), but it’s annoyingly wrong. Well educated people don’t do that sort of thing.

As an aside, the chocolate in question is Green and Black, that’s a premium chocolate line that is Fair Trade, Organic, and very tasty. If you are in Australia then I recommend buying some because $3.00 is a good price.

SSDs have been dropping in price recently so I just bought four Intel 120G devices for $115 each. I installed the first one for my mother in law who had been complaining about system performance. Her system boot time went from 90 seconds to 20 seconds and a KDE login went from about 35 seconds to about 10 seconds. The real problem that she had reported was occasional excessive application delay, while it wasn’t possible to diagnose that properly I think it was a combination of her MUA doing synchronous writes while other programs such as Chromium were doing things. To avoid the possibility of a CPU performance problem I replaced her 1.8GHz E4300 system with a 2.66GHz E7300 that I got from a junk pile (it’s amazing what’s discarded nowadays).

I also installed a SSD in my own workstation (a 2.4GHz E4600). The boot time went down from 45s on Ext4 without an encrypted root to 27s with root on BTRFS including the time taken to enter the encryption password (maybe about 23s excluding my typing time). The improvement wasn’t as great, but that’s because my workstation does some things on bootup that aren’t dependent on disk IO such as enabling a bridge with STP (making every workstation a bridge is quieter than using switches). KDE login went from about 27s to about 12s and the time taken to start Chromium and have it be usable (rather than blocking on disk IO) went from 30 seconds to an almost instant response (maybe a few seconds)! Tests on another system indicates that Chromium startup could be improved a lot by purging history, but I don’t want to do that. It’s unfortunate that Chromium only supports deleting recent history (to remove incriminating entries) but doesn’t support deleting ancient history that just isn’t useful.

I didn’t try to seriously benchmark the SSD (changing from Ext4 to BTRFS on my system would significantly reduce the accuracy of the results), I have plans for doing that on more important workloads in the near future. For the moment the most casual tests have shown a significant performance benefit so it’s clear that an SSD is the correct storage option for any new workstation which doesn’t need more than 120G of storage space. $115 for SSD vs $35 for HDD is a fairly easy choice for a new system. For larger storage the price of hard drives increases more slowly than that of SSD.

In spite of the performance benefits I doubt that I will gain a real benefit from this in the next year. The time taken to install the SSD equates to dozens of boot cycles which given a typical workstation uptime in excess of a month is unlikely to happen soon. One minor benefit is that deleting messages in Kmail is an instant operation which saves a little annoyance and there will be other occasional benefits.

One significant extra benefit is that an SSD is quiet and dissipates less heat which might allow the system cooling fans to run more slowly. As noisy computers annoy me an SSD is a luxury feature. Also it’s good to test new technologies that my clients may need.

The next thing on my todo list is to do some tests of ZFS with SSD for L2ARC and ZIL.

This generally isn’t a problem in an office as you can usually adjust the angle of the monitor and the background lighting to avoid the worst problems. It’s also not a serious problem for a hand-held device as it’s usually easy to hold it at an angle such that you don’t have light from anything particularly bright reflecting.

But my experience of laptop use includes using them anywhere at any time. I’ve done a lot of coding on all forms of public transport in all weather conditions. Doing that with a Thinkpad which has a matte surface on it’s screen is often difficult but almost always possible. Doing that on a system with a mirrored display really isn’t possible. The above photo of a 15″ Macbook Pro model MD103X/A was taken at a Myer store which was specifically designed to make the computers look their best. The overall lighting wasn’t particularly bright so that the background didn’t reflect too much and the individual lights were diffuse to avoid dazzling point reflections. But even so the lights can be clearly seen. Note that the photo was taken with a Samsung Galaxy S, far from the best possible camera.

If I was buying a laptop that would only ever be used in the more northern parts of Europe or if I was buying a laptop to use only at home and at the office then I might consider a mirror display. But as I mostly use my laptop in mainland Australia including trips to tropical parts of Australia and I use it in all manner of locations a mirror display isn’t going to work for me.

This isn’t necessarily a bad decision by Apple designers. My observation of Macbook use includes lots of people using them only in offices and homes. Of the serious geeks who describe their laptop as My Precious hardly anyone has a Macbook while Thinkpads seem quite popular in that market segment. I don’t think that it’s just the matte screen that attracts serious geeks to the Thinkpad, but it does seem like part of a series of design decisions (which include the past tradition of supporting hard drive removal without tools and the option of a second hard drive for RAID-1) that make Thinkpads more suitable for geeks than Macbooks. While the new tradition in Apple design of gluing things together so they can never be repaired, recycled, or even have their battery replaced seems part of a pattern that goes against geek use. Even when Apple products are technically superior in some ways their catering to the less technical buyers makes them unsuitable to people like me.

Maybe the ability to use a Macbook as a shaving mirror could be handy, but I’d rather grow a beard and use a Thinkpad.

The general trend seems to be that cheap hard drives are increasing in capacity faster than much of the data that is commonly stored. Back in 1998 I had a 3G disk in my laptop and about 800M was used for my home directory. Now I have 6.2G used for my home directory (and another 2G in ~/src) out of the 100G capacity in my laptop. So my space usage for my home directory has increased by a factor of about 8 while my space available has increased by a factor of about 30. When I had 800M for my home directory I saved space by cropping pictures for my web site and deleting the originals (thus losing some data I would rather have today) but now I just keep everything and it’s still doesn’t take up much of my hard drive. Similar trends apply to most systems that I use and that I run for my clients.

Due to the availability of storage people are gratuitously using a lot of disk space. A relative recently took 10G of pictures on a holiday, her phone has 12G of internal storage so there was nothing stopping her. She might decide that half the pictures aren’t that great if she had to save space, but that space is essentially free (she couldn’t buy a cheaper phone with less storage) so there’s no reason to delete any pictures.

When considering backup methods one important factor is the ability to store all of one type of data on one backup device. Having a single backup span multiple disks, tapes, etc has a dramatic impact on the ease of recovery and the potential for data loss. Currently 3TB SATA disks are really cheap and 4TB disks are available but rather expensive. Currently only one of my clients has more than 4TB of data used for one purpose (IE a single filesystem) so apart from that client a single SATA disk can backup anything that I run.

Benefits of Hard Drive Backup

When using a hard drive there is an option to make it a bootable disk in the same format as the live disk. I haven’t done this, but if you want the option of a quick recovery from a hardware failure then having a bootable disk with all the data on it is a good option. For example a server with software RAID-1 could have a backup disk that is configured as a degraded RAID-1 array.

The biggest benefit is the ability to read a disk anywhere. I’ve read many reports of tape drives being discovered to be defective at the least convenient time. With a SATA disk you can install it in any PC or put it in a USB bay if you have USB 3.0 or the performance penalty of USB 2.0 is bearable – a USB 2.0 bay is great if you want to recover a single file, but if you want terabytes in a hurry then it won’t do.

A backup on a hard drive will typically use a common filesystem. For backing up Linux servers I generally use Ext3, at some future time I will move to BTRFS as having checksums on all data is a good feature for a backup. Using a regular filesystem means that I can access the data anywhere without needing any special software, I can run programs like diff on the backup, and I can export the backup via NFS or Samba if necessary. You never know how you will need to access your backup so it’s best to keep your options open.

Hard drive backups are the best solution for files that are accidentally deleted. You can have the first line of backups on a local server (or through a filesystem like BTRFS or ZFS that supports snapshots) and files can be recovered quickly. Even a SATA disk in a USB bay is very fast for recovering a single file.

LTO tapes have a maximum capacity of 1.5TB at the moment and tape size has been increasing more slowly than disk size. Also LTO tapes have an expected lifetime of only 200 reads/writes of the entire tape. It seems to me that tapes don’t provide a great benefit unless you are backing up enough data to need a tape robot.

Problems with a Hard Drive Backup

Hard drives tend not to survive being dropped so posting a hard drive for remote storage probably isn’t a good option. This can be solved by transferring data over the Internet if the data isn’t particularly big or doesn’t change too much (I have a 400G data set backed up via rsync to another country because most of the data doesn’t change over the course of a year). Also if the data is particularly small then solid state storage (which costs about $1 per GB) is a viable option, I run more than a few servers which could be entirely backed up to a 200G SSD. $200 for a single backup of 200G of data is a bit expensive, but the potential for saving time and money on the restore means that it can be financially viable.

Some people claim that tape storage will better survive a Carrington Event than hard drives. I’m fairly dubious about the benefits of this, if a hard drive in a Faraday Cage (such as a regular safe that is earthed) is going to be destroyed then you will probably worry about security of the food supply instead of your data. Maybe I should just add a disclaimer “this backup system won’t survive a zombie apocalypse”. ;)

It’s widely regarded that tape storage lasts longer than hard drives. I doubt that this provides a real benefit as some of my personal servers are running on 20G hard drives from back when 20G was big. The fact that drives tend to last for more than 10 years combined with the fact that newer bigger drives are always being released means that important backups can be moved to bigger drives. As a general rule you should assume that anything which isn’t regularly tested doesn’t work. So whatever your backup method you should test it regularly and have multiple copies of the data to deal with the case when one copy becomes corrupt. The process of testing a backup can involve moving it to newer media.

I’ve seen it claimed that a benefit of tape storage is that part of the data can be recovered from a damaged tape. One problem with this is that part of a database often isn’t particularly useful. Another issue is that in my experience hard drives usually don’t fail entirely unless you drop them, drives usually fail a few sectors at a time.

How to Implement Hard Drive Backup

The most common need for backups is when someone deletes the wrong file. It’s usually a small restore and you want it to be an easy process. The best solution to this is to have a filesystem with snapshots such as BTRFS or ZFS. In theory it shouldn’t be too difficult to have a cron job manage snapshots, but as I’ve only just started putting BTRFS and ZFS on servers I haven’t got around to changing my backups. Snapshots won’t cover more serious problems such as hardware, software, or user errors that wipe all the disks in a server. For example the only time I lost a significant amount of data from a hosted server was when the data center staff wiped it, so obviously good off-site backups are needed.

The easiest way to deal with problems that wipe a server is to have data copied to another system. For remote backups you can rsync to a local system and then use “cp -rl” or your favorite snapshot system to make a hard linked copy of the tree. A really neat feature is the ZFS ability to “send” a filesystem snapshot (or the diff between two snapshots) to a remote system [1]. Once you have regular backups on local storage you can then copy them to removable disks as often as you wish, I think I’ll have to install ZFS on some of my servers for the sole purpose of getting the “send” feature! There are NAS devices that provide similar functionality to the ZFS send/receive (maybe implemented with ZFS), but I’m not a fan of cheap NAS devices [2].

It seems that the best way to address the first two needs of backup (fast local restore and resilience in the face of site failure) is to use ZFS snapshots on the server and ZFS send/receive to copy the data to another site. The next issue is that the backup server probably won’t be big enough for all the archives and you want to be able to recover from a failure on the backup server. This requires some removable storage.

The simplest removable backup is to use a SATA drive bay with eSATA and USB connectors. You use a regular filesystem like Ext3 and just copy the files on. It’s easy, cheap, and requires no special skill or software. Requiring no special skill is important, you never know who will be called on to recover from backups.

When a server is backing up another server by rsync (whether it’s in the same rack or another country) you want the backup server to be reliable. However there is no requirement for a single reliable server and sometimes having multiple backup servers will be cheaper. At current purchase prices you can buy two cheap tower systems with 4*3TB disks for less money than a single server that has redundant PSUs and other high end server features. Having two cheap servers die at once seems quite unlikely so getting two backup servers would be the better choice.

For filesystems that are bigger than 4TB a disk based backup would require backup software that handles multi part archives. One would hope that any software that is designed for tape backup would work well for this (consider a hard drive as a tape with a very fast seek), but often things don’t work as desired. If anyone knows of a good Linux backup program that supports multiple 4TB SATA disks in eSATA or USB bays then please let me know.

Conclusion

BTRFS or ZFS snapshots are the best way of recovering from simple mistakes.

ZFS send/receive seems to be the best way of synchronising updates to filesystems to other systems or sites.

ZFS should be used for all servers. Even if you don’t currently need send/receive you never know what the future requirements may be. Apart from needing huge amounts of RAM (one of my servers had OOM failures when it had a mere 4G of RAM) there doesn’t seem to be any down-side to ZFS.

I’m unsure of whether to use BTRFS for removable backup disks. The immediate up-sides are checksums on all data and meta-data and the possibility of using built-in RAID-1 so that a random bad sector is unlikely to lose data. There is also the possibility of using snapshots on a removable backup disk (if the disk contains separate files instead of an archive). The down-sides are lack of support on older systems and the fact that BTRFS is fairly new.