Sunday, January 10. 2010

Let me first say that I still stand by my previous comments that PostgreSQL is in much less need of a native 64-bit version on Windows than other databases - and less than PostgreSQL on other platforms. (Both this and the previous post talk about server-side support, having support client-side in libpq and ODBC is obviously important)

That said, I am now happy to say that with the effort of several people, it looks like we now have basic 64-bit support for PostgreSQL on Windows. During the past couple of weeks I have cleaned up and extended the patch from Tsutomu Yamada, which in itself builds on the work of several others, and applied it to cvs. As of today, you can build and run PostgreSQL on 64-bit Windows from the official repository.

This is obviously just the beginning. Unfortunately, Windows in general seems to be lagging behind quite severely on the 64-bit front, and this appears to be particularly true of the open source libraries. I've so far been unable to find 64-bit builds of many of the libraries we rely on for full functionality (gettext, zlib, libxml, libxslt, libiconv, tcl, uuid). While in theory support for these things should show up as soon as they are available, that's obviously not tested. If you know where I can download complete 64-bit binaries (including .lib and .h files for development) of any of these projects, please let me know.

Also, this is not very well tested yet. It passes all the PostgreSQL regression tests. That's pretty much all the tests I've been running on it. This is obviously a point where a lot of people can help - so please do. I have made a set of pre-alpha binaries available here. This is a bare-bones build of cvs head earlier today - things like OpenSSL that do have 64-bit binaries available is also disabled. It comes with no installer - just unzip into a directory, and use the initdb and pg_ctl commands directly. So if you have a 64-bit environment, please download and test this against your application!

Many people who develop patches for PostgreSQL don't have access to Windows machines to test their patches on. Particularly not with complete build environments for the MSVC build on them. The net result of this is that a fair amount of patches are never tested on Windows until after they are committed. For most patches this doesn't actually matter, since it's changes that don't deal with anything platform specific other than that which is already taken care of by our build system. But Windows is not Posix, so the platform differences are generally larger than between the different Unix platforms PostgreSQL builds on, and in MSVC the build system is completely different. In a non-trivial number of cases it ends up with breaking the buildfarm until somebody with access to a Windows build environment can fix it. Lucky, we have a number of machines running on the buildfarm with Windows on them, so we do catch these things long before release.

There are a couple of reasons why it's not easy for developers to have a Windows machine ready for testing, even a virtual one. For one, it requires a Windows license. In this case the same problem with availability for testing exists for other proprietary platforms such as for example Mac OSX, but it's different from all the free Linux/Unix platforms available. Second, setting up the build environment is quite complex - not at all as easy as on the most common Linux platforms for example. This second point is particularly difficult for those not used to Windows.

A third reason I noticed myself was that running the builds, and regression tests, is very very slow at least on my laptop using VirtualBox. It works, but it takes ages. For this reason, a while back I started investigating using Amazon EC2 to do my Windows builds on, for my own usage. Turns out this was a very good solution to my problem - the time for a complete rebuild on a typical EC2 instance is around 7 minutes, whereas it can easily take over 45 minutes on my laptop.

Now, EC2 provides a pretty nice way to create what's called an AMI (Amazon Machine Image) that can be shared. Using these facilities, I have created an AMI that contains Windows plus a complete PostgreSQL build environment. Since this AMI has been made public, anybody who wants to can boot up an instance of it to run tests. Each of these instances are completely independent of each other - the AMI only provides a common starting point.

I usually run these on a medium size Amazon instance. The cost for such an instance is, currently, $0.30 per hour that the instance is running. The big advantage here is that this includes the Windows license. That makes it a very cost-effective way to do quick builds and tests on Windows.

Read on for a full step-by-step instruction on how to get started with this AMI (screenshot overload warning).

Tuesday, February 19. 2008

Several completely independent people have asked me this in email already, so I figured it's about time for a blog post on the topic of PostgreSQL on 64-bit versions of Windows.

First for the simple answer: No, there is no 64-bit version of PostgreSQL for Windows. PostgreSQL has supported 64-bit environments on Unix for many years (long before we had x64 to make it available for wintel machines), but there is no Win64 port. Yet. And given the way that PostgreSQL is developed, there is no firm date for when this will be available.

Until then, though, the 32-bit version of PostgreSQL runs perfectly fine on 64-bit Windows. And doing so may actually be a better idea in some cases than a native 64-bit version. But most importantly, the architecture of PostgreSQL makes the requirement for a native 64-bit version much less than for many other competing databases. Particularly:

PostgreSQL relies on the operating system to do much of the caching of data. Since the main limitation of running a 32-bit process is the total amount of memory being addressable, a system that is based on the database engine doing all the caching cannot address all the memory on a system with say 16Gb memory. For PostgreSQL, we leave much of that caching to the operating system which can address it, as long as the OS is 64-bit.

PostgreSQL uses a multi-process architecture, not multi-thread. I'm not going to get into the discussion about which is better, just summarize by saying there are good and bad points to both of them. In this case it's a good thing. In a multi-threaded database server, all client connections share the memory space, and are again limited by the total addressable memory region. With PostgreSQL, you could easily have more than 1Gb per backend (if necessary) without running out of memory, which significantly pushes away the point when you need to go to 64-bit.

So why would it ever be better to run a 32-bit version than a 64-bit version? Simply because a 64-bit version is more memory-hungry. Every pointer used in the system takes twice as much space. Every int takes twice as much space. And if you're not actually going be able to cash in on the benefits (by addressing loads of memory, for example), it's a very real cost.

I'm well aware that this is a fairly simplistic view on the 32- vs 64-bit situation in general, but it should be enough to explain why 32-bit PostgreSQL on 64-bit Windows isn't such a bad idea after all. And of course, this only applies if you are using an x64 version of Windows - with Itanium it's a whole different story.