NEC Japan employee KaiGai Kohei has been working on a security-enhanced version of Postgres
(SEPostgreSQL) for two years using the
Security-Enhanced Linux framework. The community recently had a long
discussion (summary) about
the challenges and usefulness of adding this feature. While concerns about adding the feature are legitimate, some of those concerns have
been recently addressed by:

A year ago I wrote a tongue-in-cheek blog entry about the MySQL "soap opera".
I never anticipated the soap opera would take on the international importance it has today, with the
European Union (EU)
questioning the purchase of Sun
Microsystems by Oracle.

There have already been two excellent Postgres blog posts about this issue
(1.
2), so I just want to make three observations:

I never thought Oracle cared enough about MySQL to delay the merger, e.g. MySQL was not mentioned in the merger announcement,
and MySQL makes up a small portion of Sun's revenue. I am guessing either MySQL is more important than Oracle revealed, or Oracle is
resisting the EU objections out of principle or stubbornness. (Perhaps there is some advantage to Oracle in delaying the merger.)

Oracle users rarely migrate to MySQL, so I don't understand the anti-competitive objection to the merger. Even then-CEO Marten Mickos
said in 2003 that MySQL complements and does not compete against Oracle,
so it is hard to understand why the EU is objecting to the purchase on monopolistic grounds. As much as MySQL tried to position itself by
adding enterprise features, the effort was incomplete, and based on the limited number of people who port applications from Oracle to
MySQL, probably not very successful.

Of course, Postgres works well for both MySQL and Oracle workloads based on the number of people who port applications every day, and
Postgres will remain a viable open source database alternative no matter what happens to MySQL.

There is an argument that dual-licensing is required to create successful open source software companies. Of course
Red Hat, other GPL-only software companies, and Postgres companies are doing just fine, so it is hard to see
how this argument makes sense. It might be a requirement if you expect to pay all the MySQL developers, which is the way MySQL has
always operated in the past. Postgres and most open source projects rely much more on volunteers and on multiple companies supporting
developers who work in a cooperative fashion; MySQL was an aberration in this area.

Interestingly, some MySQL users are
suggesting
a compromise of changing MySQL to use the Apache/BSD license, like Postgres's, which certainly is easier for companies, but not a
requirement. There is a few thoughtful articles (1,
2,
3) about the licensing issue.

I think the big concern Postgres people have is that many of the things being said about this merger are either wrong or MySQL-specific
and portray open source, and open source databases specifically, in an inaccurate way.

I just posted two emails about open source community management: the
first covers the challenges of adding new patch committers to
process the increased community patch volume, and the second
explores the potential problem of companies hiring away some of our most experienced developers.

Interestingly, two levels of recursive queries were used — one recursive query's output was fed into the next recursive query,
which was then fed into the main query. For example, to create the Mandelbrot image, the first recursive query generated 100 X values
which are fed into the second recursive query that generated (x, y) pairs with appropriate symbol numbers, which were fed to the main
query and converted to ASCII characters for display.

Today at PG West I saw a great
presentation by Josh Berkus about the many variants of Postgres — it was a
trip down memory lane. I was also surprised to see how many offshoots there are of Postgres; I had heard many of the names before but it
was surprising to see them all listed together.

The last release of pg_migrator was over two months ago. Since then, I have not received
any pg_migrator bug reports, and have received several successful migration reports from users. For these reasons, I assume pg_migrator
development is done for 8.4 and will not need to be resumed until we near the release of Postgres 8.5.

Also, EnterpriseDB has produced a introductory video
of me talking about pg_migrator.

When developing the Postgres backend, we are always mindful of keeping the code clean, efficient, and reliable. For some people, our
style is too conservative, but end-users appreciate our current approach. One thing we often avoid is complex coding — the pitfalls of
which are well presented in a blog post by Joel Spolsky
(Joel on Software). Talking about programmers grappling with complex designs, Joel wrote:

You see, everybody else is too afraid of looking stupid because they just can't keep enough facts in their head at once to make
multiple inheritance, or templates, or COM, or multithreading, or any of that stuff work. So they sheepishly go along with whatever
faddish programming craziness has come down from the architecture astronauts who speak at conferences and write books and articles and are
so much smarter than us that they don't realize that the stuff that they're promoting is too hard for us.

No, I am not advocating Duct Tape programming (as mentioned in the blog), but the Postgres project is always trying to steer between Duct
Tape programming and overly-complex design programming.

I have relicensed all my presentations under the Creative Commons Attribution
License, which most closely matches the BSD licensed used by Postgres. Previously there was no license on the presentations, meaning, I
think, all rights were reserved. (My book cannot be relicensed because the copyright is owned by the publisher, Addison-Wesley.)

I completed teaching my university database course at the end of August. The
class went more smoothly than I thought. It was similar to teaching at a conference, except it was twice a week for ten weeks. I had a
teaching assistant who graded the homework, and an experienced professor helped with exams and grading, so I focused mostly on lectures.

All the students had heard of Postgres and knew of its reputation. They all used Postgres for homework assignments, including one that
required writing an application that connected to Postgres. No one had major problems, which is a good indication that our one-click
installers are easy for new users.

Though the course is over, I am still on the Drexel faculty and will probably be involved in database and open source activities there in
the future.

Update: Many of my students were online, and this
article talks about the trend of
increased online learning.

If you have interacted with the Postgres community at all, you have experienced how civil and helpful the group is; this has helped us to
retain volunteers for many years and sustains our consistent growth. Of course, not all online communities are as civil, and illustrator
Mike Reed has created a humorous web site showing 89 of the not-so-nice inhabitants of
online forums, patterned after Dungeons and Dragons characters. Of course, none of the characters matches anyone in the Postgres forums,
but I can see hints of a few of us. What amazes me is how many of these characters we
don't have.

I just presented a talk at JBoss World with Jim Mlodgenski of EnterpriseDB. We showed the changes
necessary to allow Hibernate to work well with Postgres. I wonder if we should be doing more to
encourage middleware users to use Postgres, perhaps by creating resources so they can use Postgres more efficiently.

Update: This document contained inaccurate information and so the link to the document has been removed.

I have heard good things discussed for the past few years about using ccache to speed compiles of Postgres.
I finally got time to test it and saw my Postgres compile times drop from three minutes to one minute. Supposedly ccache always produces
the same result as a normal compile (including any compiler warnings), so I plan to use it from now on.

I realize it is a month late, but I want to say how happy I am that the Japan PostgreSQL Users Group
(JPUG) is celebrating ten years of activity. I have been
privileged to have been involved in more JPUG activities than most non-Japanese. The size and accomplishments of JPUG far exceed that of
any other Postgres user group, and the JPUG has enabled much of the advanced research and wide-spread adoption of Postgres in Japan today.

The big JPUG event this year is their tenth anniversary
conference. I plan on attending and I know many international Postgres developers are planning to attend as well.

You might have noticed that patch processing has changed in recent Postgres development cycles. For 8.4 development, we used wiki-based
commit fests with the goal of getting more people involved in patch
approval, and speeding patch application. For 8.5, Robert Haas has written a patch management
application that automates much of the manual adjustment
required when using the wiki; many thanks to him.

I used to manage most of the independent patches myself, but patch volume now far exceeds my ability to process them. Patch application
is now managed by a capable team, and my involvement is less critical. This will allow me to get involved in other projects, such as
pg_migrator, which I just
finished.

Last Tuesday night, I received an award as
Database Jedi Master during the Google O'Reilly Open Source Awards ceremony
(image,
image, video -
start time 3:30). Of course, the award belongs to the entire Postgres community (as I mentioned in my acceptance speech), and the glass
statue spent most of the week in the OSCON Postgres booth. The glass statue design is unique, with a *1* appearing from one angle in the
glass, and a *0* from another angle (picture), and an infinity symbol visible from the
top. The statue generated quite a bit of booth traffic from people who wanted to pick it up to study it. (If there is interest, I can
bring it with me to Postgres events.)

I was certainly surprised by the award — I always felt databases, because they are software infrastructure, didn't get enough
visibility at these events, but I had gotten used to it. This year two database people received awards, myself and
Brian Aker of MySQL/Drizzle, which is a good sign. Postgres seems to be on fire this year, and this is another
indication of that.

The Postgres booth was well staffed, and we had more people stopping by to say they use Postgres than I have ever experienced before.

Interestingly, Matthew and I came home with four free phones: two Symbian phones and two
Android phones. While we don't have use for that many new cell phones,
their wifi capability makes them useful hand-held computers. The Android
G1 phones are particularly useful as hand-held ssh
terminals because they have a physical QWERTY keyboard.

This past week two bugs were found in pg_migrator: one related to the handling of
large objects, and a second related to migrating
sequences. I have removed pg_migrator 8.4 from the web site, and
posted an 8.4.1 Alpha 1 release with a clear mention of the two known bugs that exist. I hope to have these bugs fixed in the next few
weeks.

After a great amount of testing, pg_migrator 8.4 final was released last week. I have
received some successful usage reports, and no bug reports so far, so it seems all the testing paid off. Thanks to the many testers that
made pg_migrator 8.4 possible.

Two months ago I mentioned that I was going to finish proofreading the
Postgres manuals. Well, I was only able to complete proofreading the first two parts (Tutorial and the SQL Language, 350 pages) for 8.4.
I will continue working on this for 8.5. Fortunately I have not found any major mistakes so my proofreading is more of a cosmetic issue.

I have released pg_migrator release candidate 1 today. I was planning on doing the final
release this week, but the backend API that pg_migrator uses was improved in Postgres 8.4 RC1 so pg_migrator now requires an 8.4 RC1 or
later server and no longer works for Postgres 8.4 beta versions. Also, Tom Lane pointed out that pg_migrator shouldn't go final until
PostgreSQL goes final, so I will continue in RC status until PostgreSQL 8.4 final is released.

You might remember the PostgreSQL East conference that took place this past April at
Drexel University in Philadelphia. As part of the event, I and other conference leaders had dinner with the
Drexel Computer Science Department Head, Jeremy Johnson. One item we discussed was our
frustration at the limited academic adoption of Postgres, even though Postgres is better for education than many other databases used in academia.
Jeremy seemed to understand our plight, and I gave him my business card in case I could help Drexel, perhaps by doing a guest lecture
about Postgres.

Well, a month later, I got an email asking if I could teach a database course this summer; that was much more involvement than I
anticipated. I thought about it for a week, asked a few people for advice, and decided to accept. I filled out the paperwork a few days
ago so now I am officially a Drexel adjunct professor, at least for the
summer. I have already created a web site for the class. (Much of that material was provided by
the previous instructor.)

The class is a combined graduate/undergraduate class, with both traditional and online students. Classes start June 22 and last for ten
weeks. Students will use Postgres exclusively for learning SQL, homework, and projects. I hope this class not only encourages more
Postgres use at Drexel and among Drexel graduates, but also spurs other educational institutions to explore the benefits of incorporating
Postgres into their curriculum.

I just put out pg_migrator beta2, with only small changes to the install file since
beta1. The
install
file is critical to the success of pg_migrator. Unlike the Postgres backend or libpq, pg_migrator relies on many external tools and
administrator actions, so detailed, accurate install instructions are essential for reliable migration. Pg_Migrator is really not usable
without consulting the ten-step installation instructions.

I have not read much email since the beginning of May — that's when I started working on
pg_migrator, then I attended PGCon, and then I did a
week of training for EnterpriseDB. Only on May 30 was I able to devote full-time effort to reading
Postgres email; by that time my unread email had grown to 3,000 messages. So for a week I only read email and have now caught up. You
can actually see this in my graph of unread email.

The good news is that the Postgres community did an excellent job of addressing almost every open item reported during the month, so
there were only perhaps twenty emails I had to process. That makes my job very easy, and isn't that what it's all about
(see poster).

At PGCon a few new Postgres users mentioned how surprised they were at the professionalism of the
Postgres community. (I am not sure I want to know what they expected us to be like.) I assume they didn't mean professionalism as in
proper office attire or use of business jargon, but rather our seriousness,
dedication, and attention to detail. I think what really surprises people is that we are usually more professional in these areas than
paid programmers.

Not only do our world-wide conferences help motivate our developers, it also helps people associate faces with the project and help them
feel confident about trusting us and Postgres with their data.

This year's PGCon was a well-oiled machine. Many of the inconveniences of the logistics and venue were
gone. Gone also was the self-consciousness of our having our own Postgres conference — it all seems natural now.

I had three people approach me asking how they could get more involved in Postgres development — certainly a good sign.

My talks about pg_migrator and
getting your patch accepted are now online. The PGCon conference team did
an amazing job getting the conference videos online in record time,
often within an hour of the presentation. The web video software they are using is also ideal for presentations because the video and
slides are presented simultaneously and the slide numbers are indexed in the video.

During the past week pg_migrator has had significant improvements. Stefan Kaltenbrunner
did a migration of a large database, and found a few bugs along the way that have been fixed, and Hiroshi Saito has ported pg_migrator to
Windows, where it has been successfully tested; he is now a pg_migrator committer.

I received a few emails thanking me for getting pg_migrator ready for 8.4. I think working on pg_migrator is similar to porting Postgres
to Windows — many people want the capability, but few want to do the tedious work to make it happen. My guess is that I have
years of pg_migrator work ahead as I adjust it for every major Postgres release. I have come to the conclusion that we can't ignore the
dump/restore major upgrade inconvenience any longer, and pg_migrator is the best solution I can foresee.

One clever facility (added when EnterpriseDB wrote the original code) is the ability for pg_migrator to call server-side functions,
specifically backend functions to help in the migration — major data format changes that couldn't have been handled by previous
binary upgrade scripts are now possible. We are already using this facility to create TOAST files in a more robust way. Another new
facility is data page conversion routines, which aren't needed for 8.4 but might be useful in the future.

I have just released an alpha-6 version of pg_migrator, and hope to release the first
beta in a week or two.

I mentioned my work on pg_migrator a week ago; today I released an
alpha version of pg_migrator. This is an overhaul of the pg_migrator code designed for
Postgres 8.2 (recently BSD-licensed by EnterpriseDB), which itself was based on the method used by the
pg_upgrade script I wrote in 1998.

I have tested it by migrating the Postgres regression database from 8.3 to 8.4. pg_migrator currently supports only 8.3 to 8.4 migration,
and does not support Microsoft Windows.

I was surprised how many old-school practices we follow that are listed in this
article (in order of appearance):

Custom sorting code, linked lists, hashes

Structured programming

Multi-threading with setjump/longjump

Custom memory management

Non-WYSIWYG editing platforms

Pointer math

Date conversions

Null-terminating C strings

I guess databases are one of the technologies that make regular programmers' jobs easier (or less old-school). It certainly makes sense:
why implement sorting or date arithmetic when the database can do it, but someone eventually has to do it, and it seems that is us.
I think this explains why many of our old-school practices remain.

I mentioned I was going to be enhancing pg_migrator to allow binary upgrades
from Postgres 8.3 to 8.4 (no dump/restore). Well, today is the first day I successfully migrated an 8.3 database to 8.4. I
still have much testing to do and want to add more validation checks, but if you want to see what I have so far and perhaps give it a try,
a tarball is available. I am still perhaps two weeks away from releasing a beta on
the pg_migrator web site.

There are a few Postgres items I would like to complete in the months before Postgres 8.4 Final:

Proofreading the Postgres manual: I mentioned this a year ago but
made little progress; I am now determined to finish proofreading and merge my minor wording improvements into the 8.4 final documentation.

Complete testing pg_migrator: I have resolved all open issues related to using
pg_migrator to do a binary upgrade from Postgres 8.3 to 8.4; now only testing is left.

Last week the Postgres developer community released PostgreSQL 8.4 Beta1. The Beta1
release notes contain a lot of information about upcoming 8.4 features as well as a link to the 8.4
release notes. There is a
web page with even more detailed information. There is a detailed
news article about beta1, and I am doing a
webcast about 8.4 features using a presentation I
mentioned previously.

There have been no major problems with Beta1 and few open issues and so
hopefully we can release PostgreSQL 8.4 Final in a few months.

In a surprising news today, Oracle has announced they will buy Sun Microsystems.
While everyone is still trying to process the effects of such a merger
(article), the big question is how this will affect
MySQL usage and hence Postgres usage. I can't imagine MySQL being controlled by a more hated company, so I suppose this is good news for
Postgres adoption.

I anticipate this will drive many MySQL users to Postgres and Solaris users to Linux. The effect of the merger on Java and Open Office
is a cause for concern.

The purchase price for Sun is 13% of the price Sun paid/overpaid for MySQL.

Update: If Sun's cash assets are considered, the price of MySQL vs. Sun is more like 18%, but if you then consider the MySQL price to
be actually 800 million (an additional 200 million was pledged), the percentage is more like 14%.

Although there has already been a blog entry
about the video of Robert Treat winning a netbook (time 2:30), I don't think the true drama
of the event was sufficiently highlighted. Treat's rendition of Steve Balmer's dance
video should truly not to be missed. Kudos to Robert for providing unforgettable
entertainment during PG East.

The Postgres East conference starts tomorrow, and I have offered to host a walking
tour of Philadelphia on Monday for whoever is interested. I have created a
map
of the typical highlights during my Philadelphia tours.

I am attending a 2-day Emerging Technologies Conference in Philadelphia today and
tomorrow. The best talk today was by the creator and lead developer of the
jQuery JavaScript library. The subject of his
talk was not about jQuery technology but about how he worked to grow
jQuery adoption and its open source community. I think there are lessons in his talk for the Postgres community. We have been at this for
13 years and sometimes forget the things necessary to grow; jQuery is only three years old so their experiences are fresher.

The conference also had some interesting talks about how to develop applications on the iPhone and Android, and
one about
agile development, specifically how to develop applications that can be
easily modified and improved over time, something we do well as an development community.

I have completed the first draft of the Postgres 8.4 release notes and they are now
online. It still needs additional cleanup, markup, and reordering,
but it is good enough for beta and will be continuously updated until 8.4 is released.

There was a lot of interest recently about how I create the release notes, so I have kept my intermediate versions for review:

No.

Stage

Lines

Duration

1

raw cvs logs since 2008-02-13

576k

instant

2

src/tools/pgcvslog -d

16.5k

instant

3

remove insignificant items

2.7k

1 day

4

research and reword items

1k

5 days

5

group into sections

1k

4 hours

6

add SGML markup

2.3k

1 hour

As you can see, the majority of the time is spend researching and rewording every item so the final version of the release notes is easily
understood.

I have started working on the Postgres 8.4 release notes. I started with a cvs log of all commits since February 13, 2008 (576k text
lines), and ran that through 'src/tools/pgcvslog -d' which merges commit messages, removes back-patched items, and removes unnecessary
lines (result, 16k lines). I will now delete unneeded commit messages (result, 2.7k lines), reword existing ones (result: 1k lines),
collect them into categories (result: 1k lines), add SGML markup, and commit the result into our CVS SGML documentation; I should finish
by the end of the week.

I think Postgres 8.4 beta is a few weeks away. In related news, SE-Linux
integration will not be included in 8.4.

In preparation for Postgres 8.4 beta, I have written a presentation highlighting
some 8.4 features. I will also soon start compiling an interim version of the release notes. FYI, I just updated the presentation to
reflect that the hot standby capability (read-only queries on continuous archiving slaves) will not be in Postgres 8.4.

Last week I was in Andover, Massachusetts talking to EnterpriseDB customers. This week I am in Munich
teaching a 5-day source code internals course to an EnterpriseDB customer. I have not left the United States East Coast since
October of 2008, and my wife says she can tell it is time for
me to travel again. Odd that staying near home for four months seems strange when I used to not travel for years without it feeling
unusual.

The community is considering a security enhancement for Postgres 8.4. Like the port of Postgres to Windows, this is a new area for the
community, so I have solicited help from the security community
to determine if this is a useful enhancement to Postgres. I have created a wiki
to guide the discussion, but I ask that no one post to hackers about this until Wednesday, 2009-01-28 1200 GMT, when security experts will
hopefully be subscribed to hackers to aid our discussion.

Surprisingly, the patch author, Kaigai Kohei, and SE PostgreSQL are already listed on the U.S. National Security Agency (NSA)
website.

My webcast schedule for EnterpriseDB has been
solidified. I am covering three complex topics: administration, data processing, and performance. Each topic will be convered in three
one-hour webcasts that extend to the end of March. My first one in the series starts tomorrow (Tuesday) and about 300 people are
registered. Some older webcasts are now online.

While not directly related to Postgres, this
article about Mark Shuttleworth captures
well many of goals that motivate our project. The article mentions the financial aspect of open source as
quixotic, which I first heard used about open source in an article about my first Postgres
company, Great Bridge. I remember
humorously debating with the marketing manager about whether "quixotic" was a positive or negative reference in the article. (We decided
it was both. )

I have expressed my confusion in the past about conflicting
reports of Postgres performance compared to commercial databases, specifically Oracle. On the one hand, people seem to be porting
applications from Oracle to Postgres almost every day, and the reports we get are that Postgres is within 10-15% of Oracle's performance,
sometimes faster, sometime slower. On the other hand, some people who live to tune Oracle report that Oracle can be made to run certain
queries much faster than Postgres, like 2x, 10x, or 100x faster.

Obviously, these reports seem contradictory. I think the resolution involves the number and complexity of tuning knobs associated with
each database. Postgres takes the approach of trying to get the best performance for the most users — that means automatically
tuning many parameters and allowing users to tune only the most significant parameters that can't be automatically set, in the hope that
the easier it is to tune Postgres, the more people will do it. Oracle takes a different approach, exposing many more performance settings
for adjustment. And, as expected, because Oracle has so many settings, many people don't try to tune it, and in fact there is
disagreement among Oracle experts about how to tune them effectively; this problem exists for Postgres as well, but to a much smaller
extent.

So, I think what is happening in the contradictory reports is that the majority of people porting from Oracle to Postgres are not using
optimally tuned Oracle installations. In many ways they are not at fault because Oracle is so complicated to tune, and perhaps their
Postgres installations are well tuned because Postgres is so much simpler to tune and administer.

Another factor is that the Oracle experts who report Oracle as 2x, 10x, or 100x faster are reporting this on special-case queries where
Oracle's tuning excels, rather than for a general workload, and the tuning they are doing is far beyond that which would be done for a
typical Oracle installation.

This exposes a clear tradeoff. Adding tuning knobs helps in some queries, but the number of knobs also discourages tuning overall. For
example, suppose T is the number of tuning knobs; we can calculate I as the percentage improvement possible with those tuning knobs.

log10(T)
I = --------
100

This says that with 10 tuning knobs, you can get a 10% performance improvement, but it takes 100 knobs to get a 20% gain.

And, we can estimate P as the percentage of people who choose to tune their installations:

100
P = -------
sqrt(T)

That is, the number of users attempting tuning decreases as the knob count increases.

Therefore, the gross advantage of tuning for all installations is the percentage of people who tune their installations (P) times the
percentage benefit they get from tuning (I):

log10(T) 100
-------- * -------
100 sqrt(T)

which simplifies to:

log10(T)
--------
sqrt(T)

I plotted this function and I think you will as surprised by the graph as I was. It
initially shows increased gross improvement as the number of knobs increases, then it changes to a gradual decrease as the number of knobs
increases.

Now, these are all very crude, back-of-the-envelope calculations, but I think it shows something the community has known by instinct for a
while — that there is an optimal number of tuning knobs to expose, and at a certain point increasing the number of tuning knobs
decreases gross performance. Now, there will always be someone who does use all the knobs, and gets better performance because of them,
but that user is often the minority, and a shrinking minority as the number of performance knobs increases.

I mentioned
recently that we are approaching 8.4 beta. Well, the hard work has
begun. I have worked almost continually for the past week trying to get
a handle on all the open issue that need to be closed before starting
beta. It is tempting to push many of these issues to 8.5 but that leads
to sloppy software longterm and has never been our style.

I began with 2,000 emails that I have saved over the past few months: 500 where related to
recovery, replication, and hot standby.
Another 700 were already closed. That leaves about 800 emails that are either on the commit-fest or need some kind of attention. I have
gotten great assistance from everyone during the past week as I work to get them all completed or closed.

My employer, EnterpriseDB, has scheduled me to
do webcasts of my most popular talks during the next few months; they also have several non-technical webcasts. I already did a webcast
about replication in December, and the content is online now (free
registration required).