Our web site lists many Postgres user groups, but perhaps only a dozen are active, having monthly
meetings. Why is that? What makes a Postgres user group successful? How can you help create or revitalize a user group?

The answers aren't a great mystery, but it does require hard work. I think a lot of people who start Postgres user groups don't realize
the work necessary to make it successful. They probably think, "Oh, I will announce a user group meeting, and people will show up, and it
will be fun." Well, the "fun" part might be right, but growing a user group long-term requires lots of work — let me explain.

The first meeting is usually deceptively easy — you just announce it and curious Postgres users attend. But what next? You
can't just keep calling meetings hoping people will show up. You need to give people a tangible reason to attend future meetings. That's
where the hard work comes in:

Communication: Clear an consistent communicate are a requirement for any successful user group. Many USA-based user groups use
Meetup, because it provides an easy, single point of contact for the group. The New York City Postgres User
Group's (NYPUG) web site is a good example.

Consistency: When you set the expectation of monthly user group meetings, people then plan their schedules around those meetings. If
leaders get busy and can't host expected meetings, people start to wonder if the user group is useful for them, and attendance lags. Most
successful groups need at least two leaders to avoid this problem, so one can take over if the other is busy.

Speakers: Providing meaningful content is an important part of growing attendance. In the first few meetings, it is often necessary
to bring in well-known, outside speakers to generate excitement. After the group grows, the members themselves can often be effective
meeting speakers.

Food: Let's face it — no one wants to grab a warmed-over meal from a convenience store and eat it in the car in the way to
the meeting. Provide food and advertise it so people know they can come right from work and have a nice time socializing and eating
before the meeting starts.

Determination: As the group is getting started, there will be meetings that are poorly attended, which can lead to discouragement.
Leaders should avoid scaling back and just continue going — it takes time to build momentum.

The bottom line is this — if leaders consider the user group important, then others will as well. If leaders behave in a way
that indicates it isn't a priority, the rest of the group concludes the same, and the group fades away.

Efficient scheduling of multiple sessions across multiple CPUs by the operating system

Improved reliability

Unfortunately, this process architecture makes it difficult for a single session to utilize multiple CPUs and I/O channels. (Multiple
sessions use multiple CPUs just fine.) This is becoming more of a limitation because of recent computer architectural improvements:

More CPU cores rather than faster CPUs

Increased memory capacity

Solid-state drives with reduced I/O latency

With the traditional database bottlenecks of I/O and memory receding
(presentation), a single CPU will become more of a limiting factor
in the coming years, and is already a limitation for some workloads.

Postgres has already made some parallelism inroads. Our pg_restore
utility can use multiple parallel restore processes. We also support building the source code in parallel with
gmake.

While these are useful first steps, there many more possible areas of parallelism:

That last item needs some explanation. Databases are inherently stage-based, moving queries from parsing, rewriting, planning/optimizing,
and execution. There isn't a whole lot of parallelism possible in those early stages, and it is difficult to pass partial results from
one stage to the next, so it is hard to implement parallelism there. It is possible, but it requires rearchitecting major parts of the
system, and the overhead of the new infrastructure often hurts reliability and diminishes the hoped-for performance benefits.

I think back to changes made in Informix 6 in 1994.
Rearchitected to be massively parallel, Informix 6 was widely rejected and many users rolled back to process-based Informix 5. Informix
7, also designed for parallelism, took years to stabilize, and by that point the company had lost much of its momentum and credibility.
Of course, we don't want that to happen to Postgres.

The Postgres approach has always been conservative, looking for ways to increase performance without sacrificing reliability or
portability. Users often arrive with invasive performance patches, but our developers are able to test patches to determine which parts
are contributing the majority of the performance improvements, and restructure the patches to minimize code disruption. There is a good
example of this from an email posted today.

We will soon reach a point where more parallelism will be a requirement, particularly to speed up single large queries — we
might be there already. But parallelism will not be done in a big-bang manner, but rather incrementally, where it can produce the
greatest impact. Our first step will probably be to create a session-helper architecture, with multiple processes waiting for requests for
parallel operations. With that in place, we can start to look at where parallelism is needed.

For a really interesting use of database parallelism, look at Tim Child's
PostgreSQL OpenCL Procedural Language.
It shows a server-side language designed to allow parallel operations on graphics CPUs (GPU). That is just one possible example of where
the multi-year effort of parallelism might take us.

Having attended 138 Postgres-related events in the past twelve years (chart), I
have done quite a bit of travel (map). Of course, many business travelers travel a lot more than I
do, but the variety, distance, and duration of my trips is somewhat unusual.

I am often asked at conferences about how I manage my heavy travel schedule. There are a few tips that keep me sane and motivated to
continue traveling, so I thought I would share them here. I especially hope this is useful for the many volunteers that give of their
time to travel as speakers to distant Postgres conferences:

Control: For me, most travel is voluntary, meaning I can decide to accept or reject a conference invitation. Psychologically, that
is an important aspect. If I have booked too much travel, I have only myself to blame.

Pace: A related issue is pacing. You might remember I recently
declined some events to recover from a heavy travel schedule.
Pushing travel past the comfort point is an easy way to start hating it and therefore burn out.

Variety: It is hard to get excited about going to the same distant city repeatedly. Variety is a great way to keep travel exciting.

Vacation Day: I always try to book an extra day in the travel city for sight-seeing. I often don't remember many details of the
conference, but I do remember those sight-seeing days, and that keeps me going and anticipating the excitement of travel. Also, my
travels usually allow me to experience the culture as a native, rather than as a tourist, which is also enlightening.

Sleep: Sleep is an often overlooked aspect of travel. A long flight where you can't sleep is a frustrating flight. A flight where
you are asleep by wheels-up and are awoken by landing instructions is a breeze. (A window seat, blow-up
travel pillow, ear plugs, eye shades, and travel socks can make
all the difference.) Also, sleeping early when on location is important, especially the night before your presentation. You need to be
sharp; you spent hours traveling — don't blow it by not pacing yourself and getting insufficient sleep. Sleep also helps prevent you
from getting sick, which is more likely and more inconvenient when traveling.

Working From Home: Because I work from home when not traveling, I have lots of travel energy. If I had to go to an office every day,
I would have a much harder time. Heck, when I have not traveled for a few months, I emit a slight whimper when driving past the airport.
(Weird, huh?) I remember waiting for a short flight and hearing a Madrid boarding announcement and thinking, "Hey, I could go over there
and be in Madrid by morning". (Yeah, odd, I know. I guess I was made for this.)

Maximize: Travel is a big time-waster, so make the most of it. I don't worry about community emails and minor things while I am away
but focus on making the most of the event by spending time with folks and being available for unscheduled things, like speaking at local
companies that use Postgres. (This usually happens after the conference is over when people realize I am around for an extra day.)
Combining trips to visit several cities is also always a win — who wants to fly back the next month?

Family: My work is not in a vacuum, and my family is certainly affected by my travel. I try to schedule things ahead of time so they
have few surprises, and I often take one of my children with me on trips. That's a win in any family's book, and it cuts down on travel
resentment. In fact, this is the first year I took all four of my children on at least one trip. (It is easier because they are
home-schooled.)

Anyway, these are my tips. I didn't start out knowing most of this, so I thought I would help people starting out with travel. In fact,
just yesterday I was advising someone who is considering travel to Japan to speak at a Postgres conference.

Done right, Postgres travel can be a meaningful part of your life and you can continue it indefinitely. Done wrong, you will soon hate it
and look for any excuse to avoid it. I hope that doesn't happen to anyone.

You might have heard the term "GUC" before, especially on the Postgres email lists. It is
documented as an acronym meaning "Grand Unified Configuration", but the
section it references doesn't mention the term "GUC" at all. Pretty
obscure, huh?

Our Grand Unified Configuration (GUC) system was created in
2000 as a way to control Postgres at various levels. You are probably most familiar with controlling Postgres via the postgresql.conf
file or using the SQL SET command, but it is much more powerful than that.

In fact, people often get into the habit of changing variables at certain levels, and never realize that these variables can be set at
more global and more local levels. Of course, some settings, like
shared_buffers can only be set
in postgresql.conf. (These entries are usually marked with the file comment "(change requires restart)".) But other settings, like
work_mem and search_path, can be set at many levels. This email
gives a very concise description:

The closest thing to global variables are GUC settings. These can be
set globally in postgresql.conf, per user and/or per-database via
ALTER ROLE/DATABASE ... [IN DATABASE ...], per session with SET, per
function via CREATE FUNCTION ... SET and finally per subtransaction
with SET LOCAL.

As I have said many times before, our GUC system seems limited at first, but once you realize that you can combine settings and set them
at various levels, it becomes a powerful tool that can meet almost any need.

Having given many talks about Postgres administration over the years, I have recently felt convicted that I have not given sufficient
coverage to external monitoring tools. I have always discussed how to get information out of Postgres, but not how to efficiently process
that information. I have updated my administration presentation, but I
would like to go into more detail here to atone for my previous lack of coverage.

Alerting: For alerting, it is tough to beat check_postgres and
tail_n_mail. Check_postgres runs checks on many aspects of Postgres, reporting the results via
text output, Nagios, or MTRG. Tail_n_mail scans for important messages in the server logs and emails them.

Analysis: There are a lot of analysis tools out there. Nagios is great for distributing alerts, while
Munin and Zabbix are good for graphing. In fact, some people use Nagios
for alerting and Munin for graphing. Cacti and MRTG are also popular. This
blog entry from 2009 compares several popular analysis tools.

Queries:pgFouine ("fouine" is French for weasel) is an often-overlooked tool for
analyzing SQL query traffic to find the slowest, most frequent, and queries that took the most cumulative execution time
(sample report). It analyzes the queries by reading the server
logs.

Recent Postgres major releases have been packed with features (9.1,
9.0), but you might have noticed a lack of major performance improvements
in those releases. I was starting to think that we had basically addressed all the performance improvements we could.

Well, I was sure wrong. Postgres 9.2 is going to have a boat-load of performance enhancements. The inclusion in Postgres 9.2 of
index-only scans and
count(*) optimizations clearly is something that is going to help a lot of
work-loads.

Large server workloads (+32 cores) is another area that is being addressed. This has been an area where we lagged behind commercial
databases, and I was starting to think that to catch up with them, we were going to need to do a lot of very ugly, platform-specific
hacks. Well, that turns out to be false also. Changing the layout of the of the per-session shared memory storage has
yielded a 40% performance improvement on 32-core tests. Lock
scalability for a large number of clients was also
improved by 5 times!

And there is more work being done, but not yet committed to Postgres 9.2. Inlining some of the sort comparisons has
lead to a 60% sorting improvement. Improvement in MVCC snapshot
overhead has lead to a 20% improvement for large workloads.

These are not the 1-2% wins we normally chase down, but major benefits. Of course, they don't apply to all workloads, but the benefits
are unmistakable. It is hard to guess everything that will be in Postgres 9.2, but it is clear that those looking for a faster Postgres
will not be disappointed.

Update: I have updated the text above to include the big lock scalability item.

After attending Postgres events in Chicago, New York City, Baltimore, Amsterdam, and Brazil during the past two months, I returned home
this week to recover. Even though I travel frequently, this pace exceeded even my ability to adjust. I will be skipping upcoming
Postgres events in Boston and
Los Angeles (for my own sanity
) and am staying closer to home during the next few months.

I will be teaching a one-day training class in Philadelphia next week, and hope
to do other classes in New York City, Boston, and Washington, DC in the coming months. (That shouldn't be too hard.) I also plan on
making some modest code contributions for Postgres 9.2.

With PG Open starting next week, I begin an eight-week period that has me traveling for half that
time, visiting Chicago, New York City, Baltimore, Amsterdam, and São Paulo
(schedule). My sons are traveling with me on two of these trips, which makes
the travel burden easier on my family.

Four months ago, Greg Smith had an excellent blog entry
describing his admiration for the Intel
Solid-State Drive 320 Series because of their performance, low cost per gigabyte, and capacitors that guarantee safe writes.

Unfortunately, a significant number of drive users
reported problems with the SSDs.
(Amazon reviews)
After power failure, the drive would sometimes show a data capacity of only 8MB.

Fortunately, Intel has released a fix, so the
drives can be safely used again.

And, to conclude, here is a portrayal of the Postgres community at its best:

This newsgroup I always brag about to my other geek friends. 'Those people on
the postgres news group are the steadiest, most helpful, most logical people I
know on the Internet. The don't jump on fancy new trends nor hide technology in
market speak. Their work revolves around the best tested, theory based
practices. They just make the tech world a better place with some of the most
reliable, best supported OSS out there.' And you guys and gals ARE that
description. (email)

Part 5 of my Email Snippets series: Postgres is a community-managed
open source project, and as part of that management, humor often pokes through:

Honesty

> >> There is some amount of user code (I'm not sure precisely how
> >> much) that runs after shared memory is mapped and before the
> >> deadman switch is engaged.
> >
> > Er ... what would you define as "user code"?
>
> Our code, as opposed to the failure-inducing boatload of crap
> injected by the operating system.
Don't hold back. Tell us how you really feel ;)
(email)

Search

Saying, in effect, "search the archives with our not-super-great
search technology using keywords you didn't think of," comes off as
pretty dismissive if not downright hostile.
(email)

Wishing

That doesn't preclude also providing some more-invasive tools
that people can use when they do get into that situation; but let's
not let I-want-a-magic-pony syndrome prevent us from doing anything
at all.
(email)

We need to revisit that for 9.2 and un-reject it. It's nice
that PostgreSQL can run on my thermostat, but it isn't nice that
that's the only place where it delivers the expected level of
performance.
(email)

Marketing

... there's an interesting angle to complete with NoSQL...wait, I'm feeling some more ad copy:

PostgreSQL 9.1 lets you pick exactly the level of commit guarantee your data requires. Whether you want unlogged tables optimized only for
speed, or you need durable synchronous replication to multiple servers, you're covered — all in one database.

Need to take a shower to wash the stench of marketing off now.
(email)

Email formatting

A: Maybe because some people are too annoyed by top-posting.
Q: Why do I not get an answer to my question(s)?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing? (email)

Part 4 of my Email Snippets series: Postgres has a well-deserved
reputation for reliability, as shown in our email postings:

60+ Ships

> Of course, none of that helps with your current issue; I would just
> hate to see you pop up here in a few months with bigger problems
> than this if it can be avoided.
We had bigger problems than this and still survived, don't worry ;)
PostgreSQL (be it 7.4.2, 7.4.19 or 8.3.3) in our 60+ vessels, is the last thing we have worried about
over the last 10 years. Generally PostgreSQL lasts even after hardware/OS have given up.
(email)

500 Machines

I had exactly the same experience 3 years ago. Complete power failure
(the stand-by generator took fire) in one small datacenter (around 500
machines). We had Oracle, SQL Server, DB2, MySQL, Progress, and of
course PostgreSQL. The only database engine that restarted with no
operation required was PostgreSQL. There were very minimal problems with
Oracle (typing recover on some instances), but we had quite a few
problems with the other engines.
(email)

Thousands

I manage thousands of databases (PostgreSQL, SQL Server, and MySQL),
and this past weekend we had a massive power surge that knocked out
two APC cabinets. Quite a few machines rebooted (and management is
taking a new look at the request for newer power cabinets heh).
Talking theory is one thing, predicting results is another...and yet
the only thing that counts is "what happens when 'worst-case-scenario'
becomes reality?"

Long story short, every single PostgreSQL machine survived the failure
with zero data corruption. I had a few issues with SQL Server
machines, and virtually every MySQL machine has required data cleanup
and table scans and tweaks to get it back to "production" status.

Part 3 of my Email Snippets series: The email lists are occasionally
introspective, which is reflected in the following philosophical email postings:

Adventure

> If you guys care about money and time why would you spend the best years of
> your life basically copying commercial products for free? Because for a
> person with higher than average IQ far less than one percent of any program
> is creative and needs some thinking and the bulk of it is just a million
> stupid details.
> ...
The work on PostgreSQL is adventure, and very good experience, very
good school for me. It's job only for people who like programming, who
like hacking, it isn't job for people, who go to office on 8 hours.
Next I use PostgreSQL for my job - and hacking on PostgreSQL put me a
perfect knowledge, perfect contacts to developers, and I can work
together with best programmers on planet. and I can create some good
things. Probably if I work on commercial projects I can have a better
money - but life is only one, and money is important, but not on top
for me - life have to be adventure!
(email)

Compensation

Here you're assuming that open source code development on large
projects like PostgreSQL is done in people's spare time. In
reality, 80-95% of such development is done by people who are paid
by their workplace to do so. In the case of PostgreSQL
developers, this pay is at least comfortable, so your assumption
that this is done uncompensated, in terms of money, is simply
wrong.

For those who do development and are not directly compensated by
their employer for doing so, there are other monetary rewards,
such as being able to put such projects on r?sum?s/CVs, which in
turn results in better job prospects, consulting fees for
specialized knowledge, etc., etc.

"and time why would you spend the best years of your life"

That time's compensated, in many different ways, as illustrated
above. Perhaps your life is in such desperate straits that you
can devote time to nothing but acquiring money. If this is true,
I feel very sorry for you. I feel even sorrier for you if you are
not in such desperate straits, but you are nevertheless devoting
every waking hour to the pursuit of money. It's a sad and lonely
way to waste your precious days of life.
(email)

Dark Side

>> In Oracle, deferrable primary keys are enforced by non-unique indexes.
>> That seems logical,
>
> ... maybe to an Oracle guy ...
>
I humbly admit being one. I am getting used to the life without the dark
side of the force, however. I saw the light, I am saved. When the
rapture comes, I will not be left behind. However, I still have to
maintain a rather big 4-way Oracle RAC configuration and some satellite
Oracle databases.
(email)

Workflow

This is one of the reasons I love open source in general, and PostgreSQL
in particular: Tom has the bandwidth to notice these kinds of
workarounds being discussed on support lists, and turn them immediately
into improvements in the planner. Partly because (I assume, based on
the commit message) Andres's parenthetical comment red-flagged it for
him, since he knew he could trust Andres's opinion that there was
probably a planner improvement hiding here. Amazing!
(email)

EnterpriseDB is looking for beta testers for its new
Postgres Enterprise Manager. The tool
allows for monitoring, tuning, and capacity planning of multiple Postgres clusters from a single console.
Download of the beta is free, but requires email registration.

Part 2 of my Email Snippets series: In helping Postgres users, we see
many SQL queries. Some queries are well crafted, and others just need improved formatting to help clarify their intent or the cause of
the problem. However, a small subset of queries elicit strong reactions, and I have referenced some of those reactions below:

Ill

I would hate to see the rest of the design. I am ill from this tiny fragment.
(email)

Second reaction to the same idea

[ blanch... ] That's not the worst SQL code I've ever seen, but it might be in the top ten.
(email)

Spinning

For the record, the table we're having trouble inserting into is ~100 rows
with ~50 indexes on it. E.F Codd is spinning in his grave.
(email)

Read my mind

> Is there anyway to load this data without specifying a column list?
Nope, sorry, COPY lacks the "READ MY MIND" option.
(email)

Torture

And yes we do data analysis that tortures SQL, but SQL allows us to do so many things quickly and less painfully. Better to torture the
machines [than] torture ourselves.
(email)

Tables?

> Ultimately we need to think of a reporting mechanism that's a bit
> smarter than "rewrite the whole file for any update" ...
Well, we have these things called "tables". Any chance of using those?
(email)

Explosion

> Good luck finding out how to interpret the dots, but it's specified
> somewhere.
My head just exploded.
(email)

The new version of Mac OS X server (code-named Lion) has replaced the default MySQL
install with Postgres (article):

If you're looking for the configuration for MySQL, you won't find it, either in the GUI or in the command line. That's because Apple has
replaced it with PostgreSQL, another open source database. On one hand, this is an improvement, because PostgreSQL is considered to be
more powerful than MySQL. But whereas Snow Leopard's Server Admin tool had GUI settings for MySQL, PostgreSQL is command line only in Lion
Server.

Reading over a hundred community emails a day, I see various interesting jokes, phrases, and observations who's value extends beyond the
original emails in which they appeared. Fortunately, I have saved some of these. With the northern hemisphere summer upon us, I though
it would be a good time to post some of these to my blog in the next several weeks. I am not going to show the entire email messages nor
the author, but will link to the original email so you can see the email author and message context.

To start, I will mention some self-deprecating comments:

Programming

I just wrote my first patch, and I need to know whether I missed something
or not. I haven't used C for a really long time, so sickbags on standby,
and if you notice something really stupid don't hesitate to call me an
asshole (according to Simon Phipps that proves we are a healthy open
community).
(email)

Reference to Homer's Odyssey

Ooops, no DATE_IS_NOBEGIN/DATE_IS_NOEND in 8.3 or 8.2 ...
I heard the siren call of git cherry-pick, but should have lashed myself
to the mast.
(email)

Testing?

> I kinda doubt that a function failing in that way would pass any
> testing.
What is this "testing" thing of which you speak?
(email)

I created a 2-minute welcome video for a Postgres conference that just finished in
China
(image). (They wisely added Chinese subtitles.) If
your conference would like a welcome video, I would be glad to make one for you, and I am sure other community members would be willing to
do the same.

If you are looking for the Postgres booth at the O'Reilly Open Source Convention (OSCon) next week
— don't bother — there isn't one. Postgres has had a booth at OSCon for years, but this year we were offered a booth
too late for our community to properly staff it. We will not have a booth at the simultaneous
OSCon Data conference either, again for the same reason. Kind of odd to have an
open source database convention without a Postgres booth, but that's what's happening.

We will have a PG Day event the day before the convention, and will have a few talks
and a tutorial during the convention.

Yesterday I presented a new talk, Explaining the Postgres Query Optimizer, to the
Boston PostgreSQL Users Group, so I have now placed the
slides online. I am working on three other new talks that I will
post online once I present them at an event.

Two days ago I gave my PostgreSQL Adoption and Trends presentation at
SERPRO in Porto Alegre, Brazil. They have fortunately placed a
video of the 2.5 hour presentation online. The first half of the video is me giving the
presentation with consecutive translation into Portuguese. The second half is a question and answer session, mostly in English, which
covers many questions about the future growth of Postgres and its features.

There is a perpetual debate among database designers between surrogate keys (generated by
the database) and natural keys (supplied by the user). There was a discussion recently on
the Postgres general email list
(April thread,
May thread) on this topic. Having attended many conference
presentations on this topic, I thought I knew everything about it. However, this thread taught me a few things. What first got my
attention was this comment by
Jeff Davis:

Once you really try to define "natural" and "surrogate" keys, I think a
lot of the arguments disappear....

In particular, I think you are falsely assuming that a natural key must
be generated from an outside source (or some source outside of your
control), and is therefore not reliably unique.

You can generate your own keys, and if you hand them out to customers
and include them on paperwork, they are now a part of the reality that
your database models -- and therefore become natural keys.

This email thread doesn't solve the surrogate vs. natural key debate, but it does add some nuance to the debate that I never realized.
What struck me was the realization that the surrogate vs. natural key debate essentially hinges around how a database design communicates
to the outside work, and how it adjusts to changes in the outside world.

Another interesting insight was the requirement that databases using surrogate keys must use a uniqueness constraint to guarantee that
data linked by surrogate keys can be uniquely matched with items in the outside world, or the schema risks being meaningless. So,
basically, an effective schema must map to unique items in the real world, and must efficiently adjust to changes in the outside world
too. No wonder this is a perpetual debate.

Pg_Upgrade often gets slighted by major Postgres developers as a "hack".
In fact, during the last PGCon, pg_upgrade earned a derogatory nine bottles of beer during the
Drinkers guide to PostgreSQL lighting talk. However, this
is actually not surprising because the philosophy behind pg_upgrade is slightly different than most other Postgres utilities. Most
utilities aim for high reliability, clean code, and a uniform interface to the underlying libraries and tools it uses. Pg_Upgrade also
aims for high reliability and clean code, but its interface requirements are different. Rather than aiming for a uniform interface, it
aims for a stable interface, one that is unlikely to change between major Postgres versions. This often requires pg_upgrade to do
unusual operations. These are often referred to as "hacks" because the operations use tools and libraries in ways that were never
intended.

The proper solution would be to better integrate pg_upgrade into Postgres by creating interfaces for pg_upgrade at the proper abstraction
level. However, this is problematic because tighter integration between pg_upgrade and the Postgres internals makes pg_upgrade more
sensitive to changes in major Postgres versions. Also, many of these interface changes could not be back-patched (for reliability
reasons), meaning pg_upgrade couldn't rely on the proper interface existing in older versions of Postgres. For these reasons, pg_upgrade
relies almost exclusively on the public interface of Postgres, which rarely changes between major releases, and exists in all versions of
Postgres. This means pg_upgrade will always appear as a "hack" — and that is a good thing.

Having watched someone yesterday on IRC successfully use
pg_upgrade to upgrade a multi-terabyte database, I thought it would be a
good time to update everyone on the changes to pg_upgrade in Postgres 9.1 (now in beta):

Performance has been improved.

The source code has been cleaned up.

The user interface is unchanged.

Such minor changes are a good thing — it means that though Postgres 9.1 has many new features, pg_upgrade required no
adjustments for those features, which has always been the goal.

The Postgres community is in beta testing for release 9.1 — the final 9.1 release is still a few months away. This process is
familiar to many long-time Postgres observers. I have written about
our predictable release process before, and there is even a
chart showing our release history. Our release process has been done
this way for so long that many people forget why it is important.

This blog entry by Andy Lester about Perl 6
cements in my mind why our release process is so critical to our continued success. It was written 3.5 years ago, when Perl 6 was already
7.5 years in development. It makes a powerful argument that the delayed release of Perl 6
("by Christmas") has hurt
Perl adoption in several significant ways. Frankly, if Perl weren't so strong, it could have killed the project, and has certainly
dampened Perl adoption and fueled the momentum of competing scripting languages. Andy's blog entry states the damage even clearer:

I don't question the desirability of Perl 6 either. I can see how, when it's finally finished, it will be an improvement
over any language available.

However, from a Communications standpoint, it's obvious that there are significant problems in communicating about perl to the world at
large. Perl 6 has been a Public Relations disaster, one that has made it harder to attract developers, other contributors, users and
companies.

Again, from a Communications/PR standpoint, our goal should be to stop shooting ourselves. And that means taking the public focus off
Perl 6 as much as possible.

So, why is the Postgres release process so important? Why are timely, reliable, and feature-rich releases so important? Because what we
are doing is radical ("An open source enterprise-class database?") and our release professionalism helps to encourage confidence in those
moving to Postgres — and we do a fine job of it.

This Wednesday, June 8, I am doing a
webcast of my
MVCC Unmasked presentation (registration required). I have two more
webcasts coming up and will announce them once they appear on the EnterpriseDB web site.

At the developer meeting at
PGCon today, a Salvation Army group was
booked in the adjacent meeting room. They started their meeting by singing songs, and that brought up the question of whether Postgres
should have a song. Postgres already has logos, conferences, web sites, and t-shirts, so maybe it is time for a Postgres song. On the
other hand, most companies don't have songs, though countries do — it would be nice to sing at conferences.

Postgres, as a community-driven open source project, lacks much of the hierarchical structure usually associated with successful
organizations. The production of high-quality Postgres software year after year certainly proves that hierarchical structure isn't
required, but how does the lack of an hierarchical structure affect long-term planning?

Well, surprisingly, we seem to be successful on that front too. A survey commissioned by Sun Microsystems in 2007 indicated five
limitations to Postgres adoption (email):

1. Easy Install

2. Simple, low-overhead replication

3. Upgrade-in-place

4. Administration & monitoring

5. Driver quality/maintenance

(This list was officially recorded at a 2009
Postgres developer meeting.)

And how is Postgres today? Josh Berkus stated it well in the above email, "It's really nice the number of the above issues we've knocked
out since 2007", and I replied, "Wow, that is amazing, like we
actually have a plan.".

The email thread this spawned it quite interesting. The first
part shows confusion over where the list is recorded (we
probably could have used more organization there) and complaints about our poor wiki search capabilities (again, I guess we need help).
The second part critically examines how well we have removed
these limitations, and wonders about the next five limitations.

So, those five items seem to have been largely addressed by mostly independent contributors who worked to solve problems that both
Postgres and their organizations and customers shared. Sounds like a win to me, even if it is surprising.

(I was also able to add data about some of the earlier releases.) You can see the few cases where the code size decreased, and the two
large increases for 8.0 and 8.4. I assume the increase in 8.0 was due to point-in-time recovery, savepoints, and the Windows port. I
guess the 8.4 increase was due to the addition of the WITH clause and window functions, which I used for this analysis.

I just realized we will exceed one-million lines of C source code in Postgres 9.1 (currently 1,069,547; computed using
codelines).
It has been a while since I reported on our C source code line count, so here is an updated list:

In the past few weeks I have received several compliments at conferences about the clarity of our source code. If you have never looked
at it, you might want to
(sample).

There are two reasons for this clarity: first, the community realized early that source code clarity was one of the best ways to
encourage new developers to get involved, and to avoid mistakes. Second, the Postgres source code is automatically reformatted by
pgindent
before every major release to guarantee source code consistency.

It's radical to reformat source code on a regular basis, but it has helped to reenforce the high standards we have for everything that is
done by the Postgres community.

I reported a pg_upgrade bug two weeks ago, and yesterday
Postgres 9.0.4 was released which fixes this bug. As outlined in the
wiki, this release only corrects new uses of pg_upgrade; users who ran
pg_upgrade previous to Postgres 9.0.4 must run the fix script listed in the wiki.

I want to thank Andrew Gierth for the initial research on this, and thank the Postgres community for pushing out a release with this fix
so quickly. The good news is that only three users reported experiencing this problem, and all their databases were properly corrected.
I am hopeful that we caught this bug early and that pg_upgrade can continue to be a relied-upon method for rapid major Postgres upgrades.

A video of my keynote address yesterday at O'Reilly's MySQL Conference is now
online. This twelve-minute presentation was to a primarily MySQL audience so I
covered some of the history of our project, ways in which we are unique, and features of the Postgres 9.0 and upcoming 9.1 releases.

Update: A common criticism of my keynote address was that I was too apologetic and/or humble, but that was by design. When speaking to
an audience that might not be interested in my topic, I often low-ball the presentation because it is an effective way of getting
listeners to think about my ideas, and to lower the natural barrier people have to hearing information that conflicts with their long-held
beliefs. A humble presentation makes people realize I am not selling anything, and that my information is quite genuine and merits
serious consideration. 2011-04-15

I am attending O'Reilly's MySQL Conference this week. I had to be convinced by Postgres community
members to attend this conference, and I had flashbacks to my teenage years:

Mom: Greg (a family friend) is having a party on Friday. I think you should go.
Me: I don't want to go.
Mom: You will have a good time.
Me: Mom, I won't know anyone there.
Mom: Billy will be there. You like talking to Billy.
Me: Mom, Billy is kind of weird.
Mom: Still, I think you should go.

Seems I wasn't very outgoing as a child, and I guess I'm still not. Well, PG East had MongoDB
talks so why not Postgres talks at MySQLCon. I will get to study MySQL developers in their native environment, or at least all bunched
together.

I am speaking three times tomorrow (Tuesday), and then don't have anything scheduled until Friday when I am doing
EnterpriseDB training, because, hey, I'm here, so why not. EnterpriseDB and the Postgres community
both have booths at this conference, and they both want me to help with booth duty, plus I want to attend the Postgres talks, so it
doesn't seem I am going to have much downtime at this conference.

So, I guess I will have people to talk to, and I will have a good time. Thanks Mom.

Today we released a fix for a bug that affects all systems upgraded by current versions of
pg_upgrade and
pg_migrator. This wiki
page contains the announcement and steps necessary to avoid query errors caused by the bug.

I hope we have caught this bug early — there are only three known cases of query errors caused by this bug, and I believe those
three users have been able to correct their systems thanks to help from the IRC channel. If
you have used pg_upgrade or pg_migrator in the past, you should review the wiki and take the steps outlined to correct your systems. I
apologize for the problems this has caused and I hope this is a rare mistake that will not be repeated.

I am attending or presenting at events every month for the next eight months, through November, in Santa Clara (CA), Philadelphia, Ottawa,
Mexico, Oregon, Colombia, Baltimore, and Brazil. June is empty and two unannounced conferences are planned for October. If you are near
any of these places, you should consider attending; details are on my
conferences page.

I have completed drafting the Postgres 9.1 release notes. They
will be updated and improved in the coming months as we approach the final release. Release note generation was easier this time (4.5
days) because git was quicker at returning commit details.

The final 9.1 commit-fest was closed last week so the development team
is now marching toward the final Postgres 9.1 release. There is still months of cleanup and testing ahead but all our energies are now
focused in that direction. I have started on the release notes and others are pouring over features,
fixes, and documentation to get ready for our first beta, which will
probably be in April. You can get a sneak-peak by checking the
Alpha 4 release notes.

I am excited about next week's PgEast conference at the Hotel Pennsylvania, but probably not for the reason you think. Thirty-five years
ago, as a teenager, I attended a Star Trek convention at the hotel (then
called the Statler Hilton). I remember clearly riding in an hotel elevator with James
Doohan (Scottie), and when someone in the back mumbled "Bridge", we all cracked up.
I am excited to be returning to the hotel for another convention. For reasons I will not mention
(link,
link), most attendees will be sleeping at a different hotel,
The New Yorker Hotel.

I am attending the SCALE conference this week in Los Angeles. The conference has been an
eye-opener regarding the increased buzz about Postgres. Our dedicated PGDay on Friday was a huge success with many talks standing-room
only and overflowing into the hallway. Our biggest problem was the need for a larger room, which we will get next year. That's a good
problem to have!

Joe Conway was in our booth all day Saturday and he saw a big difference — last year there was much concern about Oracle buying
Sun/MySQL — this year he engaged people walking by and they said they were using Postgres, loved it, and it was running fine so
they had no questions — again, a good problem to have.

I remember the years we knew we were producing a superior open source solution but were often overlooked. That era is over and we
are reaping the rewards for many years of hard work.

Tomorrow (Monday) I am teaching an EnterpriseDB class in the same hotel as the conference. I figured as long as I am here I might as well
do some training. Looking ahead, I have big
conferences for the next three months (New York City, Santa Clara, Ottawa)
— June is wide open right now. Wow, another good problem to have!

I am excited to be giving presentations about Postgres every month for the next five months, with New York City starting it off this
Monday (event list). Interestingly, all the events are in North America: New
York City, Los Angeles, Santa Clara, Ottawa, and Philadelphia (attending only).