Interview with Mark Stosberg (From Oracle/tcl to PostgreSQL/Perl)

From PostgreSQL wiki

After his excited post to the PostgreSQL Advocacy mailing list describing the recent relaunch of the 1-800-Save-A-Pet web site, replacing it's aging Oracle based solution with a more up to date PostgreSQL solution, we knew we had to talk with Mark Stosberg of Summersault, LLC.

techdocs: Hi Mark, I think the first question we wondered about was just how you got involved in the project, and where PostgreSQL started fitting in to the picture?

Mark Stosberg: I'm the Principal Developer for Summersault, LLC. We are a website firm focusing on database-driven websites. In the fall of 2003, 1-800-Save-a-Pet.com contracted with us to redevelop and extend their existing website.

At the time, It was written in the tcl language, served with AOLserver and Oracle. The project leveraged an existing open source code base, called "Arfdigita", which had the basic features they wanted. Arfdigita was based on a broader toolkit which was thriving at the time, the Arsdigita Community System. Considering the available code and circumstances, it was a good match at the time.

At Summersault, our standard toolkit was much different. From the beginning, we've used Perl, PostgreSQL and FreeBSD as our foundation. We proposed that the whole site be rewritten in Perl. I think the client had some understanding that tcl had not become nearly as popular as Perl, and it was harder to find programmers that focused on that paradigm.

I must say, since we had this opportunity to shift paradigms, we saw it as an opportunity to replace Oracle as well. First of all, it's not open source, which was an important factor for us. I couldn't think of any features that we would need that Oracle offered and PostgreSQL didn't.

I also found Oracle much more cumbersome to deal with it. Here's one example: I simply wanted a dump of the legacy database schema so I could review it. In PostgreSQL this is a breeze: just use pg_dump --schema_only. When my research on how to do something like this in Oracle didn't turn up results, I asked a local Oracle consultant. He gave me a little script to use that produced something like this-- he didn't know of a built-in method for this either!

techdocs: One would have to think MySQL would come into the equation somewhere, did you spend any time looking into thier products?

Mark Stosberg: We did consider using MySQL for the database. Although we preferred PostgreSQL, this website was projected to handle more traffic than anything we had built to date and MySQL still had a reputation for being fast.

While we did not find an ideal "PostgreSQL vs. MySQL" comparison, we did hear a general consensus among the resources we found: If there was a speed gap between PostgreSQL and MySQL it was minimal, and in some cases PostgreSQL can be faster.

With some confidence that PostgreSQL would be fast enough for this demanding application, we chose it. We already knew it had a strong feature set, was very stable, well documented, and easy to use and administer.

techdocs: You mention that the client understood your desire to switch from tcl to Perl, did they have any reservations about switching from Oracle to another database solution? How did you sell them on PostgreSQL?

Mark Stosberg: Another staff member and myself spent several hours researching relative benefits of PostgreSQL, MySQL, Oracle and MS-Access. We eventually produced this document:

It is still a little rough: For example, the specific versions we reviewed aren't documented. It is probably already become a little inaccurate from the passage of time.

Our direct contacts were not technical people themselves. However, they did have some other IT consultants to help make key decisions like this. These were the people we really had to convince.

With our summary report in hand, PostgreSQL was not a hard sell. Since then, our database choice has been a non-issue. As developers we've continued to be happy with it, and the client is satisfied that it meets their performance expectations.

techdocs: Once the decisions on which new technologies to use were made, it sounds like you spent most of your time rewriting the application code rather than porting it; would this be a fair assesment?

Mark Stosberg: Yes. When the project began, we thought the first phase would be to reproduce the current functionality in the current paradigm. As we consulted more with the client, we discovered what they really wanted was fairly different-- there were a lot of additions, subtractions and changes from the current system. In the end, I doubt any time was lost with the choice to rewrite.

I just used the program 'sloccount' to compare the size of the old and new code bases. The old one contained about 135,000 lines of code. Our rewrite was under 37,000. That's almost 100,000 less lines and a quarter of the size! I think the new solution will be easier to maintain. :)

This size reduction can't be simply attributed to converting from tcl to Perl, though. Before, the site was based on a large toolkit of which they were using only a fraction of the functionality. Still, some of that code was inter-related, contributing to the overall complexity.

Also, we are using many Perl modules from CPAN which would not be included in that count.

techdocs: Did you find anything perticularly difficult to reproduce with the Perl/PostgreSQL combo that would have been easier to handle in Oracle/tcl ?

Mark Stosberg: No. With Perl, we were able to take advantage of pre-existing excellent modules from CPAN to help us with our work. There were no Oracle concepts that were difficult to reproduce in PostgreSQL.

techdocs: Speaking of pre-existing and modules, have you folks needed to use any of the "contrib" and/or external PostgreSQL modules such as PostGIS, Tsearch2, or Slony?

Mark Stosberg: The "earthdistance" package from "contrib" is central to our search for "pets near you". It's easy for me to forget about it, because we've been using it with little effort and high satisfaction since the days of PostgreSQL 7.0. :)

Before I had the opportunity to implement a "dealer locator" application in MySQL that could have made use of this sort of geo-distance function. It was much more cumbersome to build without it, and I imagine would not scale as well as having such a function compiled into the database.

Replication support was one of the database requirements we researched. We found that PostgreSQL had multiple options here. However, we have not needed to implement replication to support the load yet. We have one beefy dual-processor, RAIDed server that is working fine for now.

I appreciate knowing that replications options for PostgreSQL are there when I need them.

techdocs: I've always felt that one of the corner stones of Open Source software was the vast number of options it gives you both from a development and from a support stand point, allowing you to receive as much or as little as you need. What has your experience been in terms of getting support for PostgreSQL?

Mark Stosberg: We have really never needed formal support for PostgreSQL. We've been using it since 6.5 and have found it be very reliable, with excellent documentation.

When we have needed help, we've had great results and fast responses simply be asking on one of the mailing lists. Once after asking one of these global lists I got a response back from Nick Fankhauser, just a block down the street. His business runs on PostgreSQL as well. (Nick maintains the PostgreSQL and JDBC FAQ: http://www.fankhausers.com/PostgreSQL/jdbc/ ).

It's comforting to know we get local support. I believe our city and county governments are using PostgreSQL in some capacity, as well as the local hospital.

If we ever need formal external support, I know that the PostgreSQL website keeps a list of a number of other PostgreSQL consultants we could contract with if we needed to.

techdocs: Well, it has certainly been interesting to learn more about your use of PostgreSQL and how it helped the people at 1-800-Save-A-Pet.com. Any last words of wisdom for other folks who might be considering the switch from Oracle to PostgreSQL?

Mark Stosberg: This is technical, but it may be helpful: We had to figure out how to get the data from Oracle on the remote machine to PostgreSQL on our local machine. We knew the process would somewhat interactive-- with initial failures in the data transformation, refinements and restarts.

We also didn't want to install PostgreSQL on the remote machine, which was still being used in production for the legacy site. Our solution was put the data in SQLite on the remove machine. SQLite dump files can be imported directly into PostgreSQL.

This allowed us to work more quickly because we didn't have to talk over the network to two different databases while we were refining things.