let's get small (with MySQL and WebLogic Server)

by Rod Chavez

a couple of weeks ago, a friend of mine here at BEA from the NetUI team
(Eddie O'Neil) sent me email saying he'd
been playing around with MySQL and WLS, and that he thought it'd be interesting
to post that work so that people could have an easy path to follow when trying
to get the two working together. i agreed of course, so i told him to send me
what he had and i'd take it, wrap some words around it, and put it online. if
life were only that simple <g>

that was over 3 weeks ago. he sent me something that worked, i spent 5
minutes reducing it to a smoking mess and the rest of the 3 weeks getting
everything going again. what took so long? some of it is just me, in that i've
got a particular way i want things to work so that what seems simple winds up
being quite hard (and i can be a tiny bit stubborn, so when i hit a rough patch
i tend to keep going forward, when all sense would dictate that i go back and
try something else), and some of it is that there are a number of subtleties
that make it hard all by itself. we'll wind up covering both here, and see
which is which

the first thing i always want to do when trying to show how something works is
to use it to do something. somehow i just find i don't understand something,
and hence can't explain it, until i've done something "real" with it. but what
to do, what to do?

i don't know about you, but i've been seeing people using tinyURLs just about
everywhere lately. if you've never heard of them, they perform a very simple
but useful task; they turn big URLs into small ones. tiny ones, in fact <g>.
the big problem this solves is the line-breaking-in-email problem. i'm sure
you've all experienced a situation where you've gotten a piece of email
containing a "long" URL (like a Yahoo map or a Google query) where somewhere
along the way the URL got split onto multiple lines, and part of the link gets
"broken off" so when you click on the link, it doesn't take you where it
should. you wind up having to paste twice or some such manual hackery. and if
you sent the ling to someone who's a bit clueless, you'll get a "your link
didn't work" reply back

TinyURL takes care of this problem by allowing you to submit your "long" URL
and they return a "tiny" one. this can be used pretty much anywhere the long
one could have been, and since it's really short, it pretty much never gets
split or broken. it has the added benefit that you can use it places that a
long one wouldn't fit, like in your IM status

what do TinyURLs have in common with MySQL and WLS? nothing in particular, but
i thought it would make a cool, simple example of a data based web-app that did
something useful, all by itself, in not too many lines of code, so that's what
i decided to build

how does TinyURL work?

before i could write a line of code, i had to figure out how TinyURL worked.
so the first thing i did was submit the following URL,http://www.oreillynet.com/pub/au/1303,
and they returnedhttp://tinyurl.com/sr66. not a huge
reduction in size, but enough for you to see what they do. and when i chase the
URL they returned, i wind up at the page i submited (which happens to be my
bio on ORN)

so the next thing i did was use wget to see what was actually going back
and forth on the wire to make this dance work. here's what i found

so looking closely at what happened, wget chasedhttp://tinyurl.com/sr66, only to have the server return a 302 and
another URL, http://www.oreillynet.com/pub/au/1303. a302
response code means, in essence, that the server understands what you're
asking for, but wants you to know it's temporarily someplace else, and
that place is at this URL i'm handing you. so the user-agent (read: browser, or
more properly, http-client) should now rerequest the document from this new
location. which of course is what wget does, returning the final target
document. nice

now that we know how TinyURL works from an HTTP protocol point-of-view, we need
to start thinking about what kind of storage needs we're gonna have. first,
we're going to need to accept a URL (let's call this the userUrl) and
return a URL (let's call this the tinyUrl). sounds pretty simple. we
create a table with two columns (tinyUrl, userUrl), uniquely indexed on
tinyUrl. ok, that seems simple, just create a table like that, both columns as
TEXT, and then... wait a minute. er, exactly how did that tinyUrl get created?
i mean, how do we create a unique tinyUrl for one userUrl, and a different
tinyUrl for a different userUrl? and also make sure that it works properly in
the face of simultaneous requests, which compounds the problem of trying to
create something unique by introducing synchronization issues into things. i
mean, what good would it be to make sure we didn't duplicate any "already
created" tinyUrls, if we didn't also make sure that we didn't accidentlly
create duplicates because the tinyUrl generation code was run twice at the same
time?

fortunately for us, databases turn out to be great at helping to solve these
kinds of problems. first off, they're built to handle simultaneous requests and
enforce all sorts of interesting rules about how to deal with concurrency. but
they also understand how to play a role in creating uniqueness, like
customer-id used as the primary-key in a database. i'm not going to try to
cover these concepts in a general way, as there are (quite large) books
dedicated to these subjects <g>. but i'll drill into the parts we need to make
our service work

MySQL has this cool feature where you can declare an integer column as
AUTO_INCREMENT. when you insert a new row into a table with a column like this,
MySQL determines what the current MAX value in the column is, adds 1 to it, and
uses that as the value for that column in the new row. the great thing about
using this feature is that MySQL is taking care of both problems we were
dealing with, namely how to come up with the "next" tinyUrl to use, and also
how to do it in such a way that simultaneous requests don't result in the same
tinyUrl being created. MySQL, like almost all database products in existence,
handle concurrency as a matter of course

of course, this changes the code we're going to need to write somewhat. instead
of storing tinyUrl as a string it'll be an integer in one of these
AUTO_INCREMENTing columns. so where does the URL come from? it's interesting to
notice that most of what's returned to the user to use as a tinyUrl is always
the same. the only part that's really unique is that little bit at the end. in
the above example where we used the real TinyURL, that was just sr66

given our new desire to store this little unique bit as an integer, it would
make things nice and simple if we just turned it into a string, so that the
integer 1 would turn into the string "1". and so our URLs would look like.../1, with the ellipsis (...) being the "static" part of the URL, the
part that never changes. hmmmmm, that is simple, why doesn't TinyURL do that?

i have no idea how TinyURL is actually written, although some friends and i
have speculated on this a few times. but i think it's safe to assume they are
doing something similar to what i'm describing. but as we already know, the
unique part of their URL doesn't look like an integer. or does it?

most of the developers reading this will be familiar with hexadecimal
notation, or HEX for short. HEX is a base-16 number format, which means
there are 16 different values possible in any given "place" in a number. as
opposed to the base-10 system we (or must of us) grew up with where the values
are 0 through 9 (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), in HEX the values
are 0 through f (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f).
but things don't really have to stop there...

sexatrigesimal

welcome to sexatrigesimal <g>. that's the correct name (i think) for
a base-36 number format. googling around, i found a page on the
"Names of bases for
number systems". and while the page does not provide a name for base-36
directly, i think i understand the rules being used and base-36 would be named
sexatrigesimal, or what i'm going to call STG for short

in STG the values are 0 through z. in other words, instead of
stopping at f the way HEX does, it keeps going until z. of course
the next question would be, why bother? why not just use decimal or HEX?

the answer would be, to keep things tiny! taking the case above where TinyURL
returned sr66, and assuming i'm doing the math correctly, and also
assuming that i'm correct about how this string is encoded, i'd say that that
value represents 1386134. or a shortening of almost 50% over the
decimal notation (from 7 characters to 4). and it should only get better over
time, as the number of pages just keeps getting larger

one other question some of you may be asking is, "ok, but why not keep going
with this? why not split the alphabet into upper and lowercase letters to
yield another 26 values to our number format (that'd get us base-62), and there
are lots of other characters that could be used in the path of a URL, like
('.', '/', '&', etc), why not use them too? we could get
up to base-70 or maybe 90? the only answer i can think about that makes sense
is that in addition to making URLs a lot smaller, this is also supposed
to make things easier (which is a reason near and dear to my heart). and
it'd be a real pain to type or copy (or speak aloud?)
"http://tinyurl.com/s.S&", don't you think? and just to double-check and
see what TinyURL actually does, i hit them with the URL i'd gotten back from
them earlier, but with the sr66 up cased to SR66, and they still
bounce me to the same page. so it at least appears like they're using
something like STG as their format

whew, what a rathole, eh? <g> ok, now let's see where we are. we know how
TinyURL does their tricks with the browser, how they encode their URLs (or at
least how we're going to do it), how we're going to interact with MySQL to help
us generate unique URLs and deal with concurrency, is there anything else, or
can we just start doing stuff yet? i think we're ready to go now, let's
build it

MySQL install and startup

first, we need to get MySQL installed and ready to run. since i'm doing this on
RH9 it's drop-dead-simple, since MySQL comes as part of the RH distribution. so
as root, run the following commands to get the RPMs installed:

the other nice thing about using the MySQL that comes with RH9 is that it's
been setup to run as a service in the *nix style. so once installed, again as
root, all you need to do to get it running is to run the following command:

$ sudo /sbin/service mysqld start

you can stop it the same way (but replace the word start with stop). and if you
want MySQL to start at boot time, then the following command will make sure
that happens too:

$ sudo /sbin/chkconfig mysqld on

now that MySQL is installed and running, we can install and run the 'tiny'
service i've written

download and install the 'tiny' app

if you want to install the service yourself, or just use the scripts and code
for your own purposes, it's all packaged into a zip that you candownload. i'm going to
walk through the steps to install it, and after that walk through in detail how
it works

shutdown WebLogic Server. if you're using the same configuration that i
describe in myfirst blog post, all you
need to do is the following

$ sudo /sbin/service wls stop

note that the rest of these instructions are all relative to that install.
if you've got WLS someplace else, translate accordingly

now we need to add the MySQL JDBC driver to the WLS classpath. to do this we
need to download it and put it where WLS can find it. MySQL maintains the
following page,
http://www.mysql.com/downloads/api-jdbc-stable.html, where you can download
a tar or zip of the latest production JDBC bits. go there and click on the
"Pick a mirror" link for getting the zip file. from there you can pick
whatever download site is closest to you. once you have the zip pulled down,
run the following command to unzip it and put it where we're going to have
WLS find it

then we need to get the startWebLogic.sh script to refer to the MySQL JDBC
driver we just unzipped. this could be done by editing the startWebLogic.sh
script directly, but that's a poor choice. the reason has to do with the way
the WLS config-wizard works. one of the things it does is to overwrite the
startWebLogic.sh every time it runs to completion. believe me when i say this
can get very annoying. fortunately, there's a much better choice if you
want to add things to startWebLogic.sh without getting stepped on by the
config-wizard

there's a file, startscript.xml in the directory/bea/user_projects/domains/wls/_cfgwiz_donotdelete. the config-wizard
will honor and merge any changes it needs to make with any changes you need to
make. so edit this file and add the bold line you see below to the same
spot in the file as it appears here (we've put this bit in a separate window because of the long code lines).

unzip tinyapp.zip and move into the directory it creates

$ unzip tinyapp.zip
$ cd tiny

next, run the 'tinySetup' script provided in the zip file. this will create
all the resources needed so that the everything will work. i'll walk through
what's going on there after we get everything running

$ sudo ./tinySetup

finally, restart WLS

$ sudo /sbin/service wls start

that's it! at this point, you should be able to hit thehttp://localhost/tiny and see your very own "tiny" URL service. or, if
you didn't complete the install, you can play with the one we've made available
up on the net. it's right here. you're
free to use it just like you'd use TinyURL. when you chase this link, you'll
see a form that allows you to enter your "long" URL, and when you submit it the
service will return a "tiny" URL for you to then use just like you'd use a
"real" TinyURL. neat huh?

it's time now to dive into how all the pieces fit together. here we go...

tinySetup

let's take a look at the source to tinySetup and see what it does

#!/bin/bash

# make weblogic/weblogic a (super)user in the mysql system
mysql --user=root < userSetup.sql

# add the database and tables for the tiny service
mysql --user=weblogic --password=weblogic < dataSetup.sql

# add the connection-pool and data-source to the server for the tiny service
/bea/weblogic81/common/bin/config.sh -mode=silent -silent_script=$PWD/silentCTiny.txt

# copy the tiny web-app over so the server will deploy it on startup
cp -r tiny /bea/user_projects/domains/wls/applications/tiny

first, it runs mysql and sets up the "user" account that the app will use to
access the database. then it runs mysql again, logging in as the newly created
user, and creates the database and table that the app will use. next it runs
the WLS config-wizard, creating both the connection-pool and the data-source.
it finishes by coping the "tiny" webapp into the "wls" domain so it will be
automatically deployed when the server is restarted. let's take a quick look at
each of these steps. after that, we'll check out the code to see how the story
ends <g>

userSetup.sql - MySQL user creation

here are all the parts of the "userSetup.sql" file, used at the start of the
tinySetup script

# make the MYSQL database the default for the following statements
USE mysql;

here we see the USE
command used to set the default database for all subsequent commands to operate
on

# create a superuser that can connect from on this machine from the command line
GRANT ALL PRIVILEGES ON *.* TO weblogic@localhost
IDENTIFIED BY 'weblogic' WITH GRANT OPTION;

now we're creating the user "weblogic@localhost" using theGRANT command. this names
the user, defines his rights and sets the password. in this case, the user
can access anything. one thing to notice is the way the user name is seen by
MySQL. it's user-name@location. in this case, since the database and app
are running on the same machine, it's the user "weblogic" on this machine (aka
"localhost").

# create a superuser that can connect from on this machine from WLS
GRANT ALL PRIVILEGES ON *.* TO weblogic@localhost.localdomain
IDENTIFIED BY 'weblogic' WITH GRANT OPTION;

now this is weird. this shouldn't actually be needed, but it turns out
there's a RH9/MySQL interaction problem that causes "local" user names being
presented to MySQL as "user-name@localhost.localdomain". according to
the MySQL README that comes with the JDBC driver it's a RH9 bug. and i know
that you don't need to do this on RH8. in any case, this GRANT is only there to
work around this issue. and this sure was a fun one to figure out <g>

one final note: if you're going to deploy this service other then in a demo
manner, make sure you change the password created in userSetup.sql and used in
tinySetup and TinyServlet.java

dataSetup.sql - MySQL database and table creation

the "dataSetup.sql" file has all the commands to create a database and a table
for use by our app

# create the database TINY to be used by the tiny service
DROP DATABASE IF EXISTS tiny;
CREATE DATABASE tiny;

the DROP DATABASE
and CREATE
DATABASE commands are used so that we wind up with
a database named "tiny". this is the database where our apps data will live.
btw, the reason it leads off with a DROP DATABASE command is so that you can
run the dataSetup.sql script multiple times if needed. i had to when writing
this, and it's a useful habit to get into. the "IF EXISTS" clause is there for
the same reason

# make the TINY database the default for the following statements
USE tiny;

of course we now make the "tiny" database the default for the next statements

and we wrap up our MySQL work by usingDROP TABLE to remove
any old "redirect" tables we might have had, and then usingCREATE TABLE to add
a table named "redirect". look carefully at the column named "tinyVal". this is
the column will hold the integer we're going to lookup as the unique part of
the tinyUrl. that column has "NOT NULL AUTO_INCREMENT PRIMARY KEY" as modifiers
on it. the "NOT NULL" is not particularly useful, as the database will always
be setting it, but the "AUTO_INCREMENT" is critical for the reasons discussed
above and the "PRIMARY KEY" was added so that MySQL would know that this would
be the main access mechanism for queries on this table, and it could build
indices appropriately to speed things up

i'm not going to go through the entire silent-configuration script
line-by-line. you can read about how all the options workonline.
but i am going to point out a few choice parts that caused me problems so that
you'll be successful if you customize these scripts yourself

// open domain
read domain from "/bea/user_projects/domains/wls/";

when writing a silent-config script, the first thing you have to do is figure
out if you're creating a new domain, or modifying an existing one. here i'm
modifying an existing one, so i've got to tell the config-wizard to read it as
it currently exists

here a connection-pool is being created, along with the settings it needs in
order to run. here you need to be careful of theTestConnectionsOnReserve and TestTableName settings. this is
quite confusing (IMO) if you're going through the WebLogic console to create a
connection-pool. here's what's going on

at a high level, a WLS connection-pool is a cache of connections. whenever an
app running on that server needs one, it can get one from the pool, use it as
needed, and then return the connection to the pool where it'll sit until some
app needs it again. the reason that modern app-servers have connection-pools is
because creating a connection, with the associated user-authentication that
goes along with it, can be quite expensive. connection-pools make this a
(mostly) one-time expense. this can be a huge performance win for functionality
just like that in this app, where a request is going to result in exactly one
query (the lookup from the tinyUrl to the userUrl). on a lightly loaded server
you might not notice it, but on a even moderately loaded system, things would
get bogged down pretty quick if each external request had to pay the connection
creation and teardown cost. so life is good

er, life is good as long as the connections in the pool are valid. but what can
happen (quite easily it turns out) is that the connection sitting in the pool
waiting to be reused can go bad. network problems, timeouts, there's a number
of things that can render a connection useless. all of which wouldn't be so bad
but unfortunately there doesn't seem to be anything WLS can do to detect this
situation. so the app asks for a connection out of the pool, tries to use it,
and as Steve Matchett would say, "ka-BLAMMO"

the solution to this is setting both the TestConnectionsOnReserve andTestTableName fields. what you can have WLS do for you is to test each
connection before it's pulled out of the pool. the first tells WLS you want
this done, and the second provides a query to use as the test. it'd be great if
WLS could just know what query to run based on the database type, but this
really isn't possible because in real production environments application
connections aren't privileged to access system tables and such that are a
built-in part of the database. so you have to provide a query that you know can
run successfully (assuming the connection is valid). then WLS will try to run
it before handing the connection out and if it fails they'll throw it away and
create another one

finally we update the domain and close it. one thing to watch out for is that
even though you're simply modifying an already existing domain, you still need
to refer to a "template" to make it work. so i refer to one that ships with
WLS. if you don't do this the script will fail at the end, which is annoying
but harmless as until the "update domain" succeeds, nothing has really happened

TinyServlet.java

most of this class is just plain vanilla servlet code, which i won't go into.
but there are a few interesting points to cover. if you look at the code, the
real work is done in the two methods, getTinyVal() andgetUserUrl(). we'll look at each one in turn

so the first thing we do is perform a JNDI lookup on the data-source,tinyDS. then we get a connection out of the connection-pool from the
data-source. next we prepare the query. what this means is the database,
depending on the vendor, does work (like parsing the SQL statement) up front
and remembers what that work is, so that when you actually execute the
statement, it can do the minimal work necessary. now normally, this "prep work"
would get tossed out when the connection was closed, since we're using WLS
connection-pooling, the connection is just returned to the pool on close, and
this becomes another case where connection-pooling lowers per-request latency.

let's keep going. after preparing the statement, we've got to provide the
userUrl as the value to be used in theINSERT statement. that's
what PreparedStatment.setString(...) does. that call is saying to useuserUrl as the value for the first (1) parameter in the query.
look closely at the query and you'll see a '?'. that represents a value
to be replaced at a later time. since we prepare the statement and the prep
work is cached, we've got to supply the actually value to use each time we want
to execute. which is what happens next

at this point, the table has been changed and there's a new row in the table.
but we've still got one thing to do, which is to figure out what value MySQL
decided to use for the AUTO_INCREMENT column tinyVal. since this functionality
is fairly common, JDBC supports it natively. the pattern is to get a ResultSet
from PreparedStatement.getGeneratedKeys() and then get the value, as an
integer, of the first row/column. that's the new value. and it turns out there
are other options too, depending on what version of everything you're using.
you can read more about it in thisarticle
up on the MySQL site

we wrap this method up by taking the integer and turning it into a string. for
this we call Integer.toString(...). here's where we're actually using
sexatrigesimal as the number format, because in addition to handing in the new
value the MySQL produced, we're handing in 36 for the radix (fancy word
for base). so when the format takes place, it's none as described above (clearly
we're not the first people to have needs like this)

one last thing i should also mention about getTinyVal is that it fails
to duplicate one part of the real service, which is that TinyURL will actually
return the same tinyURL for a given userUrl. to do this they must be running
a query first, before they run the insert, and they return the same tinyURL
they returned last time. but i wanted to keep things simple, so i didn't do
that part. it should be very easy for you to do, based on the code already
here

this method does the heavy-lifting when a request comes in where a user is
actually trying to chase a "tiny" URL, and needs to be redirected. the first
thing we need to do is translate the tinyVal string into an integer, base-36

the rest is pretty plain-vanilla JDBC and allows us to run aSELECT query against theredirect table in order to get the userUrl back. that's it

that's the whole thing. a "tiny" URL service, built on WLS and MySQL. i hope
this was as useful and interesting to you as it was to me. and if you have any
problems running or using this, please let me know

how do you use TinyURLs? and what can WebLogic Server and MySQL do to make problems like this easier?

6 Comments

hondo77
2003-11-06 10:33:10

Grammar?
Um, maybe I'm late to the party here but what's the deal with not capitalizing the beginnings of sentences? You don't have a problem typing "URL" so why not extend this to the first letter of the first word of sentences? Really, it's cool AND considerate of your readers.

rodc
2003-11-06 11:58:14

Grammar?
i'm sorry if my writing style seems inconsiderate. that was not my intent, and i appologize if that's the way it seemed

i adopted this style several years ago and use it consistently in IM, email and blogs due to what is, IMO, an "informal" forum of communication

out of respect, i capitalize all proper-names and acronyms, although i'm sure i make mistakes sometimes

i should probably point out for completeness that i break several other rules too. there's only a single space between a period and the start of the next sentance, and i don't use a period at all to end sentances that end a paragraph

and don't get me started on the whole "ending a sentance with quoted text, the period goes inside the quotes". see there, i just did it wrong. the period should have gone inside the quotes

i'm not sure that really answered your question, but it's really the best i have to offer

thanks for the feedback!

phoradan
2003-11-07 16:39:12

Grammar?
To me judging a coder on grammar is like judging a beer on its label. I lose 50% of my respect for a tech person each time s/he reaches for the spell check button.

I personally not only don't mind the bad grammar, i applaud it.

anonymous2
2003-11-20 16:26:47

about informal text conventions
Judith Marten better known as Miss Manners could say this better, but she often reminds us that seemingly silly / outdated manners were invented to help people to communicate, and share other experiences in many ways. Spelling, capitals, punctuation, white space, all help people to understand what you are telling them, making the article more useful.

That said, it was useful, thanks for writing it.

Not so anonymous, Morris Hirsch

phoradan
2003-12-03 17:43:39

about informal text conventions
I like the dichotomy in your mail, which comments on the need for proper punctuation but then tramples the very punctuation-oriented ideals you claim to cherish.

For example, in your first sentence, there:
- should be a comma after "Marten"
- should be a comma after "Manners"
- should not be a comma after "communicate"
And you abused the virgule to boot. Sentences 2, 3, and 4 (actually, some are just fragments) aren't much better.

The tension between the content and the syntax of your note is a brilliant commentary on the duality of man.

georgemaryjane
2006-03-17 05:17:28

capitalization
just for the record,
back a few years ago,
nobody used capitalization in emails.
i still dont, for many reasons,
and so it goes
thanks,
george

Sign up today to receive special discounts, product alerts, and news from O'Reilly.