RE: [Webware-discuss] Clustering?

At 10:23 AM 11/29/01 -0500, Jeff Johnson wrote:
>We actually have a hardware load balancer that is collecting dust
>because I had already written oodles of cold fusion using sessions and
>getting primary keys from MS SQL Server by "select max(primaryKey) + 1".
>
>The load balancer doesn't have the option to use cookies to keep users
>on the same server. That option cost $4,000 :). Cold Fusion does
>support database storage of session variables, called client variables.
>That would solve the session problem.
What do you think of the option we sort of outlined yesterday:
- Software or hardware load balancer distributes incoming requests randomly
to a pool of machines running Apache (call them APACHE1, APACHE2, etc).
- APACHE1, APACHE2, etc have a modified version of the WebKit adapter
running on them. The modification is: if no session ID is found in the
request, it sends the request to a random machine, otherwise it parses the
session ID to find which particular machine to send the request to, from a
pool of available WebKit machines (WEBKIT1, WEBKIT2, etc). The pool of
machines would be configurable in the adapter's config file.
- WEBKIT1, WEBKIT2, etc have a modified version of Session.py that encodes
the machine name into the session key. This would be configured in
Application.config. The session key would be something like
"WEBKIT1_237480927".
The idea being to keep requests for the same session on the same
machine. Would this work for you? I'm interested in helping to write the
WebKit modifications needed.
The other option is, as you said, to write a SessionSQLStore.py to store
sessions in a database. You have to be a little bit careful about locking
here, since multiple requests for the same session may be coming from
different appservers at the same time. I'm not sure how to avoid wiping
out one appserver's changes with another appserver's changes in that case.
--
- Geoff Talvola
gtalvola@...

Thread view

On Wed, 2001-11-28 at 18:10, Jack Moffitt wrote:
> Was there a reason this was a direct reply :)
Nope, just hit the wrong key :) And then I still forget to put the list
back on...
> > > It's simple, but not complete. Bookmarks would be for specific servers,
> > > not the main machine (unless you allowed them to just bookmark the main
> > > page).
> >
> > Well, the bookmarks are pretty harmless -- but it would be a problem
> > when there was a link to a specific machine from some
> > high-traffic-generating site.
> > One way you could deal with this was capturing the instance when the
> >
> > HTTP_REFERER wasn't local, and then randomly redirecting them. You'd
> > still have a lot of incoming traffic to a single server, but later
> > connections would be balanced. (Build this into the adapter and it
> > shouldn't be a problem at all)
>
> One of the major reasons for splitting over several boxes is to allow
> one or several boxes to fail without losing services. Typically in a
> round-robin dns fashion (or a slightly smarter one like you described),
> you won't get this benefit. A book mark to a dead server is useless. A
> bookmark to the main page would still work.
>
> In my opinion there should be an easy way to do both.
Well, if you are already doing fancy DNS stuff, you could have the DNS
handle failures by redirecting everything from the failed server to a
working one.
All of these systems generally have a critical server to do the
redirection, so there's always that point of failure.
> > > Also, if you left and came back, you might get to a different machine,
> > > which means for some applications the number of times you have to login
> > > would increase.
> >
> > This doesn't seem like it should happen. The only way would be if you
> > left the site and came back through the front page. Most people
> > wouldn't be surprised if they had to relogin in this case -- it might
> > even be best that they do, since their session really has ended, even if
> > they haven't closed their browser.
>
> It depends. I suppose you could also solve this at the application
> level by using cookies and restarting a session when you come back.
> Many sites do this, like Amazon, etc. I hate logging into sourceforge
> every time, so being able to persist state like that is important, but I
> you may have a point that the 'session' level might not be appropriate.
>
> > > There is an easy way (i think) to complete this easy solution, and that
> > > is to make the webware adaptor smart. Enable it to be the front end to
> > > a pool of servers, and use a cookie or something to store which server
> > > that user should be using, and direct requests to it.
> >
> > That would work... but it depends on what the memory and CPU usage of
> > the adapter. Anyone know what kind of performance mod_webkit has in
> > that way? It's fast, but I don't know how much memory it needs for a
> > connection.
>
> I don't think it takes very much. Memory isn't really even an issue in
> these days of 1GB of ram costing about $120. Of course, the more memory
> at the adapter level you want to sacrifice, the more complex the job it
> can do I guess, like caching :)
>
> The JSP servers tend to have solved or at least are attempting to solve
> these problems in a number of ways, and I think it would be a good idea
> to find out what they are doing :)
I remember reading a paper from someone involved in etoys.com, and they
used a lot of open source stuff to do both load balancing and caching.
Most of it was with various Apache modules, and seemed fairly easy to
translate to any other dynamic source (I think they were using PHP).
They were talking about storing sessions on a shared server accessed
over NFS, but that seemed kind of silly to me -- now you have this
beefy, critical file server ready to be a bottleneck. Avoiding that was
why multiple domain names just seems a lot better. Then all you need is
the beefy, critical database server.
Ian

We actually have a hardware load balancer that is collecting dust
because I had already written oodles of cold fusion using sessions and
getting primary keys from MS SQL Server by "select max(primaryKey) + 1".
The load balancer doesn't have the option to use cookies to keep users
on the same server. That option cost $4,000 :). Cold Fusion does
support database storage of session variables, called client variables.
That would solve the session problem. =20
The other problem is that SQL Server doesn't have Sequences. Hal Helms,
a CF guru, writes that to overcome this, just use GetUUID() to create
primary keys that are almost guaranteed to be unique across the entire
world. The downside is that those primary keys are 35 bytes long.
Yuck. I'm curious how others simulate sequences on MS SQL Server and
similar databases. Server side locking of tables during inserts?
I'm still reading that etoys clustering article. Good stuff!
-Jeff

At 10:23 AM 11/29/01 -0500, Jeff Johnson wrote:
>The other problem is that SQL Server doesn't have Sequences. Hal Helms,
>a CF guru, writes that to overcome this, just use GetUUID() to create
>primary keys that are almost guaranteed to be unique across the entire
>world. The downside is that those primary keys are 35 bytes long.
>Yuck. I'm curious how others simulate sequences on MS SQL Server and
>similar databases. Server side locking of tables during inserts?
I'm not sure what Sequences are, but SQL Server most definitely does
support unique primary keys -- just make sure Identity is turned on for the
primary key column. Then you can insert a new row and retrieve its
auto-generated unique primary key with something like this:
SET NOCOUNT ON
INSERT INTO MyTable (MyValue1, MyValue2) VALUES ('foo', 'bar')
SELECT @@IDENTITY AS NewPrimaryKey
SET NOCOUNT OFF
In other words, @@IDENTITY is the last Identity value that was
auto-generated on the given connection.
The business with "SET NOCOUNT ON" and "SET NOCOUNT OFF" is needed due to a
bug that I read about in the Microsoft Knowledge Base.
--
- Geoff Talvola
gtalvola@...

At 10:23 AM 11/29/01 -0500, Jeff Johnson wrote:
>We actually have a hardware load balancer that is collecting dust
>because I had already written oodles of cold fusion using sessions and
>getting primary keys from MS SQL Server by "select max(primaryKey) + 1".
>
>The load balancer doesn't have the option to use cookies to keep users
>on the same server. That option cost $4,000 :). Cold Fusion does
>support database storage of session variables, called client variables.
>That would solve the session problem.
What do you think of the option we sort of outlined yesterday:
- Software or hardware load balancer distributes incoming requests randomly
to a pool of machines running Apache (call them APACHE1, APACHE2, etc).
- APACHE1, APACHE2, etc have a modified version of the WebKit adapter
running on them. The modification is: if no session ID is found in the
request, it sends the request to a random machine, otherwise it parses the
session ID to find which particular machine to send the request to, from a
pool of available WebKit machines (WEBKIT1, WEBKIT2, etc). The pool of
machines would be configurable in the adapter's config file.
- WEBKIT1, WEBKIT2, etc have a modified version of Session.py that encodes
the machine name into the session key. This would be configured in
Application.config. The session key would be something like
"WEBKIT1_237480927".
The idea being to keep requests for the same session on the same
machine. Would this work for you? I'm interested in helping to write the
WebKit modifications needed.
The other option is, as you said, to write a SessionSQLStore.py to store
sessions in a database. You have to be a little bit careful about locking
here, since multiple requests for the same session may be coming from
different appservers at the same time. I'm not sure how to avoid wiping
out one appserver's changes with another appserver's changes in that case.
--
- Geoff Talvola
gtalvola@...

> What do you think of the option we sort of outlined yesterday:
>=20
> - Software or hardware load balancer distributes incoming=20
> requests randomly=20
> to a pool of machines running Apache (call them APACHE1,=20
> APACHE2, etc).
> - APACHE1, APACHE2, etc have a modified version of the WebKit adapter=20
> running on them. The modification is: if no session ID is=20
> found in the=20
> request, it sends the request to a random machine, otherwise=20
> it parses the=20
> session ID to find which particular machine to send the=20
> request to, from a=20
> pool of available WebKit machines (WEBKIT1, WEBKIT2, etc). =20
> The pool of=20
> machines would be configurable in the adapter's config file.
> - WEBKIT1, WEBKIT2, etc have a modified version of Session.py=20
> that encodes=20
> the machine name into the session key. This would be configured in=20
> Application.config. The session key would be something like=20
> "WEBKIT1_237480927".
>=20
> The idea being to keep requests for the same session on the same=20
> machine. Would this work for you? I'm interested in helping=20
> to write the=20
> WebKit modifications needed.
I'm a fan of cookies (or too lazy to do it the other way) but some
people have said they can't count on them. Would this support those
folks too?
One of the suggestions for multi-domains bothers me, not only would
bookmarks be server specific but SSL certs would have to exist for each
domain.
>=20
> The other option is, as you said, to write a=20
> SessionSQLStore.py to store=20
> sessions in a database. You have to be a little bit careful=20
> about locking=20
> here, since multiple requests for the same session may be coming from=20
> different appservers at the same time. I'm not sure how to=20
> avoid wiping=20
> out one appserver's changes with another appserver's changes=20
> in that case.
I started thinking about those issues and no longer think it's trivial
:). All reads and writes would probably have to go through a method
that locked the SQL table. Luckily I've never had to lock tables. I
try to avoid db vendor specific code but I think you'd have to do it for
this.

At 12:07 PM 11/29/01 -0500, Jeff Johnson wrote:
> > What do you think of the option we sort of outlined yesterday:
> >
> > - Software or hardware load balancer distributes incoming
> > requests randomly
> > to a pool of machines running Apache (call them APACHE1,
> > APACHE2, etc).
> > - APACHE1, APACHE2, etc have a modified version of the WebKit adapter
> > running on them. The modification is: if no session ID is
> > found in the
> > request, it sends the request to a random machine, otherwise
> > it parses the
> > session ID to find which particular machine to send the
> > request to, from a
> > pool of available WebKit machines (WEBKIT1, WEBKIT2, etc).
> > The pool of
> > machines would be configurable in the adapter's config file.
> > - WEBKIT1, WEBKIT2, etc have a modified version of Session.py
> > that encodes
> > the machine name into the session key. This would be configured in
> > Application.config. The session key would be something like
> > "WEBKIT1_237480927".
> >
> > The idea being to keep requests for the same session on the same
> > machine. Would this work for you? I'm interested in helping
> > to write the
> > WebKit modifications needed.
>
>I'm a fan of cookies (or too lazy to do it the other way) but some
>people have said they can't count on them. Would this support those
>folks too?
It could be made to support them -- you just have to make sure the the
logic in the adapter knows how to find the session ID, whatever method is
being used to encode it.
>One of the suggestions for multi-domains bothers me, not only would
>bookmarks be server specific but SSL certs would have to exist for each
>domain.
But wouldn't the hardware load balancer make it look to the user like
there's only a single box, even though it's actually distributing the
requests to multiple back ends machines running Apache? I don't know about
the multiple SSL certs; could you just use a single cert and place it onto
all of your boxes?
> >
> > The other option is, as you said, to write a
> > SessionSQLStore.py to store
> > sessions in a database. You have to be a little bit careful
> > about locking
> > here, since multiple requests for the same session may be coming from
> > different appservers at the same time. I'm not sure how to
> > avoid wiping
> > out one appserver's changes with another appserver's changes
> > in that case.
>
>I started thinking about those issues and no longer think it's trivial
>:). All reads and writes would probably have to go through a method
>that locked the SQL table. Luckily I've never had to lock tables. I
>try to avoid db vendor specific code but I think you'd have to do it for
>this.
--
- Geoff Talvola
gtalvola@...