Pages

Thursday, October 10, 2013

Comparing mod_proxy and mod_jk

Comparing
mod_proxy and mod_jk

Introduction

Apache 2.2
ships with advanced mod_proxy set of modules that have some of the mod_jk
capabilities namely AJP protocol and integrated load balancer.

AJP (Apache
Jserv Protocol) with it's current version 1.3 is constant binary protocol.
Constant means that the connection between web server and application server is
presumed to stay open once established for the system lifetime.

This in
essence makes web server and application server a single system from the users
point of view.

Since there is always a chance that one of the points in the system can fail
for various reasons, the mod_proxy and mod_jk as well as the application server
must have some sort of connection and transport channel error detection, and
act accordingly.

One of the main technological advances in mod_jk during the past few years was
made in that area, and there are various techniques for communication channel
error detection and recovery. At present date mod_jk is far more advanced in
that area compared with mod_proxy.

Protocol

The foundation
of mod_jk is AJP protocol, that is sometimes known as binary http protocol. The
point beside the custom binary protocol is that the client request data is
already decoded inside web server, so there is no need for doubling that
procedure in the application server as well. Next, the standard request and
response headers are passed as atoms in a form of two byte sequence instead as
strings, thus lowering down the network traffic between web and application
server.

However there is one major limitation of the AJP protocol, and that is it's
maximum packet size. The packet size is limited to little bit less then 8K.
With latest mod_jk and Tomcat versions the packet size can be enlarged to 64K,
but it's still limited. Mod_proxy still has no such capability, so it's maximum
packet size is 8K. This can be problem with large client requests, especially
with some custom SSO modules that store huge session data inside cookies or
custom headers. In case there is a need to support the huge client requests,
the only solution is to use http in favor of AJP protocol.

Encryption and SSL support

AJP
protocol is not encrypted, so it should not be used with public network
infrastructure. In case there is a need for securing the data transfer between
web and application server because the transport media could be sniffed by
outside world, then some sort of SSL tunnel must be used. The other option is
to use the https protocol with mod_proxy. However using https protocol makes
things a little bit more complex because one must assure to write the custom
Filter in application server so that client certificates get passed
transparently to the application server. AJP protocol on the other hand handles
this automatically, but with the consequence of passing decrypted data between
web and application server. In essence for SSL, the AJP protocol behaves like caching
SSL accelerator.

This offers
much higher performance because data is only decrypted once. Securing the
network between web and application server by using a different network card
and set of firewalls and routers is the most secure solution.

One other
option is to put the web and application server on the same physical box in
which case the in-memory communication will be used thus increasing the
security of the entiresystem.

Load balancing

Recent versions
of mod_jk have much advanced load balancer when compared to mod_proxy_balancer.
Mod_jk has additional 'by busyness' method that load balance according to the
actual application server response time.

Mod_jk also
has so called load balancer maintenance to be able to handle the burst load
more effectively or to decay the load to the node that was down for
maintenance.

In cases
where there is a need for large amount of application servers and session
replication mod_jk has so called 'Domain Model Clustering' that supports new
JBoss Cache's buddy replication. In essence it allows to lower down the session
replication data transfer by grouping the nodes in the clusters, and doing the
replication only between the members of the cluster.

Apache Httpd Versions

New mod_proxy
is present only with Apache httpd versions 2.2 and up. This means that the web
server upgrade will be needed in case there is a need for a new mod_proxy.

Use
worker-mpm with Apache httpd. Using worker mpm both mod_proxy and mod_jk have
the option to tune the connection pool size between web and application server.
This is needed in deployments where the Apache httpd is used to deliver some
other content beside just fronting application server. For example delivering
static content etc. In those cases the actual number of requests to the application
server can be lower then the total number of client connections that needs to
be handled by the web server. In this case the worker mpm allow to have the
connection pool size lower then the number of MaxThreadsPerChild.

Windows and
Netware Apache httpd versions are completely threaded, so their mpm and
connection pool size can be handled in much wider range.

mod_proxy
vs. mod_jk

So what to use
when? It depends on your topology. In case you already have or need Apache 2.2
functionality you have the choice to choose between mod_proxy and mod_jk.
Mod_jk works very well on Apache 2.2, so it all depends on the functionality
needed:

mod_proxy

·Pros:

§No need for a separate module compilation and maintenance. mod_proxy,
mod_proxy_http, mod_proxy_ajp and mod_proxy_balancer comes as part of standard
Apache 2.2+ distribution

§Ability to use http https or AJP protocols, even within the same
balancer.

·Cons:

§mod_proxy_ajp does not support large 8K+ packet sizes.

§Basic load balancer

§Does not support Domain
model clustering

mod_jk

·Pros:

§Advanced load balancer

§Advanced node failure detection

§Support for large AJP packet sizes

·Cons:

§Need to build and maintain a separate module

Conclusion

My personal suggestion is to use the mod_jk if you can, and if you have
the staff to maintain the module binary versions. Mod_proxy is still in active
development and misses some of the features from mod_jk. However if there is a
need to use the https or a simple load balancing scenario use the mod_proxy.