How to load balance in .PAC file

One thing to note when writing proxy.pac files is that you should avoid calling functions or doing dns lookups more than once. Always read it into a variable and call that variable multiple times. The browser never caches the results of functions or dns lookups (fortunately or unfortunately depending on circumstances). I have seen pac files that took 30 seconds to determine a result (each time).

So, in Shawns example (a very very common scenario)

you would call

var myip = myipaddress()

and then reference myip when calling isinnet.

if (isInNet(myip)...

While Shawn's example is the most common, a few other methods are also used.

This will load-balance/failover without having to know the subnets in advance by splitting based on the last octet

// Find the 4th octet

var myip=myIpAddress()

var ipbits=myip.split(".")

var myseg=parseInt(ipbits[3])

// Check to see if the 4th octect is even or odd

if (myseg==Math.floor(myseg/2)*2) {

// Even

proxy = "PROXY 172.18.0.160:9090; PROXY 172.18.0.159:9090; DIRECT";

}

else {

// Odd

proxy = "PROXY 172.18.0.159:9090; PROXY 172.18.0.160:9090; DIRECT";

}

The other common option is to simply define a virtual name in DNS and configure the DNS server to round robin the responses.

How to load balance in .PAC file

One thing to note when writing proxy.pac files is that you should avoid calling functions or doing dns lookups more than once. Always read it into a variable and call that variable multiple times. The browser never caches the results of functions or dns lookups (fortunately or unfortunately depending on circumstances). I have seen pac files that took 30 seconds to determine a result (each time).

So, in Shawns example (a very very common scenario)

you would call

var myip = myipaddress()

and then reference myip when calling isinnet.

if (isInNet(myip)...

While Shawn's example is the most common, a few other methods are also used.

This will load-balance/failover without having to know the subnets in advance by splitting based on the last octet

// Find the 4th octet

var myip=myIpAddress()

var ipbits=myip.split(".")

var myseg=parseInt(ipbits[3])

// Check to see if the 4th octect is even or odd

if (myseg==Math.floor(myseg/2)*2) {

// Even

proxy = "PROXY 172.18.0.160:9090; PROXY 172.18.0.159:9090; DIRECT";

}

else {

// Odd

proxy = "PROXY 172.18.0.159:9090; PROXY 172.18.0.160:9090; DIRECT";

}

The other common option is to simply define a virtual name in DNS and configure the DNS server to round robin the responses.

How to load balance in .PAC file

Ha! I enjoyed that and esoteric discussions of javascript random functions aside, I agree that that would work under specific circumstances.

However, how would you know which proxy the user went through from request to request? Keep in mind that everytime it goes to do a request (if you disable proxy result caching in IE as you should, or use FF/Chrome/Safari/Opera/etc) or at minimum per domain, it will walk through the pac file logic and possibly pick a new proxy to use. That could get confusing or problematic quickly.

You will have extra authentications as the user switches back and forth between the proxies on succeeding requests, I would even expect popups as the NTLM session token supplied by the browser could possibly be invalid for that particular proxy (although it's probably supposed to renegotiate on a new connection)

No idea which proxy to look at for logs/traces

In general, whether you are using WCCP, proxy.pac load balancing or better yet a physical load balancer, if you are doing authentication or troubleshooting, you want the clients to be relatively sticky.

Still, cool, I didn't think of that

@E-Squared

I did mention DNS RR (last), and it works fine, but can take time to fail over/through. Each failing proxy takes time to get through and I'm an impatient person. But having both the virtual name and the actuals is a very good idea as it gives automatic failover.

Also, everytime I've seen DNS roundrobin we ended up putting the proxyname in the block page just to figure out which proxy you're using when troubleshooting.

How to load balance in .PAC file

I am testing the .pac file with the math.random function. Seems to work great. I see myself hitting both proxies very evenly.

I understand your concerns, but this is how our WCCP traffic is load balanced anyway. It hits both proxies at the same time...sending some traffic to one and some to the other for the same user. I have no control over what gets sent here or there. You are right, it is a pain with the log files but I got use to checking both logs. And when they get sent to our Web Reporter they are combined and act as one.

Our proxies share the exact same configuration. They act as one device because they are connected in a central management configuration.

Thanks for your help and usful tips. Here is my simple .pac file. Trying to keep it as easy as possible.

How to load balance in .PAC file

Hmm, well that's not best practices in authenticated environments. Normally MFE advises that you distribute load with WCCP based on just Source IP (not [Source IP + Source port] or Destination IP) if you are doing authentication. The only reason I would ever use anything other than Source IP is in cases where multiple users are coming from one IP (Citrix, Terminal Services, NATs, etc)

It's great that it works and it is certainly up to you jont717 but you are more than doubling your authentications (due to NTLM session identifiers) and if you ever open a ticket with support regarding connection or authentication issues you will probably need to ensure that clients are sticky.

I can tell you for a fact that we have seen issues with authentication when the clients were not sticky. The last time I saw it was with a physical load balancer but the same theory applies to WCCP and proxy.pac equally.

How to load balance in .PAC file

Every time a new request comes through the router looks up in it's hash table to determine which cache to send the traffic to. At the moment for you, that cache determination is done via both Source IP and Destination IP which is almost the worst of all worlds as you can't guarantee which cache a specific user will use and can't guarantee which cache all users will use for a specific destination.

User A + Site A may go to cache 1

User A + Site B may go to cache 2

User B + Site A may go to cache 2

User B + Site B may go to cache 1

Not ideal. It's not quite the worst option as at least you can keep track of the User/Destination pairs if you wanted to, but still a pain.

If you weren't authenticating and you were attempting to save bandwidth (not generally a concern in this day and age) you would probably want to hash just based on destination IP. That way regardless of the user, they would use the same cache which may have already cached that content. This used to be useful but is of limited use with web 2.0 stuff/dynamic sites and certainly looks odd if you are doing auth or troubleshooting.

Source IP + Source port is what you want to use if everyone is coming from the same IP address as each new connection would use some arbitrary high port. In this case it will be completely random as to which cache a user will end up using from request to request.

Source IP will just split the requests coming in across the caches based on the client's IP. That way assuming they don't change addresses and the cache pool is static, they use the same proxy.

It's a good thing for authentication, it's a good thing for troubleshooting and you should consider it.

On a side note, you should be able to switch it with almost no interruption of service basically by changing the options for WCCP on the MWGs. However, if you can, I would probably disable WCCP on all MWGs, change the settings and then rejoin with the router(s).