Do some testing on standard workflow in your application and figure out how many sessions are working OK (you also need to count in the delay between requests people typicall do and depending on what is acceptable for you monitor for CPU usage below 100%, remaining memory, good reponse times, etc) and that will be the capacity per server.

No, capacity for CPU does not affect the calculation in any way. It is your average load divided by number of cores available to JVM. For instance, if you have 4 cores, your maximum "average load" value might be 400. This figure divided by 4 is Load returned. So, if you set your quad core on fire, it will return Load ~ 99 or so...

If you set the capacity of your cpu metric to 2, the load number reported to the load balancer for the current node will be half of what it normally would be. The result is that the node will under-report its load to the load balancer, causing it to receive more requests in response than it normally would. If your goal instead is to over-report the load of a node, you would want to give it a fractional value, e.g. capacity="0.5".