Chapter 5 Deployment Design

During the deployment design phase of the solution life cycle, you design
a high-level deployment architecture and a low-level implementation specification,
and prepare a series of plans and specifications necessary to implement the
solution. Project approval occurs in the deployment design phase.

About Deployment Design

Deployment design begins with the deployment scenario created during
the logical design and technical requirements phases of the solution life
cycle. The deployment
scenario contains a logical architecture and the quality of service (QoS) requirements for the solution. You map
the components identified in the logical architecture across physical servers
and other network devices to create a deployment architecture. The QoS requirements
provide guidance on hardware configurations for performance, availability,
scalability, and other related QoS specifications.

Designing the deployment architecture is an iterative process. You typically
revisit the QoS requirements and reexamine your preliminary designs. You take
into account the interrelationship of the QoS requirements, balancing the
trade-offs and cost of ownership issues to arrive at an optimal solution that
ultimately satisfies the business goals of the project.

Project Approval

Project approval occurs during the deployment design phase, generally
after you have created the deployment architecture. Using the deployment architecture
and possibly also implementation specifications described below, the actual
cost of the deployment is estimated and submitted to the stakeholders for
approval. Once the project is approved, contracts for completion of the deployment
are signed and resources to implement the project are acquired and allocated.

Deployment Design Outputs

During the deployment design phase, you might prepare any of the following
specifications and plans:

Deployment architecture. A
high-level architecture that depicts the mapping of a logical architecture
to a physical environment. The physical environment includes the computing
nodes in an intranet or Internet environment, processors, memory, storage
devices, and other hardware and network devices.

Implementation specifications. Detailed
specifications used as a blueprint for building the deployment. These specifications
provide specifics on the computer and network hardware to acquire and describe
the network layout for the deployment. Implementation specifications also
include specifications for directory services, including details on a directory
information tree (DIT) and the groups and roles defined for directory access.

Implementation plans. A
group of plans that cover various aspects of implementing an enterprise software
solution. Implementation plans include the following:

Migration plan. Describes the
strategies and processes for migrating enterprise data and upgrading enterprise
software. The migrated data must conform to the formats and standards of the
newly installed enterprise applications. All enterprise software must be at
correct release version levels to interoperate.

Installation plan. Derived
from the deployment architecture, specifies hardware server names, installation
directories, installation sequence, types of installation for each node, and
the configuration information necessary to install and configure a distributed
deployment.

User management plan. Includes
migration strategies for data in existing directories and databases, directory
design specifications that takes into account replication design specified
in the deployment architecture, and procedures for provisioning directories
with new content.

Test plan. Describes the procedures
for testing the deployed software, including specific plans for developing
prototype and pilot implementations, stress tests that determine the ability
to handle projected loads, and functional tests that determine if planned
functionality operates as expected.

Roll-out plan. Describes the
procedures and schedule for moving the implementation from a planning and
test environment to a production environment. Moving an implementation into
production usually occurs in various phases. For example, the first phase
might be deploying the software for a limited group of users and increasing
the user base with each phase until the entire deployment is complete. Phased
implementation can also include scheduled implementation of specific software
packages until the entire deployment is complete.

Disaster recovery plan. Describes
procedures on how to restore the system from unexpected system-wide failures.
The recovery plan includes procedures for both large scale and small scale
failures.

Training plan. Contains processes
and procedures for training operators, administrators, and end users on the
newly installed enterprise software.

Factors Affecting Deployment Design

Several factors influence the decisions you make during deployment design.
Consider the following key factors:

Logical Architecture. The
logical architecture details the functional services in a proposed solution
and the interrelationships of the components providing those services. Use
the logical architecture as a key to determining the best way to distribute
services. A deployment scenario contains the logical architecture paired with
quality of service requirements (described below).

Quality of service requirements. The quality of service (QoS) requirements specify
various aspects of a solution’s operation. Use the QoS requirements
to help develop strategies to achieve performance, availability, scalability,
serviceability, and other quality of service goals. A deployment scenario contains
the logical architecture (described previously) paired with quality of service
requirements.

Usage analysis. Usage analysis, developed during the technical requirements phase
of the solution life cycle, provides information on usage patterns that can
help estimate load and stress on a deployed system. Use the usage analysis
to help isolate performance bottlenecks and develop strategies to satisfy
QoS requirements.

Use cases. Use cases, developed during the technical requirements phase
of the solution life cycle, lists distinct user interactions identified for
a deployment, often identifying the most common use cases. Although the use
cases are embodied in the usage analysis, when assessing a deployment design
you should refer to the use cases to make sure that they are properly addressed.

Service level agreements. A service level agreement (SLA) specifies minimum
performance requirements, and when those requirements are not met, the level
and extent of customer support that must be provided. A deployment design
should easily meet the performance requirements specified in a service level
agreement.

Total cost of ownership. During deployment design you analyze potential solutions that
address the QoS requirements for availability, performance, scalability, and
others. However, for each solution you consider, you must also consider the
cost of that solution and how that cost impacts the total cost of ownership.
Make sure that you consider the trade-offs embodied by your decisions and
that you have optimized your resources to achieve business requirements within
business constraints.

Business goals. Business goals are stated during the business analysis phase
of the solution life cycle and include the business requirements and business
constraints to meet those goals. Deployment design is ultimately judged by
its ability to satisfy the business goals.

Deployment Design Methodology

As with other aspects of deployment planning, deployment design is as
much an art as it is a science and cannot be detailed with specific procedures
and processes. Factors that contribute to successful deployment design are
past design experience, knowledge of systems architecture, domain knowledge,
and applied creative thinking.

Deployment design typically revolves around achieving performance requirements
while meeting other QoS requirements. The strategies you use must balance
the trade-offs of your design decisions to optimize the solution. The methodology
you use typically involves the following tasks:

Estimating processor requirements. Deployment
design often begins with estimating the number of CPUs needed for each component
in the logical architecture. Start with the use cases representing the heaviest
load and continue through each use case. Consider the load on all components
providing support to the use cases, and modify your estimates accordingly.
Also consider any previous experience you have with designing enterprise systems.

Replicating services for availability
and scalability. Once you are satisfied with the processor estimates, make modifications
to the design to account for QoS requirements for availability and scalability.
Consider load balancing solutions that address availability and failover considerations.

During your analysis, consider the trade-offs of your design decisions.
For example, what affect does the availability and scalability strategy have
on serviceability (maintenance) of the system? What are the others costs of
the strategies?

Identifying bottlenecks. As you continue with your analysis, examine the deployment design
to identify any bottlenecks that cause the transmission of data to fall beneath
requirements, and make adjustments.

Managing risks. Revisit your business and technical analyses with respect to
your design, making modifications to account for events or situations that
might not have been foreseen in the earlier planning.

Estimating Processor Requirements

This section discusses a process for estimating the number of CPU processors
and corresponding memory that are necessary to support the services in a deployment
design. The section includes a walkthrough of an estimation process for an
example communications deployment scenario.

The estimation of CPU computing power is an iterative process that considers
the following:

Logical components and their interactions (as indicated by
component dependencies in the logical architecture)

Usage analysis for the identified use cases

Quality of service requirements

Past experience with deployment design and with Java Enterprise System

Consultation with Sun professional services who
have experience with designing and implementing various types of deployment
scenarios

The estimation process includes the following steps. The ordering of
these steps is not critical, but provides one way to consider the factors
that affect the final result.

Determine a baseline CPU estimate for components identified
as user entry points to the system.

One design decision is whether
to fully load or partially load CPUs. Fully loaded CPUs maximize the capacity
of a system. To increase the capacity, you incur the maintenance cost and
possible downtime of adding additional CPUs. In some cases, you can choose
to add additional machines to meet growing performance requirements.

Partially loaded CPUs allow room to handle excess performance requirements
without immediately incurring maintenance costs. However, there is an additional
up front expense of the under-utilized system.

Make adjustments to the CPU estimates to account for interactions
between components.

Study the interactions among components in
the logical architecture to determine the extra load required because of dependent
components.

Study the usage analysis for specific use cases to determine
peak loads for the system, and then make adjustments to components that handle
the peak loads.

Start with the most heavily weighted use cases
(those requiring the most load), and continue with each use case to make sure
you account for all projected usage scenarios.

Make adjustments to the CPU estimates to reflect security,
availability, and scalability requirements.

This estimation process provides starting points for determining the
actual processing power you need. Typically, you create prototype deployments
based on these estimates and then perform rigorous testing against expected
use cases. Only after iterative testing can you determine the actual processing
requirements for a deployment design.

Example Estimating Processor Requirements

This section illustrates one methodology to estimate processing power
required for an example deployment. The example deployment is based on the
logical architecture for the identity-based communications solution for a medium-sized enterprise
of about 1,000 to 5,000 employees, as described in the section Identity-Based Communications Example.

The CPU and memory figures used in the example are arbitrary estimates
for illustration only. These figures are based on arbitrary data upon which
the theoretical example is based. An exhaustive analysis of various factors
is necessary to estimate processor requirements. This analysis would include,
but not be limited to, the following information:

Detailed use cases and usage analysis based on an exhaustive
business analysis

Quality of service requirements determined by analysis of
business requirements

Specific costs and specifications of processing and networking
hardware

Past experience implementing similar deployments

Caution –

The information presented in these examples do not represent
any specific implementation advice, other than to illustrate a process you
might use when designing a system.

Determine Baseline CPU Estimate for User Entry Points

Begin by estimating the number of CPUs required to handle the expected
load on each component that is a user entry point. The following figure shows
the logical architecture for an identity-based communications scenario described
previously in Identity-Based Communications Example.

The following table lists the components in the presentation tier of
the logical architecture that interface directly with end users of the deployment.
The table includes baseline CPU estimates derived from analysis of technical
requirements, use cases, specific usage analysis, and past experience with
this type of deployment.

Include CPU Estimates for Service Dependencies

The components providing user entry points require support from other Java Enterprise System components.
As you continue to specify performance requirements, add the performance estimates
required for supporting components. The type of interactions among components
should be detailed when designing the logical architecture, as described in
the logical architecture examples in the section Example Logical Architectures.

Study Use Cases for Peak Load Usage

Return to the use cases and usage analysis to identify areas of peak
load usage and make adjustments to your CPU estimates.

For example, suppose for this example you identify the following peak
load conditions:

Initial ramp up of users as they log on simultaneously

Email exchanges during specified time frames

To account for this peak load usage, make adjustment to the components
providing these services. The following table outlines adjustments you might
make to account for this peak load usage.

Table 5–3 CPU Estimate Adjustments
for Peak Load

Component

CPUs (Adjusted)

Description

Messaging Server MTAinbound

2

Add 1 CPU for peak incoming email

Messaging Server MTAoutbound

2

Add 1 CPU for peak outgoing email

Messaging ServerMMP

2

Add 1 CPU for additional load

Messaging Server STR(Message Store)

2

Add 1 CPU for additional load

Directory Server

3

Add 1 CPU for additional LDAP lookups

Modify Estimates for Other Load Conditions

Continue with your CPU estimates to take into account other quality
of service requirements that can impact load:

Security. From the technical
requirements phase, determine how secure transport of data might affect the
load requirements and make corresponding modifications to your estimates.
The following section, Estimating Processor Requirements for Secure Transactions describes a process for making adjustments.

Latent capacity and scalability. Modify
CPU estimates as necessary to allow latent capacity for unexpected large loads
on the deployment. Look at the anticipated milestones for scaling and projected
load increase over time to make sure you can reach any projected milestones
to scale the system, either horizontally or vertically.

Update the CPU Estimates

Typically, you round up CPUs to an even number. Rounding up to an even
number allows you to evenly split the CPU estimates between two physical servers
and also adds a small factor for latent capacity. However, round up according
to your specific needs for replication of services.

As a general rule, allow 2 gigabytes of memory for each CPU. The actual
memory required depends on your specific usage and can be determined in testing.

The following table lists the final estimates for the identity-based
communications example. These estimates do not include any additional computing
power that could have been added for security and availability. Totals for
security and availability will be added in following sections.

Table 5–4 CPU Estimate Adjustments
for Supporting Components

Component

CPUs

Memory

Portal Server

4

8 GB

Communications Express

2

4 GB

Messaging Server(MTA, inbound)

2

4 GB

Messaging Server(MTA, outbound)

2

4 GB

Messaging Server(MMP)

2

4 GB

Messaging Server(Message Store)

2

4 GB

Access Manager

2

4 GB

Calendar Server

2

4 GB

Directory Server

4

8 GB (Rounded up from 3 CPUs/6 GB memory)

Web Server

0

0

Estimating Processor Requirements for Secure Transactions

Secure transport of data involves handling transactions over a secure
transport protocol such as Secure Sockets Layer (SSL) or Transport Layer Security
(TLS). Transactions handled over a secure transport typically require additional
computing power to first, establish a secure session (known as the handshake)
and then to encrypt and decrypt transported data. Depending on the encryption
algorithm used (for example, 40-bit or 128-bit encryption algorithms), the
additional computing power can be substantial.

For secure transactions to perform at the same level as nonsecure transactions,
you must plan for additional computing power. Depending on the nature of the
transaction and the Sun JavaTM Enterprise System services that handle it, secure transactions
might require up to four times more computing power than nonsecure transactions.

When estimating the processing power to handle secure transactions,
analyze use cases to determine the percentage of transactions that require
secure transport. If the performance requirements for secure transactions
are the same as for non-secure transactions, modify the CPU estimates to account
for the additional computing power needed for the secure transactions.

In some usage scenarios, secure transport might only be required for
authentication. Once a user is authenticated to the system, no additional
security measures for transport of data is required. In other scenarios, secure
transport might be required for all transactions.

For example, when browsing a product catalog for an online e-commerce
site, all transactions can be nonsecure until the customer has finished making
selections and is ready to “check out” to make a purchase. However,
some usage scenarios, such as deployments for banks or brokerage houses, require
most or all, transactions to be secure and apply the same performance standard
for both secure and nonsecure transactions.

CPU Estimates for Secure Transactions

This section continues the example deployment to illustrate how to calculate
CPU requirements for a theoretical use case that includes both secure and
nonsecure transactions.

To estimate the CPU requirements for secure transactions, make the following
calculations:

The performance requirement for secure transactions is the
same as the performance requirement for non-secure transactions.

To
account for the extra computing power to handle secure transactions, the number
of CPUs to handle these transactions will be increased by a factor of four.
As with other CPU figures in the example, this factor is arbitrary and is
for illustration purposes only.

Calculate additional CPU estimates for secure transactions. Assume secure
transactions require four times the CPU power as nonsecure transactions.

Ten percent of the baseline estimate require secure transport:

0.10 x 4 CPUs = 0.4 CPUs

Increase CPU power for secure transactions by a factor of four:

4 x 0.4 = 1.6 CPUs

1.6 CPUs

3

Calculate reduced CPU estimates for nonsecure transactions.

Ninety percent of the baseline estimate are non-secure:

0.9 x 4 CPUs = 3.6 CPUs

3.6 CPUs

4

Calculate adjusted total CPU estimates for secure and nonsecure transactions.

Secure estimate + non-secure estimate = total:

1.6 CPUs + 3.6 CPUs = 5.2 CPUs

5.2 CPUs

5

Round up to even number.

5.2 CPUs ==> 6 CPUs

6 CPUs

From the calculations for secure transactions in this example, you would
modify the total CPU estimates in CPU Estimates for Secure Transactions by adding an additional two CPUs and four gigabytes
of memory to get the following total for Portal Server.

Component

CPUs

Memory

Portal Server

6

12 GB

Specialized Hardware to Handle SSL Transactions

Specialized hardware devices, such as SSL accelerator cards and other
appliances, are available to provide computing power to handle establishment
of secure sessions and the encryption and decryption of data. When using specialized
hardware for SSL operations, computational power is dedicated to some part
of the SSL computations, typically the “handshake” operation that
establishes a secure session.

This hardware might be of benefit to your final deployment architecture.
However, because of the specialized nature of the hardware, estimate secure
transaction performance requirements first in terms of CPU power, and then
consider the benefits of using specialized hardware to handle the additional
load.

Some factors to consider when using specialized hardware are whether
the use cases support using the hardware (for example, use cases that require
a large number of SSL handshake operations) and the added layer of complexity
this type of hardware brings to the design. This complexity includes the installation,
configuration, testing, and administration of these devices.

Determining Availability Strategies

When developing a strategy for availability requirements, study the
component interactions and usage analysis to determine which availability
solutions to consider. Do your analysis on a component-by-component basis,
determining a best-fit solution for availability and failover requirements.

The following items are examples of the type of information you gather
to help determine availability strategies:

How many nines of availability are specified?

What are the performance specifications with respect to failover
situations (for example, at least 50% of performance during failover)?

Does the usage analysis identify times of peak and non-peak
usage?

What are the geographical considerations?

The availability strategy you choose must also take into consideration
serviceability requirements, as discussed in Designing for Optimum Resource Usage. Avoid complex solutions that require considerable
administration and maintenance.

Availability Strategies

Availability strategies for Java Enterprise System deployments include the following:

Load
balancing. Uses redundant hardware and software components to share
a processing load. A load balancer directs any requests for a service to one
of multiple symmetric instances of the service. If any one instance should
fail, other instances are available to assume a heavier load.

Failover. Involves
managing redundant hardware and software to provide continuous access of services
and security for critical data if any component fails.

Sun Cluster software
provides a failover solution for critical data managed by back-end components
such as the message storage for Messaging Server and calendar data for Calendar Server.

The following sections provide some examples of availability solutions
that provide various levels of load balancing, failover, and replication of
services.

Single Server System

Place all computing resources for a service on a single server. If the
server fails, the entire service fails.

Figure 5–2 Single Server System

Sun provides high-end servers that provide the following benefits:

Replacement and reconfiguration of hardware components while
the system is running

Ability to run multiple applications in fault-isolated domains
on the server

Ability to upgrade capacity, performance speed, and I/O configuration
without rebooting the system

A high-end server typically costs more than a comparable multi-server
system. However, a single server provides savings on administration, monitoring,
and hosting costs for servers in a data center. Load balancing, failover,
and removal of single points of failure is more flexible with multi-server
systems.

Horizontally Redundant Systems

There are several ways to increase availability with parallel redundant
servers that provide both load balancing and failover. The following figure
illustrates two replicate servers providing an N+1 failover system. An N+1
system has an additional server to provide 100% capacity should one server
fail.

Figure 5–3 N+1 Failover System With Two Servers

The computing power of each server in Horizontally Redundant Systems above is identical. One server alone handles the
performance requirements. The other server provides 100% of the performance
when called into service as a backup.

The advantage of an N+1 failover design is 100% performance during a
failover situation. Disadvantages include increased hardware costs with no
corresponding gain in overall performance (because one server is a standby
for use in failover situations only).

The following figure illustrates a system that implements
load balancing plus failover that distributes the performance between two
servers.

Figure 5–4 Load Balancing Plus Failover Between Two Servers

In the system depicted in Horizontally Redundant Systems above, if one server fails, all services are available, although
at a percentage of the full capacity. The remaining server provides 6 CPUs
of computing power, which is 60% of the 10 CPU requirement.

An advantage of this design is the additional 2 CPU latent capacity
when both servers are available.

The following figure illustrates a distribution between a number of
servers for performance and load balancing.

Figure 5–5 Distribution of Load Between n Servers

Because there are five servers in the design depicted in Horizontally Redundant Systems, if one server fails
the remaining servers provide a total of 8 CPUs of computing power, which
is 80% of the 10 CPU performance requirement. If you add an additional server
with a 2-CPU capacity to the design, you effectively have an N+1 design. If
one server fails, 100% of the performance requirement is met by the remaining
servers.

This design includes the following advantages:

Added performance if a single server fails

Availability even when more than one server is down

Servers can be rotated out of service for maintenance and
upgrades

Multiple low-end servers typically cost less than a single
high-end server

However, administration and maintenance costs can increase significantly
with additional servers. You also have to consider costs for hosting the servers
in a data center. At some point you run into diminishing returns by adding
additional servers.

Sun Cluster Software

For situations that require a high degree of availability (such as four
or five nines), you might consider Sun Cluster software as part of your
availability design. A cluster system is the coupling of redundant servers
with storage and other network resources. The servers in a cluster continually
communicate with each other. If one of the servers goes offline, the remainder
of the devices in the cluster isolate the server and fail over any application
or data from the failing node to another node. This failover process is achieved
relatively quickly with little interruption of service to the users of the
system.

Availability Design Examples

This section contains two examples of availability strategies based
on the identity-based communications solution for a medium-sized enterprise
of about 1,000 to 5,000 employees, as described previously in Identity-Based Communications Example. The first
availability strategy illustrates load balancing for Messaging Server.
The second illustrates a failover solution that uses Sun Cluster software.

Load Balancing Example for Messaging Server

The following table lists the estimates for CPU power for each logical Messaging Server component
in the logical architecture. This table repeats the final estimation calculated
in the section Update the CPU Estimates .

Table 5–6 CPU Estimate Adjustments
for Supporting Components

Component

CPUs

Memory

Messaging Server(MTA, inbound)

2

4 GB

Messaging Server(MTA, outbound)

2

4 GB

Messaging Server(MMP)

2

4 GB

Messaging Server(Message Store)

2

4 GB

For this example, assume that during technical requirements phase, the
following quality of service requirements were specified:

Availability. Overall system
availability should be 99.99% (does not include scheduled downtime). Failure
of an individual computer system should not result in service failure.

Scalability. No server
should be more than 80% utilized under daily peak load and the system must
accommodate long-term growth of 10% per year.

To fulfill the availability requirement, for each Messaging Server component
provide two instances, one of each on separate hardware servers. If a server
for one component fails, the other provides the service. The following figure
illustrates the network diagram for this availability strategy.

In the preceding figure the number of CPUs has doubled from the original
estimate. The CPUs are doubled for the following reasons:

In the event one server fails, the remaining server provides
the CPU power to handle the load.

For the scalability requirement that no single server is more
than 80% utilized under peak load, the added CPU power provides this safety
margin.

For the scalability requirement to accommodate 10% increased
load per year, the added CPU power adds latent capacity that can handle increasing
loads until additional scaling would be needed.

Failover Example Using Sun Cluster Software

The following figure shows an example of failover strategy for Calendar Server back-end
and Messaging Server messaging store. The Calendar Server back-end and
messaging store are replicated on separate hardware servers and configured
for failover with Sun Cluster software. The number of CPUs and corresponding
memory are replicated on each server in the Sun Cluster.

Figure 5–6 Failover Design Using Sun Cluster Software

Replication of Directory Services Example

Directory services can be replicated to distribute transactions across
different servers, providing high availability. Directory Server provides
various strategies for replication of services, including the following:

Multiple databases. Stores
different portions of a directory tree in separate databases.

Chaining and referrals. Links
distributed data into a single directory tree.

Single master replication. Provides
a central source for the master database, which is then distributed to consumer
replicas.

Multi-master replication. Distributes
the master database among several servers. Each of these masters then distributes
their database among consumer replicas.

Single Master Replication

The following figure shows a single master replication strategy that
illustrates basic replication concepts.

Figure 5–7 Single Master Replication Example

In single master replication, one instance of Directory Server manages
the master directory database, logging all changes. The master database is
replicated to any number of consumer databases. The consumer instances of Directory Server are
optimized for read and search operations. Any write operation received by
a consumer is referred back to the master. The master periodically updates
the consumer databases.

Advantages of single master replication include:

Single instance of Directory Server optimized for database
read and write operations

Any number of consumer instances of Directory Server optimized
for read and search operations

Horizontal scalability for consumer instances of Directory Server

Multi-Master Replication

The following figure shows a multi-master replication strategy that
might be used to distribute directory access globally.

In multi-master replication, one or more instances of Directory Server manages
the master directory database. Each master has a replication agreement that
specifies procedures for synchronizing the master databases. Each master replicates
to any number of consumer databases. As with single master replication, the
consumer instances of Directory Server are optimized for read and search
access. Any write operation received by a consumer is referred back to the
master. The master periodically updates the consumer databases.

Figure 5–8 Multi-master Replication Example

Multi-master replication strategy provides all the advantages of single
master replication, plus an availability strategy that can provide load balancing
for updates to the masters. You can also implement an availability strategy
that provides local control of directory operations, which is an important
consideration for enterprises with globally distributed data centers.

Determining Strategies for Scalability

Scalability is the ability to add capacity to your system, usually by
the addition of system resources, but without changes to the deployment architecture.
During requirements analysis, you typically make projections of expected growth
to a system based on the business requirements and subsequent usage analysis.
These projections of the number of users of a system and the capacity of the
system to meet their needs are often estimates that can vary significantly
from the actual numbers for the deployed system. Your design should be flexible
enough to allow for variance in your projections.

A design that is scalable includes sufficient latent capacity to handle
increased loads until a system can be upgraded with additional resources.
Scalable designs can be readily scaled to handle increasing loads without
redesign of the system.

Latent Capacity

Latent capacity is one aspect of scalability where you include additional
performance and availability resources into your system so the system can
easily handle unusual peak loads. You can also monitor how latent capacity
is used in a deployed system to help determine when to scale the system by
adding resources. Latent capacity is one way to build safety into your design.

Analysis of use cases can help identify the scenarios that can create
unusual peak loads. Use this analysis of unusual peak loads plus a factor
to cover unexpected growth to design latent capacity that builds safety into
your system.

Your system design should be able to handle projected capacity for a
reasonable time, generally the first 6 to 12 months of operation. Maintenance
cycles can be used to add resources or increase capacity as needed. Ideally,
you should be able to schedule upgrades to the system on a regular basis,
but predicting needed increases in capacity is often difficult. Rely on careful
monitoring of your resources as well as business projections to determine
when to upgrade a system.

If you plan to implement your solution in incremental phases, you might
schedule increasing the capacity of the system to coincide with other improvements
scheduled for each incremental phase.

Scalability Example

The example in this section illustrates horizontal and vertical scaling
for a solution that implements Messaging Server. For vertical scaling,
you add additional CPUs to a server to handle increasing loads. For horizontal
scaling, you handle increasing loads by adding additional servers for distribution
of the load.

The baseline for the example assumes a 50,000 user base supported by
two message store instances that are distributed for load balancing. Each
server has two CPUs for a total of four CPUs. The following figure shows how
this system can be scaled to handle increasing loads for 250,000 users and
2,000,000 users.

Note –

Scalability Example shows the
differences between vertical scaling and horizontal scaling. This figure does
not show other factors to consider when scaling, such as load balancing, failover,
and changes in usage patterns.

Figure 5–9 Horizontal and Vertical Scaling Examples

Identifying Performance Bottlenecks

One of the keys to successful deployment design is identifying potential
performance bottlenecks and developing a strategy to avoid them. A performance
bottleneck occurs when the rate at which data is accessed cannot meet specified
system requirements.

Bottlenecks can be categorized according to various classes of hardware,
as listed in the following table of data access points within a system. This
table also suggests potential remedies for bottlenecks in each hardware class.

Dedicate disk access to specific functions, such as read only or write
only

Cache frequently accessed data in system memory

Network interface

Varies depending on bandwidth and access speed of nodes on the network

Increase bandwidth

Add accelerator hardware when transporting secure data

Improve performance on nodes within the network so the data is more
readily available

Note –

Identifying Performance Bottlenecks lists
hardware classes according to relative access speed, implying that slow access
points, such as disks, are more likely to be the source of bottlenecks. However,
processors that are underpowered to handle large loads are also likely sources
of bottlenecks.

You typically begin deployment design with baseline processing power
estimates for each component in the deployment and their dependencies. You
then determine how to avoid bottlenecks related to system memory and disk
access. Finally, you examine the network interface to determine potential
bottlenecks and focus on strategies to overcome them.

Optimizing Disk Access

A critical component of deployment design is the speed of disk access
to frequently accessed datasets, such as LDAP directories. Disk access provides
the slowest access to data and is a likely source of a performance bottleneck.

One way to optimize disk access is to separate write operations from
read operations. Not only are write operations more expensive than read operations,
read operations (lookup operations for LDAP directories) typically occur with
considerably more frequency than write operations (updates to data in LDAP
directories).

Another way to optimize disk access is by dedicating disks to different
types of I/O operations. For example, provide separate disk access for Directory Server logging
operations, such as transaction logs and event logs, and LDAP read and write
operations.

Also, consider implementing one or more instances of Directory Server dedicated
to read and write operations and using replicated instances distributed to
local servers for red and search access. Chaining and linking options are
also available to optimize access to directory services.

Designing for Optimum Resource Usage

Deployment design is not just estimating the resources required to meet
the QoS requirements. During deployment design you also analyze all available
options and select the best solution that minimizes cost but still fulfills
QoS requirements. You must analyze the trade-off for each design decision
to make sure a benefit in one area is not offset by a cost in another.

For example, horizontal scaling for availability might increase overall
availability, but at the cost of increased maintenance and service. Vertical
scaling for performance might increase computing power inexpensively, but
the additional power might be used inefficiently by some services.

Before completing your design strategy, examine your decisions to make
sure that you have balanced the use of resources with the overall benefit
to the proposed solution. This analysis typically involves examining how system
qualities in one area affect other system qualities. The following table lists
some system qualities and corresponding considerations for resource management.

Table 5–8 Resource Management
Considerations

System Quality

Description

Performance

For performance solutions that concentrate CPUs on individual servers,
will the services be able to efficiently use the computing power? (For example,
some services have a ceiling on the number of CPUs that can be efficiently
used.)

Latent capacity

Does your strategy handle loads that exceed performance estimates?

Are excessive loads handled with vertical scaling on servers, load balancing
to other servers, or both?

Is the latent capacity sufficient to handle unusual peak loads until
you reach the next milestone for scaling the deployment?

Security

Have you sufficiently accounted for the performance overhead required
to handle secure transactions?

Have you accounted for the scheduled downtime necessary to maintain
the system?

Have you balanced the costs between high-end servers and low-end servers?

Scalability

Have you estimated milestones for scaling the deployment?

Do you have a strategy to provide enough latent capacity to handle projected
increases in load until you reach the milestones for scaling the deployment?

Serviceability

Have you taken into account administration, monitoring, and maintenance
costs into your availability design?

Have you considered delegated administration solutions (allowing end-users
to perform some administration tasks) to reduce administration costs?

Managing Risks

Much of the information on which deployment design is based, such as
quality of service requirements and usage analysis, is not empirical data
but data based on estimates and projections ultimately derived from business
analyses. These projections could be inaccurate for may reasons, including
unforeseen circumstances in the business climate, faulty methods of gathering
data, or simply human error. Before completing a deployment design, revisit
the analyses upon which your design is based and make sure your design accounts
for any reasonable deviations from the estimates or projections.

For example, if the usage analysis underestimates the actual usage of
the system, you run the risk of building a system that cannot cope with the
amount of traffic it encounters. A design that under performs will surely
be considered a failure.

On the other hand, if you build a system that is several orders more
powerful than required, you divert resources that could be used elsewhere.
The key is to include a margin of safety above the requirements, but to avoid
extravagant use of resources.

Extravagant use of resources results in a failure of the design because
underutilized resources could have been applied to other areas. Additionally,
extravagant solutions might be perceived by stakeholders as not fulfilling
contracts in good faith.

Example Deployment Architecture

The following figure represents a completed deployment architecture
for the example deployment introduced earlier in this white paper. This figure
provides an idea of how to present a deployment architecture.

Caution –

The deployment architecture in the following figure is for
illustration purposes only. It does not represent a deployment that has been
actually designed, built, or tested and should not be considered as deployment
planning advice.