Find out the columns you want to index if using an RDBMS. Find out the memory required by those indexes.
If indexes per machine are taking more than 10-100 GB per machine, then you have to shard the DB.

If read traffic is very high than write traffic, it may be a good idea to increase some slaves replicas.
Slave replicas will only respond to read traffic but will keep themselves updated by getting data from the server.

Approach to system design

Find requests per day.

Break it into read and write traffic.

Then get requests per second to see if a single machine can support it or not.

Estimate the amount of data in a request and response to see the network traffic requirements.

Statelessness for scalability

Make sure your servers are stateless.
That ways a load balancer can send requests to any server (maybe in a round robin fashion)
To do this, all the state information must be kept in the database.
This includes session information as well.
However, since session information must be available extremely fast, it is best to keep
this information in a distributed cache.

If servers cannot be made stateless, then the load-balancer must be made smarter.
It should route stateful requests to the proper server.
This is done by making the load balancer inspect the session-ID of each request and
matching that with the appropriate server.

Load balancer (Also called "Reverse Proxy")

RP systems hide the internal topology from outside clients and in doing so they can provide several features:

Caching static content

Load balancing

Firewalling

Compression

Logging

A load balancer is useful not only for load balancing, but it also brings:

Redundancy and tolerance to machine failures

Elastic load-balancers can shut-down some of the servers during non-peak hours.

vmtouch

vmtouch is an application to see what is currently cached by the system.
Its a very simple c program in one file only and can help to see, remove or put things into cache.