Introduction

SolrCloud

SolrCloud is the name of a set of new distributed capabilities in Solr. Passing parameters to enable these capabilities will enable you to set up a highly available, fault tolerant cluster of Solr servers. Use SolrCloud when you want high scale, fault tolerant, distributed indexing and search capabilities.

Getting Started

If you haven't yet, go through the simple Solr Tutorial to familiarize yourself with Solr. Note: reset all configuration and remove documents from the tutorial before going through the cloud features. Copying the example directories with pre-existing Solr indexes will cause document counts to be off.

Simple two shard cluster

In this example we will setup a 2 shard cluster using two running instances of Jboss. Both of the Jboss instances will be running on the same server and IP, but will be serving on different ports.

Installing and configuring Jboss

Unjar the downloaded Jboss jar files in two different directories. For this example, we will use /opt/jboss_1 and /opt/jboss_2

Default Jboss will serve the application on port 8080. We will keep that for jboss_1. For jboss_2 we will add a offset of the 100 in the socket definition for port, so that it starts on port 8180. Modify socket-binding-group section in the /opt/jboss_2/standalone/configuration/standalone-full.xml as follows. (Notice the port-offset:100)

Preparing Solr

For this example we will instantiate two instances of Solr, and place them in /opt/solr1 and /opt/solr2

Next copy the example/solr directory from the downloaded Solr tar file to the /opt/solr1 and /opt/solr2

cp example/solr /opt/solr1
cp example/solr /opt/solr2

Next modify the schema.xml in the /opt/solr1/collection1/conf/schema.xml and /opt/solr2/collection1/conf/schema.xml as needed

Installing and starting Zookeeper

Untar the Apache Zookeeper to a directory. For this example we will use /opt/zookeeper

Create a data for Zookeeper. For this example we will use /opt/zookeeper_data

copy the /opt/zookeeper/conf/zoo_sample.cfg to /opt/zookeeper/conf/zoo.cfg

cp /opt/zookeeper/conf/zoo_sample.cfg to /opt/zookeeper/conf/zoo.cfg

edit the /opt/zookeeper/conf/zoo.cfg file to modify the Zookeeper dataDir

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper_data
# the port at which the clients will connect
clientPort=2181
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1

Start Zookeeper by using the zkServer.sh

/opt/zookeeper/bin/zkServer.sh start

This will start Zookeeper on port 2181

Upload the Solr configuration to the Zookeeper

Use the zkCli.sh bundled with Solr distribution for this. It is available in example/cloud-scripts/ as of Solr 4.3.1