PYTHON, SEARCH, AND OPEN SOURCE.

Exploring LucidWorks Enterprise with SolrCloud

In this blog post I will show how to get a distributed search cluster up and running using LucidWorks Enterprise 2.1 (LWE) and SolrCloud. LucidWorks Enterprise 2.1 is the most recent release as of this post.

Definitions

I want to start with a few definitions of the terms I will be using in this article. The documentation for SolrCloud is confusing and seems to have multiple definitions for the same term. For example, the SolrCloud wiki page defines a shard as a partition of the index and the NewSolrCloudDesign wiki page seems to refer to it as a replica.

For the purpose of this article we will use the following definitions:

collection: a search index composed of the total set of documents being searched.

shard: a partition of a collection. A shard contains a subset of the documents being searched, a collection is composed of one or more shards.

replica: a copy of a collection. If a collection is composed of N shards, a single replica means each of those shards will have one copy.

node: a single instance of LucidWorks Enterprise or Solr. A single node can contain multiple collections where each collection has a different data source.

Requirements

The basic requirements for this test setup are:

Start with a single node

Create an index with two shards and one replica

Index some documents that should be split between the two shards

Bring a new node online and move one of the shards and replicas to this new node

After verifying you have a working installation, stop LWE. To do this, browser to your installation directory and run:

$LWE/app/bin/stop.sh

Step 3: Bootstrap Zookeeper

Running LucidWorks Enterprise and Solr in a distributed mode requires the use of Apache Zookeeper. Lucid Imagination’s documentation recommends running a separate Zookeeper ensemble for production deployments. That is outside the scope of this article, so we will use Solr’s embedded version of Zookeeper that is intended for development purposes only.

Since this is the first time we are running Zookeeper, we need to “bootstrap” it with LWE’s configuration files. To do this start LWE with the bootstrap flags:

This bootstrap process only needs to be done once. In the future, you can start LWE with the zkRun flag:

$LWE/app/bin/start.sh -lwe_core_java_opts "-DzkRun"

Once bootstrapped, you can head over to the Solr cloud admin page (http://localhost:8888/solr/#/cloud) and see if the default LWE configs were uploaded to Zookeeper. Verify that you have configs called collection1 and LucidWorksLogs.

Step 4: Create A Test Collection

To keep things simple, we are going to use LucidWorks Enterprise’s “collection1″ configuration. This is the out of the box schema and solrconfig settings that ship with LucidWorks Enterprise. In most situations you will need to create a schema specifically for the content you are indexing but this default configuration is fine for our test collection.

Update 05/05/12: Mark Miller pointed out that this is a lapse in the LucidWorks Enterprise documentation. You can specify the numShards parameter via the LucidWorks REST api or if using the UI, it will honor the numShards system property. This is nice but does not simplify the steps of this post, see comments below.

I wish I could say this is as easy as executing an api call specifying that we would like to create a new collection with two shards and one replica, but I can’t. In it’s current form everything related to creating shards and replicas needs to be done manually.

The SolrCloud documentation mentions the use of the numShards parameter which I assumed would be used to automatically split new collections. In my testing this was not the case, all it does it create a new Zookeeper entry for a second shard but you still need to manually create a Solr core for that shard using the Core Admin API.

So, now that we know we need to do everything manually, execute the four core admin api calls to create a single collection. The four api calls are:

Alright, now that we have the collection created time to check that everything was successful. Head back to the Solr Cloud interface (http://localhost:8888/solr/#/cloud) and view the clusterstate.json entry.

In this json output, you should see the new “testcollection” collection and that it is composed of two shards, “shard1″ and “shard2″. Expanding those shards will show our replicas, “replica1″ and “replica2″ for each shard.

You may be wondering why we are looking at the clusterstate.json file over the nice LWE Admin interface. Well, that is because when you create collections manually via Solr’s Core Admin API they do not show up in LWE. This is bug that I hope is addressed in a future version of LucidWorks Enterprise.

Update 05/05/12: If you start LWE with the numShards parameter and use the GUI/Rest API to create the initial collection, it will show up in the UI.

Step 5: Index Data

I had intended on using the example crawler that ships with LucidWorks Enterprise, but Lucid Imagination states that data sources do not work when running LWE in SolrCloud mode. That and the fact I can’t see my collection in the LWE admin interface in order to assign a data source to my test collection.

So, for quick testing purposes I will resort to using the sample documents that ship with a standard download of Apache Solr. Once you download Solr, browse to the $SOLR_HOME/example/exampledocs directory.

Edit the file post.sh to point to our test collection update handler.

URL=http://localhost:8888/solr/testcollection/update

Save the file and run:

./post.sh *.xml

Now that we have data fed, lets check that it was distributed between the two shards we created and that our replicas contain the same data. Head back to to Solr admin page at http://localhost:8888/solr/#/.

click on the first shard “testcollection_shard1_replica1″, you should have 10 documents in this shard

click on the second shard “testcollection_shard2_replica1″, you should have 11 documents in this shard

check the replicas for each shard, they should have the same counts

At this point, we can start issues some queries against the collection:

Step 6: Add A New Node

Now time for the fun part, adding a new node. We want to create a new node and have the shards and replicas be split between the two nodes. This is going to be yet another manual process because SolrCloud and LucidWorks Enterprise do not automatically rebalance the shards as new nodes come and go.

To keep things simple, we going to run multiple instances of LWE on the same machine. So, run the LWE installer again but this time do not use the defaults. Select a new installation directory (I will refer to this as $LWE2 below), use port 7777 for Solr, and 7878 for LWE UI. Uncheck the box that starts LWE automatically.

Now we need to start our new instance of LucidWorks Enterprise and connect to our existing Zookeeper instance. To do this you need to set the zkHost parameter to the host and port of your existing Zookeeper instance. Unfortunately, Lucid’s documentation does not specify what port Zookeeper is running on. However, on the SolrCloud wiki page, I found that Zookeeper starts on Solr Port + 1000. In our case Zookeeper should be running on port 9888. Run the following command to start the new instance of LWE:

$LWE2/app/bin/start.sh -lwe_core_java_opts "-DzkHost=localhost:9888"

Now execute the two Solr Core Admin API calls to create our shard and replica on this new node since they are not automatically migrated from the first server.

At this point, take a look at the cluster state like we did at the end of step 4 above. You should still see our two shards, but each shard should now have three replicas. Two on the first node and one of the new node.

Also take a look at the new node’s admin interface at http://localhost:7777/solr/#/. If you look at the core status for our new shards you should see that our documents were automatically sent over from the first node. Finally something I did not need to do manually!!!

Issuing the same queries from step 5 above against the new node should yeild the same results.

Take a look at the cluster state again and observe that we have finally achieved our desired outcome, a single collection with two shards and a replica.

Conclusion:

As you can see it is possible to get LucidWorks Enterprise up and running with SolrCloud but it is not a trivial process. Hopefully future versions of LWE will make this process easier and address some of the bugs I mentioned above. At his point SolrCloud feels half-baked and it’s integration into LucidWorks Enterprise even less. Considering all the LWE features that do not work when running in SolrCloud mode, you would probably be better off running a nightly version of Solr 4.0 which will have the latest SolrCloud patches.

6 Comments.

Hey Matt – a couple things I’d point out – first, the numShards param is the key to controlling the number of replicas. Second, with LWE, you do not need to drop down to Solr and make all these calls at all. I’m trying to find where in the doc it says you can not use the admin or rest api for this, but I don’t see it. In fact you can! Simply use the LWE collections API to create a new collection, or create a new collection in the admin UI. One of the params you can pass if using the ReST API is numShards. That is not available from the UI yet (it uses whatever numShards was passed as a sys prop on startup.

The integration for 2.1 is certainly early – but you don’t have to jump through all these hoops at all.

That would be great but I was unable to find any documentation supporting this, thus the reason I needed to drop down to Solr Core Admin API.

“When creating a new collection (with either the Admin UI or the API), you cannot yet specify the number of shards to break it up into. If this is something you need to do, please contact Lucid Imagination for assistance.”

Also, you say numShards is for replicas. What I want to do is create a single collection that is composed on two partitions (shards) and a single replica. Is there a system property that specifies the number of partitions I want? The definitions of terms used in SolrCloud is confusing.

“When creating a new collection (with either the Admin UI or the API), you cannot yet specify the number of shards to break it up into. If this is something you need to do, please contact Lucid Imagination for assistance.”

Yeah, I checked on this and this was pointed out to me. It’s a lapse in the doc – something else was meant. It will be corrected. The default UI and API calls for collection management will work.

I can ask that we make sure to explicitly document that collection management still works when using SolrCloud functionality.

“What I want to do is create a single collection that is composed on two partitions (shards) and a single replica.”

If you want 2 shards, simply use a numShards of 2. Once you start the first instance, it will be part of shard1. Start another and it will be part of shard2. Start another and it will be a shard1 replica. Start a 4th instance and it will be a shard2 replica.

You just give the number of “partitions” and then instances evenly become replicas for each shard.

We plan on adding further ways to control this, but that is the initial knob.

Hey Mark, thanks for pointing out that it is possible to start LWE with the numShards parameter and that the LWE UI/Rest create collection commands will honor that setting. I did some testing and found that this does work, however it does not simplify any of the steps I describe in the blog post. The exception being I no longer need to specify the shard names when executing the Core Admin API commands.

The numShards parameter basically only manages creating the shards in zookeeper. It does not automatically create the additional shards/replicas within LWE itself, I still need to do that manually via Core Admin API or the LWE UI/Rest API. If I start LWE with numShards=2 and don’t bring up another instance of LWE and create the collection manually, I get errors while feeding.

Also, to do everything using the LucidWorks Enterprise GUI/Rest api, would need to bring up 4 separate instances of LWE vs. having Shard 1/Shard 2 Replica in a single instance and Shard 2/Shard 1 Replica in another. This means I still need to drop down to using Solr’s Core Admin API to set this up manually. Basically I still need to jump though all the hoops I talk about above to create a multi-shard collection with replicas.

Hey Matt, thanks for your post. I have a couple of questions about running SolrCloud and LWE. 1) You mentioned creating a custom schema for your collection. If you’ve created a collection and want to update the schema, do you know where you would do that? It looks like the config files are stored in Zookeeper somewhere, but I haven’t found where in my file system they’re actually stored. If I update the schema in the actual collection, the changes don’t show up in Zookeeper. Updates through the LWE UI work, but that’s a pain for large scale changes.

2) Have you tried creating a Zookeeper ensemble? I’ve tried as described on the SolrClound Wiki, but when I start a server with more then one Zookeeper host, I get a Port out of range: -1 error. Here’s the start command I’m using:

1. You need to send the updated schema to ZooKeeper (essentially what we did in step 3 above while bootstrapping). Once the changed schema is in ZooKeeper you need to restart all your nodes. LWE is not smart enough to detect changes and automatically apply the new schema. Your schema should be located somewhere in $LWE_HOME/conf/solr/cores folder.

2. I have not, but I know it is not recommended to run the embedded version of ZooKeeper for an ensemble. Setup a standalone ZooKeeper ensemble and see if that helps.

At any rate, SolrCloud is not very good right now and it’s integration in LWE even worse. I recommend taking a look at ElasticSearch as it is superior to SolrCloud and has commercial support just like LWE.