Welcome to Splunk Answers, a Q&A forum for users to find answers to questions about deploying, managing, and using Splunk products. Contributors of all backgrounds and levels of expertise come here to find solutions to their issues, and to help other users in the Splunk community with their own questions.

This quick tutorial will help you get started with key features to help you find the answers you need. You will receive 10 karma points upon successful completion!

Refine your search:

Converting CherryPy To A Forking Only Web Server

3

Hi,

I am in a unique situation of having a 24 core box with 64GB's of RAM as a Splunk Search head. Giving the nature of how Python's threading works[1], [2], has anyone converted the default CherryPy configuration to use processes instead of threads. In the past, with other Python web frameworks, I have used mod_wsgi with Apache prefork or MPM to accomplish this.

In looking through the CherryPy documentation, you can't disable all threading, but it appears reasonable to fork all requests, or use the multiprocessing module to create a process pool that requests talk to. Of course, another way to really verify what is going on, is to use a mod_wsgi monitoring middleware to time the whole request and response cycle. I am somewhat skeptical, given reading through Python core's bug report on threading, that things are always obvious with threading, even with only I/O bound requests.

People who like this

2 Answers

I'd say this is mostly a waste of time. The Splunk Web interface/SplunkWeb/CherryPy consumes an insignificant amount of resource compared to the Splunkd process(es) and any kind of load on the machine will use machine resources up running searches (via multiple forked Splunkd processes, one per search) well before the web interface becomes a bottleneck.

Each search process will consume one core, and runs as a separate process. If your are unable to use up all CPUs with multiple searches running in parallel, then your bottleneck is probably disk I/O, which won't be improved by running more instances.

We generally recommend horizontal scaling using 8-core servers each with independent disk I/O subsystems for this reason.

I see. In this case this is only the search head that has 24 cores, and I am assuming the indexer is doing most of the work anyway, in reading through your Splunk presentation. So really this box is waiting for Network I/O, because it forks a few splunkd instances, which then makes REST calls to the indexer.

Due to the horizontal scaling architecture of Splunk, it is subtle to figure out how you can exactly speed things up. In our case what we really want to speed up is the number of events per second piped into a timechart. Will create another question about this.