lucene-solr-user mailing list archives

Does the number of searcher affect CPU usage ?
Not totally sure about it but I think some versions of Tomcat were not
totally scalable over 4 CPUs (or 4 cores).
C.
wojtekpia wrote:
> Yes, I am seeing evictions. I've tried setting my filterCache higher, but
> then I start getting Out Of Memory exceptions. My filterCache hit ratio is >
> .99. It looks like I've hit a RAM bound here.
>
> I ran a test without faceting. The response times / throughput were both
> significantly higher, there were no evictions from the filter cache, but I
> still wasn't getting > 50% CPU utilization. Any thoughts on what physical
> bound I've hit in this case?
>
>
>
> Erik Hatcher wrote:
>
>> One quick question.... are you seeing any evictions from your
>> filterCache? If so, it isn't set large enough to handle the faceting
>> you're doing.
>>
>> Erik
>>
>>
>> On Nov 4, 2008, at 8:01 PM, wojtekpia wrote:
>>
>>
>>> I've been running load tests over the past week or 2, and I can't
>>> figure out
>>> my system's bottle neck that prevents me from increasing throughput.
>>> First
>>> I'll describe my Solr setup, then what I've tried to optimize the
>>> system.
>>>
>>> I have 10 million records and 59 fields (all are indexed, 37 are
>>> stored, 17
>>> have termVectors, 33 are multi-valued) which takes about 15GB of
>>> disk space.
>>> Most field values are very short (single word or number), and
>>> usually about
>>> half the fields have any data at all. I'm running on an 8-core, 64-
>>> bit, 32GB
>>> RAM Redhat box. I allocate about 24GB of memory to the java process,
>>> and my
>>> filterCache size is 700,000. I'm using a version of Solr between 1.3
>>> and the
>>> current trunk (including the latest SOLR-667 (FastLRUCache) patch),
>>> and
>>> Tomcat 6.0.
>>>
>>> I'm running a ramp-test, increasing the number of users every few
>>> minutes. I
>>> measure the maximum number of requests that Solr can handle per
>>> second with
>>> a fixed response time, and call that my throughput. I'd like to see
>>> a single
>>> physical resource be maxed out at some point during my test so I
>>> know it is
>>> my bottle neck. I generated random queries for my dataset
>>> representing a
>>> more or less realistic scenario. The queries include faceting by up
>>> to 6
>>> fields, and quering by up to 8 fields.
>>>
>>> I ran a baseline on the un-optimized setup, and saw peak CPU usage
>>> of about
>>> 50%, IO usage around 5%, and negligible network traffic.
>>> Interestingly, the
>>> CPU peaked when I had 8 concurrent users, and actually dropped down
>>> to about
>>> 40% when I increased the users beyond 8. Is that because I have 8
>>> cores?
>>>
>>> I changed a few settings and observed the effect on throughput:
>>>
>>> 1. Increased filterCache size, and throughput increased by about
>>> 50%, but it
>>> seems to peak.
>>> 2. Put the entire index on a RAM disk, and significantly reduced the
>>> average
>>> response time, but my throughput didn't change (i.e. even though my
>>> response
>>> time was 10X faster, the maximum number of requests I could make per
>>> second
>>> didn't increase). This makes no sense to me, unless there is another
>>> bottle
>>> neck somewhere.
>>> 3. Reduced the number of records in my index. The throughput
>>> increased, but
>>> the shape of all my graphs stayed the same, and my CPU usage was
>>> identical.
>>>
>>> I have a few questions:
>>> 1. Can I get more than 50% CPU utilization?
>>> 2. Why does CPU utilization fall when I make more than 8 concurrent
>>> requests?
>>> 3. Is there an obvious bottleneck that I'm missing?
>>> 4. Does Tomcat have any settings that affect Solr performance?
>>>
>>> Any input is greatly appreciated.
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Throughput-Optimization-tp20335132p20335132.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>
>>
>
>