From user-return-19030-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Wed Dec 7 11:00:25 2011
Return-Path:
X-Original-To: apmail-couchdb-user-archive@www.apache.org
Delivered-To: apmail-couchdb-user-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id C7E157ED5
for ; Wed, 7 Dec 2011 11:00:25 +0000 (UTC)
Received: (qmail 7720 invoked by uid 500); 7 Dec 2011 11:00:23 -0000
Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org
Received: (qmail 7679 invoked by uid 500); 7 Dec 2011 11:00:23 -0000
Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: user@couchdb.apache.org
Delivered-To: mailing list user@couchdb.apache.org
Received: (qmail 7671 invoked by uid 99); 7 Dec 2011 11:00:23 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 11:00:23 +0000
X-ASF-Spam-Status: No, hits=-2.3 required=5.0
tests=RCVD_IN_DNSWL_MED,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: domain of paul.hirst@sophos.com designates 213.31.172.35 as permitted sender)
Received: from [213.31.172.35] (HELO mx5.sophos.com) (213.31.172.35)
by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Dec 2011 11:00:18 +0000
Received: from mx5.sophos.com (localhost.localdomain [127.0.0.1])
by localhost (Postfix) with SMTP id 92F97540B51
for ; Wed, 7 Dec 2011 10:59:54 +0000 (GMT)
Received: from uk-exch2.green.sophos (uk-exch2.green.sophos [10.100.199.17])
by mx5.sophos.com (Postfix) with ESMTPS id 6B3F55408B9
for ; Wed, 7 Dec 2011 10:59:54 +0000 (GMT)
Received: from UK-EXCHMBX1.green.sophos
([fe80:0000:0000:0000:e1bd:d3c1:23.222.229.221]) by uk-exch2.green.sophos
([10.100.199.17]) with mapi; Wed, 7 Dec 2011 10:59:54 +0000
From: Paul Hirst
To: "user@couchdb.apache.org"
Date: Wed, 7 Dec 2011 10:59:53 +0000
Subject: RE: Couch on SSD drives
Thread-Topic: Couch on SSD drives
Thread-Index: Acy0LHQ4gecuD7VETBmuS6SHUkpnggAkYdWg
Message-ID: <36E79CEC5BFB8E4D9763C4DEB9B1163C389D30B041@UK-EXCHMBX1.green.sophos>
References: <36E79CEC5BFB8E4D9763C4DEB9B1163C389BE9C69E@UK-EXCHMBX1.green.sophos>
<845D15A2-A648-45E2-ADB6-7F988DE2BF6C@gmail.com>
In-Reply-To: <845D15A2-A648-45E2-ADB6-7F988DE2BF6C@gmail.com>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sophos.com; h=from:to:date:subject:message-id:references:in-reply-to:content-type:content-transfer-encoding:mime-version; s=global; bh=vV07HBsfL7lK2dRxt1qQ22yXhPGF5MN0vyjRnDx3lWk=; b=IhTHvicNtD/TDyxHlYHiEIhFjNdiUPypFBfqdRl0BXbvfuDR23UhUgN4eNZ7AsFCYdSeWiloMVQ4LXFt/4AwHMhTGg5ncx9ZeQVdJ51UbALwbUSUH51DVJYLug2aXXl22MSucNy67+ICugfXEVcbX8atxL5Wl2UE/BpaiLaHVNc=
> Firstly, RAID5 is working against you, it's one of the slowest RAID
> modes (every logical write, for example, is two physical reads and two
> physical writes.
I'm surprised you say this. I thought an append only file arrangement (like=
couch) combined with a battery backed RAID controller would be the perfect=
setup for RAID5. I have 6 disks, so that should be 5 data blocks followed =
by a parity block. If the 5 data blocks are written out in succession then =
there should be no need to do any reads to calculate the parity. Running wi=
th a battery backed unit means the system should be able to lazily calculat=
e parity like this since there is no requirement to get the data and associ=
ated parity, immediately written to disk.
But I'll admit I have tried to research this many times and have found it n=
eigh on impossible to find a decent answer. I have come across many people =
claiming RAID5 is seriously slow and a few other people saying that while t=
hat was true many years ago RAID controllesr are now much more optimised th=
an they used to be and are able to do the sort of thing I'm suggesting abov=
e.
Anyway, my blocks read per sec is at least 10 times my blocks written per s=
ec so I'm not overly concerned about write performance.
Suggest RAID-10 instead. Secondly, have you compacted
> the db lately? This will reduce its total size and also organize it
> better on disk.
I do need to do that. It compacts to about 500G. One of the problems I'm ha=
ving is that the compaction process takes a few weeks to run. Again I'm loo=
king for ways to speed this up and I was wondering if SSDs might be the ans=
wer.
Finally, it's not the number of 'active documents' that
> matters (b-tree performance is typically not predicated on the high
> probability that the leaf nodes are cached), it's the inner b-tree
> nodes that matter (the more that are in the disk cache, the better);
> this is a factor of document count. The cost of a cache miss is
> dependent on the speed of your disk array.
Is there any way I can estimate the number of blocks being used by the inne=
r b-tree? My database is currently 60 million documents. Since my documents=
are relatively large I would imagine that all the inner nodes should fit c=
omfortably into available RAM and therefore should mostly be in the disk ca=
che since most are likely to be read far more often than the leaf nodes.
>
> SSD drives will be considerably better because there is no seek cost,
> but the cost of the drives is still prohibitive for many cases.
I think I shall try to benchmark it myself. I'm thinking of getting a machi=
ne configured with a sizeable uncompacted couch database and probably a 1:2=
0 ratio of RAM to couch database size. Then try a compaction on an SSD vs s=
ingle spinning disk. That will hopefully give me something to go on even th=
ough I'm ultimately considering 6 SSDs to replace 6 spinning disks.
I may also try RAID5 vs RAID10 if I find the time.
Thanks for the feedback. If I do come up with any numbers I'll share them.
>
> B.
>
> On 6 Dec 2011, at 04:05, Paul Hirst wrote:
>
> > Has anyone done any performance testing of couch on SSD drives?
> >
> > I have a strong suspicion that my disks are constantly seeking in
> order to satisfy read requests and therefore the performance is
> rubbish. The system is a RAID5 with 6 10k SAS drives. I'm wondering if
> upgrading to SSD drives might give a significant performance boost.
> It's either that or spreading the load across multiple boxes using
> something like BigCouch.
> >
> > To give a bit more background....
> >
> > I have a ~1.1Tb database at the moment running on a single box with
> ~48G of RAM. I strongly suspect that the number of active documents
> (ones which are seeing updates) is a larger set than will fit into RAM
> and therefore I assume most document requests are hitting the disk. My
> disk is ~100% utilizied all the time and I'm not keeping up with the
> number of read and writes I need to make.
> >
> > The average wait time for disk IO is around 5ms however the CPU load
> is minimal.
> >
> > Finally I did a test on the box to compare the disk throughput when
> reading a large sequential file. Even without stopping couch, reading a
> sequential file managed to drag about 7 times more data off the disk
> than the system was normally achieving.
> >
> > So even though I might eventually switch to BigCouch or similar I'd
> really like to balance out the CPU power and the disk power in my box
> since at the moment the system seems totally over specced CPU wise and
> totally under specced disk wise. Could SSD drives be the answer?
> >
> > Thanks.
> >
> > ________________________________
> > Sophos Limited, The Pentagon, Abingdon Science Park, Abingdon, OX14
> 3YP, United Kingdom.
> > Company Reg No 2096520. VAT Reg No GB 991 2418 08.
Sophos Limited, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, Un=
ited Kingdom.
Company Reg No 2096520. VAT Reg No GB 991 2418 08.