From user-return-23804-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu Mar 7 07:19:54 2013
Return-Path:
X-Original-To: apmail-couchdb-user-archive@www.apache.org
Delivered-To: apmail-couchdb-user-archive@www.apache.org
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
by minotaur.apache.org (Postfix) with SMTP id 9E4FBD38B
for ; Thu, 7 Mar 2013 07:19:54 +0000 (UTC)
Received: (qmail 18979 invoked by uid 500); 7 Mar 2013 07:19:53 -0000
Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org
Received: (qmail 18550 invoked by uid 500); 7 Mar 2013 07:19:52 -0000
Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: user@couchdb.apache.org
Delivered-To: mailing list user@couchdb.apache.org
Received: (qmail 18503 invoked by uid 99); 7 Mar 2013 07:19:50 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Mar 2013 07:19:50 +0000
X-ASF-Spam-Status: No, hits=2.4 required=5.0
tests=DC_IMAGE_SPAM_HTML,HTML_IMAGE_ONLY_28,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: domain of nicolists@gmail.com designates 209.85.217.169 as permitted sender)
Received: from [209.85.217.169] (HELO mail-lb0-f169.google.com) (209.85.217.169)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Mar 2013 07:19:45 +0000
Received: by mail-lb0-f169.google.com with SMTP id m4so200988lbo.0
for ; Wed, 06 Mar 2013 23:19:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20120113;
h=mime-version:x-received:date:message-id:subject:from:to:cc
:content-type;
bh=iCLf74jgXm+CtcmFEfXyAzXq2cKNZrqea66lWzGcnWI=;
b=mEP4+PyGY1hZfkrcaC9UfnxMZqlmUUQd8jAR5+P0cVGD8wg4p6GpQhtCpj4fzk1JrY
SH+QO/5FJFScN9MAkRmWLU7KjWDeR4rTGodVvsy30L0MuPcjxoFdtkTxjk1GGWzzz4cg
GGkBB1W0ttWYPbjzwjNwE/u6Ag3X5pNyzX6lJEECm+dk7qyc+yhM4GetZoYvy6udBHsc
lk0dr5lfVDbA6atAGij56a+xl70z+mIAKhF52zhqWZF+WgKyhNaxyBEL0YKXnAhfaseC
9RxsmAek2Ht789qzX64n/SjwsjOZMEU7sSi/xJOlsPQuUFKavK5cobuYzjUPnOu/gRB6
SZ5Q==
MIME-Version: 1.0
X-Received: by 10.112.10.138 with SMTP id i10mr8717309lbb.24.1362640762917;
Wed, 06 Mar 2013 23:19:22 -0800 (PST)
Received: by 10.114.95.1 with HTTP; Wed, 6 Mar 2013 23:19:22 -0800 (PST)
Date: Thu, 7 Mar 2013 08:19:22 +0100
Message-ID:
Subject: CouchDB compaction not catching up.
From: Nicolas Peeters
To: user@couchdb.apache.org
Cc: npeeters@infohubble.com
Content-Type: multipart/related; boundary=e0cb4efe33dea15dc304d75086e2
X-Virus-Checked: Checked by ClamAV on apache.org
--e0cb4efe33dea15dc304d75086e2
Content-Type: multipart/alternative; boundary=e0cb4efe33dea15dbd04d75086e1
--e0cb4efe33dea15dbd04d75086e1
Content-Type: text/plain; charset=ISO-8859-1
Hi CouchDB Users,
*Disclaimer: I'm very aware that the use case is definitely not the best
for CouchDB, but for now, we have to deal with it.*
*Scenario:*
We have a fairly large (~750Gb) CouchDB (1.2.0) database that is being used
for transactional logs (very write heavy) (bad idea/design, I know, but
that's besides the point of this question - we're looking at alternative
designs). Once in a while, we delete some of the records in large batches
and we have scheduled auto compaction, checking every 2 hours.
This is the compaction config:
[image: Inline image 1]
>From what I can see, the DB is being hammered significantly every 12 hours
and the compaction is taking (sometimes 24 hours (with a size of 100GB of
log data, sometimes much more (up to 500GB)).
We run on EC2. Large instances with EBS. No striping (yet), no IOPS. We
tried fatter machines, but the improvement was really minimal.
**
*The problem:*
The problem is that compaction takes a very long time (e.g. 12h+) and
reduces the performance of the entire stack. The main issue seems to be
that it's hard for the compaction process to "keep up" with the insertions,
hence why it takes so long. Also, the compaction of the view takes long
time (sometimes the view is 100GB). During the re-compaction of the view,
clients don't get a response, which is blocking the processes.
[image: Inline image 2]
The view compaction takes approx. 8 hours and the indexing for the view are
therefore slower and during the time that view indexes, another 300k
insertions have been done (and it doesn't catch up). The only way to solve
the problem was to throttle the number of inserts from the app itself and
then eventually the view compaction resolved. If we would have continued to
insert at the same rate, it would not have finished (and ultimately, we
would have run out of disk space).
Any recommendations to set it up on EC2 is welcome. Also configuration
settings for the compaction would be helpful.
Thanks.
Nicolas
PS: We are happily using CouchDB for other (more traditional) use case
where it does go very well.
--e0cb4efe33dea15dbd04d75086e1
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi CouchDB Users,

Disclaimer: I'm very aware that the use case is defini=
tely not the best for CouchDB, but for now, we have to deal with it.

Scenario:

We have a fairly large (~750Gb) CouchDB (1.2.0) database that=
is being used for transactional logs (very write heavy) (bad idea/design, =
I know, but that's besides the point of this question - we're looki=
ng at alternative designs). Once in a while, we delete some of the records =
in large batches and we have scheduled auto compaction, checking every 2 ho=
urs.=A0

This is the compaction config:

From what I can see, the DB is being hammered significantly e=
very 12 hours and the compaction is taking (sometimes 24 hours (with a size=
of 100GB of log data, sometimes much more (up to 500GB)).

We run on EC2. Large instances with EBS. No striping (yet), n=
o IOPS.=A0We tried fatter machines, but the improvement wa=
s really minimal.

The problem:

The problem is that compaction takes a very long time (e.g. 1=
2h+) and reduces the performance of the entire stack. The main issue seems =
to be that it's hard for the compaction process to "keep up" =
with the insertions, hence why it takes so long. Also, the compaction of th=
e view takes long time (sometimes the view is 100GB). During the re-compact=
ion of the view, clients don't get a response, which is blocking the pr=
ocesses.

=A0

The view compaction takes approx. 8 hours and the indexing fo=
r the view are therefore slower and during the time that view indexes, anot=
her 300k insertions have been done (and it doesn't catch up). The only =
way to solve the problem was to throttle the number of inserts from the app=
itself and then eventually the view compaction resolved. If we would have =
continued to insert at the same rate, it would not have finished (and ultim=
ately, we would have run out of disk space).

Any recommendations to set it up on EC2 is welcome. Also conf=
iguration settings for the compaction would be helpful.

Thanks.

Nicolas

PS: We are happily using CouchDB for other (more traditional)=
use case where it does go very well.