On OS X, maximum allowed number of simultaneously open file
descriptors is very limited. Here it was a single node case, 1024
vbuckets, each with its own index file open. This is exactly the same
architectural/design issue we had with regular indexes before the
whole b-superstar tree thing - geo indexes will be similar in the
future, but not for 2.0, where they are an experimental feature.

So either use Linux (and ensure a high enough min/max for nofiles in
/etc/security/limits.conf, etc), or reduce number of vbuckets (easy in
development environment, but not sure if it's possible for package
installations)."

I'll close this as "Won't Fix" for now, as I'm aware and will take care of it for the coming versions of Couchbase.

Volker Mische
added a comment - 01/Oct/12 9:50 AM Filipe explained it very well, hence I'll just quote him:
"This is irrelevant.
On OS X, maximum allowed number of simultaneously open file
descriptors is very limited. Here it was a single node case, 1024
vbuckets, each with its own index file open. This is exactly the same
architectural/design issue we had with regular indexes before the
whole b-superstar tree thing - geo indexes will be similar in the
future, but not for 2.0, where they are an experimental feature.
So either use Linux (and ensure a high enough min/max for nofiles in
/etc/security/limits.conf, etc), or reduce number of vbuckets (easy in
development environment, but not sure if it's possible for package
installations)."
I'll close this as "Won't Fix" for now, as I'm aware and will take care of it for the coming versions of Couchbase.

Farshid Ghods (Inactive)
added a comment - 02/Oct/12 7:51 AM regular views work fine OSX installation . once 6783 is fixed we will rerun the test to see if spatial indexes still run into file descriptors issue.

Also noticed, with file descriptor limit set at 10240, when trying to query a spatial view, system_limit is being reached although the maximum number of file descriptors that beam.smp uses is at around 1040.

Abhinav Dangeti
added a comment - 04/Oct/12 7:33 PM Also noticed, with file descriptor limit set at 10240, when trying to query a spatial view, system_limit is being reached although the maximum number of file descriptors that beam.smp uses is at around 1040.

Farshid Ghods (Inactive)
added a comment - 05/Oct/12 12:19 AM Jens,
based on this information even though the change which was submitted changes the max of file descriptors to 10k it still fails after having 1024 open file descriptors.
Alk has mentioned that there is a FreeBSD issue preventing us from setting this to a higher number

I've searched the web for info about Darwin-specific setrlimit and RLIMIT_NOFILE issues, and the only thing that comes up is that the call will fail if you set a value above OPEN_MAX (which is 10240). But we don't do that, and we aren't getting errors.

I'm guessing that the actual limit being hit is something other than the max number of file descriptors. Maybe RLIMIT_NPROC, the maximum number of running processes per userid? According to "ulimit -a" it defaults to 709.

Maybe someone familiar with the geo code could look into exactly what system call is failing and with what errno.

Jens Alfke
added a comment - 05/Oct/12 4:53 PM I've searched the web for info about Darwin-specific setrlimit and RLIMIT_NOFILE issues, and the only thing that comes up is that the call will fail if you set a value above OPEN_MAX (which is 10240). But we don't do that, and we aren't getting errors.
I'm guessing that the actual limit being hit is something other than the max number of file descriptors. Maybe RLIMIT_NPROC, the maximum number of running processes per userid? According to "ulimit -a" it defaults to 709.
Maybe someone familiar with the geo code could look into exactly what system call is failing and with what errno.

It could be another limit as well. In order to figure out which limit it may reach, here's what the geo index does. It works like on Apache CouchDB. As every vBucket is a database, this means that for every design document there's a view created for every vBucket. I thought we hit the file descriptor limit easily, bit of course it could be something else. I don't know if Erlang does something like opening huge amount of processes if it works on so many files.

Volker Mische
added a comment - 06/Oct/12 11:08 AM It could be another limit as well. In order to figure out which limit it may reach, here's what the geo index does. It works like on Apache CouchDB. As every vBucket is a database, this means that for every design document there's a view created for every vBucket. I thought we hit the file descriptor limit easily, bit of course it could be something else. I don't know if Erlang does something like opening huge amount of processes if it works on so many files.

I remember in the past having similar issue on OS X snow leopard. Tried several ways to increase the maximum allowed number of open files, but whatever value was set, in practice it didn't allow more then a few thousand (even if it reported allowing 10k or more).

If I recall correctly, Dustin knew a lot of details about this. I think it's similar to what Farshid said above.

Filipe Manana (Inactive)
added a comment - 08/Oct/12 7:10 AM I remember in the past having similar issue on OS X snow leopard. Tried several ways to increase the maximum allowed number of open files, but whatever value was set, in practice it didn't allow more then a few thousand (even if it reported allowing 10k or more).
If I recall correctly, Dustin knew a lot of details about this. I think it's similar to what Farshid said above.

#3 above (select()) is a big issue on OSX. It seems like it should be trivial to fix (and may have been in some version by now), but I've seen that one recently as well. It's an easy limit to hit, and you can't do much about it other than make erlang use kqueue (which it does, I believe, on FreeBSD, so I don't know why it wouldn't on OS X).

Dustin Sallings (Inactive)
added a comment - 08/Oct/12 11:45 AM There are a few different limits we're talking about here:
1. rlimits.
2. erlang limits
3. limitations due to erlang using select() as an IO multiplexer.
I don't think this is a file descriptor limit as that gives
{error,emfile}
. system_limit generally refers to limits from #2 such as overrunning the maximum number of ports. This limit is also 1024 by default, and can be raised by setting ERL_MAX_PORTS: http://www.erlang.org/doc/efficiency_guide/advanced.html#ports
That one seems likely.
#3 above (select()) is a big issue on OSX. It seems like it should be trivial to fix (and may have been in some version by now), but I've seen that one recently as well. It's an easy limit to hit, and you can't do much about it other than make erlang use kqueue (which it does, I believe, on FreeBSD, so I don't know why it wouldn't on OS X).

I'm having similar issues on a 32bit Ubuntu box and the 2.0.0beta .deb installation. I don't get the system_limit messages, but all couchbase processes die when requesting a spatial view. Some output I found in the log file:

10T21:50:45.178,ns_1@127.0.0.1:couch_spatial:couch_log:debug:36]Spawning new group server for spatial group _design/dev_products in database default/242.[couchdb:debug,2012-10-10T21:50:45.179,ns_1@127.0.0.1:<0.22359.1>:couch_log:debug:36]request_group

These are unrelated. Other issues are about avoiding opening 2 file descriptors for empty spatial or dev views (per design document), and avoiding database file handle leaks when spatial views are used and bucket compaction happens.

Volker Mische
added a comment - 16/Oct/12 3:02 AM Patrick, can you always reproduce that? If yes, can you please post the exact steps to see the crash? I've an Ubuntu 32-bit desktop machine at home, so I might be able to reproduce it as well.

Volker Mische
added a comment - 16/Oct/12 3:04 AM Farshid, as Filipe says, it's not related to MB-6860 . I assign it back to you, so that you can assign it to someone to try out setting the ERL_MAX_PORTS to sime higher value.

Volker, the problem is easily reproducible by creating a new standard spatial view and having at least one document with a spatial point: location[41.386944,2.170025];

function (doc) {
if (doc.location)

{
emit(doc.location, null);
}

}

The server crashes when trying to show results / show the view on 8092 (and being a bit impatient about the loading time, clicking multiple times). With a bit more patience and one click at a time the server seems to be stable, but I don't get any results in the view (there should be 10). The Linode I/O graph and CPU graph get high spikes and I got a warning email about excessive I/O from Linode (> 18k blocks / sec I/O). I tried dev and production views with multiple reloads.

Filipe, this is a snap from /opt/couchbase/var/lib/couchbase/logs/debug.11 right after the crash. I can send you the full logs, if you like.

Patrick
added a comment - 16/Oct/12 3:53 PM Volker, I tested with one document in the database, which only has location [41.386944,2.170025] set. In the other bucket there are 10 documents, all with an array of two values: location [lat,lon] .

Patrick
added a comment - 16/Oct/12 4:36 PM Oh crap. Sorry about that. I lost all my views with the 2.0 upgrade and didn't remember how I wrote them. Maybe a good idea to change the default spatial function to that emit in future releases?

Volker Mische
added a comment - 16/Oct/12 4:46 PM Patrick, the solution is better error messages instead of crashing It's on my TODO list.
For all others, Patrick's crashes were a different problem, though the original issue is still there.

Not sure why, but I get EMFILE errors (too many open files), even though ulimit -n reports 10000.

This is all like mapreduce indexes were over an year ago, one small index per vbucket database, then results from all are merged at query time. For this case (single node) it means 1024 files open for the vbucket databases (1 to 1 mapping) plus 2 file descriptors per vbucket spatial index file. In other words, you need at least 3K files open to query a spatial view on a single node (+1 one for the client to server TCP connection, etc).

With DP4 and past versions, because the default number of vbuckets was 256 (and not 1024 like in beta and builds post DP4), this was probably doable in OS X or with default settings of most Linux distributions. If I recall correctly, on Ubuntu the default maximum for open files for a user is 1K or 2K.

Filipe Manana (Inactive)
added a comment - 17/Oct/12 5:54 AM I've tried out on my macbook, running OS X Lion (10.7).
Followed the instructions to raise max open files limit from http://docs.basho.com/riak/latest/cookbooks/Open-Files-Limit/#Mac OS X and set and exported ERL_MAX_PORTS to 10000.
Not sure why, but I get EMFILE errors (too many open files), even though ulimit -n reports 10000.
This is all like mapreduce indexes were over an year ago, one small index per vbucket database, then results from all are merged at query time. For this case (single node) it means 1024 files open for the vbucket databases (1 to 1 mapping) plus 2 file descriptors per vbucket spatial index file. In other words, you need at least 3K files open to query a spatial view on a single node (+1 one for the client to server TCP connection, etc).
With DP4 and past versions, because the default number of vbuckets was 256 (and not 1024 like in beta and builds post DP4), this was probably doable in OS X or with default settings of most Linux distributions. If I recall correctly, on Ubuntu the default maximum for open files for a user is 1K or 2K.
See:
http://askubuntu.com/questions/162229/how-do-i-increase-the-open-files-limit-for-a-non-root-user
about how to increase it on Linux/Ubuntu.

Farshid Ghods (Inactive)
added a comment - 17/Oct/12 8:20 PM we can change the number of vbuckets for mac release but that has bigger impact such as SDKs and making sure all scripts and external customer need to be aware.

I like the idea of changing the default number of vBuckets, as I think it's way too high i you run a single instance (also rebalancing is realy slow with 1024 vBuckets).

Though I really see the point that changing it is a too huge step. Hence I would rather spend time on making it easier to change the number of vBuckets and put up good documentation about it. So people who really want to use the spatial index can just decrease the number. That would then help on all platforms, as I think you could also easily hit the limit on Ubuntu as well as you have the default settings.

Volker Mische
added a comment - 18/Oct/12 2:35 AM I like the idea of changing the default number of vBuckets, as I think it's way too high i you run a single instance (also rebalancing is realy slow with 1024 vBuckets).
Though I really see the point that changing it is a too huge step. Hence I would rather spend time on making it easier to change the number of vBuckets and put up good documentation about it. So people who really want to use the spatial index can just decrease the number. That would then help on all platforms, as I think you could also easily hit the limit on Ubuntu as well as you have the default settings.

@Dipti, perhaps my comments before were not very explicit.
It means, with default settings of OS X, Ubuntu etc, you will never be able to query spatial views, at least for single node case, and likely for 2 nodes cluster and maybe 3 nodes cluster.

In Linux it's easy to change, so it gets documented. On OS X (and Windows almost for sure), someone more skilled in those OSes might know how far we can configure them to be able to allow for more open file descriptors.

This shouldn't be a surprise for QE, as the exact same issue happened on regular map views over an year ago, and it affected DP1 and DP2 (and the demo for the first CouchConf SFO).

Filipe Manana (Inactive)
added a comment - 18/Oct/12 5:30 AM @Dipti, perhaps my comments before were not very explicit.
It means, with default settings of OS X, Ubuntu etc, you will never be able to query spatial views, at least for single node case, and likely for 2 nodes cluster and maybe 3 nodes cluster.
In Linux it's easy to change, so it gets documented. On OS X (and Windows almost for sure), someone more skilled in those OSes might know how far we can configure them to be able to allow for more open file descriptors.
This shouldn't be a surprise for QE, as the exact same issue happened on regular map views over an year ago, and it affected DP1 and DP2 (and the demo for the first CouchConf SFO).

Steve Yen
added a comment - 18/Oct/12 1:07 PM This looks like it will not make 2.0. Let's talk in next "daily" 2.0 bug scrub w/ PM mtg. .next?
For regular map-reduce indexes, iirc, Filipe made fixes to limit # of file descriptors so it doesn't run into this problem.
However, spatial indexes doesn't have that fix, it seems.

Dipti Borkar
added a comment - 29/Oct/12 2:59 PM Reducing the number of vBuckets to 64 might be the best way around for MacOS. Dev and QE team should consider this and provide feedback if this is possible including testing for 2.0

Farshid Ghods (Inactive)
added a comment - 01/Nov/12 4:11 PM in order to change the number of vbuckets to 64 we need to change this file
https://github.com/couchbase/couchdbx-app/blob/master/Couchbase%20Server/start-couchbase.sh
and the COUCHBASE_NUM_VBUCKETS

ship with 64 vbuckets as default. And, need to add checks if user attempts rebalance/XDCR between clusters of different # of vbuckets.

ship with 1024 vbuckets as default and have instructions on how to either change vbucket #'s to lower (64?) or instructions on how to change system limits. And, document warnings about rebalance/XDCR between clusters of mismatched # vbuckets. (Damien favors this.)

Steve Yen
added a comment - 01/Nov/12 4:18 PM options...
ship with 64 vbuckets as default. And, need to add checks if user attempts rebalance/XDCR between clusters of different # of vbuckets.
ship with 1024 vbuckets as default and have instructions on how to either change vbucket #'s to lower (64?) or instructions on how to change system limits. And, document warnings about rebalance/XDCR between clusters of mismatched # vbuckets. (Damien favors this.)

On OS X, I bumped up kern.maxfiles andkern.maxfilesperproc in sysctl, and then increased launchd maxfiles. Then my erlang install (not from couchbase installer) ran into FD_SETSIZE limit at which point I gave up and switched to 64 buckets.

Sriram Melkote
added a comment - 01/Nov/12 4:30 PM On OS X, I bumped up kern.maxfiles andkern.maxfilesperproc in sysctl, and then increased launchd maxfiles. Then my erlang install (not from couchbase installer) ran into FD_SETSIZE limit at which point I gave up and switched to 64 buckets.

Volker Mische
added a comment - 02/Nov/12 7:08 PM John Zablocki just reported that GeoCouch also crashes on Windows. We might want to decrease the number of vBuckets there as well.
http://www.couchbase.com/issues/browse/GC-4

This needs to be documented. Jens please assign the bug to MC once you have merged the change.

MC, we will need to add big "warning" signs on the MacOS install doc pages, saying that it is not compatible with other platforms. Given that it is a developer only platform, the number of vBuckets is set to 64 by default. So mixed clusters with other platforms will not work. Replicating data using XDCR to / from one cluster with 1024 vBucket (linux, windows) from / to to a cluster with 64 vBuckets will NOT work.

Dipti Borkar
added a comment - 03/Nov/12 12:11 PM This needs to be documented. Jens please assign the bug to MC once you have merged the change.
MC, we will need to add big "warning" signs on the MacOS install doc pages, saying that it is not compatible with other platforms. Given that it is a developer only platform, the number of vBuckets is set to 64 by default. So mixed clusters with other platforms will not work. Replicating data using XDCR to / from one cluster with 1024 vBucket (linux, windows) from / to to a cluster with 64 vBuckets will NOT work.

FYI, I have asked a question on an Apple mailing list about why we can't seem to set the file-descriptor limit high enough. Hopefully someone will have a good answer that will let me work around the problem.

Jens Alfke
added a comment - 03/Nov/12 2:41 PM FYI, I have asked a question on an Apple mailing list about why we can't seem to set the file-descriptor limit high enough. Hopefully someone will have a good answer that will let me work around the problem.

I've tried to add a spatial view in Windows an the beer sample data set. It looks like:

function (doc, meta) {
if (doc.geo) {
emit(

{"type": "Point", "coordinates": [doc.geo.lng, doc.geo.lat]}

, meta.id);
}
}

It worked without a problem. I tried it with another one (with some other emit value). It was still working. So I'd say Windows is good to go. Perhaps we should try how many spatial views you can have before windows freaks out. It would be cool if it wouldn't be me, as using Windows vie remote desktop is really painfully slow.

Volker Mische
added a comment - 05/Nov/12 8:42 AM I've tried to add a spatial view in Windows an the beer sample data set. It looks like:
function (doc, meta) {
if (doc.geo) {
emit(
{"type": "Point", "coordinates": [doc.geo.lng, doc.geo.lat]}
, meta.id);
}
}
It worked without a problem. I tried it with another one (with some other emit value). It was still working. So I'd say Windows is good to go. Perhaps we should try how many spatial views you can have before windows freaks out. It would be cool if it wouldn't be me, as using Windows vie remote desktop is really painfully slow.

Farshid Ghods (Inactive)
added a comment - 05/Nov/12 11:42 AM assigning this to Iryna pending results from testing on windows
Iryna , if you are uanble to access windows on ec2 or mview please let Deep know and he will run the tests

I'm still seeing this problem on Windows and an Ubuntu cluster. All I'm doing is running the query referenced in GEO-4 through the admin console and I get the errors below and eventually the node crashes.

Farshid Ghods (Inactive)
added a comment - 05/Nov/12 1:32 PM per discussion in bug scrubbing we need to create a seperate bug for use cases that are for more than one spatial index and on different platforms

Volker Mische
added a comment - 06/Nov/12 11:28 AM This sounds good enough for me. I would just document that on Windows the Spatial Views are limited to 4 design docs with 1 view each. This should be enough to play with it.
For Linux 10+ is also good enough. There the documentation could mention that it's an experimental feature that might fall apart when too many Spatial Views are defined.

Once I saw a line of code in the Erlang VM file handler, having to do with handling on Mac, that essentially hardcodes it at 1024. So unless we wanna fix that in the Erlang VM we made the right choice here by lowering vbucket count on Mac.

J Chris Anderson
added a comment - 10/Nov/12 8:12 AM Once I saw a line of code in the Erlang VM file handler, having to do with handling on Mac, that essentially hardcodes it at 1024. So unless we wanna fix that in the Erlang VM we made the right choice here by lowering vbucket count on Mac.

Filipe Manana (Inactive)
added a comment - 10/Nov/12 8:24 AM Yes, that's a well known issue that was debated several times in erlang-questions list:
http://erlang.2086793.n4.nabble.com/clipping-of-max-of-file-descriptors-by-erts-when-kernel-poll-is-enabled-td2108706.html
Doesn't seem it will be addressed at any time soon.

kzeller
added a comment - 12/Nov/12 3:32 PM Added to RN: For Mac OSX, we limit the number of vBuckets to 64 from 1024
due to limitations on Mac OSX file descriptors. In the past, this
resulted in crashes for OSX.