[ns_server:error,2012-11-20T5:02:35.847,ns_1@10.110.206.19:ns_doctor<0.2076.0>:ns_doctor:update_status:205]The following buckets became not ready on node 'ns_1@10.111.66.215': ["default"], those of them are active [][ns_server:debug,2012-11-20T5:02:36.003,ns_1@10.110.206.19:capi_set_view_manager-default<0.2157.0>:capi_set_view_manager:handle_info:349]doing replicate_newnodes_docs[ns_server:debug,2012-11-20T5:02:36.034,ns_1@10.110.206.19:xdc_rdoc_replication_srv<0.2167.0>:xdc_rdoc_replication_srv:handle_info:132]doing replicate_newnodes_docs[ns_server:debug,2012-11-20T5:02:36.034,ns_1@10.110.206.19:xdc_rdoc_replication_srv<0.2167.0>:xdc_rdoc_replication_srv:replicate_change_to_node:160]Sending _design/_replicator_info to ns_1@10.111.66.215[ns_server:error,2012-11-20T5:02:38.749,ns_1@10.110.206.19:<0.20550.6>:misc:sync_shutdown_many_i_am_trapping_exits:1408]Shutdown of the following failed: [

Jin Lim
added a comment - 20/Nov/12 5:02 PM Assign it to NS_SERVER team for triaging first from their end. It appears to me that the shutdown of bucket got stuck (or failed) somewhere. Please assign back to me afterwards. Thanks!

[ns_server:error,2012-12-06T14:04:58.426,ns_1@10.3.121.93:<0.9035.81>:misc:inner_wait_shutdown:1426]Expected exit signal from <0.9040.81> but could not get it in 5 seconds. This is a bug, but process we're waiting for is dead (noproc), so trying to ignore...[ns_server:debug,2012-12-06T14:04:58.475,ns_1@10.3.121.93:<0.9035.81>:misc:inner_wait_shutdown:1427]Here's messages:
{messages,[

Jin Lim
added a comment - 07/Dec/12 3:37 PM Andrei, the node 93 which crashed was it a node that being removed or failed over? This will give us clear understanding of why and what this node was doing prior to the crash.
Thanks much!
Jin

Just talked to Chiyoung for his input. Given that these nodes are all Windwos, can you please also verify if all nodes had been upgraded with Service Pack 1? We believe this is a memory corruption issue and we wanted to make sure that all nodes have proper Windows environment. Thanks.

Jin Lim
added a comment - 07/Dec/12 4:58 PM Just talked to Chiyoung for his input. Given that these nodes are all Windwos, can you please also verify if all nodes had been upgraded with Service Pack 1? We believe this is a memory corruption issue and we wanted to make sure that all nodes have proper Windows environment. Thanks.

EP Engine team will try to reproduce the crash (segfault) with more debugging. In the mean time QE team please see if the manual test steps described above can be automated or incorporated into a new test case. Thanks.

Jin Lim
added a comment - 07/Dec/12 7:13 PM EP Engine team will try to reproduce the crash (segfault) with more debugging. In the mean time QE team please see if the manual test steps described above can be automated or incorporated into a new test case. Thanks.