Description

When bootstrapping a node today using the exact same process followed with success yesterday, the process failed with a Segmentation fault at various places in chef/provider/package/yum.rb. The crash was reported variously at line 743, 412, and other places.

Reviewing the lines called out did not identify anything that seems like it might be the root cause.

In the environment field are details including the command run, the bootstrap file used, etc. Attached is the output of a chef run which failed.

Activity

Where is your Ruby coming from? This is likely a Ruby bug. I would recommend first using the 'chef-full' template to use the Omnibus Client chef installation, and second to review CHEF-2413 for possible alternative Ruby rpms.

Bryan McLellan [Chef]
added a comment - 28/Mar/12 9:21 PM Where is your Ruby coming from? This is likely a Ruby bug. I would recommend first using the 'chef-full' template to use the Omnibus Client chef installation, and second to review CHEF-2413 for possible alternative Ruby rpms.
https://github.com/opscode/chef/blob/master/chef/lib/chef/knife/bootstrap/chef-full.erb

Jordan Dea-Mattson
added a comment - 28/Mar/12 10:12 PM Hi Bryan -
My Ruby is being installed using the bootstrap file which I displayed in the Environment field above. After reviewing that, does your advice still hold?
Jordan

Brandon Adams
added a comment - 28/Mar/12 10:30 PM This just started happening to me too. I noticed that it's only happening to newly deployed nodes, existing nodes are fine.
I thought it might have been a Chef version issue, since some of my currently running nodes are still on 0.10.2. I updated them to 0.10.8 and could not repro the problem.
This is on Amazon Linux, ami-1b814f72. Both newly deployed and existing instances are running Ruby 1.8.7p357.

I think I've identified the root cause. Amazon just released a new version of their Amazon Linux distribution, and they've designed the update process such that older versions of the distro will by default install updated packages for the new version. e.g., 2011.09 will use packages for 2012.03.

It appears that there must be some dependency that does not get updated when updating the ruby package. I experience the segfault if I choose only to install ruby-devel on a 2011.09 image. However, if I install ruby-devel and execute a yum update for all packages, which effectively upgrades a 2011.09 to a 2012.03, the segfault behavior disappears.

Brandon Adams
added a comment - 28/Mar/12 10:54 PM I think I've identified the root cause. Amazon just released a new version of their Amazon Linux distribution, and they've designed the update process such that older versions of the distro will by default install updated packages for the new version. e.g., 2011.09 will use packages for 2012.03.
It appears that there must be some dependency that does not get updated when updating the ruby package. I experience the segfault if I choose only to install ruby-devel on a 2011.09 image. However, if I install ruby-devel and execute a yum update for all packages, which effectively upgrades a 2011.09 to a 2012.03, the segfault behavior disappears.
Amazon has documentation here on how to change the behavior of yum so that it does not pull packages for later releases: http://aws.amazon.com/amazon-linux-ami/faqs/#lock

Brandon Adams
added a comment - 29/Mar/12 8:25 PM It looks like I was wrong about a yum update for all packages solving this. It doesn't – I must have been confused about which terminal session I was looking at.
A fresh 2012.03 instance will exhibit this problem, as well as a 2011.09 that has installed the 2012.03 Ruby.
The only work-around right now is using cloud-config to restrict the instance from installing 2012.03 packages.

Brandon Adams
added a comment - 30/Mar/12 4:46 PM Confirmed, I was able to install Chef and successfully use a Package resource to install a package on ami-f565ba9c, with both Ruby 1.8.7 and 1.9.3.
Not sure if you want to test all AMIs (32/64, EBS/instance store, HPC/non-HPC) before closing.

This will install Ruby 1.9.3 and then configure it to be the default Ruby on the system. I used this approach vs. the Omnibus installer, because I need to be able do sidewise installs and the Omnibus installer does not yet have a clean way to do this.

If you have extreme paranoia, you may want to change the 'gem install' to use 'gem1.9 install'. If you have no need for Ruby 1.8.7, you also might want to consider performing a 'yum remove ruby ruby-devel'

We install the bigdecimal gem, because aws-sdk depends on it, but that dependency is not called out in the gemspec. Believe this is due to how bigdecimal was handled in Ruby 1.8.7.