You Paid for Support?! Bwah-ha-ha

We’re using Open Source Software extensively in our Big Enterprise. It really irritates me that we pay millions of dollars for “Support” from our vendors and we get endless circles of “try this,” “that should work” and “oh, that’s an upstream bug, we’ll file a bug report.” Seriously? For 10% of what we’re paying these guys, I’ll do it myself.

I currently have 3 bugs open with 3 vendors; 2 of those are open source. Let’s talk about them.

1) OS won’t PXE boot across a LACP Bond. The documentation says it should. Everything “looks right” but after 3 business days of the vendor telling me to try things I’ve already tried, I finally solved this myself. I can boot my DataNode image to my servers, but I wanted to install an OS on some of the control nodes. As soon as the install agent starts up, it loses network connectivity. I told it how to configure the bond on the kernel boot line, but it fails to see it and use it. Trying to use a single interface doesn’t work because the switch is expecting to distribute the packets (per LACP 802.3ad spec) across 4 NICs. It turns out that I can tell the kernel to use eth0 and NOT probe other network devices, which solves 99% of my problem. It’s not perfect, but it’s a hellava lot better than trying to hand install. Here’s hint if you have this problem: nonet.

2) Proprietary software vendor can’t pull the Avro schema from HDFS. This seems to be squarely in their court for resolution, however, they claim it’s a bug in Hive and opened a bug report. Come on kids, if you’re finding hdfs:// and expecting file:// something is wrong on your side.

3) Open source Hadoop vendor opened a bug report because pig doesn’t correctly support Avro in our version. We supplied a bug report and a bug solution from Apache, but they made us chase our tails for 10 days before they agreed and opened a new bug report.

After losing some 600 blocks of data in our Dev cluster we found out there is a “fix” for under replicated blocks coming in HDFS 0.20, but 0.1x doesn’t have this “feature.” Support DID help us find that issue, but ONLY after they ran us thru hoops looking for non-existent configuration problems.

My advice: Eschew paid support and dig into the details on your own. You’ll learn more, be more valuable and solve you own problems faster.