Monday, November 29, 2010

A review of the VMware troubleshooting course. This is a 4 day course. The very notion of which I find interesting. Invariably the “troubleshooting” module on almost all courses I’ve ever done is the Friday afternoon, last or last but one module, when everyone wants to get away.

Yet troubleshooting is arguably one of the most important tasks of an IT pro. It’s when the spotlight falls on you (along with blame). But it’s hidden away on courses. As though nothing ever goes wrong. I know it’s partly a time constraint, but still, it’s just so important and yet it’s invariably glossed over.

So here we have 4 days dedicated to it. It’s also a recommended course for the VCAP-DCA (another reason for taking it, as I aim to have a stab at this early 2011). So, onwards, into the breach (or something) …

This course took place at Global Knowledge in Wakefield, Nov 23-26 2010.

Day 1
Intros, and understanding people’s experience and expectations. We get 3 manuals. Course notes, labs and a troublshooting reference guide which outlines procedures eg, “vCenter Server system cannot migrate a virtual machine with vMotion”.

As ever with courses, it’s not just the content that will determine the success or not of the course. It’s also the instructor and the fellow delegates. Fortunately the instructor is Scott, who was the instructor on my fast track course for 3.5 nearly 2 1/2 years ago, and I know he knows his stuff and it will be good.

The majority of the first day modules are in essence setting you up for being able to troubleshoot. We spend time understanding and configuring vMA and log files. vMA is basically going to allow you to do the work in ESXi that you did in the service console in ESX (given the absence of a service console in ESXi). It can also be used with ESX. It would appear at this stage it’s the future for this type of work, so good to spend time and have an understanding of it. Plus of course for ESXi, it can be used for logging and also resxtop.

Most of the labs today are standard procedural labs - ie, you follow the instructions, and you should be good at the end of it all.

Day 2
Networking today. First it’s a review - the things VMware expect you to know, but (again, where the experience of the instructor comes in from having taught the course previously), you may be rusty on. So we run through this. It’s a good refresher for me in parts, and highlights that I REALLY need to do more work with distributed virtual switches (especially to prepare for the DCA). A straw poll showed that 3 of the 7 of us on the course were using dVS switches in production!

More procedural labs, and then break-fix. Basically, the core of the course, the instructor has scripts at his disposal which will “do things” to your environment. You’re given a user report “vmotion doesn’t work”, and you fix it. I think that it’s likely the instructor has various degrees of difficulty at his disposal, so the course will kind of shape and evolve to fit the needs of the course. This may mean skipping some labs, may mean fiendishly difficult or relatively simple.

Day 3
Finish the networking module - setting up a packet sniffer, setting switches/port groups to promiscuous mode to allow it to work etc. And then more break/fix on networking.

Afternoon is management and then storage. Following a similar pattern of relatively brief notes, which really are going over things people already know, then some more break/fix.
Day 4
Finish storage module, and a procedural set of labs configuring different iSCSI LUNs - CHAP, digest, then adding multipathing and using claimrules.

Then it’s into the final stretch with modules on vMotion, storage vMotion HA/DRS, FT, DPM and general VM troubleshooting. Stay on at the end for a couple more labs as I’ve frankly been dreadful at them, and I need more.
Overall Impressions
So, lessons learnt? Well, networking really is key, and I need to do much much more with dVS - a lot of the course kind of hinges on these.

My troubleshooting was dreadful. It was kind of a mix of embarassing and humbling really. But in a way, that was probably the BEST part of the course for me. I’ve come out of it with a good idea on areas I really need to get focussed on, and it’s not something that I can’t overcome. I guess it’s like this, my work environment is sufficiently small and reliable, that I’ve never had to truly troubleshoot the VMware setup. That could be seen as a testament to the quality of the software and also the hardware in use. Maybe a small element can be attributed to what I’ve done in the setup and maintenance. But when you don’t do something frequently (fixing things in this instance) you can get rusty.

On the other hand, if you do troubleshoot every day, the course may not be as eye opening - there was clearly one guy who excelled - often getting the problem within a few minutes.

I guess my main complaint is that we’re paired up on labs. Personally I prefer to work alone as I can work at my own pace. I tend to work quite quickly, and so find myself waiting for my lab partner, which disrupts my flow (though of course working fast is no guarantee of working smart). Plus in the context of this course, I don’t want my mistakes to disrupt my lab partner. It’s not fair on them. And on these types of courses, I like to try stuff (it’s an opportunity to do so, without breaking production kit AND where you have the safety net of an experienced VMware vExpert to help you out when you basically have a brainfart). But that’s how the labs are designed, so, so be it. Oh, and of course I’d love to have access to the scripts for my testlab, but that’s not to be. Still, there’s enough suggestions within the manuals that I’m sure I can at least work back from those and create various scenarios.

So, recommend the course? Yep, I think so. As I said above, I was embarrassed and humbled by my performance, but you need to learn from these things, and set aspirations and goals accordingly. As preparation for the VCAP-DCA, well, I’ll let you know when I’ve tried it. My suspiscion is it will be valuable, if for no more than it emphasises once again that you have to get hands on - build, break, learn, repeat.