The constraint that almost everyone hits first is RAM, not CPU. Some admins cannot expand on the amount of RAM in their server, because the cost of the bigger DIMM's are too high, and there are not enough slots left in the server. Which leaves them with servers that are nearing memory capacity, but not not anywhere close on utilizing the CPU power of the server.

Many people are purchasing dual-socket servers for redundancy or because of the fear of not having the server perform well enough.

From my own environment I can say that my hosts are utilizing around 30% of their CPU, with 2 Quad Core CPU's. And from the answers I got tonight on my question above - the results are pretty much the same.

Now perhaps a sacrilegious thought. What would happen if we only used one physical processor in a server?

Today we are talking about a six or eight core processors and this number is rising. The amount of cores available are more or less the same, as what the majority of people are using today, 8 cores - 2 x Quad-core processors.

Now you might ask, but here I lose the redundancy. This could be true, but how many of you have actually lost a CPU due to malfunction in a server? I personally have not. Ever. I would also suppose - that if a physical CPU barfs on you during a production workload - then it will not be pretty. The VM's that were running from that Processor will obviously kill over and die, but I suppose the rest of the host will not be happy either. From my experience with faulty memory, you are more likely to crash the whole host with a PSOD than having the host function with one DIMM less. I guess that with a CPU - it will probably be the same. So having redundant CPU's does not really cover it. I could be wrong here, and if so I would appreciate your feedback with more information.

Now I am sure there are other implications here, regarding the spread of memory and load over the two channels from both processors, and I am also sure that there are other internal ESX performance implications as well. So it is not a simple matter.

How will this change the game though? Well it will cut costs - in two ways.

Licensing. ESX licenses are now counted per processor, and not per sets of 2. Removing one processor, will lower ESX host licensing costs by half.

Server hardware. With one processor less, you are able to cut costs on each server.

So are we destined to run only a 1 Socket ESX host? I would interested in hearing your thoughts and insights on this one.

2010-10-23

10 days ago I completed my VCDX defense in front of a Panel of some the top professionals and technical people in VMware. This has been the completion of a journey, a path and quest to achieve the VCDX certification.

But let us roll back a bit - approximately a year back.

VMware announced that there would be a new certification path that would be a level above the current VCP certification.

There was a great amount of buzz around this when it launched and people were all gung-ho to start and get this new cert.

Until.. They saw what was involved. And then a large number of people dropped it.

VMware Certified Design Expert - that is the abbreviation. 4 letters but oh it is so much more.

So let's start why I decided to pursue this certification.

Let's face it. There are approximately 56,000 VCP certification holders. That number becomes less for those who hold both VCP3 and VCP4, and even less for those who hold VCP2, VCP3 and VCP4. I am not a certification chaser, not by a long shot, usually I only update my Certs if I have to or find additional benefit, it was the same with MCSE 2000, and MCSE 2003. I have not pursued the 2008 certifications, because I did not need to, but more so - I do not see the benefit of doing so at this time. Being one of 56,000 individuals in a population of almost 7 billion is an amazing achievement.

But I aspired for something more. Doing the regular day-to-day admin work was not enough. So you start to look for more interests, you go into new subjects, you continuously learn new things.

Now this certification was something that caught my eye. This was not you average Admin work. The most interesting thing that I saw, was that you could not study from a book for the exams. the Exam blueprint was based on experience and in-depth knowledge of VI3 products and the surrounding infrastructure. There were no braindumps, no cheat sheets. This is either something that you know or you do not, based primarily on you own experience and knowledge.

I made a decision, that I would try to pass the Enterprise Admin Exam, and see further. I passed

And then I decided that I would try and pass the Design exam thereafter. And I passed

I was now at a crossroads. The last two parts was submitting a design, and defending it in front of a panel of architects. Now I kid you not. No matter how confident you think you are (I have still some work to do on this), no matter how clever you think you are, no matter how good you think you are, this is not an experience to take lightly. I knew that I would have to travel overseas for this part, and that would entail securing the correct funds and time to complete this. I managed to secure both.

The application is one of the hardest things I have ever done. In my life.

When designing an infrastructure, it is more than just consolidation ratios. It is more than how much RAM each ESX host should have. It is more about than how many IOPs you will need to support you environment. It is more about than how many ESX clusters you will have, what level of DRS, what will the failover settings be. I kid you not, each and every one of these above are extremely important and should be a fundamental piece of your design. But the whole idea of being an architect (which is what VMware is looking for - IMHO) is all of the above, but also to understand how the your business works, what are needs, what are their requirements, how you plan to fill those needs/requirements. And how every decision you make can/will affect some other part of the environment.

You need to be able to see the big picture. You need to cater to the needs of the customer. And even more so, be able to justify to mainly to yourself, and in turn to the customer, why this option is better than the other? What will happen if you choose one over the other?

The application is brutal - seriously. It is not a clear document saying:

Explain in 10 words or more what you design is.

Present a diagram of your virtual machine layout.

It is open to interpretation. There are a decent amount of deliverables you have to provide - but there are no instructions exactly what has to be in these deliverables. I personally find it extremely difficult to provide an answer to something without knowing the what is expected of me. But in essence here is the good thing.

Let me give an example.

You are asked to build a tower out of cards that will be 7 levels. The end result is that it will have to stand on its own without support for 10 minutes.

So you might ask - what I am allowed to use? How many cards, any geometrical shape? will there be wind that will blow these cards down? Am I allowed to use anything else besides cards? All of these are legitimate questions which will help you in your planning and design of your tower. Some of the answers you receive and some of the answers you do not. Now of course there is more than one way to accomplish this. But in the end what you will have to to I make sure that you tower stands and you will have to explain why you chose what you did, and what were the implications of choosing A over B.

So for example you might start to research what is the most steady geometrical form, create a schematic on how to build it, measure to the millimeter, what card has to go where. Then you build, document, test etc.

On the other hand you might just start and see what works for you. stick the cards with Cello tape to one another, staple the bottom level to piece of wood and build. You then document the process and reproduce it.

Which solution was correct you may ask? I think both, each in their own way. Both provided the end result. Both were documented, both are reproducible.

Sometimes, giving you a detailed spec of exactly what is required - suppresses creativity. We all work with both sides of our brain, the right for creativity and the left for analytical thinking. Some use one side more than the other. There is usually more than one way to solve a problem.

For the VCDX - I think that the journey is the most important part. How you got to decide that you will have 24 VM's per LUN? Why you use VMFS and not NFS. And if you were to change that decision to something else - what would the implications be. I have learned a lot from this journey, it has helped me grow intellectually, technically, professionally and personally. And for that I am extremely grateful.

Today I received my answer from the panel, and unfortunately the answer was no. There were a few points that were noted that needed improvement.

I would like to thank the Panel members for their time and effort put into each and every defense. I gather I will be seeing you all when I re-submit for a defense.

I do not remember where I once heard this but it is so true. "If you do not have the courage to fail, then you will not have the courage to succeed."

I see this as a setback, and as a learning experience. Perhaps I took too much on my plate. Too much going on at the same time. And not enough time to concentrate on each thing properly. Or then again, maybe not.

When I finished the defense I felt immense relief. I felt that I had given my best in the process. I felt the interaction was good between the panel and myself.

I think that I am the first (or one of the first) public figure(s) in the virtualization world that has not passed the defense. Nothing at all to be ashamed of. The success rate is not high, not at all.

I have decided not to re-submit for a defense for the VCDX 3 track. The time left to submit a design is not something that I can complete (at least to my satisfaction) before November 22nd.

I am going to give myself a rest for a few months and start out anew for the VCDX 4 track. I need to complete some other projects that I am currently busy with. Yes I know that it will mean that I need to complete the exams again, but again that I see as part of the learning experience.

So back to the blogging, back to the technical troubleshooting, back to answering questions on the forums.

2010-10-21

A week has passed since the last day of VMworld in Copenhagen. And I owe myself (and you as well) a roundup and summary of the event

Quite a few roundups have been published over the past week. I will not repeat what has been said in the other posts, but I would like to add my following comments.

Which kid does not like going to Disneyworld? So I guess you can say the same thing, which virtualization enthusiast does not like like going to VMworld?

There are very few events or places that you can go and find a load of people who are interested in the same thing you are, who understand what you are talking about, who will crack up laughing at the funny slogans on the splunk tshirts. So thanks VMworld.

There are many opportunities we all have to play with new technologies. But we hardly ever have the privilege to have everything prepared for us, the same way we provide for our end users and customers. That is why I think that the labs were such a hit. Of course the experience, the diversity, and the performance of the labs was amazing, but for me to go and and have a a vCloud Environment all ready for me with the click of a mouse, without me having to go and find hardware, storage, licenses and time to set it up, was a treat. So thanks VMworld.

The social networking. It was great finally meeting a great amount of people that I have chatted with, tweeted with, DM'ed and read their blog posts over the past year or two. It was an honor to meet you all. The conversations were great. The live troubleshooting sessions were great. The ad-hoc meetings were great. So thanks VMworld.

Meeting and talking with the VMware Product Managers / Architects. Meeting and talking with the Vendors, the Directors, VP's, Architects and big-wigs. I have mentioned it perhaps before, but I am continually amazed at how accessible you all are. I sometimes think that maybe it is me who is well connected, but no it is not! It is all of you who are willing to listen, to help with a problem, to make the change when necessary. So thank you all and thanks VMworld.

I heard and read many a blog post about what is the important part of VMworld. Is it the sessions? Solutions Exchange? Labs? Maybe the Social Networking? I think it is all of the above,

I tried to divide my time between all of the above. I gave less of an emphasis on the sessions and Solutions Exchange. I picked and chose my sessions carefully. and only participated in 3 in total. I chose them because of the topics that I found appealing, the rest of the sessions that I want to see, I will download in the near future. I tried to get take some labs with products that I will hopefully be evaluating in the very near future to get a quick feel for the product. The few products I spent time with at the Solutions Exchange were ones that I felt would benefit from in the upcoming months, mainly with insight into the performance and analysis of my environment.

It was a blast, I think it was one the highlights of my entire year. It makes all the time spent blogging, on Twitter, writing and on email so worthwhile.

So thank you VMware for having VMworld and thank you all for making it the amazing experience it was.

1-2. Set start variable with the date I need (30 days ago) and finish with today's date
3. This is the metric I am looking for.
5-6. Here we get each Host, I divided it up by datacenter/cluster to show that all hosts are (should be) balanced and using the same amount of resources and we perform the action on each host.
7-8. First clear the variable, from any previous runs, and set the stats variable with the information I want.
9. Here I used the .Net format for numbering "{0:N2}" -f otherwise I received a number which had 10 digits after the decimal point (which is pointless .. ;) )

And I got this output

Dev
esx2.maishsk.local average Memory Usage over the last month is: 82.46%
esx3.maishsk.local average Memory Usage over the last month is: 79.20%
esx5.maishsk.local average Memory Usage over the last month is: 79.04%
esx6.maishsk.local average Memory Usage over the last month is: 72.57%
esx7.maishsk.local average Memory Usage over the last month is: 77.97%

2010-10-14

I headed over to the Labs early to get a lab done. The labs were already busy at 08:10. The Wyse thin-client on each and every station was what connected you to the labs and from there further into the cloud to the remote datacenter

The lab I completed was LAB18: VMware vCloud Director - Networking. This was the 3rd in the series of Cloud related related Labs. I think I have said this before. The hardest part of understanding vCD is going to be the Network. I think the rest of the concept is really easy - especially if you have used Lab Manager before. The different kinds of networks and segmentation of these networks is much more complicated than the VIAdmin has been used till until now - dealing with the vSwitch and the dvSwitch. It seems that In addition to having to administer the Hardware, the virtualization layer, the Operating Systems, and as of late Storage is becoming a more integral part of a VIAdmins day to day job, we will now also start to design/manage the network in our cloud environment - or at least have to understand how it works, before selling it to your customers, or implementing in our own Environment. The lab had some quirks and did not run smoothly. And at one time, I had Duncan Epping and two other proctors (I apologize I did not get your names) around me trying to to solve the issue, which they did, which allowed me to complete the lab.

I finally got Duncan to autograph both of his books(one day they will be worth millions $$$$ on EBay)

vSphere 4.0 Quick Start Guide and

Foundation for Cloud Computing with VMware vSphere 4

After the Lab I moved over to the Blogger Lounge where at one time, where there was such a concentration of talent, knowledge and good moods - It was great. Scott Lowe was kind enough to give me a Spousetivities shirt to take back for the missus. On a whole, besides one or two sessions, I think the largest benefit I gained from this conference was interacting with people. But more on that in an upcoming blog post.

I wanted to go back into do another lab, but the lines - man the lines!! 20 minute wait just to get in!

I came back to the lounge and had a great discussion with John Arrasjid, about politics, about cloud, about religion, about home labs, about the VCDX process. It was great.

I was on my way to a VDI performance assessment session on the other side on the Center, I bumped into Scott Herold, and I went past the session (TA8133 Best Practices to Increase Availability and Throughput for VMware) with Vaughn Stewart and Chad Sakacc. I decided to sit in this one instead. The room was packed! Completely!! I have never seen either of them present before (besides hearing recordings). Both of them are amazing presenters, really - they both know their stuff, have a great amount of knowledge. A great mix of detail, technology and entertainment.

It was a really good session on storage performance best practices. Nothing that I have not heard or seen on their joint blog posts, but hey - you gotta go and see the "Frenemies" in action.

After the session, I went up to say hi to both of them, and had a lengthy discussion with Vaughn about NetApp, storage, vision and what is to come. It was a great discussion.

I am constantly amazed how accessible the top people in the industry are. I mean if I were to compare it to the CTO of my current employer or other companies for that matter, I am not sure how many of them are available for speaking to the end user. You obviously have to take and make the time to allow this to happen, and I am not only talking about the two gentlemen above, This is something I constantly noticed over the last few days, at all levels, I spoke to VP's CTO's and Principal Architects.

And really who am I? An Admin. So my hat off you all. You are what makes this industry exciting, enjoyable and a better place for us all.

Ok enough philosophy.

I went to sit in one more lab. vCenter configuration Manager - This is a whopper of a product - but I must admit, the amount of clients that will actually get to implement it are not large. The product has a great amount of functionality which assures your Environment is compliant to a baseline that you have defined. It can also revert changes back to the compliant profile if a change has been made. Not only for Windows machines, but also Unix, Linux and ESX hosts as well. Which means you can define a profile for a host/vm. If someone (an ambitious power user who has the correct rights) decides to bump up the amount of RAM on the VM for better performance - but is not allowed to - then if the item goes out of the compliant profile, you can be alerted, and if you would like you could revert that change back automatically. I have note really looked at the requirements for the product regarding resources, or cost, but it should be something to keep you eye out for.

One last walk around the Solutions exchange - one more t-shirt from Splunk. I finally got to see a real live UCS - and I must say that I am impressed, the simplicity of the product makes it all worthwhile. now just to convince upper management.

A round of goodbyes to all the people I have met that I previously only knew by their Twitter handle and I started my journey back home.

And that was day 4 - the last day of VMworld Europe.

One more blog post to come to summarize my experience. And no - I - did - not - win - an - IPAD!

Solutions Exchange was still buzzing - it was quite amazing how much you do not notice the noise when you are down on the floor.

I then joined ALT2004 Building the VMworld Lab Cloud Infrastructure - Dan Anderson. Absolutely amazing!!

I so enjoy these sessions which are given by extremely technical people, who do the every day job that I do as well, and it is great to hear how they did it, what were the issues they came across along the way and how they dealt with the issues.

A quick summary.

A large part of the lab infrastructure is running on Nested ESXi

This was the Infrastructure in San Francisco

And this is what was set up in Copenhagen

There was another full datacenter set up in the Bella Center itself - but was not used. (redundancy)

Storage used was iSCSI, NFS, and FC

Cloud Lab is an internal product by two individuals at VMware employees who developed it as Dan put it "on their second day job"

They were running some 64 bit VMs under the Nested ESXi hosts. Something which is not possible to day. (and of course he did not tell us how it was done)

Dan's Manager "volunteered" him for the job of creating a Whitepaper on the whole process.

After the session the Q&A continued further in to what exactly that limitation is that does not allow a 64bit VM to run under a nested ESX.

If you do download the sessions, this one is a must.

After this session, I returned back to the blogger lounge - to be asked to join a short 1 minute piece on the VMworldTV channel with John Troyer.- I will post the the link to the video when it becomes available.

I sent some time at the VMware party, music, drinks and food were a`flowing.

This I think is one of the most hidden secrets and most unused featured that is bundled with vCenter. In essence this could enable you to basically do almost anything you want in your Virtual Infrastructure. But the issue is that the documentation for the product was, IMHO, severely lacking until the current release and besides a few how-to's on how to set it up - the knowledge is practically non-existent.

I would like to point you all to the VCO Team site who have been releasing amazing content on how to make better use of the product.

The first part was the installation process of Orchestrator which has been simplified even more than previous versions.

The Lab continued on how to create workflows to deploy VM's and move VM's from one location the other.

I would have liked to see a more advanced level for giving more insight on how to configure and create workflows and how to present them to the end user through the web interface, maybe at VMworld 2011

I was extremely impressed by the product. The lab gave a good overview, precise explanations and the enough amount of hands-on to see an example on how to load balance or NAT a VM, amongst other things.

I started to walk around the Solutions Exchange floor. SWAG, SWAG SWAG!! Mike has started a worthy cause of the The SwagBag Competition.

I had two very interesting demo's. One with the VP of Xangati who have what seems to be a very promising product. The second was a demo of Splunk. They actually had the funniest t-shirts. Seriously! This is something I would like to investigate further after the show.

I then spent some time in the Blogger lounge where it was great to put a real face to the Twitter handles that we all use. Great discussions about all sorts of things are happening the whole time. @jtroyer (so much taller than I expected) is busy with the VMworld LIVE video feed and interviews throughout the day.

A good part of the day was spent in the blogger lounge @h0bbel was kind enough to be the blogger lounge photographer.

I then went to the BC7803 Planning and Designing an HA Cluster that Maximizes Virtual Machine Uptime - which was packed solid - Duncan Epping

2010-10-12

So day one started with @esloof sneaking into the Bella Center at the crack of dawn and sneaking some pictures in from the solutions exchange. The first 20 Seconds you can can see how quiet (AND EARLY) it was.

I arrived at the Bella Center at around 9.00 and completed registration without hardly having to wait line. There are at least 12 stations with two computers each, for registering. Very simple process. Put in your name, show your ID and that's it.

Wi-Fi connectivity sucked - BIG TIME. Until (I gather) someone decided to increase the DHCP scope for the Access Points, and that was only after 14.00.

I snuck in some pictures of the Exhibition Floor myself

The labs were opened early today, and it was good to see some familiar faces. I sat and completed LAB13 Lab: VMware vCloud Director - Install & Config. One of many I hope.

The interface was as good as they said. Two monitors per seat, response time was good. The content of the lab was helpful and as I said to @DuncanYB, we all like to play with the stuff (yes even us bloggers), and it is not every day that you get to have the opportunity to have someone prepare a lab environment for you, instead of you having to do it for yourself.

The Technology Exchange day was quite productive. I was especially interested in the session PPC-04 vSphere APIs for Performance Monitoring, Ravi Soundararajan (Sr. Staff Engineer) and Balaji Parimi (Staff Engineer), both of them really know their way around the API, and gave a good session.

Using the API for POWERCLI work, there are several takeaways from this session that I took with:

Getting Performance metrics from the API is extremely easy (when you know what you are looking for)

You can do it easily with POWERCLI

When you environment grows, you will have to change how you do it with POWERCLI, other this will get messy, very very fast.

I gave two short Video Interviews with VMware, one for Technology Exchange Day, and the second for PowerCLI.

It is finally great to put a proper face to all the Twitter names. There are too many to mention.

I do have to say, that even though the inconvenience of security checks in Ben-Gurion Airport are frowned upon by some, I am really shocked about how the rest of you in the world could have done without the checks before 9/11.

Tomorrow a relatively early start. I will most probably go into the Tech Exchange Day, and I have a meeting or two in the Bella Center.

2010-10-04

I have been waiting for confirmation from the VMware Certification Team regarding this and it came in today.

For those of you who remember, when the track was released there were a number of people already in the process who wanted to upgrade to the new track. The assurance back then was that people already in the track would receive a discount on the new exams.

After exchanging emails back and forth with Jon Hall and the Certification team, asking about this discount, I received this in my inbox today.

If you have passed the VCE310 exam (but have not yet passed the VCD310 exam) and are a VCP4:

Register for the VCAP4-DCA exam at a discounted rate. Become one of the first VCAP4-DCA certified professionals, and fulfill one of the pre-requisites for the VCDX4 certification.

Stuxnet for those of you have been unaware to the what has been happening with one of the most talked about worms since Conficker is an industrial rootkit targeted at Siemens software.

Now conspiracy theories aside:

Who was this targeted at?

Who created it?

I started to think how can or could this reflect on a Virtual Infrastructure?

What would happen if you had a worm / virus targeted at virtual machines or even worse at the ESX host themselves?

Let me give the following scenario.

You have a 30 Windows VM’s each using 1 vCPU on your ESX host. The host has 2 Quad core Processors, a pretty sensible VM:core ratio (3.75). All of a sudden every single VM on the host – jumps to 100% CPU utilization – which in turn brings your host to a stall.

This is a perfect DOS attack but in this case it is directed at the ESX host by causing the load on the VM.

This was also an issue before without Virtualization, but with all of the VM’s spiking at the same time - this issue is extremely amplified.

The thing is though it does not have to be a virus with malicious code that causes the load. You could easily do it as well with Calc.exe (which is on EVERY SINGLE WINDOWS MACHINE)

Run calc.exe and switch to scientific mode

Type a large number (eg. 12345678901234567890), press the 'n!' button.

Calc will ask to confirm after warning this will take a very long time

100% CPU utilization will now occur (essentially forever)

For a Linux machine

cd /usr/src/linux

make -j8 &

ping -l 100000 -q -s 10 -f localhost &

100% CPU utilization will now occur (essentially forever)

I am sure that this can be scripted in some way.

So here you have it, a DOS attack not only on the VM – but also on the host itself.

I am not sure that the AV vendors will recognize this as malicious activity, because you are running a legitimate Windows Application.

Security experts are wary of the dreaded Blue Pill which compromises the hypervisor. How ready are the Vendors against such a kind of attack? How ready are the Hypervisors for such an attack?

You will get alerts instantaneously regarding the abnormal CPU usage. But it will require manual intervention to solve the issue.

I would hope that in the future there will be an option to perform operations in an attack like this. It would require some additional Intelligence built into the product.