If you’re writing GemFire cache.xml files you’ll probably want to refer to the GemFire DTD files for valid syntax. These are hosted on gemstone.com but don’t seem to appear in Google searches so I’m putting this blog in place to facilitate finding them.

If you’ve touched a server in the last 5 years or so you probably know that automation matters a lot. There are a lot of things that might prevent full automation when you deploy software into production, and the situation tends to be especially bad for commercial software. Maybe it’s defective installers, maybe it’s manual EULA acceptance or clickthrough agreements, account logins that you might need to reset, or maybe it’s software that’s not designed to work out-of-the-box as soon as you install it. Thankfully most of this stuff is slowly and steadily dying out.

With vFabric one of our goals was to enable fully automated deploys in a way that fits into the broader context of tools you’re either already using or seriously considering.

For vFabric 5 we’ve focused our attention on enabling automation on RHEL. For now Windows users are a bit out of luck, partly because the automation capabilities on Windows are not as developed as they are on Linux. Look for this to improve in the future though.

Using yum to install vFabric software on RHEL.

Step 1 is to attach your RHEL system to our public repo. The easiest way to do this is to add a special RPM we have created that creates the appropriate files:

All the various vFabric software is there and ready to be installed. Let’s install vfabric-gemfire and see what happens.

I get a summary of what will be installed including vfabric-gemfire and vfabric-eula. Let’s answer y to this question (equivalently we could have said yum -y) and see what’s next.

This is the standard vFabric EULA agreement you would click through on the website or if you ran the GemFire JAR installer. One option we have is to scroll to the bottom of this EULA, accept it, and be on our way. One other thing to note is that the EULA is only accepted once per OS, so if we install some more vFabric stuff we don’t have to go through this again.

Obviously, though, we can’t automate this. How can we do a true unattended install?

Doing a true unattended install on RHEL with yum.

We put a mechanism in place that allows the EULA to be “pre-accepted” through use of a particularly worded file. Here’s how it works.

If you put the text string I_ACCEPT_EULA_LOCATED_AT=http://www.vmware.com/download/eula/vfabric_app-platform_eula.html into a file called /etc/vmware/vfabric/accept-vfabric-eula.txt you will not be manually prompted to accept the EULA and vfabric software will install unattended. (You really should read the EULA we worked very hard on it.)

One of many ways to do that is as follows:

Here’s what a GemFire install looks like after you’ve done that:

No prompting. It’s pretty easy to see how you could integrate the entire process into a Chef recipe or a Puppet script, or even into more traditional software management systems. The repo is also set up so you can host it internally using Spacewalk, for example.

Other reasons to do yum installs.

There are a few other reasons to do yum installs of vFabric software, particularly when going into production. For instance, necessary user accounts will be created, init.d scripts will be created and so forth. Let’s take GemFire as the example again, if you installed it via its JAR installer all this work needs to be figured out by the end user. At this point in the evolution of IT operations the way init scripts should work and the way user accounts should be set up is pretty well baked, it’s not an area that users should be trying to innovate in, and it’s better to have all this stuff “just work” out-of-the-box when you install.

If you’re using GemFire in your software, there’s a good chance you’d like to build it using Maven. GemFire is hosted in a Maven repository but the details don’t seem to be documented, so this post is a bit of a “Missing Manual” entry.

It’s pretty easy, you need to set up the GemFire repository and add it as a dependency in your pom.xml.

If you want to see all the versions of GemFire available you can on the repo root page.

I used this in the course of putting together a Continuous Query client for GemFire, which lets you run SQL-like queries against GemFire that returns matches in real-time as data enters GemFire. So you can run something like “SELECT * from /myregion WHERE totalsale > 100″ and any time an object is inserted into GemFire that has a totalsale value greater than 100 you’re immediately notified of it. It’s pretty interesting and the client is a nice way to interactively discover the system’s capabilities.

One other thing that might not be totally obvious is that GemFire is free for development use, there is a non-expiring license to use it for up to 3 connections (clients or members in the distributed system), so feel free to try it out, play around give feedback.

Disclaimer: These thoughts are my personal opinions. Since I work on vFabric these days and not vSphere I’m not very familiar with the specific reasons behind any choices made with respect to the new licensing scheme.

I missed the big launch yesterday because I’m hanging out in India with the GemFire and SQLFire teams, but I got up this morning and saw the discussion is completely dominated by talk of the licensing changes. That’s too bad because vSphere 5 has some pretty cool new features that aren’t getting the attention they deserve, for instance Storage DRS and SRM failback.

The new scheme is quite different from anything we’ve seen and any change is naturally met with skepticism. I think people who are worried about the new scheme should consider what would have happened if VMware maintained the status quo. vSphere 4 was not licensed per host, rather it’s “per CPU socket up to X cores” where X depends on the license level, 6 for Enterprise and 12 for Enterprise Plus if I recall correctly.

The big push in hardware is to increase core counts as fast as possible. Predictions are core counts will follow a Moore’s law-like trajectory of doubling about every 18 months. I know there are 10 core CPUs on the market today and much more dense CPUs are in the pipeline. So consider that in a few years you’re going to have computers with 32 or 48 or maybe even 64 cores per CPU. If you’ve got a computer with 4 CPU sockets each with 64 cores you’d need 44 licenses of Enterprise or 24 licenses of Enterprise Plus to have the host in compliance. (The formula is ceiling(# cores / max cores at license level) * number of sockets. Get that??)

Now consider that each host in your inventory is going to have different hardware profiles and require the same kinds of calculations. What a mess!

The fundamental thing is that to deal with this mess VMware had to change to some sort of pooling mechanism. Other vendors are going to have to do similar things. These sorts of host-by-host calculations on discrete boundaries are just unsustainable as IT environments get more and more complex. Pooling is the only answer. The only other option that is tenable is per-host licensing, but you won’t see that from anyone because it can’t monetize big hosts and small hosts differently.

VMware could have chosen to do core-based pooling or memory-based pooling. As it happens they chose memory-based, I don’t know the specifics of why, these hardware assets seem to be on the same growth trajectory. But my understanding is the amounts of memory they chose (24/32/48 depending on license level) were chosen so that most common current hardware configurations would not be affected.

So if you’re worried about vSphere’s new licensing scheme, consider the mess you would be in without this change, and also bear in mind that you were going to need to buy a lot more vSphere 4 licenses to enable the next wave of hardware anyway, because of exploding core densities. With pooling the operational aspects get a whole lot easier.

Update: In case you missed it, the vRAM licensing scheme was changed with higher ratios and some other tweaks. Read about it on the official blog.

A couple of products I work on, SQLFire is one of them, have WAN features for linking different datacenters together with asynchronous replication. Whenever I tell people about it the first thing they ask is what happens if the same data is updated in both datacenters at the same time. I don’t really know all that well, so I wanted to tool that would help me learn more about it, I needed a WAN emulator.

I looked at a couple of WAN simulators, specifically WANem and Dummy Cloud. WANem gave me a pretty bad impression, it’s text UI is extremely buggy and I didn’t even notice that it is supposed to be driven by its Web UI. Maybe the rest of the product is better but I’m looking for something as simple as possible. Dummy Cloud looks pretty good, and is loaded with features. Too many features actually, for what I want, and the free version doesn’t let you do really high latencies. So I decided to roll my own. I packaged the results as the WANatronic 10001 virtual appliance you can run on VMware Workstation, and probably ESX and Fusion as well.

Goals:

Can work with nothing but a laptop (no actual network required).

As small and lightweight as possible.

Simple, as close to zero configuration as possible.

Really cool name.

I think things turned out fairly well. Namely:

Traffic sent directly to WANatronic 10001 is sent right back to whatever host it originated from. If you ping WANatronic 10001 you’re actually pinging yourself. Same with SSH, etc.

Packaged as a VM needing 32MB and a download size of about 170MB, which unfortunately is small as far as virtual appliances go.

There are 4 things you can configure. If you’re lazy and don’t configure anything, WANatronic 10001 will still work.

That’s pronounced WAN-a-tron-ic ten-thousand-one in case you were wondering. Just try to forget that name. See? You can’t do it.

Using WANatronic 10001

Possibly the most interesting thing, and the thing I think I’ll use it most for, is simulating slow links on a single host. In this case there is nothing you need to do on your computer. In the screenshot above my WANatronic is at 10.24.0.13. Any traffic I send to that IP is sent right back to me. So if I have two processes on my host that I want talking slowly, let’s say one is on port 8000 and one is on port 9000, all I do is tell the first application that its peer is on 10.24.0.13:9000 and tell the second one its peer is on 10.24.0.13:8000. Done.

WANatronic can also function as a router-on-a-stick, all you need to do is route traffic through WANatronic. For example Google’s free DNS is located at 8.8.8.8. If you want to simulate being on a slow network between here and there run:

ip route add 8.8.8.8 via 10.24.0.13

Your IP address will be different from mine, check the console to see what it is. You may need to adjust that command based on your OS. With that set up, ping 8.8.8.8 to see the effect of latency and packet loss.

Last but not least, configuration is via the console since there is no way to remotely connect to WANatronic. Changes are automatically saved and applied.

Technical Mumbo Jumbo

WANatronic uses Ubuntu 8.04 JEOS as a base. I used this older distro to cut down on the size of the virtual appliance, which unfortunately is still huge. For the most part WANatronic is a very basic use of the iptables and tc utilities. The WANatronic source code is hosted on GitHub.

There is one exception that required a little bit of science. I really wanted any traffic sent to WANatronic to be NATted directly back to me. Without this I would need to use multiple network interfaces which I wanted to avoid. As far as I can tell there’s no way to do that with iptables, so I made a custom version of the iptables NETMAP target that would replace the packet’s destination address with its source address in PREROUTING. After that the packet is NATted so it appears to originate from WANatronic. This is the magic that lets you talk to yourself on what appears to be a very crappy link. Perfect for a demo that needs to reside completely on a single laptop.

Stuff It Doesn’t Do

Plenty of things really but things that come to my mind include:

Static IPs (seems useful)

Routing stuff like OSPF (not going to happen)

Remote management (not sure it’s worth it)

IPv6 (does anybody actually use that)

Really fancy stuff like packet reordering (not likely)

If you’ve got comments or would like to see something let me know.

In case you scrolled to the bottom for a download link here it is again: WANatronic 10001.

I tried not to get a swelled head when Stu Radnidge conferred divine status upon me some years ago. After all Stu grew up in the Australian wilderness and I suppose riding a kangaroo to school every day could lead to a warped sense of reality.

So it was refreshingly deflating to find that nobody even noticed my VMworld session (2917 vote for me). In fact permit me to helpfully reproduce what you would have seen if you had voted for me as you should.

Elastic Memory for Java is a project I’ve been helping out with for the past year or so that aims to let you run Java apps on ESX without having to give them all their memory all the time. Java apps run great on ESX, so long as they have all the memory they want. If they don’t, things can go south really quick.

This is a big problem because Java likes to be lazy about cleaning up after itself, and prefers to hold on to every piece of trash and dead memory in sight until it nearly collapses under its own weight. It’s sort of like the software world’s answer to the Collyer brothers.

This behavior also explains why traditional ballooning in ESX doesn’t work well with Java. EM4J is different because it’s designed to work within Java and has intimate understanding of what’s happening in the Java app. If you wanna find out more, you gotta come to the session.

Why is VMware doing this you ask?

Efficiency is in.

We’re on the precipice of an app explosion. We’ve already seen it in consumer space, more special purpose apps that individually do less than monolithic apps of the past. This is infiltrating enterprise and when it does we need an order-of-magnitude or two more consolidation to cope with it.

Successful Succession?

The future of ESX’s COS is still a contentious subject of speculation. And with its future in doubt, so is the future of the ESX administrator’s most beloved friend and constant companion, esxtop. Esxtop’s heir apparent, resxtop, remains plagued by platform restrictions and unclear compatibility. Can it really be a viable replacement?

These both appear to use some form of secret, undocumented API. It’s not clear why such a vitally important data source needs to be shrouded in mystery. Thankfully, the following cable was intercepted by a brave and intrepid soul whose name I have sworn to anonymity. It sheds substantial light on the nature of this hidden API, and will hopefully prove an invaluable reference.

The aforementioned PowerCLI team provided some rudimentary discovery mechanisms, but they may not dig deep enough. For example, LucD notes on his blog that he found only 392 counters on his system, whereas the cable we obtained identifies 425 counters. Is there some dark conspiracy at work here? Worse, many of the counters appear to give misleading information, seeming to need adjustment before they can be used. Hopefully the information captured in this cable will resolve these questions once and for all.

What We Found

Our intercepted cable revealed a treasure trove of information. Not only did it give us a full readout of all statistics, but their names, the way they correspond to the names shown in esxtop, and possible clues for how to adjust misbehaving counters. We rely on all of you to expose this data to the light and help us unravel this mystery.

vFabric GemFire is a Java-oriented in-memory distributed caching platform. To translate that into English, it’s a way of sharing and processing that’s extremely fast because the data is kept live in memory, and is also extremely scalable because GemFire applications can automatically discover and share data between peers over the network.

GemFire can run peer-to-peer or standalone. When you’re using it peer-to-peer you embed GemFire into your application. When the application launches, it discovers all its peers automatically via the network.

You run GemFire standalone through something called cacheserver. In the standalone case you launch one or a few cacheservers and then a large number of clients can connect to it. This mode resembles what you’d see if you used something like memcache or redis. This mode is good if you need to centralize data, write data to disk or if you want to perform continuous queries.

GemFire includes an extensive example set that gives you a good sense of what’s possible. Unfortunately I found the getting started guide a bit hard to follow. Part of it was just the usual Java BS: any time you’re asked to set environment variables to get software working you know you’re in for a rough ride. But beyond that there are various different piece parts you have to collect and configure before you can really get started, and the way to run the examples themselves is not very clear, the directory structure of the sample files seems inconsistent with the way the instructions tell you to run the samples. Maybe that was my error but I bet a lot of people would make the same error.

To try to help things a bit I wrote a script that tries to completely automate the demo experience. The script downloads all the necessary code and then presents a menu where you can select any demo you want to run. Some of the examples require a cacheserver, and in those cases the demo script will automatically start and stop one for you. There’s nothing to think about, just run the script and it should work (leave me a note if it doesn’t!)

Download the script or just copy and paste it from the box below. After you download, just run it and you’re on your way (see below for a note on system requirements).

You need a system with Python, Java and unzip all in your path (no environment variables, please!). Other than that it should work anywhere. Leave me a comment if it you try it and it doesn’t work. Otherwise have fun checking out what GemFire has to offer, don’t forget to check out the continuous query and function execution examples!

Why do I have to pass a CAPTCHA test after I log in to my account with my exceptionally strong password? This is bullshit. I don’t feel like squinting to figure this crap out. Nobody else on planet earth is insane enough to do this nonsense and you guys need to cut it out. Not even my bank, you know the thing that controls, like, all my money and stuff, makes me do stupid shit like this. You guys are a 2nd rate social networking platform. Do you really think the junk in my linkedin account is more important than my finances??? Reality check time, people.

Not to mention that I use keypass and never use the same password on two different sites. The risk is extremely low without your unwanted roadblocks. In fact if you can figure out my password, go ahead and knock yourself out. Update my profile to say I’m a professor of scatology at UC Berkeley and that my hobbies include raping kittens. I don’t care because there is nothing of value within my linkedin account. Keep going the way you’re going and there will be nothing of value in any linkedin account.