Month: April 2009

Today I received a rather alarmed email from one of my customers who’s on the faculty at a large research university here in the US. Apparently, email originating from the server I’m maintaining for this customer is being bounced by the mail servers at the educational institutions that are users of our software.

Examining the bounce messages, I find that they’re originating from anti-spam appliances sold by Barracuda Networks, Inc. Each bounce message contains a URL pointing you to an explanatory web page, which indicates that the messages are being bounced because the outgoing email servers for the Engineering department at this large university have been listed in Barracuda’s “bad reputation” blacklist. There is a laundry list of reasons cited as to why these mail servers may have been listed, but no clear indication of the actual offense that caused these specific servers to be listed.

However, there is this little highlighted tidbit on the web page:

One way to get your email through spam filters even if you are listed on the BRBL is to register your domain and IPs at EmailReg.org. Email administrators can configure their systems to use EmailReg.org to apply policy to inbound email. Emails from domain names and IP addresses that are properly registered on EmailReg.org can be automatically exempted from spam filtering defense layers on Barracuda Spam Firewalls, preventing your email from being accidentally blocked.

Surfing on over to EmailReg.org I discover that getting your server address “properly registered” requires a $20 “administrative charge”– apparently per server. Furthermore, it seems that EmailReg.org is at least receiving hosting equipment from Barracuda Networks. There is little other information to be found regarding who exactly is behind EmailReg.org.

But let me tell you what it smells like to me– it smells like a “protection racket” being run by Barracuda Networks. They can add arbitrary senders to their “bad reputation” blacklist and then prominently advertise the services of EmailReg.org as a mechanism for being removed from the blacklist. Judging by the number of bounce messages my client is receiving, being blacklisted by Barracuda devices cuts you off from sending email to a significant number of organizations. Many companies, even legitimate senders, will likely pay the $20 just to avoid the hassle. If, as I suspect, Barracuda Networks is receiving some commercial gain from EmailReg.org, then this is conduct of the lowest order.

I have filed a complaint with the US Federal Trade Commission, asking them to investigate this matter. I urge everybody who has had similar experiences to file similar complaints with the appropriate organization for your jurisdiction.

I was recently asked to make a guest appearance on a podcast related to information security in “the cloud”. One of the participants brought up an interesting anecdote from one of his clients. Apparently the IT group at this company had been approached by a member of their marketing team who was looking for some compute resources to tackle a big data crunching exercise. The IT group responded that they were already overloaded and it would be months before they could get around to providing the necessary infrastructure. Rebuffed but undeterred, the marketing person used their credit card to purchase sufficient resources from Amazon’s EC2 to process the data set and got the work done literally overnight for a capital cost of approximately $1800.

There ensued the predictable horrified gasping from us InfoSec types on the podcast. Nothing is more terrifying than skunkworks IT, especially on infrastructure not under our direct control. “Didn’t they realize how insecure it was to do that?” “What will happen when all of our users realize how easily and conveniently they can do this?” “How can an organization control this type of risky behavior?” We went to bed immersed in our own paranoid but comfortable world-view.

Since then, however, I’ve had the chance to talk with other people about this situation. In particular, my friend John Sechrest delivered an intellectual “boot to the head” that’s caused me to consider the situation in a new light. Apparently getting the data processed in a timely fashion was so critical to the marketing department that they figured out their own self-service plan for obtaining the IT resources they needed. If the project was that critical, John asked, was it reasonable from a business perspective for the IT group to effectively refuse to help their marketing department crunch this data?

Maybe the IT group really was overloaded– most of them are these days. However, the business of the company still needs to move forward, and the clever problem-solving monkeys in various parts of the organization will figure out ways to get their jobs done even without IT support. “Didn’t they realize how insecure it was to do that?” No, and they didn’t care. They needed to accomplish a goal, and they did.

“What will happen when all of our users realize how easily and conveniently they can do this?” My guess is they’re going to start doing it a lot more. Maybe that’s a good thing. If the IT group is really overloaded, then perhaps it should think about actually empowering their users to do these kind of “one off” or prototype projects on their own without draining the resources of the core IT group. Remember that if you let a thousand IT projects bloom, 999 of them are going to wither and die shortly thereafter. Perhaps IT doesn’t need to waste time managing the death of the 999.

“How can an organization control this type of risky behavior?” You probably can’t. So perhaps your IT group should provide a secure offering that’s so compelling that your users will want to use your version rather than the commodity offerings that are so readily available. This solution will have to be tailored to each company, but I think it starts with things like:

Pre-configured images with known baseline configurations and relevant tools so that groups can get up an running quickly without having to build and upload their own images.

Easy toolkits for migrating data and out of these images in a secure fashion, with some sort of DLP solution baked in.

Secure back-end storage to protect the data at rest in these images with no extra work on the part of the users.

Integration with the organization’s existing identity management and/or AAA framework so that users don’t have to re-implement their own solutions.

Integration with the organization’s auditing and logging infrastructures so you know what’s going on.

Putting together the kind of framework described above is a major IT project, and will require input and participation from your user community. But once accomplished, it could provide massive leverage to overtaxed IT organizations. Rather than IT having to engineer everything themselves, they provide secure self-service building blocks to their customers and let them have at it.

Providing architecture support and guidance in the early stages of each project is probably prudent. After all, the one hardy little flower that blooms and refuses to die may become a critical resource to the organization that may eventually need to be moved back “in house”. While the fact that the building blocks that were used to create the service are already well-integrated with the organization’s centralized IT infrastructure will help, having a reasonable architectural design from the start will also be a huge help when it comes time to migrate and continue scaling the service.

Am I advocating skunkworks IT? No, I like to think I’m advocating self-service IT on a grand scale. You’ll see what skunkworks IT looks like if you ignore this issue and just let your users develop their own solutions because you’re too busy to help them.

One of my personal and professional mantras is, “It doesn’t take that much longer to do it right.” Sure, you can always do a half-assed job on some project just to shove it out the door, but there are inevitably down-stream costs. Whether it’s time lost having to go back and fix your broken junk or customer frustration and negative perception, your short-cut decision usually ends up costing you more in the long run.

Of course you know I have a story to illustrate this principle. During the peak of the dot-com boom I was helping a small start-up company move out of their sub-lease arrangement into their new permanent home. The weekend of the move, we had contracted with a moving company to do the physical move of all the equipment and personal items and I was helping the IT team for the company do the setup, server room build out, telephony configuration, etc. The building we were moving into was a two-story affair, and the plan for the equipment that was moving onto the second floor was to arrange it on pallets, wrap the pallet securely, and forklift the pallets through a second-story window. Pretty standard practice, actually.

The moving company, however, had negotiated a fixed-price bid. So it was in their best interests to get the work done as quickly as possible. Without our knowledge, they decided to throw caution to the wind and not wrap the palleted equipment. Sure enough, the first pallet went up on the forklift and as the pallet was moving forward through the window, one of the computers toppled off the edge and fell a full story onto concrete. The punchline is that the computer belonged to the office manager for the company and was the computer that was going to cut the check to the moving company. We were actually able to recover the hard drive from the twisted chassis and boot it in another PC, but the moving company had to pay a significant penalty that more than erased any savings that they might have achieved from not wrapping the pallets. Also, after that incident, we made them stop and wrap all the pallets anyway, which cost them even more time.

Once the equipment was loaded in, the IT team and I could get started with our part of the project. And it was a big project: from our Friday evening start to late Sunday night, I think we got maybe 12 hours of downtime to sleep. But there we were Sunday night and everything appeared to be ready for business to resume Monday morning.

Exhausted, but feeling pretty good about ourselves, we were standing admiring the new server room and somebody pointed out that we forgot to put the covers back on the cable raceways. There was a collective groan. We were “done”– surely replacing the covers could wait until the following week when we were all less tired? But when we all looked each other in the eye, we knew that the upcoming week was going to be so chaotic that if we put off doing it now we’d never get around to it in a timely fashion. So, without a word being spoken, we all grabbed some covers and spent another half hour or so fitting them into place.

When the execs toured the facility the following morning, we actually did get some compliments about how “neat” and “professional” the server room looked, but I don’t think that’s really why a crew of exhausted geeks spent an extra half hour re-assembling cable raceways. Nor was it because we’re anal retentive or budding masochists. I think it was because (a) we wanted to establish a culture of “doing things right” in the new server room so that entropy wouldn’t set in so quickly, and (b) we knew that going back and fixing the problem later would take longer than doing it right then.

So when you find yourself looking for excuses not to “finish” a project and just “get ‘er done”, take a moment and think about whether expediency is really the best policy. Remember that bugs are always cheaper to fix in development than after the product ships. It doesn’t take that much longer to do it right.

At a recent tech event, I ended up having a conversation with Wendy Kincade and Colleen Dick about how we sometimes let critical projects slide. Wendy said a very wise thing, which is that when people are avoiding a project, it’s most often because they don’t think they’ll reach a successful outcome. In some sense it becomes a vicious cycle: we know we really should be making progress on that big project, but yet we somehow always find other things to be working on that allow us to avoid it. Of course, the longer we avoid it, the less time we have to complete the project and it only becomes bigger and scarier as a result of the shrinking time window.

I’ve certainly come to recognize this behavior in myself. It’s much more comfortable to spend your days doing small, tactical tasks that have short completion times and satisfying outcomes. It’s a huge leap of faith to set out to tackle and enormous project when you’re not sure if you have all the skills necessary to accomplish the task, where the end of the project is not clearly in sight, and where the cost of failure may be high. It gives me an uncomfortable feeling in the pit of my stomach, not unlike the feeling you get when contemplating stepping off from a great height.

When I recognize this feeling in myself, I immediately take steps to start tackling the project, because I know from previous experience that if I let it linger the situation is only going to get worse. Here are some tactics that I’ve developed for getting over the hump:

1. Break it up: Vast monolithic projects are daunting, so break the project up into a set of deliverables, milestones, and dependencies. Then outline the steps necessary to reach each component of the project. You don’t have to create a formal project plan– in fact, I’ve seen people spend all their time grooming a plan in MS Project, just to avoid getting started on the actual work. A simple outline format is fine.

2. Pick an easy one: Once you’ve got a notion of the individual tasks you need to accomplish to finish your project, pick one of the tasks that you think you can complete quickly and get it done. During our conversation, Colleen commented, “I know that if I can just knock one thing down, that gives me energy to push further into the project.”

3. Make it fun: What motivates you? I really enjoy figuring out and mastering new technology. So if there’s a component of the project that requires me to do a bunch of research to figure something out, I’ll tend to do that first plus give myself leeway to spend extra time really getting mastery of that subject. While I need to be careful to prevent turning the research into an avoidance exercise in and of itself, I also know that any mastery I acquire will be useful at some point in the future, even if it’s not directly relevant to the project at hand. Some people reward themselves after completing a particular part of the project– take a break to play your favorite video game, hang out with friends, go for a hike, whatever.

4. Consider past success: Reflect on the fact that you’ve accomplished difficult tasks in the past. Remember the satisfaction you felt when you finally shipped those projects. Use these feelings to reinforce your belief that you’ll be successful at the project you’re currently embarking on.

While I’ve come to know my own avoidance behaviors and learned to take steps to work around them, I don’t think I’ll ever be entirely free of them. I think it’s just a natural human risk aversion response. However, I also recognize that one has to take risks to accomplish great things. I have a quote from the explorer Magellan on the wall in my office and I look at it often:

Unlike the mediocre, the intrepid spirits seek victory over those things that seem impossible… They embark on the most daring of all endeavors… to meet the shadowy future without fear and conquer the unknown.

I would argue, however, that this level of change management is only appropriate once you reach a certain size company. If the company is more than 100 people, you need to have these policies in place and you must have enforcement, or the cost for running the IT team in a manner that benefits the company is impossible.

In startups, where I have spent much of the last decade, the change management systems you have defined above would be overly prohibitive and remove the flexibility that is critical for success.

John’s not alone in expressing this view. I’ve heard similar sorts of comments from companies of all different sizes– some of which were substantially larger than John’s suggested 100 person threshold. But I think that change management is important regardless of what size you’re at, and it doesn’t have to remove any “flexibility” or “agility” from the organization. Quite the contrary, appropriate change management should enable the organization to move more rapidly because it reduces failed changes an unplanned work that suck resources that could otherwise be more productively channeled.

The key word there is “appropriate”. Of course the change management process in a 3-10 person start-up looks completely different from the process in a company with hundreds of employees. In an early-stage start-up you’ve typically got a team of people working very closely together with laser focus on a single line of business. You don’t tend to have the kind of “process flow control” issues that larger companies do, where you need change review meetings to balance competing priorities and competing schedule issues.

But even three-person start ups need to make production changes thoughtfully and with rigor. It’s easy to think “we know what we’re doing” and get yourself into a lot of trouble and cause a significant outage. It doesn’t take that much longer to sit down and write a detailed implementation plan, have one of your co-workers review it, and then execute it (Hickstein’s, “Think, think, think, type, type, type, `beer’!”). And the bonus is that history of implementation plans helps you when you need to grow your infrastructure, because now you have the documented list of configuration changes necessary to produce replicas of your existing systems.

Do I think a three-person start-up needs formal change control meetings? Heck no! If you have regular Engineering meetings, set aside a little time to mention scheduled production updates (if any) and solicit feedback. If you don’t have regular meetings, set up an email alias where notices of production changes can be posted. That way, at least everybody will be aware of the current state of affairs on the production systems (or can refer back to the archives as appropriate), which is critical information for them to know as they’re developing code for those platforms.

I would, however, recommend that you implement some sort of configuration control process on your production systems. It could be as simple as implementing an Open Source utility like AIDE or Samhain, just to keep an eye on what’s happening on the system. Aside from alerting you to cockpit error on the part of your own people, these kinds of tools can also alert you to more nefarious activity and are part of a good baseline security posture.

At some point in the growth cycle of the company, you’re going to start getting feedback from developers that they “don’t care” about the production update notices. Congratulations! You’ve just reached a major milestone in your company’s maturation process– the beginning of separation of duties. This is probably also around the time you’ll be hiring your first full-time IT person, so start soliciting resumes.

Your change management processes will also start adjusting to your new realities. Your new IT person is going to become the keeper of the implementation plans and other change documentation. They’ll probably also start pushing you for more formal outage windows, just so they can have some predictability in the environment. And they’re also going to start pushing back on the developers to keep them from making direct changes on the production systems. Let these things happen.

The next thing you know, you’re going to look up and realize you’ve got several IT folks and they’ve got their own manager. Furthermore, you’ve got several products now being developed concurrently. This is the stage where John suggests that your company needs to start embracing a formal change management process like the one described in Visible Ops, and I agree. Hopefully you figure out you’ve reached this stage before you have a production outage caused by multiple, badly coordinated updates.

Just like the wrong time to fix bugs in your product is after the product has shipped, it’s wrong to try and build a culture of change management from scratch in an established company. It is very hard to change a “cowboy culture” once it’s been allow to establish itself. Visible Ops has a quote from Dr. Bob Doppelt, who was actually speaking of public health matters when he uttered it, but it is nonetheless appropriate: “The righter we do the wrong things, the wronger we become.” The problem is that inattention to change management can appear to work for a period of time– mostly because nobody’s bothering to track the amount of time lost to firefighting and unplanned work. But suddenly an organization wakes up and realizes that they’ve become utterly crushed by the tyranny of unplanned work. Digging out of this hole is painful.

So resist the notion that change management is “only for big companies”. Don’t you hope to be a big company some day? Well you’re not going to receive an angelic visitation complete with fully-functioning change management process on the magic day you somehow cross the “big company” threshold. Better instead to be a small company that believes strongly in change management and grows naturally into a formal change management process.