Blogroll

Scheduling

In my training sessions, I often use sports metaphors for project schedule modeling – this, despite my utter lack of any sort of athletic ability and pathological indifference to most sporting events. The thing is, sports events are great examples of predictive models.

Take the World Cup for instance. If I were to ask you who would have taken the Cup way back at the beginning of this thing, you would have selected from 32 teams. At a basic level, we could say that each team has an equal chance of winning. Of course, we can make the model more complex. We can look at individual player performance, historical records, playing style, locations in which matches will be played….. When we throw all of those together, we can create a model of the top 5-10 favored teams.

What is this doing? It’s adding information to the model to predict the future – and providing a range of potential outcomes. Like any schedule, as I learn additional information, I include that in my model and refactor the results. From where I sit at the time of writing, we have now gotten down to the quarterfinals. Of the 32 teams that started, 8 are left. Clearly, I can remove 24 of the original candidates from the list of potential winners.

Simply by removing those 24 from the list, I have greatly increased my chances of being correct about who will win the championship. Can I tell exactly who will win? No, but I can apply my data modeling structure to the remaining teams and come up with a much better prediction than when the games started.

Extrapolate that forward, and as each game is played, the potential outcomes for the competition narrow even further. Eventually, we’re left with two teams – which greatly simplifies the modeling. Finally, at the end, we know who won.

Apply this model to a schedule. We forecast a range of potential outcomes, with a more probably outcome and a less probable outcome. As the project plays out, the range of potential outcomes diminishes. The most probably path becomes more and more apparent.

Another metaphor I commonly use is a toothpaste tube. If you assess the amount of risk, or uncertainty, associated with the schedule. As events increasingly transform from the potential future to the recent past, the amount of uncertainty goes down. It’s the same as squeezing toothpaste from a tube. At the end of the day, there’s no uncertainty as the project is complete. All of the risk has been squeezed out of the tube.

Figured this out a while back for the forums, but then promptly forgot and shot myself in the foot.

Consider the following scenario:

You modify the default site template to have more than one task list – for instance, if you plan to have workflow deployed on the site – which requires a task list.

You publish a project with that site template.

You create a schedule for that project.

The schedule does not appear on the site. You don’t get that cool Project Summary view.

It turns out the schedule is sync’d to the first task list it finds on the site alphabetically. If you have a task list named something like, My Workflow Tasks, the schedule gets sync’d to that list.

The solution is to always make sure the task list that you want to sync with the project comes in first in the alphabet. Optionally, the other method is to modify the Project Summary Webpart so it points to the correct list – although I’d probably check out option #1 first.

‘Ware the resource hoarders in your organization. You’ll know them because they’re the ones who refuse to provide detailed task estimates for their projects. When pressed for details on their resource plans, they’re like as not to produce an Excel chart that shows the resource allocated evenly for 32.75 hours across the length of the project.

If you see a spike in a given week, you might ask something like “What are they doing in March?”…only to be told that they’ll be working on “Release stuff.”

“Great,” you say, “Where’s the schedule that shows when the release will occur so we know if that spike in resource demand will move out if the release moves?”

And the hoarder responds with, “Well, we’re not getting that detailed with our planning.”

The issue here is that the hoarder has been given resources by the organization. “You will have X resources for 6 months,” the organization declares. The response, logically, is to ensure that, on paper at least, we will use those resources.

The problem is several fold here:

Resource hoarding becomes a self-fulfilling prophecy. If I commit to a resource for 40 hours a week, and I don’t plan what that resource will work on…..I end up filling that resource’s time with make work and inanities. Mind you, I’m not saying that committing to an FTE is bad….I’m just saying that it needs to be justified. (And yes, I would consider a formal Agile methodology to be justification enough, as, when implemented properly, that methodology puts in place the controls required to ensure the resource is working on real work.)

The hoarders are never pressured to link their resource plans to the actual schedule – meaning that changes to the schedule don’t connect to the resource plan. As a result, resources may not be available when the project needs them – or worse, they have extra capacity that could be otherwise used while they wait for a delayed deliverable to appear in the queue.

The thing is, in organizations that tolerate and/or encourage resource hoarding, you’ll have all sorts of latent resource capacity sitting under the radar. That capacity represents a tremendous opportunity cost in the work that is not getting done – because the resource is dedicated to a project that may not be really using them. Start prying into resource visibility, and you’ll be shocked at how much excess capacity scurries out from under a rock.

Teach scheduling to any class of IT project managers, and the number one request is invariably “How do I make my schedules more predictable? How do I accurately predict important events such as application releases?” There’re really two answers to that question – neither of which are mutually exclusive:

Only use dedicated resources on your projects. It’s the interplay between break fix and new builds that dooms project schedules in your “normal” IT department.

Needless to say, option #1 is a nonstarter in most organizations. The challenge with the latter is that it requires an organizational shift. It requires different communication techniques and different reporting. Instead of reporting on a forecast release date, I report on a target release date and the probability that I might actually hit it. That’s the cultural shift required to move towards more predictable scheduling.

Nowhere in IT is this cultural shift more evident than in the relationship between the project manager and the release manager….

What Do You Mean I Can’t Release Tomorrow?

Release managers are incredibly annoying with their change processes, their release reviews and their insufferable insistence on having actual documentation. They represent the worst of organizational process and present an obstacle to getting my application in production and letting me finally wipe my hands of my current project and move on to something new and interesting.

Not only that, but I think they only care about increasing the costs of any project I’m on – whether it be by requiring an inordinate amount of hoops to jump through to actually get an approved release window, or forcing my team to spin cycles performing extra validation and testing on our application. I mean, we know it will work. It worked in Dev, right?

These are natural sentiments for a project manager. Over the months and years of a project, the project manager learns to look out for his project. He becomes focused on his project. And as he knows, his projects are always far more important, far more critical to the organization at large than any other project within the portfolio.

Look to Big Oil for the Answers

A couple of years ago, I had the opportunity to dip my toes into the water of oilfield maintenance scheduling. It was actually my first engagement upon moving to the Houston oil patch. For those of you not familiar with maintenance scheduling, it looks something like this:

Whenever a new facility or piece of equipment is commissioned, a series of maintenance routines are loaded into a Computerized Maintenance Management System (CMMS). These routines might call for quarterly inspections, monthly cleaning procedures, filter changes, belt changes, lubrication, etc. Think of your local Starbucks, and all of the equipment they must maintain – and then multiply that by all of the Starbucks in your market, and you’ll get the picture.

The CMMS spits out tickets whenever they’re ready to be worked. This feeds into the scheduled maintenance backlog. Each time a ticket is created, a timestamp is recorded, and the age of the tickets begins to be tracked. I don’t know what a “typical” backlog is, and assume that varies by industry, but figure a backlog of anywhere from 6 months to several years is reasonable – depending on the priority of the ticket and the industry.

Those tickets go to the various work crews that do the work. As they have a set of specialized skills and equipment, the same work crew would be responsible for supporting multiple facilities throughout the oilfield. Going back to our Starbucks example, these are the tier 2 support team that are responsible for the hard core equipment maintenance (assuming they exist. I know next to nothing about coffee house logistics). It’s up the work crew to optimize their schedule so that they can meet their performance metrics.

Enter the facility operator role. The facility operator is in charge of the safe and productive operation of his facility. His job is to ensure the facility meets its production goals and reduces or eliminates any reportable health, safety or environmental incidents. The facility operator has the authority to turn away a work crew that arrives on any given day if the work they are to perform is considered too dangerous….i.e. a simultaneous operations (SimOps) issue, where one crew is performing work in close physical proximity to another crew. For example, one crew might be welding while another crew is venting flammable gasses right next door. One crew might be slinging steel overhead while another crew is swapping out equipment belts below. For obvious reasons, SimOps issues are to be closely monitored and avoided at all costs. Turnaway metrics are also monitored, as they entail cost: the work group geared up, schlepped out to the facility, and then got turned away – which could potentially kill an entire morning or a full day of productivity.

Hence a lot of organizations have moved to a system where the CMMS tickets are pushed into a scheduling optimization system. Trained schedulers then review all of the tickets and assign target dates based on priority, geographic proximity, risk and other factors. They “bundle” maintenance by the same crew in the same area to ensure increased productivity. This system optimizes maintenance ticket throughput, reduces turnaways (inefficiencies), and mitigates risk both of individual harm and the systemic risk of a high maintenance backlog.

In essence, what we have here is a number of systems being optimized for their own goals:

The CMMS is identifying the tickets that must be performed and tracking the aging metrics.

The work teams are focusing on their own daily marching orders.

The facility operators are focusing on the safe operations of their facility.

The schedulers’ role is to balance the needs and wants of those groups with the overall priorities of the organization – which in an oilfield is typically mitigating risk of both reportable incidents and production stoppage.

And IT Folks Should Care Because….

Release management is essentially the same series of systems, writ small. The project manager is focused on the local optima, getting that release out the door as soon as possible. The release manager is looking at all of the releases coming down the pipeline and optimizing across the entire infrastructure. Everyone is just performing their natural role in the systems, and the release managers represent the organizational check of a project manager’s native optimism.

Generally, the way we see this playing out is that the project manager, through the development of the schedule model, creates a prediction of when the application will be ready for release. That date is then negotiated with the stakeholders and the release manager to pin down an approved release window.

That release window is then inserted back into the schedule as a target or constraint on the actual release event. Mechanically, within Microsoft Project, that looks something like this.

See how I’ve inserted the release window as a Release Target – which is basically a repurposed Deadline field? I then add a Start No Earlier Than (SNET) constraint on the successor tasks. In effect, this adds buffer to my schedule, as I can now track the Total Slack on the Release Activity. The lower the Total Slack, the greater the chance that I’ll miss my release window.

I’d point out that adding buffer is nothing specific to IT project management. I’ve recommended a similar approach to predicting pipeline completion dates in a drilling scenario. That helps us avoid the dreaded WOPL, where the well is completed, but Waiting on Pipeline.

…Bringing it Full Circle…

So, what we have in the interplay between project and release management is the same interplay we see in oilfield maintenance. We’ve got two different systems that interact but are focused on different optima. The project system is focused on getting the release out as soon as possible. The release system is focused on ensuring the enterprise infrastructure doesn’t break. The goal then is to incorporate into our project scheduling the feedback to and from the release management process.

In yesterday’s post, I proposed a model for assessing work and assigning an appropriate lifecycle. That’s great in theory, and for those folks who deliver framework workshops as it allows us to spend a couple more hours diagramming stuff on the white board and adding lots of arrows and boxes and circles. But what does that mean at a more tactical level? How do multiple lifecycle models allow us to get closer to answering the two questions that prompted this discussion:

Do you also think that all schedules can or should follow the *same* (arcane or standardized, old or new, blue or red, agile or fatcat) schedule model?

…and….

How far ahead are you able to do almost perfect, good and less great predictions in your schedule? How does your scheduling model affect your prediction capabilities?

I’d contend that yesterday’s post addressed question #1, i.e. not all work is created equal, and the lifecycle should be tailored to the work. That leaves the second question, which essentially boils down to asking how to solve the conundrum of being able to meet organizational estimating and control requirements while still maintaining a flexible lifecycle model. This post is intended to address that question, i.e. to identify how we can have our cake and eat it too, by using iterative models and still meeting the organizational estimating requirements.

Hence, this post is more tactical in nature. My goal here is to talk about how to actually structure a system to work with the models proposed yesterday.

Work Authorization Systems

Typically, as the project progresses through the business case development process, the risks are assessed and the appropriate model assigned. We would need to incorporate an assessment of which lifecycle model would be appropriate into our work authorization systems.

How would that impact the organization? Realistically, we might not wish to authorize a project that maps to a model we’re not familiar with. Perhaps we need to focus on hiring resources that can actually manage projects like this. Or….we may wish to restructure the project to mitigate risk, i.e. to shrink the project size by chartering each iteration separately and reducing the overall scope of the effort….or extend the business case development process to include the initial prototyping.

At the end of the day, process is inherently an exercise in risk mitigation, and the process applied to the work should be commensurate with the level of risk that has been identified in the work. Flagging the project appropriately from the beginning allows us to route it through the appropriate estimating process.

‘Ware the Bean Counters

Which brings us to the bean counters. You know who they are. They’re the folks that want to know how many resources will be used and when they’ll be required. They’re the bogeyman that project managers use to justify not developing detailed estimates, but simply peanut buttering resource requirements across the project lifetime of a project. “Bean counters,” it is argued, “do not understand iterative planning. We must give them standard CPM schedules, even if we know they’re wrong.”

This is a fallacy. The reality is that the bean counters need to know what resources are required and approximately when. They’re not the ones looking for near term resource contention. That’s the functional managers. We need to differentiate between the two main goals of our estimate consumers:

Short term resource contention – looking for potential resource conflicts in the next three months.

Long term resource availability – looking for availability of specific roles over the next 3-12 months.

These inherently are two different goals. We must review our estimating behavior to ensure that we can meet both of those goals. The implication however, is that when building my detailed schedule, I need to absolutely focus on those resource assignments I can define in the near term but I can be a bit more vague about the resource assignments I need after that immediate planning horizon. That methodology will still meet the needs of my stakeholders.

Untangling Scope and Schedule Control

But organizations like estimates. Organizations like some sort of static prediction of the future – regardless of whether or not its correct. Hence, we need to make certain assumptions about the future. The question in IT typically comes down to a question of how well our scope is defined before we can be confident about our estimates.

There’s no absolute answer to this question, but I would say that the scope must be defined to the point that I can confidently point to the upstream budget and either validate it or invalidate it based on my refined definition of the scope of the project. In a waterfall project, I typically depict that process as follows, with the two lines representing the various levels of control over the scope of the project. Once the blue and red lines intersect, the scope could be said to be “under control.” The project manager has defined scope well enough that a change – which would put us outside of the agreed upon parameters of the project could be defined as a change.

The goal of planning an estimating process is to move the white line from the left to the right until the project manager has defined enough scope to determine if the budget is adequate and if a new requirement is identified, that this would result in a change. If your schedule estimate can do that, can identify near term resource contention, and can meet the long term role-based predictive needs of the organization, you’ve successfully estimated your project.

Critical Path, Shmitical Shmath

So can a critical path schedule be developed for a long term project with only a partially defined scope? Probably not, although I can define a near term critical path, and couple that with some longer term tasks for the more undefined future. Remember, the critical path is the beginning of the scheduling process, not the end. The critical path is the starting foundation to perform risk analysis and then load uncertainty into the schedule through the use of buffers and probabilistic analysis.

But what about iterative projects? Can I define the critical path on a project using an agile methodology? Should I define the critical path on a project using an agile methodology? I prefer to use a different method for estimating agile projects. The trick to estimating these kind of highly iterative projects is to define the path to get the project kicked off and started; and then some of the high level design. After that, I continuously perform an assessment: do the costs of my dedicated development resources exceed the value and opportunity cost of the remaining development effort? If the answer at any point is yes, that project should be terminated. If the answer is no, the project is generating more value than cost and it should be continued.

Hence, with iterative development we want to plan out the project just far enough until that inflection point is visible. If we assume a dedicated team, then we can effectively predict our costs over the lifetime of the project. If we don’t have a dedicated team, then good luck, make your best guess, and modify it on a weekly basis. Understand that inaccurate estimates are almost always the result of multitasking team members.

Reporting Concerns

Flagging the lifecycle applied to the project is also critical from a reporting standpoint. Reports typically roll up process compliance indicators to the PMO or executive staff. These reports would have to be amended to call out the various compliance points for projects adhering to different lifecycles, i.e. the milestone report that would work for waterfall projects may not work for iterative projects. Reports need to be tailored by execution model.

From the resource management perspective, resources may also be allocated or estimated differently for each of the lifecycle models. For example, a waterfall methodology may incorporate task based estimating with or without rolling wave for long term future tasks. A more iterative approach with dedicated resources would essentially allocate capacity in an operational method, allocating resources until we reach the inflection point between cost and value. Some sophistication would be required to roll up the data effectively across the various execution models.

Going back to the comment that kicked off this entire diatribe…

How far ahead are you able to do almost perfect, good and less great predictions in your schedule? How does your scheduling model affect your prediction capabilities?

…the answer would be as far enough as I need to meet the organizational requirements – knowing that I can modify my estimating methodology to match the work lifecycle just like pairing wine and cheese.

Do you also think that all schedules can or should follow the *same* (arcane or standardized, old or new, blue or red, agile or fatcat) schedule model?

…and….

How far ahead are you able to do almost perfect, good and less great predictions in your schedule? How does your scheduling model affect your prediction capabilities?

…which leads me to conclude that Magnus is hanging out with the project managers I was working with last week who asked me the same questions – or simply made the same assumptions about what I’m going to say before the workshops even started.

Let me state this unequivocally at the beginning of this conversation:

Not All Work Is Created Equal

No lifecycle model fits all projects. One of the leading causes of failures for most PMOs, if not the leading cause of failure, is the ignorance of that fact and the attempt to shoehorn all work into the same model. Lifecycles should be paired with projects like wine and cheese (or in Texas, artisanal beer and brisket).

There, I’ve said it. Let’s just put it out there as an opening premise and build from that. What’s interesting is that I was about to write that “No process fits all projects,” but on reflection decided not to. Because we can have a single high level process that captures work, categorizes work, assesses risk, and then assigns a lifecycle model according to predetermined triggers. In that case, arguably there is a single process, but it’s at a high level…which brings us to the federal model of process design that stipulates that the process is defined at the level appropriate to the process. (See here and here for a couple of 2011 discussions of this model – which I note in retrospect also earned a comment from Magnus.)

So the trick is to pair the model with the work to be performed, which generally, but not universally implies that we’re talking about work being performed in the IT domain. Why IT? Because, IT is one of those domains that typically manages work that could map to different lifecycle models. As I’ve spent the last few years bouncing back and forth between building pipe and building code, I usually contrast this to construction, which typically starts with a better defined scope than most IT projects.

A Triage Model

The best model I’ve yet to find to break down lifecycle models in a manner that’s easily explainable is a model I found years ago in Terry Williams’ Modeling Complex Projects which references a 1993 model by JR. Turner and RA. Cochrane. Reproduced in the form of a graph here, it provides a useful framework for assessing the risks inherent in a project.

In this framework, a project may be assessed by how well the scope of work is known vs. how well the method is known by which we will achieve this scope. In the top left, both elements are well-defined, i.e. we’re using proven technology to deliver a project scope which all of the team members and stakeholders are familiar with. I’d place pipeline construction in this quadrant. In the IT world, this may typically map to an infrastructure project, where we’re deploying X number of servers or virtualizing hardware that’s gone off lease.

In the top right quadrant, we might have something where the scope is undetermined or hazy, but the technology is pretty well known. For instance, we’re using an existing database platform to deliver an application to support a brand new business process. The process is probably still being defined, but we have experienced staff familiar with the technology.

In the lower left quadrant, we have establish scope but a method that is new to the organization or the project team. For example, we might be porting an existing application from an older technology to a new technology – while keeping the functionality the same.

And last but not least, if we’re looking for the trifecta (Bifecta? Difecta?) of managing systemic risk, we might go for the bottom right quadrant, developing with a new technology to meet an uncertain scope.

Pairing Scope Management Models with Lifecycles

Now, the discerning reader at this point would ask if we’re looking at scope management models or risk management models. Essentially at this point, it’s the same thing. Much of the risk is introduced from the scope, and step 1 in any risk assessment exercise would be to ensure that we’re implementing the project management structures in a way that they will effectively manage risk – which if you’re following a “one methodology fits all” practice probably does not apply.

So what lifecycle maps to each quadrant of the model? Far be it from to be an expert on the nuts and bolts of specific lifecycles, but I would submit the following as a starting point:

Scope

Method

Lifecycle

Known

Known

A traditional waterfall lifecycle is appropriate to this scenario. “We’ve done similar work before and can estimate it without too much difficulty.”

Known

Unknown

Shorten the design cycle, but extend the development cycle. Plan multiple quality checkpoints into the project design to ensure the team is meeting its goals. Ensure plenty of testing is included in the project. Consider an iterative development cycle with multiple internal prototypes reviewed before showing them to the stakeholders.

Unknown

Known

Plan for multiple prototyping sessions to ensure the requirements are being met. Release early and often to allow the organization time to acclimate to the changes and to refine their own processes accordingly per the magic stone model.

Unknown

Unknown

See above but with more stringent quality control and even more system design and release iterations.

Now is Agile the solution to all of this? I know some people that would claim yes. I’ve always taken a more moderate approach, which is to say that a properly managed Agile methodology is appropriate to some of these lifecycle models – provided that you’re using an appropriate form of project management rigor and have dedicated technical resources. I certainly wouldn’t just label the project as “agile” because a simple waterfall methodology is inappropriate and then jettison all rigor and process. That would be wrong.

Folks like me often get a lot of push back from project managers as we work with their PMO to ramp up the quality of their schedules. Most often, I see complaints not about the schedule itself, but about the seemingly arbitrary list of arcane rules required to ensure that the schedule prediction is underpinned by established modeling best practices. Examples of such rules might be that every task should have at least one predecessor and successor, tasks should not exceed a specific duration, or that the update methodology must be strictly adhered to. You know, simple DCMA 14 Point Assessment stuff.

A schedule developed without following these rules may still be accurate, insofar as the dates may actually be realized as reported, and the data all supports the organizational reporting requirements. The model however, the underlying logic, is invisible. It’s all happening in the PM’s head and then being reported through the schedule mechanism.

What we have at this point is a reporting schedule. It’s a schedule created to meet the minimal organizational requirements and to document the key dates of the project. It does not, however, capture the underpinning logic of the schedule. It is missing the schedule model.

A couple of years ago, PMI introduced these two concepts, the concept of the “schedule model,” or the logical predictive model of a schedule, and the concept of the schedule itself, which is a static snapshot of the schedule model at a specific point in time. For example, I create the schedule model in my favorite scheduling application, with all of the dependencies and sparing use of constraints, etc. Then, every week, after updating my model, I generate my prediction of what the future will look like. That prediction is my schedule. The schedule is refreshed each week with the output of my updated schedule model.

Are these predictions correct? I don’t know that anyone can ever say a prediction of the future is correct. The more accurate question is whether these predictions are valid. Are they an accurate reflection of everything we know about the work to date? In fact, that’s my litmus test for validity. Can I look at your schedule, and ask you, point blank, “Is this the most accurate prediction of the future based on what you know today?” If the answer is anything other than yes, I would consider the schedule to be invalid.

Let’s take that and apply it to a typical audit scenario. In this scenario, you, the project manager, are telling me that the dates are all correct and valid per your latest understanding of the project. This statement is something that I cannot challenge. Well, I cannot challenge this statement – with the possible exception of calling out tasks completed in the future or incomplete work still scheduled in the past. What I can do is ask how the model was developed, to which the response is invariably, “It’s all up here.” with a finger tapping the temple.

In essence, what you’re telling me is that your schedule model is all in your head. It may be valid and it may be invalid, but I, as an external observer have no way of identifying that. Your schedule model is hidden to me, and therefore, unless I trust you implicitly, I can’t trust your model.

This is why we have schedule audits. This is why we have DCMA checkpoints. Because while it would be nice if we all just had a little more trust of each other, given the high cost of projects in the world today, that’s a luxury that many organizations simply cannot afford. And all of those audits and quality assurance processes come to nothing when the schedule model is hidden and we’re only allowed to see the schedule.

So in the end, while you can show me a schedule that’s resource loaded and has all of the key organizational milestones attached to it, you can’t show me your schedule model. You see, it’s all in your head, it’s all mental. And that’s why I can’t make a judgment about whether or not you have a valid schedule.