Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Interpreting data we capture for process improvement and coaching teams is hard. Sometimes you think you understand what you see, only to have that taken away from you. Sometimes, If it walks like a duck and quacks like a duck, it could still be a rabbit. Perspective and context matters.

Data isn’t evil, people are evil. Sure, sometimes data can highlight something we would rather not be true, but this doesn’t make it wrong to capture and analyze data in proper ways.

No matter how strong you are, being embarrassed is painful. When data is perceived as evil, its most commonly because it has been used as a blunt tool for coercing some behavior. Never do this, and never make data the tool to embarrass people. Not just because its wrong to that person, because it causes data to make better decisions to go underground. If you embarrass someone using data without context, then you are now just seeing a partial picture in the future – and your decisions will be sub-par.

If that’s not enough, Deming puts the people aspect of process impact around the 10-15% range versus 85-90% for the system. A metric managing the system response is far more likely to return benefits.

A good way to assess how data is being used is to ask the simple question, is that data being used to make a difference? Or is it just being used to make a point, to say I’m better than you.

To make a difference, data should tell a story. The readers should be able to follow the argument and train of thinking, and see something they weren’t expecting. A story doesn’t just jump to the last page to see whether little red riding hood lives happily ever after, there is a buildup of the characters. Your data stories should do the same. They should take the reader on a journey and leave them with the moral to the story.

Lets see a couple of examples of taking boring data and turning it into a story. Obligatory cat photo to make sure this deck is tweeted.

Boring tabular data. The domain of Excel. Also the format NOAA makes weather data available to the public. Pretty boring stuff, even for weirdo data geeks like me. But the clever Ivo an avid kite surfer turned this data into this [click]

Beautiful. But it gets better, the wind speed and direction and temperature are animated [click]

You can feel the weather pattern in my hometown of Seattle. Color helps me quickly interpret areas of higher wind speed. I can change altitude and see how the jetstream will impede or help my flight to Atlanta. I can see what the wind and weather is like here in Atlanta. I can see clouds and rainfall, temperature or barometric pressure. I can see how this is expected to trend in the past or future. I can see the story of the atmosphere and weather better than reading it from an excel text table.

Impressive, but hard for someone of my skill to do. Lets see one that we all could have built using Excel. The Wall Street Journal did a feature piece on Vaccination. They took boring health data from the University of Pittsburgh and turned it into this [click]

Even without the annotation, was the Polio Vaccine effective? Introduced in 1955 you can see the story of how every state in the US saw a decline of infections for every 10,000 people. The picture tells the story of how within 5 years the disease infected fewer people as the number of immunized increases and thrarted a debilitating disease into the rear vision mirror.

Edward Tufte – author and researcher on visual communications. Every number we show or hear should be put into the context “Compared to What?” A single number is meaningless. Single numbers make us susceptible to drawing incorrect conclusions. We need to know if the value we just got is better, worse or comparable to something else to see a story.

In our prior example, the authors managed to pack four “Compared To’s” into one graphic. State versus state. Year versus year, Before and after an event, and occurrence rates versus others in color.

And to see it wasn’t just a fluke, here is the same chart for Measles. Gone in under 5 years after vaccination introduced. Source: Battling Infectious Diseases in the 20th Century: The Impact of Vaccines By Tynan DeBold and Dov Friedman Published Feb. 11, 2015 at 3:45 p.m. ET http://graphics.wsj.com/infectious-diseases-and-vaccines/ http://www.tycho.pitt.edu

Our last example shows census data. The age of the US population in a recent census. Steve Wexler took this boring data and put a clever spin on it to help create a personal story for the viewer. [click]

He let you put YOUR age in and see where you fell. Are you older than more than 50% of the population? Are you over the hill? I am. I really wish he had said YOUNGER than 37.3 of the male population. His point is that as long as its quiet and nobody else is around, you want to see how you compare to others like you. If for no other reason than to be “less over the hill” than your spouse or sibling. I’m both!

So that’s the theory. Lets look at how LeanKit took an existing report built-in to the application and applied these principles. Steve Wexler and myself works with the talented internal Visualization team to prototype and build the following visualization to replace an existing report as part of a training course.

Here is the current report. Pretty standard scatter plot of cycle times at the bottom, and a running average of cycle times for each type of work at the top. Design wise, these reports have served customers well. Being 46, the dots are a little small to hover over with my arthritic hands, and its pretty bland. It also doesn’t answer a lot of the questions i have around the time based aspects of software development.

So, I crowdsourced a set. I used Twitter to get feedback on what questions most resonated with my followers. Crude, but it quickly gave some weight to the possible questions we needed to answer. What we found was more comparison of similar work was key. And the ability to zoom in and out on date groupings, day, week, month for example. Leankit didn’t do wither of these very well in its current form.

So, taking input from the market, the following four questions were key. And we had three major “Compared to What” vectors. Card Type, Time Periods and My Cards. Now we had an understanding of the story we wanted to tell, it was time to do some research.

Edward Deming has published many books about his work in managing manufacturing processes, in his book “Out of the Crisis” he goes to good lengths to discuss that when looking at a process, your first job is to do no harm. If the process is “in-control” leave it alone! Touching it causes things to likely get worse. A graphic caught our eye. It is a run chart showing spring manufacturing data. Each dot represented a tested spring, and a downward trend can be seen even though the variation shown by the marginal histogram looks perfectly symetrical. In Deming’s words, you can assess a process unless you can see the change over time and the distribution of variation. The scatter plot alone isn’t enough. The histogram of frequency of result isn’t enough. You need both.

So we turned our attention to a paper prototype. And wondered what if we put marginal histograms like Deming on the right side and the top of a scatter plot. Would we tell a fuller story about the process and how “stable” it is. And that’s what the team did. [click]

Although this looks more aesthetically pleasing by not yelling in ALL CAPS using harsh colors. This visualizations packs a lot of information. Lets look at the layout.

Across the top are the time period buckets. Users can choose day week or month, we default it to week. And they can choose to show all types of work or just selected ones, we default it to all types [click] Under that is a bar chart showing how many items were completed in that period. This is throughput. We can see by glancing at the bar height how completion rate is changing over time. [click] Then our scatter plot. To avoid all of the dots being overlapped and not accessible by cursor, these dots are randomly jittered in horizontal location. Hovering over them gives popul information about that specific piece of work.. This is technically a jitter plot. [click] The key story though is how the average reference lines in each period trend over time. We try and help the user follow the path from left to right giving them the average value of cycle times for items completed in that time period.[click] On the right hand side if the marginal histogram showing cycle-time distribution. This is an area of my research, how the cycle time histogram relates to Agile process factors, but most people will totally ignore it! [click]

And here is how I interpret the story of this team for this period. Demand is decreasing, and this lower demand is helping stabilize the cycle time of the items to a little less than 2 days on average. There are a few outliers to understand, in case we can solve the root cause of them. The cycle time distribution is following an expected shape. Of course, I’m the only one that knows that a dev-ops kanban team should expect an Exponential distribution, which this would have been if not for that clump of late january items all closing around 20 days.

And this is another direction I think we might be heading. Help the user feel the flow of work. This is a jump plot invented and mocked-up to show software process flow. The green jumps across the top is work flowing forward, underneath, work flowing back. The height of each arc represents average cycle time, and thickness represents count. In one plot, you get to see the story of cycle time, throughput and status steps. Amazing piece of story telling for understanding a process and its flow.

Source: https://public.tableau.com/views/JumpLineExamples/SDLCDetailedJP?%3Aembed=y&%3Adisplay_count=yes&%3AshowTabs=y&%3AshowVizHome=no Amazing work by Tom VanBuskirk and Chris DeMartini. See JumpPlot.com

If you think its too hard or you don’t have enough data, your wrong.

I set out to prove this by setting myself a challenge to see what I could do with just completed and start dates. You can get the result for yourself by downloading the spreadsheet from bit.ly/Throughput with a capital T. Again, all these links will be tweeted at @t_magennis

Here is the input page. Completed date is mandatory. Start date is optional but well advised. And the type of work. Is it planned or un-planned work is needed to see how these different types stack up. This is optional. Most people don’t use it, I’ll leave that up to you.

The spreadsheet from just these three inputs creates 17 different charts so far. [click] Throughput rates and histograms. Cycle times values and histograms. Work in process rates, arrival and departure rates, cumulative flow diagrams. And some rather complex mathematical analysis of the cycle time histogram that fits the data to the Weibull probability distribution thanks to John Cook, someone who everyone in this industry should have heard of. It does all of this without macros. Everything is straight formula based. And all from 2 dates and one type column that’s optional.

Here are some examples. Here is the throughput chart. Intentionally kept light weight in design. Intentionally designed to draw your eye along the journey not focus on the values. This is the throughput trend week over week.

Here is a chart I use a lot to understand coaching opportunities. How much unplanned work is a team encountering. Helps to understand what external pressure is fighting for a teams attention and to help come up with a balance of managing planned versus unplanned work types. I can see this team is slowly driving down its un-planned percentage. Hopefully by using better triage practices and helping external parties get their work into the next sprint rather than interrupting this one.

The obligatory cycle time chart. Although now I showed you what LeanKit have planned, this is a let down. Still, not bad from a couple of dates. The percentile marker helps the team communicate an expected service level agreement. You don’t like 95%, change it to 85%. Its just a value.

And this is a new experiment. It tries to visualize the story of supply versus demand. Above zero, the team is completing more than its starting, below the line, starting more than completing and growing WIP. This is intended to tell the flow story, and help the team strive to balance starting and completion by staying close to zero in the center. I’m hoping this becomes more of a go-to chart than Cumulative Flow for teams who want to focus on consistent flow of value.

Moving onto talking about metrics and teams. We often think its about having the best and brightest superstars. To counter that thinking, what percentage of time do you think the league MVPs for baseball and basketball belong to a winning championship team? If it was just having the BEST player (whatever that means), it should be pretty high, above 50% at least.

Well in Baseball, its 23% in the 1930 to 2012/13 timeframe. And not recently. 1984 and 1988 depending on the league.

Basketball does better. 37%. Smaller team sizes? Who knows the cause, but still under 50% roughly 1 in 3 or 4.

It takes a team to win championships, so lets focus some effort on metrics that help form that team.

I have a simple tool to help visualize skill capabilities. Its – you guessed it – a spreadsheet. You enter a list of required skills, and it helps produce a simple paper survey sheet. The team members assess their ability to teach others on a skill, to perform that skill, or to be willing to learn that skill. It takes this data and aggregates it into a heatmap. [click]

I have a simple tool to help visualize skill capabilities. Its – you guessed it – a spreadsheet. You enter a list of required skills, and it helps produce a simple paper survey sheet. The team members assess their ability to teach others on a skill, to perform that skill, or to be willing to learn that skill. It takes this data and aggregates it into a heatmap. [click]

Along the continuum from green is “safe” to red is “at risk” for each skillset based on how many teachers, do-ers, and novices you have available. Being red isn’t bad. It just means you better hope demand doesn’t rise for that skillset. If it does, train up a novice or two.

[click]

The spreadsheet gives coaching advice based on what it sees, helping you as a coach or manager triage risky skill gaps and single points of failure. By proactively telling the story of skill capability of your teams, you position them to be resilient with changing demand.

One goal of mine when coaching teams using metrics is helping them make smart trades between competing forces. Maximizing performance in all facets of a process is stupid and likely impossible without gaming. I’m far more impressed by teams who trade something they are super good at for an incremental improvement in another area they are struggling. As a coach, we can help teams assess and make these trades.

Larry Maccherone spearheaded some research whilst working at Rally with collaboration with CMU/SEI. The Software Development Performance Index. The SDPI framework includes a balanced set of outcome measures. These fall along the dimensions of Responsiveness, Quality, Productivity, Predictability.

Each of these are opposing forces. Its unlikely any team will excel at all of them. Increasing productivity beyond team capability will likely cause a decrease in quality. Responding faster likewise, could mean corners are cut on quality with fewer tests or less testing. By making sure you track data trends in each area will help the team creeping net positive.

Here is one dashboard produces using the SDPI principles. Ii’’ talk about the quality dimension in a second, but the others are

Productivity, in this case throughput, dark green for story work, pale green for defect work. You can see the team traded some story work to rapidly burn down some defect debt nearing the end of a major release. I’d have been happier as a coach if they made that balance earlier.

Responsiveness is defect cycle time resolution. I want the team to have a clear picture of how long it may take to burn-down defect debt. In this case, the team managed to move from 10 days down to 3 days once they moved through the most difficult ones. A great indicator as a coach that the remaining defects are small in nature and effort.

Predictability is a little more complex a measure. If you just use standard deviation of the productivity number, bigger teams would be disadvantaged. The quickest way to control for team size is by dividing the standard deviation by the mean, a measure called the co-efficient of variance. This of it as the standard deviation of standard deviation.

Credit for this Visualization goes to Isaac Obezo.

Quality is always difficult. I like to consider some measure of ongoing delivery ability. Often though it comes down to escaped defect counts. To make this axis useful to the team in a coaching context, I forecast how long to zero defects given THIS teams cycle time average. During planning the team can quickly see “if we did nothing but defects we would need x days as a team.” A great discussion to help focus on technical debt removal before taking on more story work.

Key here: Make it personal about the team. THEIR cycle time, THEIR defect count. THEIR team size.

Team to team comparison is a dangerous pursuit. But lets try and do it safely anyway! This dashboard helps team compare their trend with similar teams in a company. By looking for patterns in the trends across the 4 different SDPI categories, contextual coaching advice can be displayed. But the data is noisy, so we quickly found removing the line cleared up the mess. [click]

Now it clear to see how “MY” teams trend in orange compares to the rest of the company shown in grey. No axis values, just trend. Every value is normalized as best as possible to fairly compare apples versus apples. Teams should focus on the steepest adversely trending category. They should trade something from the steepest favorable trending category. Smart trades, based on comparison against teams in the same context.

Sure, it could be misused, but the way the categories work against each other its not possible for any one team to have a stable favorable trend in all categories. Or if they can, I haven’t seen it.

My name is Troy Magennis, I’ve been in software for 25 years now, from QA through to VP Architecture and Development for companies like Travelocity and Lastminute.com. Most recently I formed my own company building tools and running training on software development forecasting and risk management solutions. Feel free to take notes, but the slides and examples are available to you online. And as a special benefit for joining us today, you can download the software used throughout this session for free. Bit.ly/agilesim will take you to the right site. I wrote a book about these topics, “Forecasting and Simulating Software Development Projects” and I’d like to make sure you all got a free PDF copy of this book also. Just download it from the same location.

I set out to prove this by setting myself a challenge to see what I could do with just completed and start dates. You can get the result for yourself by downloading the spreadsheet from bit.ly/Throughput with a capital T. Again, all these links will be tweeted at @t_magennis

1930 to 2012 = 82 years 19 nat league, 19 am league

Last time 1988, 1984

And this is another direction I think we might be heading. Help the user feel the flow of work. This is a jump plot invented and mocked-up to show software process flow. The blue jumps across the top is work flowing forward, underneath, work flowing back. The height of each arc represents average cycle time, and thickness represents count. In one plot, you get to see the story of cycle time, throughput and status steps. Amazing piece of story telling for understanding a process and its flow.

Source: https://public.tableau.com/views/JumpLineExamples/SDLCDetailedJP?%3Aembed=y&%3Adisplay_count=yes&%3AshowTabs=y&%3AshowVizHome=no Amazing work by Tom VanBuskirk and Chris DeMartini. See JumpPlot.com

15.
Time and Pace related questions
1. Is it taking us longer to do the same type of work?
2. What is a good commitment cycle time to others? (SLA)
3. What is and how stable is our completed work rate?
4. Where should we focus improvement efforts?
• Compared to what?
• Compared to the same type of work versus all work
• Compared to the same time period last week/month/year
• My work compares to others (only seen by me so I can improve)

16.
“If anyone adjusts a stable process, the
output that follows will be worse than
if (s)he had left the process alone”
Attributed to William J Latzko.
Source: Out of the Crisis. Deming.
Q. Is the process stable? First, do no harm.

36.
Quality
• Goal is to keep the TEAMS
within 10 days of releasable
• Forecast has to be personal for the team
• Days = Open Bugs x Avg(recent cycle time samples)
Number of Devs on team
“If OUR entire TEAM did
nothing else but fix bugs
this sprint, at OUR
historical rate, we would
have x days of work”

65.
Responsiveness
• Average or median of the number of days between
two dates for items closed within a period
• Cycle time or Lead time of ???
• If reliable first touch date, use that
• If just created date, then use P1 and P2 bug
“If something urgent comes
along, how fast can we turn
that around”
@t_magennis | Bit.Ly/SimResources 81

66.
Completion Rate
• Team goal is to maximize number of COMPLETED
items, not started items
• Count of items completed each period
• Don’t celebrate bug throughput (as much)
“What is holding us back on
completing more. Lets discuss
dependencies and blockers in
the retrospective”
@t_magennis | Bit.Ly/SimResources 82

67.
Predictability
• How much variation there is each week in throughput,
normalized by “team size” in a rough way
• Coefficient of Variation = Mean/SD
“How consistently do we
deliver value?”
@t_magennis | Bit.Ly/SimResources 83