Measuring What Matters to Innovation

Competing on Innovation

The rate of technological innovation and adoption has accelerated to a level that would have been impossible to imagine centuries ago. As such, an amazing number of startups have been able to succeed from strategic position based on niche-market differentiation and a culture of passionate innovators.

Established – and well-funded – large enterprises have taken note. Adoption, disruption, and even disappointment and abandonment of products are forcing every company to tackle what cloud computing, mobile, and IoT mean for their categories.

Every company is trying to become innovative, because we are all competing based on technological innovation.

Unfortunately, building a sustainable competitive advantage around culture of innovation is far more complex than previous strategic positioning was. Nash equilibriums a few decades ago could be easily resolved with a few long-term commitments that ensured indirect competition and margins for all major players. Innovation could actually be stifled – purposefully – by industry oligarchs to minimize the risk of new entrants. The rate of innovation today has made this control impossible, to the benefit of the consumer.

However, building a culture of innovation requires an entire new way to structure the organization and reinforce its behaviors. This is made even more challenging by the abundance of data available that was inaccessible before. Measuring metrics is the formalization of what decisions are worthy of notice. Thus, to understand what to measure, a leader must know not only how to measure a metric but also why the metric matters to the achievement of long-term goals.

To explore this, I’ve been discussing the connection between individual motivation and the company’s goals as a Minimum Viable Superorganism:

‘Selfish individuals pursuing shared goals (arising from shared underlying incentives), held together by a Prestige Economy which consists of two activities: (1) seeking status by attempting to advance the superorganism’s goals, and (2) celebrating (i.e., sucking up to) those who deserve it.’

The executive and visionary of the company must be the purposeful “mind” leading the activity of the superorganism. The relationship between employee motivation and company outcomes rely on this Prestige Economy, so leading a company means guiding this economy.

Naturally, it easiest for me to discuss this in terms of scaling agile or scrum, or technological innovation in start-ups versus large established enterprises. Discussing each, if you are hoping for very specific metrics for you company, will take the rest of my life so I will leave those conversations to a per-company basis. However, the fundamental issue confusing the connection between innovative teams and enterprise-level accounting metrics is the virtually insane forgetting of what has truly changed in our new and rapidly accelerating tech-adoptive society.

Market Risk vs Technical Risk

If you recall the example of the terrible metric (for innovation-based companies) “allocable resource utilization” we saw the well-established, consistent distribution – i.e. strategic trade-off – between utilization and responsiveness.

If you imagine one very capable engineer you’ll find:

Demands arrive to the employee at a variable rate.

Work is accomplished at a variable rate.

There is one worker.

The possible queue of demands is potentially infinite.

This type of queue is an M/M/1/ ∞ queue.

Whether the “1” here is a server, an interstate, a mobile developer, or a 1990’s movie hacker’s CPU, increased utilization of total potential results in rapidly-accelerating diminished returns as utilization reaches 100%. Likewise, you can occasionally “overclock” with appropriate support and recovery, but constant over-use results in chronically poor responsiveness to demands, a pile up of unmet requests, and finally, a CRASH.

HOWEVER, competing on innovation requires additional variables for our M/M/1/ ∞ queue because it treats “requests” as a discrete abstract entity. It accounts for the possibility, because it is a Poisson distribution, of requests not being the same size or difficulty – technical risk – by stating that work is accomplished at a variable rate.

What the M/M/1/ ∞ queue fails to account for is whether or not the request was the correct request. In the case of a server handling API requests, we could assume the incorrect initial request probability is zero if we believe retrieving unwanted data is the failure intuitive front end or simply user error (the API and the server is not the guilty party). If we are imagining a highway over an extended period of time, drivers who intended to take a different route and got on the highway by mistake are virtually an outlier, and an even less noticeable percentage behave in a way that would impact the flow of traffic.

In technological innovation and the development of any given software product, the risk that the request was the correct request at the time it was made and still the correct request by the time it has been fulfilled is EXTREMELY HIGH. The variable that prioritizes responsiveness over utilization in technological innovation is Market Risk.

After all, when we say “competing on innovation” what we really mean is “responding the fastest to disruptive market shifts while also creating market disruption or new demand and adapting quickly enough to capture value profitably”.

We don’t say the latter, of course, because it isn’t as sexy.

The reality is that a culture of innovation requires a few things that run counter to the leadership methods of old school consulting or manufacturing organizations. Don’t go on a request for quantifiable metrics on these, but an innovation culture requires things like:

Excess allocable brilliance

Willingness and aptitude for adaptation

Vigilant feedback-seeking

Permanent restlessness

The greatest risk of any innovation-based company is not the technical risk of learning to implement what was promised or the project risk of time-to-market or delayed advertising campaigns. The primary risk when competing on innovation is the market risk that, for any new product:

The market knew what to demand

Supply correctly met that demand

The product still met a need by the time it went to market

In the App Store alone there are thousands of new apps per day. The market risk not only for a software product in this one market is unprecedentedly high, likewise shifting immense market risk to every feature added and every update released to your company’s increasingly less loyal consumer base. That is why Scrum works sprint-based, with increments that should always require less than a single sprint (2 weeks, ideally) of development team effort – not to mitigate project or technical risk, to mitigate the market risk that the end users or stakeholder knew what they actually wanted, knew the impact on the overall product, properly communicated it, and still want it by the time it is delivered.

Fail Faster to Succeed Sooner

We can see that when competing on a culture of innovation, receptiveness and relevance are the necessary compliments to responsiveness. This is the most important way in which agile delivers higher-quality software. Code quality, user testing, and market fit are all checked as often as possible. A great Scrum team fixes cosmetic, logic, and intuitive experience problems as they go, looks for feedback immediately about the demand for the feature, then enhances each tiny increment of the product prior to each release. In agile we call this “failing fast” so that we can assure we succeed sooner. The tight feedback loop means creating the right thing very well, based on the newest information available.

Time-to-irrelevance is the greatest risk to every innovation-based project. Not only the market risk of irrelevance, but also the loss of relevant context – both code and product vision – when ensuring the quality of the software and resolving defects or maximizing the return on a feature by improving it before moving on to the next feature.

Metrics that Matter

If we are driving a superorganism comprised of teams that are focused on product innovation – not only in software – we can see there plenty of metrics that will reinforce a Prestige Economy built for succeeding in innovation. I’ve described a handful below. One last note of caution here while these metrics are powerful and valuable, it will still be essential to clearly express who is accountable for each metric and empower that person or team so that they are in control of that metric. These are also defined rather philosophically. If you have a documented Work In Process flow to share, I can give you specifics.

Measurement Goal #1 – Receptiveness

Feedback-to-Answer cycle time – the total process time from the market making a demand to the market receiving an indication of response. In classic “core” Scrum, this may simply be the time from a customer making a request to the Product Owner telling that customer a valid expected release date. In a large-scale environment, this may be the time from a Tweet received by Marketing to the time Marketing announces the planned features in a new update that contains the feature requested on Twitter. To the extent your large enterprise is attempting to compete with small startups, this is the crux of your challenge. An entrepreneur leading a small team only needs manage her or his reputation for accuracy of promises and find a reliable way to ensure that single-mind heroic vision for the product becomes a reality. The cycle time from neuron to neuron is infinitesimally smaller than any scaled cycle time that includes multiple business units, functional teams, vendors, and a PMO.

The Feedback-to-Answer cycle can also be measured at the Work-In-Process level – when a card in a latter step is kicked back to an earlier step, how long does it take for that feedback to receive an answer? If it takes a long time and there is very little work-in-progress, this is a sign that receptiveness is poor. Maybe the Scrum board isn’t visible enough or the daily stand up is not as effective as it should be. On the other hand, if there is a huge amount of work-in-progress, capacity is over-utilized and responsiveness is suffering – setting WIP limits may be necessary (even if only for a short experimental period).

Feedback-to-Answer Quality – This is likely to be a qualitative measurement if used ongoing, and is likely a tertiary metric looked at only occasionally. The most relevant use for this metric is electronically documented support tickets that receive a rating by the requestor after it is closed. The problem with qualitative responses, of course is the possibility that only the most positive or most negative reviewers to surface. This makes this a poor metric for individual or team performance but should summarized more broadly for an indication of the process. Don’t set a target, just learn from the insights.

Supply-to-Demand Receptiveness – From a buzzword standpoint, this is your “Social Listening” as an organization, both internally and externally. From the time you meet a demand, how long does it take to discover that you met the right demand? How long does it take to know if you met the right demand correctly? For software, don’t leave this purely in the hands of social network listening – build into your software trailing indicators like usage analytics, product-wide ratings, and (sometimes) per-feature ratings and feedback soliciting.

In a large-scale product environment, pay attention to listening from all sources. A few new “ceremonies” are going to be needed to encourage collaboration from Epic Owners, program/division-level alignment of a shared backlog across products, and Stakeholder Gathering to solicit additional feedback.

Receptiveness is the precursor to responsiveness. If you aren’t “listening” to your market, you will never respond to demands correctly.

Measurement Goal #2 – Responsiveness

Demand-to-Supply cycle time – This is the traditional definition of cycle time and the best metric that carries over from Lean Manufacturing to Lean Startup principles. From the moment a market demand is made, assuming receptiveness is held constant, what is the total process time until supply can meet that demand. Anecdotally, the highest performance with this metric in a Scrum team I led as Product Owner was on a large enterprise tool. We released to stakeholders twice a week and released to production weekly. A feedback feature was created that gave the users direct input into our product backlog. We were able to respond to improvement requests made on Monday in a fully-tested Production Release that Wednesday. Statistically, these wonderfully short feedback cycles were outliers and relied on circumstances more than team performance. I share that anecdote as challenge to whatever complacence you may have about That said, if average Demand-to-Supply cycle time is greater than 90 days I would challenge you to consider if you are really “listening”.

Demand-to-Supply lead time – This is the traditional definition of lead time and is the time from initiation to completion of a production process. In classic Scrum, that’s the time from commitment at Sprint Planning to the time it is called Potentially Shippable by the Product Owner. This is a once-in-awhile metric that should be checked as an indication of whether teams are sizing stories and committing properly. Whether a team is new or old, they will need extra reinforcement from a manager when average lead time is consistently greater than the sprint length. This is a sign of over-commitment and sprint carry-over. Too much WIP, over-utilization, and poor story sizing will leave a team hamstrung. Quality will suffer, context-switching will breed deceleration of velocity, and burn out will occur. This is at the heart of Little’s Law. When lead time is consistently greater than sprint length, this isn’t a performance metric to track – it’s a trailing indication that management needs to set clear expectations around the trade-off between velocity and quality (including market fit). Set WIP limits and enforce true swarming activities. Because WIP is a leading indicator for Lead Time, reducing WIP should lower lead time back below sprint length.

There is an important call-out here. While experienced Scrum teams know all too well the relationship between team WIP and Lead Time – this only covers the process states for that team. In a scaled implementation, where the stakeholders have an internal proxy voice (imagine a product division large enough to include a Social Listening Analyst on the marketing team) WIP limits at the Epic, Feature, Theme/Campaign, and Product may be necessary. Putting too many items on the Work In Process flow of the Product Marketing Managers that must A) Add Epics to the Product Backlogs and B) Give feedback that the right thing was built and properly fits market demand will create terrible inefficiency in the overall innovative delivery process. The same is true when new Business Development or long-term relationship-owning Account Managers are part of the input and output. No amount efficiency gained by the teams building the products will EVER MATTER without tightening the every other feedback cycle. Restrict stakeholder WIP to ensure they pay attention, provide meaningful feedback, and properly communicate new features and gather end user and customer feedback of their own. If average lead time per story is 7 business days but it takes an entire quarter for a minor enhancement worth millions in revenue to “circle back” through the organization to the development team – YOU WILL LOSE. Pack your bags, a startup is about become a category killer.

Measurement Goal #3 -Relevance

Cycle-Time Feedback conversion – when an end user or stakeholder makes a request, the WIP cycle ought to have a traceable “funnel” for requests making it through to market. This is not a performance metric but a continuous improvement metric. If a product is in its infancy and market share growth is on the rise, but product innovation is stagnant, ask for more requests to go through. If a product is mature and market share accumulation has plateaued, but the conversion rate is extremely high – the process may be building for the sake of staying busy and a new product should be innovated.

Lead-time Feedback Quality – This is another qualitative metric that may be useful in a 360 review process. From the time a team starts working on a product increment to the time it is delivered, each time an opportunity for feedback occurs, what is the relevance and value of the feedback that is given? Putting metrics around this can be very valuable for a short period of time if approval processes and quality assurance are failing. If it is right for your scaled environment and will not cause unnecessary inefficiencies, this can even be formalized and automatically enforced (e.g. make a Resolution Type and Comment Field mandatory when the Product Owner moves a Story card to Closed, with the expectation that the quality of the feedback will be a topic of review by a manager for purposes of mentoring and career development). At scale, think very hard about the inefficiency this may create and its fairness across the organization prior to roll out.

Conclusion

These are a handful of metrics that actually matter for innovation and success. For specifics that apply to your organization, feel free to reach out to me for free advice anytime at andrewthomaskeenermba@gmail.com or Tweet me @keenerstrategy