Risk is a Metric: The Math of Risk - They Wrestle With Tests

Part 1 of Risk is a Metric Series

Background

This series of posts will explore my perspective on risk as a metric and a measure. I’m starting this series with the Math of Risk, I promised a breakdown to Melissa Fisher and Angela Riggs, and wanted to make sure I got it down before it potentially falls down the ADHD black hole where many other half completed blog posts have found themselves.

My Perspective

There’s a common misconception when it comes to risk that it cannot be used, measured, or understood, and that if it can be measured, the measurement lacks credibility or clear definition. I have a fundamental disagreement with this assumption, as I feel risk is foundational to testing and software development in general. Every decision we make is rooted in the notion that we want to mitigate the potential impact and likelihood of failure, in other words, every decision we make is rooted in identifying what the risk is, and how we mitigate it.

Turning risk into a metric requires that we accept that this is a somewhat subjective or qualitative measure. Despite often demanding that metrics be “hard numbers” rooted in quantitative calculations, we’ve come to accept this kind of measure in other metrics such as pointing. Risk measurement has much in common with pointing by the way, both are a relative size measurement based on the intuition of the experts, or put another way, the team.

Numbers Have Meaning

There is an endless list of potential means, or scale, that can be used for measuring risk. You can use emojis, colors, pictures of otters showing increasing levels of panic, it goes on and on, but the only way you can truly turn risk into a metric is to use numbers. I have said in the past, and I’m sure will continue to say in the future, that while there’s lots of scales that can be used, there’s only one that truly has context for anyone who looks at it.

Numbers have a weight and definition, there are no assumptions that need to be made, there’s no ramp up to team personality and culture required to understand them. Numbers represent magnitude, so they can be used to compare risks so that you can prioritize risk mitigation. Not to mention the other benefit we get from metrics besides measurement is the ability to track change over time, and while I think otter pictures are great, I can only measure trends through numbers.

So, there you have it, if you want to use risk as a metric, our scale must be numbers. While the notion of using numbers is not new, the way I use them is slightly different. I need to be upfront, this way of calculating risk I advocate for is not my own creation, I learned this while working at Progressive Insurance and this measurement is their standard way of doing business. I bet you thought insurance couldn’t be innovative, right? And while I didn’t create this formula or scale, I did help to usher in some changes to the way risk is managed and measured over time. Keep an eye out for a future post where I dig into that further.

The Math of Risk

ISTQB and others advocate for two components that are used together to give the measurement of a specific or identified risk. Their two components are likelihood or probability and impact.

Likelihood or Probability

This may go without saying, but the likelihood or probability of a risk is a combination of how often there’s been a defect or failure related to this particular risk or what we estimate the likelihood to be that there will be a defect or failure related to this risk.

Impact

The impact related to a particular risk is what we view the impact of a defect or failure related to this risk to be. The impact is something we don’t dig deep enough into. Sure, we need to know the user impact if the software fails, and we should consider any technology-related impacts (loss of data, etc.) if there’s a defect or failure, but we really should think deeper and broader when we consider impact.

For many of us, there can be financial, regulatory, or legal impacts to a failure. We should all consider the potential impact on the business or the brand identity for a potential failure. Is this a module of our software that, if it fails, we’re likely to see people head to Twitter to complain about? Are we potentially going to find ourselves in the crosshairs of the next New York Times or Wall Street Journal expose if this feature fails? If so, we need to consider this in our rating for impact.

Traditional Method of Calculating Risk

The traditional method of risk calculation is a 1-3 scale for Likelihood/Probability and a 1-3 scale for Impact, 3 being the highest, 1 being the lowest. These two components were then multiplied and there you go, your risk score for that particular risk is ready for you to weigh against others.

For example, 3 likelihood and 2 impact would be a risk score of 6 out of a possible 9.

A Better Way to Calculate Risk

I advocate for adding an additional component and broadening the scale for risk.

Complexity

I find that likelihood/probability and impact are too limiting when considering risk. When we’re building something new, there’s a good chance we don’t fully know what the probability of a failure is yet. And it’s possible that we’ve seen a high level of stability in a feature, but despite it being so far reliable and stable, it’s still complicated and potentially difficult to maintain. Therefore, I advocate for adding in the additional component of complexity. Complexity allows us to consider the whole of a specific risk, instead of just the factors of impact and likelihood.

Broadening the Scale

When using the 3-1 scale ISTQB and others advocate for, I find myself not having enough room on the scale to fully measure the relative size of individual risks. When using this 3-1 scale, our potential risk scores are 9, 6, 4, 3, 2, 1. While there are lots of potential scores on the low end, there’s very few on the high end, this can leave us in a situation where we don’t have enough context to make meaningful decisions based on the risk scores we’re seeing. Put another way, I can’t rely on this the way I want to, to make informed decisions around where to focus my test efforts.

Instead, I suggest using a 5-1 scale for all three components of likelihood, complexity, and impact. But I don’t want my scale to become so broad that I lose context because there’s too much gap between risk scores. So, I instead use a simple evaluation and calculation to make my scale manageable.

The Formula

Let’s consider a risk called shopping cart for this example. We’ve rated the components of our shopping cart below.

Likelihood- 2
Complexity- 3
Impact- 5

To keep our scale manageable, we want to hold to a 25-1 scale. To do this, we’re going to multiple Impact times likelihood or complexity, whichever of the two is higher.

This is going to give us a risk score of 15. Since complexity is a 3 and likelihood a 2, we’ll multiply complexity times impact.

The benefit of this method and larger scale is my new scale is 25, 20, 16, 15, 12, etc. This larger scale enables me to make more informed decisions about this like test coverage based on risk.

What’s Next

If you’re saying to yourself “great, I know the formula, what do I do now?”, don’t worry, I’m not leaving you in the dark, there’s more to come! So far in this series, I plan to break down the nuts and bolts of each step of risk analysis with my perspective on each. If there’s something specific you’d like to read about related to risk, let me know, and I’ll try and include it in the series.

Further Risk Based Goodness

Risk or Fear: What Drives Your Testing
Risk Based Testing: Creating Language Around Risk
Reverse Engineer Your Way to Adopting a Risk-based Testing Approach