Posts tagged ‘tracking’

A Closer Look at Risk Burndown

I like the idea of the risk burndown chart. Burndown is an effective and satisfying visual indicator of progress and it’s relatively easy to calculate to boot. But does looking at a project’s risks through the lens of a burndown chart make sense?

I see several problems with thinking about risk in this way.

Numbers can be Misleading

The first key to effective risk management is to value accuracy over precision. This means that it’s better to be right in your predictions than it is to be spot on correct. Remember, risk is about assessing your likelihood for project success. It doesn’t matter if you miss your threshold of success by a little or a lot; either way you still fail the project!

Pop quiz. Say there are two risks in your project. There’s a 25% probability that Risk A will become a problem while Risk B only has a 20% probability. For now, assume the impact is the same for both risks. Which risk is a greater threat to the project?

That one’s easy. Risk A is a greater threat because, impacts aside, Risk A has a 5% greater probability of turning into a problem.  Ok.  What if I told you that I made up probabilities based on my gut feelings so I could easily rank risks? Now which risk is a greater threat to the project?

The real question I’m asking you is this. Are you willing to bet the success of your project on those numbers? Because if my best guess, gut feeling probabilities are off by more than 5%, the project could be in serious trouble depending on the risks’ impacts.

I know, I know. That was a trick question. Nobody on your team would make up numbers on one of your software projects. In all fairness, nobody goes out of their way to fabricate false values. Use your logics. If you were any good at guessing the probability of futures events occurring, you would not be reading this post right now. You would be a multi-millionaire, off enjoying your gambling winnings from the ponies. Too much precision gives folks too much confidence in the correctness of your assessment when the reality is that probability and impact are based on best guesses and gut feelings. Probability and impact numbers just make it easier to calculate exposure so risks can be ranked automatically.  Burndown is a fairly precise metric.

Not all Risks are Created Equal

If you are monitoring project risk with a risk burndown chart, how do you know whether the right risks are being reduced? Let’s take a look at an example.  Which of these sets of risks should be addressed?

Set 1 with a total exposure of 7 days made up of the following risks:

  • Risk A has a probability of 20% and an impact of 15 for an exposure of 3 days.
  • Risk B has a probability of 25% and an impact of 10 days for an exposure of 2.5 days.
  • Risk C has a probability of 30% and an impact of 5 days for an exposure of 1.5 days.

Or Set 2 with a total exposure of 7 days (6.7 rounded up) made of the the following risk:

  • Risk D has a probability of 95% and an impact of  7 days for an exposure of 6.7 days.

In the first set, I can mitigate 3 risks, each with very low probability of becoming problems. In the second set I mitigate only 1 risk that is almost certainly going to become a problem. Reducing the imminent risk seems to make the most sense but this choice is not reflected in a risk burndown chart. Simply reducing risk over time is not enough. You have to reduce the right risks.

Impact Isn’t Really About Money or Effort

The only way for a visual chart such as risk burndown to work is if we’re able to quantify risks. This is generally done with exposure. Exposure = probability x impact. Impact is a funny thing. Impact is an assessment of how much the consequence of a risk will affect the project if the risk becomes a problem. Traditionalists like to think about this from a money perspective (which makes sense since software engineers stole most of our risk management practices from the finance world, originally anyway). For small teams, effort is a better measure as in the number of person days a risk that becomes a problem will cost to fix. This is a quantifiable loss.

There’s a problem with thinking about impact in terms days of loss. Since not all risks are created equal, not all loss is truly equal either. Some kinds of loss can’t be measured in terms of effort. It really all depends on your project’s threshold of success. Some example risks (which don’t rely on ye olde life-critical system standby) from which you might never recover if they became problems include:

  • We don’t have a reliable backup solution; might lose all of our project data. (Lost yer data? You’re up a creek, son!)
  • We don’t have backup power for our data center; data centers might go offline for more than a few hours. (How many days will it take you to get those customers back?)
  • The demo has bugs and our contract renewal is based exclusively on how much the client likes our demo; a bug might occur during the demo. (HA! HA! You don’t have a job!)

In all of these cases you would reduce the risk by working on attributes other than impact (e.g. reduce probability, eliminate the condition, extend the time frame). Enough said. When it comes to calculating exposure, each of these risks has a catastrophic impact. That’s catastrophic, short for epic failure. No amount of days can really capture the essence of complete catastrophe.  Impact works best when considered in terms of success, not days or dollars lost.

Forget Risk Burndown

I want risk burndown to make sense, but given the problems I can’t help but think of it as a meaningless metric. Sure, some risks will be reduced and some will go away by converting into problems or being overcome by events. And a chart showing this would be really neat. But you’ll also uncover new risks as the project goes on. And some risks are just not worth caring about while others deserve a lot of attention. Risk management is about identifying the things that are most likely to kill your project so you can deal with them before it becomes too expensive (or impossible).  A burndown chart doesn’t reflect any of these things directly.

Burndown masks project risks too much and gives teams a false sense of confidence. To put it another way, there’s a risk with using risk burndown:

Our new risk management strategy assumes our estimation precision is better than it is; we may not mitigate the right risks.

Exposure is a ruse. And risk burndown is a metric for showing a reduction in exposure over time. To wax poetic, perception is reality and risk burndown provides a false perception.

That said, any risk management is better than none at all.  If a risk burndown chart helps to get your team thinking about risk, then so be it.  But there are other ways (might not be as fancy) to manage risk which are easier and more effective.

Process Affordances: Ignore at Your own Peril

The Amsterdam airport was able to reduce the amount of urine “spillage” that hit the men’s room floor by 80% simply by etching a life-like image of a fly near the urinals’ drains. The fly was specifically engineered into the urinals to alter gentlemen’s behavior without their having to think about it. The concept is called nudging and it’s been used in domains other than restroom sanitation to encourage desired behavior. Other examples include the use of uncomfortable chairs in fast food restaurants to encourage people not to linger and real-time gas mileage displays in cars to encourage more economical driving. If you’ve read Donald Norman’s The Design of Everyday Things then you’ll know this as an affordance – a hint given to the user prompting them to take a specific action at a specific time.

Obviously the idea of affordances is directly applicable to devices as well as software usability but it wasn’t until I read about the urinal flies that I realized affordances don’t always have to have a physical representation. For example, a well designed software process should gently nudge a team to do the right thing. Since there is no one-size-fits-all process that works for all teams it is essential that the process complements the team and that the process’s affordances nudge team members to do what’s best for the project and the team.

Using a process that lacks the right affordances could have one of two possible outcomes. In the best case, the team abandons the process because they realize subconsciously that it is telling them to do the wrong things at the wrong times. This is bad because it sacrifices repeatability; you’ve regressed back to an ad hoc, “make it up as we go” state. In the worst case, the team sticks with the process and it leads them astray. This introduces risks into the project and could lead to complete project failure.

Software is already difficult enough to build successfully and processes are supposed to make software development easier. Unfortunately, knowing when something isn’t working is not an exact science, but with a dash of experience and little team reflection (for example from regular postmortems) it is possible to figure out when you are working for your process instead of your process working for you. To demonstrate this I am going to tell you a story.

Our Process

My studio team in the Carnegie Mellon software engineering program is charged with building a web-based requirements elicitation tool that helps users follow the SQUARE process out of the SEI. About halfway through the Elaboration Phase of the project (sometime in the spring semester) the project was going downhill. The warning signs were fairly apparent, we were missing milestones, tasking priorities were confusing, and a lot of work was stalling out at different levels of partial completion. Though we knew there was something wrong we weren’t really sure what was causing it, what we were doing wrong in our planning and tracking process.

The planning process we were using was fairly simple. At the beginning of the phase we looked at all the activities and artifacts that need to be completed by the end of that phase. For each identified milestone we enumerate specific entry criteria, general tasking, validation procedures, and exit criteria. This is a technique known as ETVX (entry, tasking, validation, and exit). Next we used planning poker to estimate how long we thought each milestone would take to complete. Finally, with this information we created a phase timeline which includes known due dates and dependencies between milestones.

Since we’re using an iterative approach to complete work in a phase, iterations follow largely the same planning process on a smaller scale. As a team we identify the milestones on which we will work during the iteration. Each milestone is assigned an owner whose job it is to ensure the milestone is completed by either delegating tasks or working on it themselves. The planning poker estimate is used to determine the approximate workload allocation on the team. This estimate is validated with bottom-up estimates that team members create based on their individual tasking.

There are several good things about this process. First, it’s written down and the team follows it. This is good because it means we can produce repeatable results over time. Second, this process makes use of several practices that are generally considered “good” by software experts. ETVX is a great way to clearly identify project milestones. Planning poker is similar to the wide-band Delphi estimation technique. Third, we’re using two forms of estimation to validate the plan as more information becomes known. Finally, the engineers responsible for the work determine the specific tasking and creating the bottom-up estimates.

You’re Good, but not That Good

In spite of all the good things we were doing, something still wasn’t connecting. The big aha! moment occurred about two weeks into the second iteration. Up to that point I had been working on my tasks that had carried over from the first iteration. The team leader noticed that almost no work had been started on the milestones I owned. [An aside: this, to me, says that at least our tracking process works somewhat well.] During the discussion that followed I became extremely defensive when the team leader asked me to shift priorities for the rest of the iteration. What should have been a simple request turned into a heated debate over tasking. I felt compelled to complete the past due work and here was this jerk trying to stop me. “Sure,” I thought, “I’ll do what you ask, buddy, but when this whole project comes crashing down it’s on your head, not mine.”

Later, as I looked back at the incident, I wondered to myself, “Why was I so defensive in light of such a simple request?” The reality was that the project wouldn’t come crashing down if I shifted priorities and I knew that. So why defend these older tasks when it was obvious that there were more immediate needs?

It turns out that the affordances built into the planning process were encouraging my behavior. There were a few simple things at play that, when combined, decreased our ability to plan effectively.

First, our process encouraged us to plan more work than time allowed. This was due to there being a missing connection between day-to-day progress and the “big picture,” the overall plan. Second, though the new team leader may have believed there was consensus, the team in fact did not wholly agree with the priorities for iterations. This behavior was not specifically discouraged by our planning process and so allowed to persist. Third, leftover work was not addressed during planning. Some tasks might simply expire while others may change priority, becoming more or less important with a new iteration. Since this wasn’t addressed it created a sense of urgency for individuals carrying over work from iteration to iteration. Finally, assigning milestone owners had unanticipated side effects. The goal was to ensure that someone was taking responsibility for coordinating and monitoring milestone work. This worked so effectively that milestone owners exhausted themselves attempting to finish milestones and resisted changes to the plan that prevented them from finishing what was promised.

When it came time to make a necessary modification to the plan, our process encouraged us to fight against the best course of action for the team. We didn’t have the level of flexibility needed due to our process’s affordances nudging us to do the wrong things. Milestones were slipping and people wanted to finish what they started. Project priorities were shifting as the project matured but team members were wearing blinders, ignoring the changing facts around us. To stand a chance at success we had to change the affordances in the planning process. We had to nudge the team in a new direction.

Our Solution

To try to solve this problem we decided to incorporate some of the planning principles from Scrum, specifically the product backlog, sprint backlog, and sprint planning meeting, into our planning process. Scrum takes a more task-oriented approach when planning iterations and correlates the sprint backlog with the product backlog. This better encourages the team to not plan more work than there is time to complete while connecting day-to-day work with the overall plan. Scrum also requires that the team reprioritize work when planning iterations and that we agree on the resulting priorities. This will hopefully eliminate the prioritization conflicts we experienced during iterations. With Scrum, leftover work from iterations is saved in the product backlog. This change decreases the anxiety team members feel when work is left undone (because the work is not forgotten) while simultaneously giving the team more flexibility to change direction as the project progresses. Finally, the team, rather than individuals, takes ownership over the milestones held in the product backlog. With each commitment made during iteration planning, the whole team buys in effectively shifting the passion and dedication individuals held for owned milestones to the commitments we agreed on as a team.

I’m not really sure how Scrum is going to turn out for us. I think the most important thing is that we recognized that something was not working and took action to correct it. I personally would rather see the team fail in a new and spectacular way rather than repeating the same mistakes again and again.

Add This to Your Silver Toolbox

Unfortunately, I don’t think there is a trick for detecting these sorts of process failures. Data and metrics can help but only if the process is repeatable and the team has the knowledge and discipline to collect the data in the first place. Team postmortems can help but if individuals are afraid to raise concerns, you’ll find yourself on a trip to Abilene before you realize it. In many cases, if you think something isn’t going well, others are probably thinking the same thing. Once I spoke up I found out that others thought something wasn’t working also. I was just the first person who was able to articulate it.

Affordances are powerful but subtle mechanisms. In well designed things, we aren’t supposed to be consciously aware of them. But that doesn’t mean they always nudge us to do the right thing.