Posts tagged ‘studio’

Lightweight Experiments for Process Improvement

[This post is a recap on the second talk I gave at XP2010. This was the big one, the experience report talk, one of 15 experience reports published at XP2010. You can download the slides (pdf) or the full paper (pdf) from this website or from XP2010.org.]

Process improvement is important for nearly all teams but it can sometimes be difficult for a team to know what is working, what isn’t working, and what techniques or methods to try when attempting to improve. Performing a scientific experiment is one way help overcome these problems but as academic research has shown us, while experimentation can yield interesting results, running an experiment is time consuming, expensive, and requires some serious thinking and control to pull off. From a practitioner’s standpoint this means that experimentation is a non-starter.

Of course, that’s only if you run experiments like an academic.

Banner from the XP2010 conference in front to the hotel.

Back Story

Just over a year ago, my MSE studio team at Carnegie Mellon had a problem. We had decided we would use Extreme Programming for the construction phase of our project but some team members had doubts concerning pair programming. We had decided that we would use some kind of peer review, having already seen the many benefits of inspection when reviewing other artifacts. The dispute arose over whether pair programming would give similar enough results. Also, not all team members had experience with pair programming but everyone on the team knew and enjoyed solo programming.

The number one concern was whether pair programming would allow us to meet our very strict deadline. We had just over three months to complete the construction phase of the project. According to our threshold of success this meant implementing all “must have” requirements with a minimum level of quality. Did we really have time to waste by having two people working on the same code at the same time? Wouldn’t working independently and inspecting code on an as needed basis allow us to get more work done faster?

At the time it just so happened that I was taking a reading class with Mary Shaw and in that class we discussed some research findings that might help settle this debate. Research from Laurie Williams, Ward Cunningham, Barry Boehm, and many others showed that pair programming requires more effort (although never double the effort) but is faster than programming alone (pdf). Also pair programming creates code of about the same quality as coding alone with inspection (pdf). Of course, the research may not apply to us since Square Root is closer to a professional team working on a large project with a real client, not undergrads working on short term toy projects.

After an iteration where some teammates used pair programming and others refused, we decided to try an experiment to see which practice actually worked better. The original idea was that we might be able to validate some of the research but decided instead that it was more important just to resolve our own internal conflicts and figure out which processes worked better.

Conducting a Lightweight Experiment

With the scientific method as our guide we planned and executed a lightweight experiment which pitted programming alone against pair programming. The results were amazing (and you can find the raw data in our project archive). In conducting the experiment we used a set of novel techniques which I think can be useful in conducting other lightweight experiments. There’s more background in the experience report so I’m only putting the meaty stuff in this post.

Focus narrowly on a single question – The essential key to keeping an experiment light is to only tackle one thing at a time. In this case we focused on comparing and contrasting a single technique, pair programming, rather than multiple techniques or an entire process (such XP vs. TSP).

Divide work, not teams – If I were comparing pair programming to programming alone in an academic setting, I would put together two teams of about the same experience and have them each build their own version of the same software, one team using pair programming, the other programming alone. In a business setting this is a complete waste and few companies can afford to have two teams duplicating effort. By dividing work instead of teams you may lose some control over variables in the experiment but in most cases isolating more variables doesn’t add any further clarity to helping answer the narrowly focused question. To divide work successfully you need to have some way of estimating work units for division. We used use case points as shown in the figure depicting our modified planning game.

Steps in modified planning game for dividing work into experiment groups

Continue making releases – Since we still needed to make a comparison, rather than dividing into teams and duplicating effort we divided the features that were released each iteration. In this way we built about half the features released during an iteration using each technique. Working on about half the features using pair programming meant that at least some features were being built by individuals. At the time this was a risk reduction decision to make sure that if pair programming completely failed we’d still have something to ship at the end of the iteration. Explicitly managing risks is the only way to know if the lightweight experiment may cause problems for making releases. Also, we had a strictly defined cut-off for stopping the experiment if it ever stopped us from shipping to our client.

Use the data you have – In almost all cases we were able to get the data we needed to evaluate our hypothesis from our current process. When we couldn’t, we only had to make minor modifications to our data collection practices, for example adding a check box to our SharePoint server for indicating whether a task was paired or individual.

One of the more interesting things we did was to create a “tally sheet” for collecting pair programming issue detection statistics in real time, as the issues were discovered. Given the near instantaneous code-inspect-fix cycle when programming in pairs, this was the only way to collect similar data for comparing pair programming to inspection.

Example of a real time tally sheet used for tracking issues discovered while pair programming.

Statistical significance is overrated – The whole point of running a lightweight experiment is to collect just enough data to help you make a better decision or validate your gut feeling. This technique is not meant for uncovering universal truths or proving something to the rest of the world. In exchange for keeping the experiment light, the results will only apply to your team. Over the course of an iteration or two, 4-6 weeks, you’ll only get enough data to start to see trends. In our case the results were not statistically significant using individual T-tests but that didn’t matter. The most important thing is that we had data that could be used for comparison, data that everyone felt good about and that helped us gain clarity into what we did and how well it worked.

Retrospectives get immediate value – The whole reason the experiment is light is to reduce cost and decrease the lag time to providing value to the team. Just to give you a little perspective, it took us 6 weeks to run the experiment and had enough data and casual observations to make a decision during the retrospective when the analyzed data was shared. That event occurred in early August of 2009. This experience report required almost nine full months of gestation from the paper proposal to the talk I gave at the conference. The gestation period on “universal truth” research can be even longer. We, as practitioners, don’t have to wait for those universal truths to be born to get value from research. By running your own quick and dirty, lightweight experiments, you can get results in a timely fashion that you know will apply to your team because your team was the subject of the experiment. It’s all about closing the gaps between research and practice and taking the information you need now instead of waiting for academic research to catch up.

Overall Conclusions

For the Square Root team it turned out that pair programming was faster, cheaper, and produced code that had more predictable albeit slightly worse quality. The more important lesson is that we discovered a technique, lightweight experimentation, for learning other interesting things about our team and about software engineering in general.

My paper and this blog post were all about trying to describe the technique, using our experiment as an example. I think it would be awesome if teams around the world conducted lightweight experiments on a variety of topics. If enough folks share what they learn, we might start to see trends emerge across teams that could lead to universal truths, validate research, or at least discover some great rules of thumb.

What else might make for a great experiment? Anything you’ve got a question about on your team!

  • What is the clearer way to write requirements, user stories or use cases?
  • Which estimation technique is more accurate of X and Y?
  • Can we skip unit testing if we use inspection (looking at quality, knowledge sharing)?
  • Is UML a better design notation than the one we made up as a team?
  • What else…?

If you do a lightweight experiment, let me know! Share what you learn as a blog post or whitepaper. Let others know what you’ve learned! Even if the specific results only apply to your team and the way you’ve executed your project, your experiences help form a baseline, a sort of shared understanding for how software development works, how some of these practices work. And there’s so much about software engineering that we have yet to learn.

Acknowledgements

This paper was my first experience report and it was an awesome journey. Naturally a lot of folks helped me along the way and I would like to take a moment to make sure they know that I appreciate their influences and support. The Square Root team: Marco Len, Yi-Ru Liao, Abin Shahab, and especially my fellow experiment co-champion Sneader Sequeira for having the guts to go along with this idea in the first place. Some of the faculty at Carnegie Mellon: Dave Root and John Robert (my studio mentors) for bringing up the idea of writing a paper, and Jonathan Aldrich for helping review my proposal. Artem Marchenko was my XP2010 paper shepherd after the proposal was accepted, and the quality of each draft only improved because of his inputs. A group of my fellow employees at Net Health Systems sat through an early draft of the presentation I gave and shared valuable feedback for improving it. And finally I thank, Marie, my wife, who was with me from start to finish and read more drafts and sat through more practice talks than anyone else. She’s probably as much an expert on this subject by now as I.

A Final Aside

I wrote the initial draft of this paper as my final reflection paper for my Master of Software Engineering degree (pdf). That draft has a very different tone, approach, conclusion, and direction than what I eventually published for XP2010. This is half due to there not being a hard page limit but also I had a lot more time to think about what was really important when writing for XP2010. There’s some interesting information, mostly in the lessons learned, that might prove interesting to those who are interested. You should check out my Square Root teammates’ reflection papers as well since they are all interesting and well written.

The Reality of Risk Exposure

Over the past few weeks I’ve been thinking a lot about risk exposure in the context of managing projects. Exposure is a technique used almost universally when managing risks, yet as I’ve already discussed, exposure can cause major problems because it’s a precise number based on mostly made-up information. At the same time, exposure is used widely and successfully – otherwise there wouldn’t be as much literature throughout the web telling you to calculate risk exposure.

This begs the question: is risk exposure really as meaningless as I’ve made it out to be? I’ve collected some data that helps answer this question.

Data Collection and Context

Risk management is one of the basic subjects covered in the Managing Software Development course, one of the five core courses students of the Carnegie Mellon Master of Software Engineering program take in completing their degree. Students learn about the continuous risk management paradigm from the Software Engineering Institute. Two of the cornerstones of this technique are threshold of success and condition-consequence based risk statements.

Having ready access to risk management experts at the SEI, nearly every team conducts a facilitated small team risk evaluation workshop in which risks are collected with the help of a taxonomy-based questionnaire (pdf), analyzed, and prioritized using group multi-voting. The basic workshop has been conducted the same way for close to a decade and many teams have put their risk data collected during the workshop in the MSE’s project archive.

I’ve gathered data from these small team risk evaluation workshops for 9 MSE Studio teams, a total of 164 identified, analyzed, prioritized risks.

What’s in the Data?

During a risk evaluation workshop, teams identify risks using their threshold of success as a guide. Once identified, risks are briefly analyzed and assigned an impact, probability, and time frame value based on a rough average from the team members’ initial gut feeling on the risk. These values are assigned simply so when a manager asks to see the probability, for example, there is a value to give him. Each of impact, probability, and time frame can only be one of 3-4 values. The idea is that by decreasing the precision we can increase the accuracy. Values are assigned based on a rubric. For the purposes of calculating a risk exposure I assigned each of the analysis categories a number. Time frame is not used in calculating exposure.

Impact

  • Catastrophic – The team will be unable to meet threshold of success. (numeric value 4)
  • Critical – The team can only meet the threshold of success with significant additional effort and stress. (numeric value 3)
  • Marginal – The team can meet the threshold of success with minimal extra effort. (numeric value 2)
  • Negligible – There is no real impact on achieve the threshold of success or little increase in effort. (numeric value 1)

Probability

  • High – Chance of becoming a problem is above about 80%. (numeric value .8)
  • Medium – Chance of becoming a problem is about 50/50. (numeric value .5)
  • Low – Chance of becoming a problem is below about 20%. (numeric value .2)

Time Frame

  • Short – May occur in about a month or less.
  • Medium – May occur in 1 to 3 months.
  • Long – May occur in more than 3 months.

Instead of relying on the results from the analysis, teams perform 3 to 4 rounds of multi-voting. The final multi-voting rank is shown. Not all teams ranked all risks since teams generally only deal with the top few risks, usually less than 10. This idea is captured in the priority. A risk is either a high priority, meaning the team is actively addressing it, or a low priority meaning the team is aware of it but it was not ranked high enough to deal with yet. Teams might choose different strategies for determining priority. The two most popular are to only examine the top X or to rely on consensus derived from how the risks clustered as a result of multi-voting. Usually there is strong team consensus for the top 4 to 5 risks and weak consensus after this.

Analysis and Discussion

My hypothesis is that teams’ rankings will generally match exposure, meaning that risks that are ranked highly will also have a high exposure. As the data shows, this is generally the case. On average nearly every team’s high priority risks were also the ones with the highest exposure.

Graph showing Teams' Average Risk Exposure by Priority.

Examining the risks rank and exposure tells a similar story but not convincingly. There is a relatively weak negative correlation (correlation coefficient of -0.22) between exposure and team assigned rank. Basically the best that can be said is that there is a general downward trend in exposure as the rank increases but there is enough variation that I can’t really say anything for certain.

Graph showing risk data for all teams.

I have two possible explanations for this. First, traditional risk exposure does not take into account time frame while teams evaluating risks in this data set do. So, all things equal from an exposure perspective, a long term risk might be ranked very low while a short term risk will be ranked much higher. If this were the case, we’d see more short-term risks assigned high ranks than long-term risks and this is indeed the case. In fact, the majority of risks identified are short-term risks with nearly three times more short-term risks being identified than long term risks. Mid-term risks are, unsurprisingly in the middle. A better exposure number might be had by taking into account risks’ time frame values.

Graph showing the count of risks per time frame by rank

The second possible explanation I have is that 3 – 4 buckets isn’t sufficient to allow for enough variation to form a strong correlation between rank and exposure. Indeed this is one of the greatest differences between this data set and traditional risk exposure calculations in which impact might take on nearly any number and exposure is usually a percentage from 10 – 100%. That said there still is a general trend which shows that most of the time, multi-vote ranking very roughly corresponds to exposure.

There is one more catch about this data and it’s a subtle but important one. Values for probability, impact, and time frame were determined as a team using a sort of rough average approach where team members vote and the approximate averages are rounded to the nearest bucket. Since all the values and rankings were determined through a group effort, it would make sense that they should roughly correspond.

Conclusions

As it turns out, risk exposure is a rough and somewhat accurate indicator for relative risk priority, at least when calculating exposure or rank using group-driven techniques. Teams relying only on exposure are likely to rank some risks higher than they otherwise might. Part of this is due to exclusion of the concept of time from traditional exposure, part of it might be differences of opinion within the group as far as impact or probability are concerned.

Talking with other MSE alumni, and I mostly agree with them, the most important thing about risk management is bringing up concerns and talking about them. Delphi mutli-voting is an easy way to encourage conversation since differences of opinion are addressed as part of the multi-voting process. No matter what technique you use, exposure (with time somehow included), multi-voting, or some combination, do not reduce risk management to simple numbers. It’s really all about communication. Encourage this communication using whatever techniques work for your team.

Raw data used for analysis in CSV format.

Identifying Process Affordances: Nudging Toward Change

This post is a recap of a talk I gave this weekend at the Carnegie Mellon University Master of Software Engineering 20th Anniversary Mini-Conference. I’ve made the paper this talk is based on (pdf) as well as the slides I used during the talk (pdf) available. I’ve also linked to as many of the primary sources I used in research as I could so please, check out those papers if this is something that interests you.

About a year ago I discussed the idea of using affordances to help figure out how to make software processes work more smoothly for a team. Back then the idea spawned from a moment of crisis and self-reflection on my studio team, but having thought about it for a while and noticing the phenomena occurring on other teams I decided to revisit this idea and see if there is a way to proactively use affordances to avert problems rather than merely explaining problems as they occur. As it turns out there is a precedent for affordance-driven design in object engineering. With a few basic assumptions I think affordance-driven design can be extended to software processes as well.

Michael Keeling giving a talk at the MSE mini conference

To better explain how to proactively use affordance to pick or tailor a software process it helps to know a little about the Theory of Affordances (pdf). If you’ve read The Design of Everyday Things by Donald Norman then you probably already have a pretty good understanding of how to use affordances in the context of object design and usability. From an ecological psychology perspective, affordances are a way of helping explain or predict how an animal will behave in the context of their environment. In the context of humans, our environment is influenced by our experience, culture, and background as well as our current goals within that environment.

A simple example is the play button on a DVD player. A triangle to the right means play. This is obvious to us because we know what a DVD player does (we have experience with similar devices), we turn to the device when we want to watch movies (we’re looking for the play button), and culturally, our notion of time is left to right (so a triangle pointing right means play while a triangle facing left means reverse). But what if you come from a culture where time is generally represented as passing from down to up vice left to right? In this case, a triangle facing up might be a better symbol on a play button than a triangle pointing right. The value of the affordance changed based on a cultural bias and the user’s background.

This is all well and good, but what does it have to do with software process? For the Theory of Affordances to apply to software processes (or any process) we have to assume that process is a part of the environment. This is an interesting proposition since process really only exists in our minds. Process is something that we make up, and like our understanding of an object your knowledge and experience with a process will influence your perception of that process as an environmental influence. As long as you believe in, understand, and follow a process, the steps of that process exist as much as any other object in the real world. Logically this seems to make sense. Even the US patent office will grant patents for a process just as it will a physical invention.

Back to the core problem of identifying affordances, something I did not have an answer for in my original post a year ago. As best as I can tell, the only way to identify affordances is to reverse engineer a process focusing on the affordances using a technique known as Affordance Driven Design. Affordance Driven Design has three basic steps (pdf).

  1. Identify a user’s needs in terms of functions.
  2. Identify the desired functional affordances necessary to achieve the previously identified user’s functions.
  3. Choose affordances to design into artifacts which are mostly likely to help achieve those desired functional affordances.

As an example, consider the task of blending a drink using a blender (pdf). Typical functions a user might want to perform are preparing the blender, blending, and cleaning up afterward. Functional affordances might include the “countertop-ability” (the ease with which a blender can be moved to a countertop), the “clean-ability” (the ease with which a user can clean a blender), and “transportability” (the ease with which a user can move a blender around). A person might choose any number of blenders to perform the desired functions but each blender will fulfill the functional affordances in different ways. A hand blender is extremely portable while a gas-powered “whacker” (powered by a 26cc engine, complete with motorcycle throttle – it makes the smoothest margarita you’ll eve have) isn’t really intended for indoor use.

To software engineers, functional affordances should look very familiar – they’re essentially quality attributes. Thinking about affordances from this perspective gives us a huge advantage since, as software engineers, we are already extremely familiar with quality attributes and quality attributes scenarios. Affordances, therefore either promote or inhibit desired quality attributes in your process.

Thinking about software processes, the functions will all be related to software: writing, designing, testing, and releasing are just a few possibilities. Some process quality attributes might include:

  • Plan-ability (How far ahead does a process help you to plan?)
  • Predictability (How well can you see into the future?)
  • Changeability (When the course of a project needs to shift, how well does the process support chaging plans or direction?)
  • Quality (The degree to which your process promotes “quality”)
  • Cost (The amount of resources you’re willing to spend on process to achieve specific functions)
  • Harmony (How well the team gets along)
  • Reliability (How consistently the process helps you perform)
  • Performance (Could be speed of development or quantity of code – define what you mean in the quality attribute scenario.)

As an example, say changeability is a desired quality on your team. A specific changeability scenario might go something like this. In order to meet business needs in an aggressive market, the team needs to be able to shift focus and answer competitors’ challenges within five business days. What are the things that might get in the way of this kind of rapid change? Heavy documentation could be one thing, as it nudges teams into keeping a single course. Long iterations also make it difficult to shift focus since more effort is required to make longer term plans. Going light and having short iterations, on the other hand promotes a team’s ability to change. But like every design decision there are trade-offs. Going light in terms of documentation might make it harder to achieve certain kinds of quality, for example.

The main idea here is that it’s relatively easy to identify process affordances by thinking like a designer and applying the skills we’ve already acquired as software engineers. I propose that evaluating process affordances as I’ve discussed here is a great way to pick a process and also to tailor processes. When tailoring simply identify affordances that are helping the team (be sure to keep those), and identify the affordances that are nudging the team in the wrong direction (replace those with affordances that help you do the right things). And above all, remember that if things are going wrong, it isn’t always your fault. The process is a part of the environment and if your process is giving you the wrong cues for your project or team, then it’s the wrong process for you. So change it!

Process Affordances: Ignore at Your own Peril

The Amsterdam airport was able to reduce the amount of urine “spillage” that hit the men’s room floor by 80% simply by etching a life-like image of a fly near the urinals’ drains. The fly was specifically engineered into the urinals to alter gentlemen’s behavior without their having to think about it. The concept is called nudging and it’s been used in domains other than restroom sanitation to encourage desired behavior. Other examples include the use of uncomfortable chairs in fast food restaurants to encourage people not to linger and real-time gas mileage displays in cars to encourage more economical driving. If you’ve read Donald Norman’s The Design of Everyday Things then you’ll know this as an affordance – a hint given to the user prompting them to take a specific action at a specific time.

Obviously the idea of affordances is directly applicable to devices as well as software usability but it wasn’t until I read about the urinal flies that I realized affordances don’t always have to have a physical representation. For example, a well designed software process should gently nudge a team to do the right thing. Since there is no one-size-fits-all process that works for all teams it is essential that the process complements the team and that the process’s affordances nudge team members to do what’s best for the project and the team.

Using a process that lacks the right affordances could have one of two possible outcomes. In the best case, the team abandons the process because they realize subconsciously that it is telling them to do the wrong things at the wrong times. This is bad because it sacrifices repeatability; you’ve regressed back to an ad hoc, “make it up as we go” state. In the worst case, the team sticks with the process and it leads them astray. This introduces risks into the project and could lead to complete project failure.

Software is already difficult enough to build successfully and processes are supposed to make software development easier. Unfortunately, knowing when something isn’t working is not an exact science, but with a dash of experience and little team reflection (for example from regular postmortems) it is possible to figure out when you are working for your process instead of your process working for you. To demonstrate this I am going to tell you a story.

Our Process

My studio team in the Carnegie Mellon software engineering program is charged with building a web-based requirements elicitation tool that helps users follow the SQUARE process out of the SEI. About halfway through the Elaboration Phase of the project (sometime in the spring semester) the project was going downhill. The warning signs were fairly apparent, we were missing milestones, tasking priorities were confusing, and a lot of work was stalling out at different levels of partial completion. Though we knew there was something wrong we weren’t really sure what was causing it, what we were doing wrong in our planning and tracking process.

The planning process we were using was fairly simple. At the beginning of the phase we looked at all the activities and artifacts that need to be completed by the end of that phase. For each identified milestone we enumerate specific entry criteria, general tasking, validation procedures, and exit criteria. This is a technique known as ETVX (entry, tasking, validation, and exit). Next we used planning poker to estimate how long we thought each milestone would take to complete. Finally, with this information we created a phase timeline which includes known due dates and dependencies between milestones.

Since we’re using an iterative approach to complete work in a phase, iterations follow largely the same planning process on a smaller scale. As a team we identify the milestones on which we will work during the iteration. Each milestone is assigned an owner whose job it is to ensure the milestone is completed by either delegating tasks or working on it themselves. The planning poker estimate is used to determine the approximate workload allocation on the team. This estimate is validated with bottom-up estimates that team members create based on their individual tasking.

There are several good things about this process. First, it’s written down and the team follows it. This is good because it means we can produce repeatable results over time. Second, this process makes use of several practices that are generally considered “good” by software experts. ETVX is a great way to clearly identify project milestones. Planning poker is similar to the wide-band Delphi estimation technique. Third, we’re using two forms of estimation to validate the plan as more information becomes known. Finally, the engineers responsible for the work determine the specific tasking and creating the bottom-up estimates.

You’re Good, but not That Good

In spite of all the good things we were doing, something still wasn’t connecting. The big aha! moment occurred about two weeks into the second iteration. Up to that point I had been working on my tasks that had carried over from the first iteration. The team leader noticed that almost no work had been started on the milestones I owned. [An aside: this, to me, says that at least our tracking process works somewhat well.] During the discussion that followed I became extremely defensive when the team leader asked me to shift priorities for the rest of the iteration. What should have been a simple request turned into a heated debate over tasking. I felt compelled to complete the past due work and here was this jerk trying to stop me. “Sure,” I thought, “I’ll do what you ask, buddy, but when this whole project comes crashing down it’s on your head, not mine.”

Later, as I looked back at the incident, I wondered to myself, “Why was I so defensive in light of such a simple request?” The reality was that the project wouldn’t come crashing down if I shifted priorities and I knew that. So why defend these older tasks when it was obvious that there were more immediate needs?

It turns out that the affordances built into the planning process were encouraging my behavior. There were a few simple things at play that, when combined, decreased our ability to plan effectively.

First, our process encouraged us to plan more work than time allowed. This was due to there being a missing connection between day-to-day progress and the “big picture,” the overall plan. Second, though the new team leader may have believed there was consensus, the team in fact did not wholly agree with the priorities for iterations. This behavior was not specifically discouraged by our planning process and so allowed to persist. Third, leftover work was not addressed during planning. Some tasks might simply expire while others may change priority, becoming more or less important with a new iteration. Since this wasn’t addressed it created a sense of urgency for individuals carrying over work from iteration to iteration. Finally, assigning milestone owners had unanticipated side effects. The goal was to ensure that someone was taking responsibility for coordinating and monitoring milestone work. This worked so effectively that milestone owners exhausted themselves attempting to finish milestones and resisted changes to the plan that prevented them from finishing what was promised.

When it came time to make a necessary modification to the plan, our process encouraged us to fight against the best course of action for the team. We didn’t have the level of flexibility needed due to our process’s affordances nudging us to do the wrong things. Milestones were slipping and people wanted to finish what they started. Project priorities were shifting as the project matured but team members were wearing blinders, ignoring the changing facts around us. To stand a chance at success we had to change the affordances in the planning process. We had to nudge the team in a new direction.

Our Solution

To try to solve this problem we decided to incorporate some of the planning principles from Scrum, specifically the product backlog, sprint backlog, and sprint planning meeting, into our planning process. Scrum takes a more task-oriented approach when planning iterations and correlates the sprint backlog with the product backlog. This better encourages the team to not plan more work than there is time to complete while connecting day-to-day work with the overall plan. Scrum also requires that the team reprioritize work when planning iterations and that we agree on the resulting priorities. This will hopefully eliminate the prioritization conflicts we experienced during iterations. With Scrum, leftover work from iterations is saved in the product backlog. This change decreases the anxiety team members feel when work is left undone (because the work is not forgotten) while simultaneously giving the team more flexibility to change direction as the project progresses. Finally, the team, rather than individuals, takes ownership over the milestones held in the product backlog. With each commitment made during iteration planning, the whole team buys in effectively shifting the passion and dedication individuals held for owned milestones to the commitments we agreed on as a team.

I’m not really sure how Scrum is going to turn out for us. I think the most important thing is that we recognized that something was not working and took action to correct it. I personally would rather see the team fail in a new and spectacular way rather than repeating the same mistakes again and again.

Add This to Your Silver Toolbox

Unfortunately, I don’t think there is a trick for detecting these sorts of process failures. Data and metrics can help but only if the process is repeatable and the team has the knowledge and discipline to collect the data in the first place. Team postmortems can help but if individuals are afraid to raise concerns, you’ll find yourself on a trip to Abilene before you realize it. In many cases, if you think something isn’t going well, others are probably thinking the same thing. Once I spoke up I found out that others thought something wasn’t working also. I was just the first person who was able to articulate it.

Affordances are powerful but subtle mechanisms. In well designed things, we aren’t supposed to be consciously aware of them. But that doesn’t mean they always nudge us to do the right thing.