Software Craftsmanship: Engineering by Coincidence

I was extremely disappointed to read a recent article on Coding Horror reflecting on an IEEE editorial written by Tom DeMarco. If you have not already, please read Tom DeMarco’s article now. It’s only two pages and it’s well written.

With all due respect, Tom DeMarco is wrong.

And Jeff Atwood made things worse.

According to Atwood’s interpretation of DeMarco, since we can’t control software projects, there is no sense in trying to engineer software.

What DeMarco seems to be saying — and, at least, what I am definitely saying — is that control is ultimately illusory on software development projects. If you want to move your project forward, the only reliable way to do that is to cultivate a deep sense of software craftsmanship and professionalism around it.

Atwood’s conclusion simply is not supported by DeMarco’s article. DeMarco made two points in his piece.

  1. We don’t have as much control over software as we think we do — even when we can measure the software on which we work.
  2. We should be focusing more on the upfront “conception” activities than the areas that currently receive the most attention, construction.

My interpretation of “conception” activities are things like requirements, architecture, and design — details that ultimately help you figure out whether it makes sense to build the thing you think you want to build. By framing DeMarco’s argument as “craftsmanship” vs. “engineering” Atwood misses the whole point and reopens the tired art or engineering debate.

Overlooked by Atwood, DeMarco never questioned the idea that software should be engineered.

I’m gradually coming to the conclusion that software engineering is an idea whose time has come and gone. I still believe it makes excellent sense to engineer software. But that isn’t exactly what software engineering has come to mean. The term encompasses a specific set of disciplines including defined process, inspections and walkthroughs, requirements engineering, traceability matrices, metrics, precise quality control, rigorous planning and tracking, and coding and documentation standards. All these strive for consistency of practice and predictability.

DeMarco is really saying that the engineering part of software engineering has become overshadowed by a collection of best practices for building software. In my mind this isn’t necessarily a bad thing. All it means is that what has become known as “software engineering” is different than the original definition intended by the NATO Conference on Software Engineering.

But by discounting current software engineering practices, DeMarco dismisses the real engineering that went into advancing the field to where it is today.

DeMarco seems to imply that what we really want software engineering to be–the application of systematic, disciplined, quantifiable approaches to the development of software–and what software engineering has become cannot coexist. Essentially, to reach a state where metrics and measures, quantifiable approaches, are used correctly and consistently by the software development community we must stop using the term “engineering” to describe the current set of practices.

This is backwards thinking.

Engineering is more than something you do; it’s also a way of thinking about problems and solutions. Reaching the point in software engineering that we are today required systematic, disciplined, and quantifiable thinking. Over time results of this thinking have been codified into the set of best practices that most developers now take for granted.

For example, we know that there is a 100x or more difference in costs between defects discovered later in the software lifecycle than earlier. We know that certain practices can effectively remove defects at different costs and at different times throughout the lifecycle (for example, inspection vs. prototyping vs. unit testing vs. system testing). We also know that historical data is an excellent indicator of future performance on software projects.

Systematic, disciplined, quantifiable thinking was required to make these discoveries.

Because of these codified best practices, it is not always necessary to conduct experiments on a project to trust that they are working. I know unit testing combined with regular system integrations will flush certain defects from my software before those defects become a problem during system testing. I know that statistical analysis of collected task tracking data will help me better predict how long future tasks of a similar size will take. It doesn’t matter whether I completely understand the engineering behind the practice or whether I simply follow the process or use the tool. The benefits will be the same.

Does that make me less of an engineer? I don’t think so.

Using best practices codified as processes, methods, or tools on a software project means you are engineering software whether you like it or not. With many of these practices, the control mechanisms are already built in so you don’t realize that you’re already controlling your project. As DeMarco points out, it simply isn’t necessary that every engineering detail be painstakingly scrutinized for a project to be successful. For many projects, the essence of the project is sufficient to overcome the accidents encountered when engineering by coincidence.

But just because you engineer by coincidence it doesn’t make you a software craftsman. To prove it, I’m calling Jeff Atwood out. Jeff, I dare you and the Stack Overflow team to take the PSP Challenge. Take a course on the Personal Software Process, honestly give it a try — use actual software engineering for a few weeks — then tell me that software engineering is dead. But don’t knock it until you’ve tried it.

Relationships Matter

Building a great software product is only half the battle. The relationship the user builds with your software has the single greatest influence on how awesome that software is. Sorry. It’s not the language the software is written in. It’s not the algorithms. It’s not all the processes you used behind the scenes to create it. If the relationship sucks, your software sucks. Period.

To prove this I’m going to use a tangentially-related personal experience that has absolutely nothing to do with software. In this case, it’s ok, because realistically, this bit of reflection actually has nothing to do with software – remember it’s about relationships. Let’s talk about two concerts I’ve seen this summer.

Coldplay: “You’re Awesome!”

I’ve heard Coldplay’s radio hits but I was never really a huge fan of the band. Had it not been for a friend’s invitation I would never have gone to this concert and I would have missed out on one of the best live shows I’ve seen. Acceptable radio hits became handcrafted masterpieces, more amazing live than on FM. The most interesting part of the show, and the thing that left the greatest impression on me, was the band’s absolute humble devotion toward the fans in attendance. At one point the band walked out to mini-stages set up in the seated area under the pavilion and way out in the lawn, giving all fans an equal chance to be a part of the show, not just the diehards that purchased overpriced tickets.

Coldplay’s attitude created a completely positive vibe that made me want to enjoy myself.  After they had exhausted themselves for close to two hours, doing everything possible to create a positive experience for the audience, Coldplay thanked us, not for coming to see them, but for allowing them to play for us, “You’re awesome, thank you for letting us play for you tonight!”

Dave Matthews Band – “We’re Awesome!”

I didn’t realize that Dave Matthews Band live shows are really just one long meandering jam session.  Every one of the band members is an amazing musician, but 14+ minute songs start to wear after a while.  Dave Matthews rarely spoke (when he did it was incomprehensible) and most of the band wore sunglasses throughout the entire show. Overall, it was a very mediocre concert that was better suited to an afternoon picnic setting than an evening at a large outdoor amphitheatre.  From a technical perspective it was amazing but overall it failed to entertain me.

The most interesting thing about this concert was the crowd. In spite of the fact that the band seemed to be playing solely for their own amusement, the audience was full of absolute fanatics, complete with their own dress code – plaid shorts and tie dyed shirts (though rarely together) with sandals – and a shared emotional connection to every song the band played. Dave Matthews Band is here to play music – If you like it, whatever, if you don’t, whatever.  “We had a fun time tonight and I hope you did too!”

It’s all About the Relationship

It turns out that the relationship the bands built with the audience during the show was directly related to how much I enjoyed it. It should be pretty obvious how this extends to software.  The relationship your software builds with your users will determine how much they enjoy using it. Is your software helping users be amazing or does your software expect users to change their behaviors to use it? Are you building something for yourself that others might happen to like or are you targeting a specific audience?

Software that is technically the best will always fall short to similar software that builds a relationship. Kathy Seirra expressed a similar sentiment with her Dating Rules for Software. But there’s a small catch to Kathy’s rules as exhibited by the Dave Matthews Band. Apple tends to act this way too. Sometimes if you do your own thing and you actually are great, you can afford not to focus on the relationship. It’s extremely difficult but sometimes it works. Sacrificing the relationship creates a razor thin margin for error but when you get it right, you’ll breed fanatics that will love you (almost) unconditionally and critics who will absolutely hate you – there is no mediocrity.

Sacrifice the relationship and you could be the best, but only until someone almost as good comes along who thinks about others before themselves.  I think Microsoft has learned that lesson the hard way.  No matter what you do, relationships rule the day.

Kenny Rogers’ Guide to Software Process Improvement

Process improvement is a tricky mistress.  Usually it’s sufficient to feel that something can be done better, but often, especially for larger organizations or folks looking for higher CMMI rankings, it’s necessary to quantify improvements in terms of measurable effects.  There is a cost to process improvement; every hour spent beefing up process is an hour less spent doing “real” work.  And if you optimize out the wrong processes it could seriously damper development.  The key is to determine the most cost-effective changes and only execute those process improvements.  Return on investment analysis is one way to understand cost-effectiveness (PDF), but I’ve found this technique conceptually difficult to apply to software.  How much is quality worth?  How do you put a dollar amount on developer happiness?  What’s a line of code worth to you?  It’s enough to make your head spin, but a good friend of mine, Kenny, once gave me some great advice that helps me keep things straight.

You’ve got to know when to hold ‘em

When you’ve got a process that works, keep using it.  It’s a great feeling and there isn’t much better than finding a process that works well for you and your team.  When you finally settle on a process that works it can lead to a boon of productivity.  If you’re disciplined enough to capture data, use it to prove that your processes are actually working.  If you aren’t the data type you’ll have to rely on heuristics and intuition and the best heuristic I’ve found is the KA Index.  The KA Index is a subjective measure of how much your team Kicks Ass.  The best processes are the ones that make you awesome – fewer defects discovered in the field, better predictions of when you’ll finish tasks, fewer fires to fight – all of these are “gut measurable” data points indicating that your process is probably working.

Your gut can get you far but eventually you’ll need data to convince your boss to make the investment.  I can highly recommend the Personal Software Process if you’re interested in learning how to use data to objectively measure your processes.

Know when to fold ‘em

When you’ve got a process that isn’t working, drop it.  Data can be used here too if you’re collecting it.  If you’re not, I’ve found carefully examining process affordances to be helpful in uncovering bum processes but affordances can be extremely subtle.  The trick when giving up a process is to quit strategically – not too early, before you might see benefits, and not too late, after you’re experiencing severe problems.  (The Dip by Seth Godin is an excellent, short read on this subject.)  The other thing to look out for is the cost of dropping the process.  Even if your process is terrible, if worse habits fill the process vacuum you could be worse off.

Know when to walk away

Once you’ve decided to abandon a problem process it usually pays to gently transition to something else.  Changing processes can be expensive between learning curve, tool changes, and the requisite burn-in period when trying out something new.  If your bum process is only killing your team slowly, take a little time to read up on a replacement and avoid setting yourself up for failure.  For example, when my team switched to Scrum, we finished the iteration we were in when we decided to make the switch before making drastic changes to our operating procedures.

Know when to run

If your process is absolutely killing you, drop it.  Immediately.  When considering this route I recommend a thorough impact analysis including some back-of-the-envelope ROI calculations.  Changing a horse mid race will be expensive, I’ve seen it happen enough times in the DoD to know that you will pay dearly in the short term, but if the long term costs justify the change and you’re willing to go through the pain, do not hesitate to make the change.

Never Count your money when you’re sitting at the table

Generally I’ve found that processes out of a book need a few iterations of development before I’m able to understand how they really work and how I’m supposed to use them.  If you’re following an established process (XP, Scrum, TSP, ACDM, and so on), then try to follow it to the letter before tailoring it.  Most processes have built in mechanisms for changing them once you’ve gained some experience and by all means, tailor once you know something doesn’t work.

There’ll be time enough for counting when the dealings done

Postmortems are one of the most powerful tools in your silver toolbox.  No matter how the project goes down, always take the time to reflect on what happened, why, and what you should have done so you can use that knowledge next time.  Personally, I hate the idea “lessons learned” and much prefer recommendations for next time.  “Lessons learned” implies that you’ve learned the lesson and no further action is required but this is hardly the right attitude for effective software process improvement.

Every process is a winner and every process is a loser

So every software engineer knows that the secret to surviving, is knowing what to throw away and knowing what to keep.  And in the right hands, the right process can be a savior while under the wrong circumstances the same process might kill a project.  I can’t offer any concrete advice other than to not be afraid of data, listen to your instincts, and look back on what you’ve done.

Software Engineering: Art or Engineering?

For some reason software engineers and others who deal with software love sitting around debating whether software engineering is more art or engineering. Part of the problem comes from how software engineering came to be, the term assigned as more a challenge by the NATO Science Committee for the software industry to get its act together; one great big label to talk about many things. Software engineering covers all aspects of software development including requirements, formal modeling, estimation, planning, tracking, managing people, processes, designing, coding, testing, deploying, and maintaining just to mention the most obvious required skill sets. Some aspects of software engineering are not heavily engineering oriented and require no mathematical or scientific reasoning such as requirements and managing people. Many aspects can benefit greatly from a mathematical background but math is not required to “get by” such as when writing code, testing, or planning a project. This is unfortunate. Confusion and arguments abound among both the academic and professional software engineering communities creating holes in the professional software world where no talent amateurs defend abominable code and runaway projects under the guise of “artistic expression.” Software engineering is an engineering discipline.

I’m not saying there isn’t a creative component to software engineering. In fact, there is a huge creative component and creative problem solving is essential for building great software. I’m also not saying that code and the resultant software can’t be aesthetically pleasing. It absolutely can and should have good aesthetics. I enjoy reading well written, elegant code and always do the best I can when writing software myself. What I am saying is that art is more than just creativity and elegance and in almost all cases, software falls well short of art. Since the vast majority of software is not art, why should the people who wrote it be considered artists?

My neighbor upstairs is an artist in the traditional sense. He writes and plays music, he paints, he makes stained glass, and he’s a professor of eurhythmics (more specifically Dalcroze Eurhythmics) at Carnegie Mellon University. As you can imagine, he’s a really cool guy. He’s actually one of a handful of eurhythmics grand masters in the world and attended the world eurhythmics conference in Tokyo this past summer. From what I gathered from talking with him, eurhythmics is a way of studying art through the body’s motions. The most obvious application is music and dance but it’s not too difficult to imagine a painter creating a masterpiece; moving with the flow of the painting as it is created.

Some have even tried applying eurhythmics to study other, non-traditional art forms such as architecture and even mathematics. He explained it to me something like this. A brilliant mathematician working on an elegant proof might be considered more of an artist. But only someone truly brilliant who is in the zone, able to see all the connections of the work with unbridled clarity, tuning out the outside world (much like flow when programming), with body and mind working as one to solve a proof might be considered to be making art rather than doing math. It’s an extreme extension of eurhythmics theory, but it seems plausible.

The argument for treating software engineering as an art form is similar. Writing software could be considered art but only in the hands of truly brilliant software developers (Paul Graham calls them hackers) and only when those truly brilliant software developers are in the zone, building something amazing. Everyone else is just writing code. Much like how no one would consider multiplication tables artistic, anyone who considers churning out code without intention or understanding of what is written as art is kidding themselves. Looking at software engineering from the perspective of eurhythmics lets us state this more as a fact than an opinion.

Making this distinction is important because treating software like art and software engineers like artists is damaging to the industry. Only the greatest, most brilliant developers might ever have the ability to build software as artists. Everyone else is just trying to get by, much like the many thousands of starving artists who are never discovered, never hit it big, or whose muse never makes it around for a visit. Treating software like art allows developers who don’t have the artistic chops to have the excuse of creating poor quality software that is not fit for the purposes of its creation.

Fitness for purpose is another key distinction. Art exists to evoke emotion but software has a utility. In many cases, software is too important for it to not work correctly. Approaching software development from an engineering perspective creates an environment where it is possible for the rest of us to create working software that is still fit for purpose. Again, I’m not saying that software can’t be have good aesthetics, it should, I’m simply saying that a systematic, critical, methodical, mathematical, scientific approach to building software is required for most people to succeed – much like how architects need a solid foundation in physics and engineering to build beautiful buildings.

I like the idea of software as art. Artists are cool and thinking of myself as an artist allows me to justify extravagant behavior that, as a creative knowledge worker, I know I need. On the other hand, I rarely care about using software as a medium for creating an emotional experience for my users. I only really care about creating something great that helps users kick ass while accomplishing whatever it is that they want to do. Following sound engineering principles helps me meet this goal consistently.

Building software is a creative endeavor and the engineering part of the field is still very young compared to other engineering disciplines. But treating software engineering like an art form is a copout, an excuse for ignoring the sound engineering practices that do exist because they’re difficult or boring or unappealing. There are important creative components to building software but that shouldn’t hide the fact that proven engineering practices can and should be used too.

Securing the Internet

Over Spring break I was fortunate to receive an invitation to the exclusive CERT (Computer Emergency Response Team) 2009 Technical Symposium.  The symposium was held in celebration of the twentieth anniversary of CERT but the main theme of the symposium was an examination of the internet, how the choices made just over 20 years ago have resulted in a huge security mess and what we can do about it today to avert the utter destruction of our modern way of life.  I wish I was exaggerating but the internet is rapidly becoming the exclusive means for sharing and managing the world’s data.  The world’s economy is following suit which heightens the need for security even more.  Speakers at the symposium included Vinton Cerf (sometimes known as “The Father of the Internet”), Scott Charney (VP for Microsoft’s Trustworthy Computing Division), Lawrence Roberts (invented packet switching), Steve Crocker (invented “Request for Comments”), Paul Mockapetris (invented DNS), and many other brilliant people who have been instrumental in creating and defining the internet.  Each speaker brought slightly different experiences to bear but everyone had the same message: if we don’t do something to secure the internet soon, something very, very bad will happen.

First, a little history

Security was never considered when the internet was originally conceived.  I guess that’s not entirely accurate, security was considered but quickly dismissed in favor of “the smallest set of tools that would give us the largest set of possibilities for the future,” according to Mockapetris.  Considering the original operating environment, the closed ARPANET/DARPANET, it makes sense that security was traded in favor of other quality attributes such as performance, reliability, and modifiability.  In the early 1980s, hosts numbered in the hundreds and there was an established and respected community of trust among the fraternity of hosts.

Once the internet became a public entity things changed.  Slowly, the community of trust that kept everything together began to unravel.  It started with news group flame wars.  Shortly after viruses started showing up.  There was even some direct hacking, albeit primarily by misled white hat hackers trying to help.  Then on November 2, 1988 the Morris Worm was released by a researcher from MIT and within 72 hours most of DARPANET was taken offline by the resulting accidental denial-of-service attack.  The community of trust that kept the internet running was permanently shattered.

CERT was created to deal with the problems that resulted from the world’s first denial-of-service attack.  While CERT’s mission is noble and its function critical to dealing with all the threats faced on the internet today, CERT is not the panacea required to save the world from the ever increasing internet threats coming from all angles in the form of botnet armies, distributed denial-of-service-attacks, and targeted attacks by sophisticated hackers motivated purely by profit.  The main problem is that CERT, by nature, is a reactive organization put in place to respond to emergencies as they happen.

As Vinton Cerf pointed out at the symposium, “the internet is always under attack.  It’s not enough to respond to each of the different incidents as they happen.  Reactivity is not enough.”  To save the world, we need to proactively address the security holes that were originally designed into the internet.

Short Term Solutions

Fixing the internet is an architectural problem.   It’s impossible to focus on a single layer of the protocol stack to make the internet more secure.  Each layer has different kinds of vulnerabilities and each layer can be exploited using different methods to achieve the same ends.  Unfortunately, replacing the entire protocol stack is difficult, thanks in a large part to how successful the internet has been.  But there are several short term strategies we can begin working toward today that work within the constraints of the current system.

One of the easiest and most significant things we can do is take humans out of the security loop.  It’s been proven time and again that people don’t follow instructions for securing systems.  Both John Gilligan and Vinton Cerf stated that improperly configured software accounted for over 80% of attacks on the internet.  Fix this problem and we’ve essentially removed 80% of cyber security threats.  Fixing the way we identify ourselves online (i.e. passwords) is another easy way to take humans out of the security loop.  A recurring theme from most speakers but most prominently discussed by Scott Charney was redefining identity online.  There are solutions that can work now to solve each of these problems.

It turns out that fixing the improper configuration problem is easy but it’s going to be a difficult sell.  The basic idea is that a computer or device has to start with a good configuration and then control mechanisms have to be put in place to prevent the configuration from getting hosed.  Some vendors such as Apple, Tivo, and Blackberry are already doing a great job at keeping their device configurations under control.  The trade-off for security is going to be freedom (modifiability if you prefer quality attributes).  In exchange for a usable, safe, and enjoyable internet experience that even your grandmother can handle you have to give up the ability to tweak anything you want in the system.  If iPod, iPhone, Tivo, and Blackberry sales are any indicator, people are willing to exchange freedom for something that just works.

Is it really fair to compare devices such as phones and PDAs with full blown computers?  Absolutely.  The key phrase is “Internet Enabled Device.”  An iPhone or Tivo can be hijacked and recruited into a botnet army just as easily as a standard desktop PC.  If even only 1% of the more than 4 billion mobile phones in use throughout the world were compromised, that would make 40 million bots ready to unleash mountains of spam and unending DDoS attacks.  The only reason this doesn’t happen is because mobile devices are effectively locked down to prevent such compromises from occurring.

With regards to passwords, I hate to say it but I think the Department of Defense is actually on the right track.  Every DoD employee and contractor is issued what is known as a Common Access Card (CAC), an ID card attached to a digital identification certificate issued by a third party authority that has verified that the person to whom the certificate was issued is indeed that person.  Both Vinton Cerf and Scott Charney were very clear that online identity can only be solved by combining something you know, usually a password, with something you have, a physical thing.  This something you have could be a digital identification certificate issued by an authority.

While it would be awesome to issue CACs to every citizen in the United States, perhaps attached to their driver’s licenses as Charney suggested, a staged success approach is possible.  PayPal has taken the initiative with a hardware device called Security Key.  Security Key is small enough to fit on a keychain and it generates a secret token (in this case a string of numbers) every thirty seconds.  When you purchase through PayPal you enter your username, password, and the token shown on the device.  Even if you accidentally gave this information to a phishing website, the token will be worthless after 30 seconds protecting your online wallet from fraud.

Though Vinton Cerf didn’t know the first thing about what I was talking about when I asked him about it, I see a lot of potential with OpenId.  OpenId is a protocol that provides something akin to a universal password that you can use to access thousands of websites and, with a little more work, could solve the identification problem across the internet.  First, OpenId providers would need to take on the role of verifying user’s real life identities and connecting the physical identity with the digital identity.  Second, OpenId providers have to offer some kind of physical verification when logging in, the something you have.  This could be in the form of a CAC, something biometric like a fingerprint reader, or even something as simple as PayPal’s Security Key.  In my opinion, OpenId is poised to become the third-party identification authority the world needs.

The Next Step

Another big theme at the symposium was trying to come up with a plan of action for moving forward, for fixing these problems.  Most speakers agreed that it would be unfortunate if it took a Pearl Harbor-like event to be the catalyst that motivated people to action.  Given the sophistication of attacks and the amount of value on the internet, the magnitude of damages could bring the United States, and in turn the world thanks to the global economy, to its knees.  The most frustrating part is that we know about the problems and I think there are a lot of smart people who know how to technically solve the problems.

By the end of the second day of the symposium I was left with more questions than answers and I am completely convinced that there will be a cyber security failure of a magnitude we have not yet seen in the near future.  I’m not suggesting that everyone start stocking canned goods in the cellar, but now is the time to proactively start working for change.  For as much flack as Microsoft has taken over the years, they seem to be on the forefront of effecting change both in terms of technology and public policy.  It’s difficult to know where to start, but Microsoft’s Trustworthy Computing resources are both comprehensive and practical.  The knowledge to fix the problems seems to be out there, the trick is motivating the world to change.

Process Affordances: Ignore at Your own Peril

The Amsterdam airport was able to reduce the amount of urine “spillage” that hit the men’s room floor by 80% simply by etching a life-like image of a fly near the urinals’ drains. The fly was specifically engineered into the urinals to alter gentlemen’s behavior without their having to think about it. The concept is called nudging and it’s been used in domains other than restroom sanitation to encourage desired behavior. Other examples include the use of uncomfortable chairs in fast food restaurants to encourage people not to linger and real-time gas mileage displays in cars to encourage more economical driving. If you’ve read Donald Norman’s The Design of Everyday Things then you’ll know this as an affordance – a hint given to the user prompting them to take a specific action at a specific time.

Obviously the idea of affordances is directly applicable to devices as well as software usability but it wasn’t until I read about the urinal flies that I realized affordances don’t always have to have a physical representation. For example, a well designed software process should gently nudge a team to do the right thing. Since there is no one-size-fits-all process that works for all teams it is essential that the process complements the team and that the process’s affordances nudge team members to do what’s best for the project and the team.

Using a process that lacks the right affordances could have one of two possible outcomes. In the best case, the team abandons the process because they realize subconsciously that it is telling them to do the wrong things at the wrong times. This is bad because it sacrifices repeatability; you’ve regressed back to an ad hoc, “make it up as we go” state. In the worst case, the team sticks with the process and it leads them astray. This introduces risks into the project and could lead to complete project failure.

Software is already difficult enough to build successfully and processes are supposed to make software development easier. Unfortunately, knowing when something isn’t working is not an exact science, but with a dash of experience and little team reflection (for example from regular postmortems) it is possible to figure out when you are working for your process instead of your process working for you. To demonstrate this I am going to tell you a story.

Our Process

My studio team in the Carnegie Mellon software engineering program is charged with building a web-based requirements elicitation tool that helps users follow the SQUARE process out of the SEI. About halfway through the Elaboration Phase of the project (sometime in the spring semester) the project was going downhill. The warning signs were fairly apparent, we were missing milestones, tasking priorities were confusing, and a lot of work was stalling out at different levels of partial completion. Though we knew there was something wrong we weren’t really sure what was causing it, what we were doing wrong in our planning and tracking process.

The planning process we were using was fairly simple. At the beginning of the phase we looked at all the activities and artifacts that need to be completed by the end of that phase. For each identified milestone we enumerate specific entry criteria, general tasking, validation procedures, and exit criteria. This is a technique known as ETVX (entry, tasking, validation, and exit). Next we used planning poker to estimate how long we thought each milestone would take to complete. Finally, with this information we created a phase timeline which includes known due dates and dependencies between milestones.

Since we’re using an iterative approach to complete work in a phase, iterations follow largely the same planning process on a smaller scale. As a team we identify the milestones on which we will work during the iteration. Each milestone is assigned an owner whose job it is to ensure the milestone is completed by either delegating tasks or working on it themselves. The planning poker estimate is used to determine the approximate workload allocation on the team. This estimate is validated with bottom-up estimates that team members create based on their individual tasking.

There are several good things about this process. First, it’s written down and the team follows it. This is good because it means we can produce repeatable results over time. Second, this process makes use of several practices that are generally considered “good” by software experts. ETVX is a great way to clearly identify project milestones. Planning poker is similar to the wide-band Delphi estimation technique. Third, we’re using two forms of estimation to validate the plan as more information becomes known. Finally, the engineers responsible for the work determine the specific tasking and creating the bottom-up estimates.

You’re Good, but not That Good

In spite of all the good things we were doing, something still wasn’t connecting. The big aha! moment occurred about two weeks into the second iteration. Up to that point I had been working on my tasks that had carried over from the first iteration. The team leader noticed that almost no work had been started on the milestones I owned. [An aside: this, to me, says that at least our tracking process works somewhat well.] During the discussion that followed I became extremely defensive when the team leader asked me to shift priorities for the rest of the iteration. What should have been a simple request turned into a heated debate over tasking. I felt compelled to complete the past due work and here was this jerk trying to stop me. “Sure,” I thought, “I’ll do what you ask, buddy, but when this whole project comes crashing down it’s on your head, not mine.”

Later, as I looked back at the incident, I wondered to myself, “Why was I so defensive in light of such a simple request?” The reality was that the project wouldn’t come crashing down if I shifted priorities and I knew that. So why defend these older tasks when it was obvious that there were more immediate needs?

It turns out that the affordances built into the planning process were encouraging my behavior. There were a few simple things at play that, when combined, decreased our ability to plan effectively.

First, our process encouraged us to plan more work than time allowed. This was due to there being a missing connection between day-to-day progress and the “big picture,” the overall plan. Second, though the new team leader may have believed there was consensus, the team in fact did not wholly agree with the priorities for iterations. This behavior was not specifically discouraged by our planning process and so allowed to persist. Third, leftover work was not addressed during planning. Some tasks might simply expire while others may change priority, becoming more or less important with a new iteration. Since this wasn’t addressed it created a sense of urgency for individuals carrying over work from iteration to iteration. Finally, assigning milestone owners had unanticipated side effects. The goal was to ensure that someone was taking responsibility for coordinating and monitoring milestone work. This worked so effectively that milestone owners exhausted themselves attempting to finish milestones and resisted changes to the plan that prevented them from finishing what was promised.

When it came time to make a necessary modification to the plan, our process encouraged us to fight against the best course of action for the team. We didn’t have the level of flexibility needed due to our process’s affordances nudging us to do the wrong things. Milestones were slipping and people wanted to finish what they started. Project priorities were shifting as the project matured but team members were wearing blinders, ignoring the changing facts around us. To stand a chance at success we had to change the affordances in the planning process. We had to nudge the team in a new direction.

Our Solution

To try to solve this problem we decided to incorporate some of the planning principles from Scrum, specifically the product backlog, sprint backlog, and sprint planning meeting, into our planning process. Scrum takes a more task-oriented approach when planning iterations and correlates the sprint backlog with the product backlog. This better encourages the team to not plan more work than there is time to complete while connecting day-to-day work with the overall plan. Scrum also requires that the team reprioritize work when planning iterations and that we agree on the resulting priorities. This will hopefully eliminate the prioritization conflicts we experienced during iterations. With Scrum, leftover work from iterations is saved in the product backlog. This change decreases the anxiety team members feel when work is left undone (because the work is not forgotten) while simultaneously giving the team more flexibility to change direction as the project progresses. Finally, the team, rather than individuals, takes ownership over the milestones held in the product backlog. With each commitment made during iteration planning, the whole team buys in effectively shifting the passion and dedication individuals held for owned milestones to the commitments we agreed on as a team.

I’m not really sure how Scrum is going to turn out for us. I think the most important thing is that we recognized that something was not working and took action to correct it. I personally would rather see the team fail in a new and spectacular way rather than repeating the same mistakes again and again.

Add This to Your Silver Toolbox

Unfortunately, I don’t think there is a trick for detecting these sorts of process failures. Data and metrics can help but only if the process is repeatable and the team has the knowledge and discipline to collect the data in the first place. Team postmortems can help but if individuals are afraid to raise concerns, you’ll find yourself on a trip to Abilene before you realize it. In many cases, if you think something isn’t going well, others are probably thinking the same thing. Once I spoke up I found out that others thought something wasn’t working also. I was just the first person who was able to articulate it.

Affordances are powerful but subtle mechanisms. In well designed things, we aren’t supposed to be consciously aware of them. But that doesn’t mean they always nudge us to do the right thing.

Applying Leadership Styles

The setting is a small library filled with various books on software engineering. Five people are sitting around a single, small table, the room overcrowded with the group. The Square Root team is reviewing the mid-semester presentation I threw together last night before we were to show it to our mentors in a few hours. I was showing the team the last slide. Up to that point everyone had pretty much agreed with the content in the presentation.

“Does anyone have any other risks they’d like to bring up?” I asked, confident that, as the team leader, I had a firm grasp on how the team was doing and where our current problems lay.

I waited a few seconds.

“OK, if nobody’s got anything I’ll go ahead and email this out…”

“Communication,” the quietest member of the team chimed in. “We have problems with our team communication.”

I was in shock. Surely this must just be an isolated problem. “Do you mean team communication or do you mean you don’t understand some aspect of the project?”

“I agree. Team communication seems to be problematic.” Now there were two dissenters.

A few seconds later there were three. Three fifths of the team felt we had team communication problems that were directly impacting the outcome of the project. I sat there shocked for a moment. I had just been blindsided by a team communication problem, therefore, obviously, we have a team communication problem.

Looking back at this incident I have decided that leadership, my leadership, was to blame. From the beginning, I had led the way I liked to be led: hands-off with enough space to make my own decisions and get things done the way I wanted to do them, the polar opposite of micromanagement. Of course, for me this worked out really well. This was the way I liked to work and the way I was used to working. Unfortunately, for a new team, a team lacking in trust, whose members didn’t know one another and had other commitments and priorities (school work) my ideal working environment and preferred leadership style was disastrous.

To prevent utter team destruction I changed perspectives. From what I could tell, team members didn’t know what they were supposed to be doing. I had assumed that this team would operate similarly to my last team where everyone would come together like a well-oiled machine to get work done. Shortly after the incident in the library I remembered that my last team took almost a year to become that awesome. Of course there were going to be problems in my new team.

Since my team seemed not to know how to take initiative in completing tasks and getting work done I thought I’d set an even better example for them. By doing so, maybe they’d get the hint and follow suit. The next day I kicked things into gear. I took on more work. I picked up all the slack I could and then some. I took on tasking outside of work specifically assigned to my role. I set high standards for myself and my team and we were going to meet those standards if I had anything to do with it.

Leadership disaster number two was prevented thanks to a timely assignment in my Managing Software Developers course. Leadership that Get’s Results by Daniel Goleman defines six distinct leadership styles and gives a little advice on when each style is appropriate. Up to that point I had no knowledge of such styles and had been flying on instincts. The following table is taken from Goleman’s paper but I encourage you to read the full paper to gain a better understanding of these ideas in a more complete context. There’s also tons of information on the web, a search away.

Leadership Style Description When the style works best Impact on team climate
Coercive Demands immediate compliance, “Do what I tell you.” Crisis, kick-start a turnaround, problem employees Negative
Authoritative Mobilize people toward a vision, “Come with me.” When changes require a new vision or clear direction is needed Most strongly positive
Affiliative Creates harmony and builds emotional bonds, “People come first.” Heal rifts in a team or motivate people during stressful circumstances Positive
Democratic Forge consensus through participation, “What do you think?” Build buy in or consensus, or to get input from valuable team members Positive
Pacesetting Set high performance standards, “Do as I do, now.” Get quick results from a highly motivated, competent team Negative
Coaching Develop people for the future, “Try this.” Help team members improve performance or develop long-term strengths Positive

As it turns out, I started with the right idea by using an authoritative style of leadership even though I didn’t know the name for it. I failed with this style because the team wasn’t ready for it yet. We didn’t have a clear vision or clear goals. We weren’t yet working well together. There was little buy-in to my leadership or the few goals the team did have. We were a full-fledged dysfunctional team. Switching to a pacesetting style was just about the worst possible thing I could have done. If I had gone on for too long I am fully confident that I would have destroyed the team and we wouldn’t have gotten anything done during our first semester. With my new found knowledge of the different leadership styles I now had new options. I immediately put three styles to use: affiliative, coaching, and coercive.

The team was clearly broken. My hope was that affiliative leadership might help build trust and better bonds among teammates. Being a programmer I tend to be a little more logical than emotional so trying to tune in more to how people felt was tough. I found that affiliative leadership goes hand in hand with vulnerability and trust and that individually thanking someone for their hard work, even if it’s relatively small, and meaning it is one of the most important things you can do as a leader.

In addition to team harmony, it was obvious that some team members were struggling with the tasks they had been given. Rather than taking over those tasks as I had been doing, I decided to try taking on more of a coaching role. In some cases I would help team members directly, other times I encouraged other team members to work together on tasks. The end result was a team better able to work together to accomplish tasks.

In spite of all this, I had the feeling that if we didn’t do something immediately, the entire semester would be a wash. To prevent this from happening I used coercive leadership in an attempt to get us back on track quickly even though the overall impact could be negative in the long term. In a sense, I was willing to be a bad guy so the team had a chance of meeting its goals. The gamble paid off and the way the team operated turned around almost instantly.

About two months after I had been blindsided by unseen communication problems, my team seemed to be working together much better. Problems were being flushed out into the open more quickly and everyone on the team seemed to enjoy working with one another on the project. I am not going to take full credit for the change but I will take credit for being the catalyst that put the change into motion. All because I changed how I led the team.

There is a small downside to the changes, but it’s only really a downside because I enjoy doing technical work. The leadership role on my team has evolved. By the end of the semester I found myself directly responsible for almost no work but rather, I was the go to guy for all problems the team was facing. I had my hands in everything but I wasn’t able to really sink my teeth into anything. If this is what management is like and I ever do become a project manager I will need to find ways to remain technically involved in something or I think I’d eventually go insane.

The biggest lesson I’m taking away from this experiences is that sometimes the team’s needs and my needs aren’t going to line up. Recognizing when this is occurring and finding a balance between these conflicting needs is critical to team success.

Thoughts on Formal Methods in Software Engineering

Are formal methods in software engineering really all they are cracked up to be?

First, let me clarify what I mean by formal methods.  Formal methods use mathematical representations of software to formally prove that the software behaves as expected.  Most formal methods use predicate logic and set theory along with other mathematical notations to describe software in a formal specification.  Examples include Z (pronounced zed), state machines (such as the finite state process algebra FSP), Petri Nets, the use of pre and post conditions, and the Object Constraint Language used in conjunction with class diagrams in UML.  Most formal methods have some kind of automated proof capability be it by brute force or through the use of more sophisticated theorem proving programs.

Let’s take a quick look at Z which is probably the most widely available formal specification method in terms of books, online documentation, and automated theorem provers.  Here’s a simple example of the birthday book taken from “An introduction to Z and formal specifications” by Mike Spivey.

Looks like gibberish, right?  It isn’t actually important that you understand what this means.  In fact, if you don’t understand it then I’ve already made my point.  Writing formal specifications using a formal modeling language is time consuming, costly, and in the end, not effective since it is too difficult to understand what the specification is actually telling me, the programmer, even if I’m trained in Z.

Formal methods get a lot of traction from the ridiculously high cost of defect injection early in the software lifecycle and the unacceptably high cost of defects in safety critical systems.  Injecting a bug in the requirements phase that is not caught and fixed until testing has been estimated to cost upwards of 100 times the cost of finding and fixing the same bug while the requirements are being written.  In safety critical systems, where human life is literally in the hands of software, deployed defects could result in injury or death (the Therac-25 accidents being some of the most famous).

Formal methods folks argue that a clear, precise, unambiguous, and proven formal specification can prevent many of these early defects from making their way into the software and thus prevent these problems.  Economic claims aside, formal methods fail because they still ignore the weakest link in building software: the human programmers.

Mathematics tells us that it is possible to create functions that map one representation of a set to another equivalent set.  For example, given a cube in three-dimensional space, a rotation function will maintain the dimensions of the cube while rotating the shape about a desired axis.  A similar principle is applied when considering Z and other formal methods.  Mapping functions are used to transform a high-level Z specification into a more concrete representation of the same specification.  Eventually the specification reaches a level of detail that is easily translated to code.  In practices this might mean that a set, such as birthday in the BirthdayBook schema above might be represented by a pair of arrays in a more concrete version of the schema.

While theoretically this sounds great, the problem is that the mapping cannot be automated in most cases so we’re essentially back to square one in terms of what we can say about the more concrete version of the specification.  This is thanks to the inevitable mistakes injected during the transformation process.  Kennedy-Carter’s iUML from the Model Driven Architecture camp is the closest thing to an automated transformation tool I’ve yet seen and even that did not include OCL or any sort of pre and post condition analysis.

So formal methods are difficult to write, difficult to understand, few people know them, and they don’t translate easily into code.  Is there a silver lining?  After all, a lot of time and research has gone into devising formal methods and some of the greatest minds in software engineering are behind them.

I think the answer is a big fat maybe.

Formal methods, in the right hands on the right project can make it easier to understand the system being specified.  A Navy track manager might be one good example.  Rules for how ships assume reporting responsibility might be easily recorded as a predicate.  Safety is certainly a concern and though the validation process will have to begin anew once code is written at least the basic algorithms will be clear and unambiguous.  That is of course assuming the right people are working on the project.  Then again, in the case of the IABM, any specification would have been better than the big plate of nothing the developers had to work with.

For almost every project in the world, I think formal methods should be generally avoided.  Given the option of spending money and time on mathematicians or extremely smart coders I would chose the latter.  With smart coders, code inspection is a fun and effective defect filtering option.  And let’s face it.  Why would you have your amazing coders do something other than write amazing code?