Food for Thought-An e-newsletter published by Software Quality Consulting, Inc. June 2008, Vol. 5 No. 5 Managing Software Risks - Part 1 What topics would you like to see in this newsletter? Each month, this newsletter tries to provide you with useful information. This is a two-way street and your feedback is important. Please send your thoughts and comments to steve@swqual.com. -------------------------------------------------------------------------------- Welcome to Food for Thought(TM), an e-newsletter from Software Quality Consulting (http://www.swqual.com/index.html?Intro). I've created free subscriptions for my valued business contacts. If you find this newsletter informative, I encourage you to continue reading. Feel free to pass this newsletter along to colleagues by clicking the Forward Email link at the bottom of this page. If you’ve received this newsletter from a colleague and would like to subscribe, please click this Enter New Subscription link (http://www.swqual.com/newsletter/Subscribe.htm?Newsletter). If you don't wish to receive this newsletter, click the SafeUnSubscribe(TM) link at the bottom of this newsletter, and you won’t be bothered again. Your continued feedback on this newsletter is most welcome. Please send your comments and suggestions to info@swqual.com. ------------------------------------------------------------------------------- *** In This Issue ** In This Months’ Topic, I discuss software risk management... Regular features to look for each month are: - Monthly Morsels Hints, tips, techniques and reference info related to this month’s topic - Calendar Conferences, workshops, and meetings of interest to software engineers, QA engineers and anyone interested in software development -------------------------------------------------------------------------------- *** This Month’s Topic *** MANAGING SOFTWARE RISKS - PART 1 True story - on a recent business trip to Europe, the in-flight entertainment system crashed several times during the flight. When the flight attendant re-booted the system, I watched the screen as the boot loader loaded binary files, executables, libraries and drivers. At first, I thought this was very amusing - yet another example of defective software. However, whenever the in-flight entertainment system crashed, the reclining seats also stopped working. So if your seat was in the reclined position when this occurred, it became difficult if not impossible to leave your seat until the system was up and running again. It was no longer amusing... What does this have to do with managing software risks? Clearly, the company who designed the software failed to identify a potential risk - that being passengers can be stuck in their seats when the system crashed. They probably didn’t anticipate the situation where the system would not be functioning... Risk is something we deal with every day. On many software projects, we acknowledge that there are risks but often fail to address them in a proactive manner. As a result, many software projects are negatively impacted by risks that were usually known but were not effectively managed. - Risk is defined as the probability of incurring a loss or enduring a negative impact. [8] - Risk management is an organized process for identifying and handling risk factors; includes initial identification and handling of risk factors as well as continuous risk management. [8] “Software has long been regarded as one of the most risk-prone of all engineering activities. Risks such as schedule slips and cost overruns tend to occur on more than 50% of all large systems. Even more severe risks, such as cancellation of the project prior to completion or serious quality deficiencies are not uncommon.” [1] Effective risk management has become increasingly important, given the exponential increase in software complexity as illustrated below (see the HTML version for this graph) [2] In this first part of a two-part discussion on managing software risks, we’ll look at many different types of risk we need to be aware of and look at some organizational barriers that may inhibit the identification of these risks... SOFTWARE HAS MANY DIFFERENT TYPES OF RISKS... We all know that using software has risks - these are External Risks and may include: - Economic Risks Software failures have caused significant financial losses. For example, In 1996, an unmanned Ariane-5 rocket exploded 37 seconds after liftoff. The cause of the failure was traced to software specification and design errors in the inertial reference system. Property losses were estimated at over $500 million. [3] In 2003, defective software was a major contributor to the Northeast power blackout, the worst power system failure in North America. Over 50 million customers lost power as 100 power plants were shut down. Financial losses from this failure were estimated at $6 billion. [4] - Social Risks During the first Gulf War, an Iraqi Scud missile killed 28 American soldiers and wounded 98 others. The Patriot missile was deployed to protect US forces against Scud missile attacks and should have intercepted the incoming Scud. However, a defect in the Patriot’s target acquisition software prevented detection of the incoming Scud missile. Ironically, this defect was found and fixed, but the fix arrived one day too late. An investigation of this incident revealed that the original design requirements for the target acquisition software were based on an assumption that the radar set associated with the Patriot would be operating for not more than 14 hours in a given 24 hour period. Under this assumption, the error associated with tracking potential targets would meet the design specifications. When deployed however, this assumption turned out to be false as the radar set was on continuously, and as a result, the targeting error was greater than the limits specified in the design specification. This is an example of a risk arising from an invalid assumption. [5] - Political Risks Software has even been involved in politics. The public’s confidence in electronic voting machines has been tainted due to poor design and ignored risks. A recent controversy in California highlights this issue: “California Secretary of State Debra Bowen announced on Friday that the state hopes to recertify and continue using electronic voting machines produced by Diebold, Sequoia, and Hart, even though the machines have known security vulnerabilities and severe flaws. The state government decided that the machines can still be used as long as the vendors adhere to a lengthy list of requirements that aim to limit the potential for security breaches and machine failure. This announcement from the state follows extensive red team security audits that illuminated profound security failings in all of the electronic voting machines that were subjected to scrutiny. The security researchers, who analyzed the voting machines found ways to modify firmware, gain root access, trivially circumvent voting machine physical security mechanisms, install self-propagating trojan horses, and manipulate mock elections. On Diebold’s voting machine, which uses the Windows operating system, researchers even found a remotely-accessible administrative account that wasn’t protected by a password.” [6] On the other hand, developing software also has risks - these are Internal Risks [7] and may include: - Schedule Risks - Is the schedule realistic? - What assumptions were made in developing the schedule? - Are all of the resources identified in the schedule available from the start? - Staffing Risks - Are the best people available? - Do they have the right skills for this project? - Are enough people with right skills available? - Are people committed for the duration of project? - Have staff members received necessary training? - Will turnover likely affect project? - Process Risks - Are requirements well defined and reviewed for ambiguity? - Is there a documented development process? - Does Management support following the development process? - Is development process followed? - Are published software standards provided to staff? - Are peer reviews part of the process? - Have staff been trained in peer reviews? - Are CM tools, procedures, and training in place? - Is there a process for changing requirements - Technology Risks - Is technology new to your organization? - Are new algorithms required? - Does software interface with new or unproven hardware? - Does software interface with unproven 3 rd party software? - Are there unreasonable performance requirements? Now that we have identified some broad categories of risk, let’s look at some ways to deal with risk. DeMarco and Lister [9] identify four things you can do: - You can avoid it You avoid risk when you decide to cancel a project or not do a part of a project that has risk. As an example of avoiding Internal Risks, the brokerage firm Merrill Lynch decided to cancel a project to develop web-based stock trading - before any company had done this. Companies such as Fidelity, E-Trade and others were willing to accept this risk and as a result, reaped significant financial gains. - You can contain it Containing risks mean you plan to deal with them when they occur. You may set aside additional time and resources to deal with an internal risk for example. One of the most infamous examples of trying to contain an external risk is the case of the Ford Pinto and its infamous exploding gas tank... “Through early production of the model, it became a focus of a major scandal when it was alleged that the car's design allowed its fuel tank to be easily damaged in the event of a rear-end collision which sometimes resulted in deadly fires and explosions. Critics argued that the vehicle's lack of a true rear bumper as well as any reinforcing structure between the rear panel and the tank, meant that in certain collisions, the tank would be thrust forward into the differential, which had a number of protruding bolts that could puncture the tank. This, and the fact that the doors could potentially jam during an accident (due to poor reinforcing) made the car a potential deathtrap. Ford was aware of this design flaw but allegedly refused to pay what was characterized as the minimal expense of a redesign. Instead, it was argued, Ford decided it would be cheaper to pay off possible lawsuits for resulting deaths. Mother Jones magazine obtained the cost-benefit analysis that it said Ford had used to compare the cost of an $11 repair against the cost of paying off potential lawsuits, in what became known as the Ford Pinto Memo (http://www.calbaptist.edu/dskubik/pinto.htm). The characterization of Ford's design decision as gross disregard for human lives in favor of profits led to major lawsuits, criminal charges, and a costly recall of all affected Pintos.” [10-http://en.wikipedia.org/wiki/Ford_Pinto #Safety_problems] There were at least 27 deaths attributed to this design decision and several lawsuits totaling about $121 million. Ford thought they could contain this risk... - You can mitigate it Mitigating risk means that you take proactive steps to minimize the impact of a risk should it occur. Memory leaks are a very common example of an internal risk that exists on many software projects. Using memory leak detection software is an example of mitigating this risk. - You can evade it Evading risk means you take no action at all - other than hoping that the risk doesn’t materialize. Of all ways of dealing with risk, this is clearly the least expensive but also the least effective. You may be lucky enough to dodge a bullet once or even twice, but eventually you will get nailed. And Murphy’s Law says, when you do get nailed, it will be at the worst possible time... CREATING A CULTURE OF RISK AWARENESS DeMarco and Lister refer to Risk Management as “project management for grown-ups.” Only naive project managers and immature organizations pretend that risks will not affect projects in some way. The culture at many organizations can suppress open and frank discussion of risk. One very sad example of this was associated with the 1986 explosion of the space shuttle Challenger. The commission which investigated the accident found that: “...the Challenger accident was caused by a failure in the O-rings sealing the aft field joint on the right solid rocket booster, which allowed pressurized hot gases and eventually flame to "blow by" the O-ring and make contact with the adjacent external tank, causing structural failure. The failure of the O-rings was attributed to a design flaw, as their performance could be too easily compromised by factors including the low temperature on the day of launch. More broadly, the report also considered the contributing causes of the accident. Most salient was the failure of both NASA and its contractor, Morton Thiokol, to respond adequately to the design flaw. This led the Rogers Commission to conclude that the Challenger disaster was "an accident rooted in history." [11-http://en.wikipedia.org/wiki/ Space_Shuttle_Challenger_disaster] The culture at both NASA and Morton Thiokol essentially suppressed concerns that engineers had about the safety of the O-rings. DeMarco and Lister [9] identify cultural barriers to risk awareness in many organizations. These barriers are often in the form of “unwritten rules” that make it difficult to discuss potential risks. For example: - Don’t be a negative thinker - Don’t raise problems unless you have a solution - Don’t raise a problem unless you can prove it is a problem - Don’t raise a problem unless you are willing to take responsibility for the solution Does the culture at your company effectively prevent people from raising concerns and risks? If so, you need to bring this to the attention of senior managers and discuss ways of changing the culture. One good example is to discuss the role of playing the devil’s advocate. By explicitly acknowledging this role, you are in effect freed from the culture that attempts to suppress discussion of possible risks. SUMMARY Risks are inherent in every software project. If there were no risks, the project wouldn’t be worth doing. In my next newsletter (Sept 2008), I will discuss the basic process of performing risk management. Until then, remember that the biggest risk on any software project is n ot knowing what the risks are! I hope you have a pleasant and relaxing summer. Please look for my next newsletter in September... -------------------------------------------------------------------------------- *** Monthly Morsels *** Every month in this space you’ll find additional information related to this month’s topic. - References 1 Capers Jones, Assessment and Control of Software Risks, Prentice-Hall, 1994 2 Higuera, R. P. and Haimes Y. Y., Technical Report CMU/SEI-96-TR-012, ESC-TR-96-012, Software Risk Management, June 1996 3 Ariane-5 Rocket Explodes on Liftoff (http://www.ima.umn.edu/~arnold/disasters/ariane.html) 4 2003 Northeast Power Blackout affects 50 million people (http://en.wikipedia.org/wiki/2003_North_America_blackout) 5 Weiner, L., Digital Woes: Why We Should Not Depend On Software, Addison-Wesley, 1993. 6 Paul, R., “California to recertify insecure voting machines”, ars technica, August 2007. (http://arstechnica.com/news.ars/post/20070806-california-to-recertify- insecure-voting-machines.html) 7 Pressman, R., Software Engineering: A Practitioner’s Approach, McGraw-Hill, 1997, 4th ed. 8 Fairley, R., “Software Risk Management Glossary”, IEEE Software, May-June 2005. 9 Lister, T. and DeMarco, T., Waltzing With Bears: Managing Risk on Software Projects, Dorset House, 2003. 10 Wikipedia, Ford Pinto Safety Problems (http://en.wikipedia.org/wiki/Ford_Pinto#Safety_problems) 11 Wikipedia, Space Shuttle Challenger Disaster (http://en.wikipedia.org/wiki/Challenger_explosion# Rogers_Commission_investigation) - Additional Resources Wiegers, K., “Know Your Enemy: Software Risk Management”, Software Development, October 1998 Boehm, B, Software Risk Management, IEEE Computer Society Press, 1989 Jones, C., Assessment and Control of Software Risks, Prentice-Hall, 1994 Yourdon, E., Death March: Managing Mission Impossible Projects, Prentice-Hall, 1997 Grey, Practical Risk Assessment for Project Management, Wiley, 1995 Charette, Robert N., Application Strategies for Risk Analysis & Management, McGraw-Hill, 1990 Cem Kaner and James Bach, Risk Based Testing Course (http://testingeducation.org/BBST/BBSTRisk-BasedTesting.html) -------------------------------------------------------------------------------- *** Calendar *** Every month you’ll find news here about local and national events that are of interest to the software community... - Software Quality Calendar There are many organizations that sponsor monthly meetings, workshops, and conferences of interest to software professionals. Find out what’s happening... (http://www.swqual.com/links/upcoming.html) - Workshops Offered by Software Quality Consulting Software Quality Consulting offers workshops in many topics related to software process improvement. Get more info... (http://www.swqual.com/seminars/courses.html) -------------------------------------------------------------------------------- *** About SQC *** Software Quality Consulting provides consulting, training, and auditing services tailored to meet the specific needs of clients. We help clients fine-tune their software development processes and improve the quality of their software products. The overall goal is to help clients achieve Predictable Software Development(TM) – so that organizations can consistently deliver quality software with promised features in the promised timeframe. To learn more about how we can help your organization, visit our web site (http://www.swqual.com/index.html?AboutSQC) or send us an email (info@swqual.com). -------------------------------------------------------------------------------- I hope this newsletter has been informative and helpful. Your comments and feedback are most welcome. Send me your feedback... Thanks, Steve Rakitin info@swqual.com Food for Thought, Predictable Software Development, Act Like a Customer, and ALAC are trademarks of Software Quality Consulting, Inc. Copyright 2008. Software Quality Consulting, Inc. All rights reserved. Graphic design by Sage Studio