![]() |
|
An e-newsletter published by |
May 2005, Vol. 2 No. 5 |
| Welcome to Food for Thought™, an e-newsletter from Software Quality Consulting. I've created free subscriptions for my valued business contacts. If you find this newsletter informative, I encourage you to continue reading. Feel free to pass this newsletter along to colleagues by clicking this Forward Email link. If you’ve received this newsletter from a colleague and would like to subscribe, please click this Enter New Subscription link. If you don't wish to receive this newsletter, click the SafeUnSubscribe™ link at the bottom of this newsletter, and you won’t be bothered again. |
In This Months’ Topic,
I discuss using Root Cause Analysis to find the real cause of Customer Reported Problems…
|
|
Getting to the Root of Of all the kinds of problems that software development organizations face, Customer Reported Problems (CRPs) are clearly the most important. This is because CRPs represent potential gaps in your knowledge of how your customers use your software. CRPs may be the result of deficiencies in your product marketing, software development, test, or fulfillment processes. CRPs can often result in unplanned releases that are both disruptive and expensive. When the underlying cause of CRPs are not fully understood, they can result in poor solutions that often create more problems than they solve. Nothing frustrates customers more than a supplier who is unable to resolve problems quickly and with correctly. Motivation By now, we should all know that the sooner a problem is found the easier and less costly it is to fix. Barry Boehm [1] demonstrated this almost 25 years ago. Current data [2] suggest that even the most experienced developers inject one defect for every 10 lines of code they write. While effective testing can find up to 95% of these defects prior to release, that still leaves quite a few defects for customers to find. Finding critical defects in your software is very disruptive not only for your customers but for your software development organization as well. Unplanned releases to fix CRPs divert expensive development resources from tasks that generate revenue (new features, new products, etc.) to tasks that don’t generate revenue (bug fixes). Unplanned releases are clearly not good for your bottom line. CRPs represent more than just defects. CRPs should be broadly defined to include any failure of software and services (including code, documentation, installation, customization, fulfillment, training, etc.) that negatively impacts customers. Root Cause Analysis Working in safety-critical industries has allowed me to become familiar with several tools not routinely used in the commercial software development industry. One such tool is called Root Cause Analysis (RCA). This tool is commonly used within a Six-Sigma framework. I’ve adapted the traditional RCA Process to make it work effectively within typical software development organizations. RCA helps people understand WHAT, WHY, and HOW an event (a CRP) occurred. Overview |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
RCA is routinely used to investigate the cause of major disasters including:
RCA helps us:
In applying RCA to a typical software development organization, we need to keep in mind the fact that finding the root cause of a CRP may be difficult because:
Let’s now look at terms specific to the RCA process. Terminology The RCA Process uses the following terms:
Let’s look a bit closer at the attributes of root causes:
Now that we have some terms defined, let’s look at the Root Cause Analysis Process. RCA Process Overview The RCA Process consists of investigating, understanding, and categorizing underlying root causes of observed CRPs. It can be best performed by a small cross-functional team and can be easily incorporated into your Defect Triage Process. The RCA Process includes a detailed a nalysis based on gathering factual information obtained from:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
And the RCA Process uses simple tools including:
An effective RCA Process helps determine appropriate and effective corrective actions by identifying both an Immediate Corrective Action (what should be done today to resolve the CRP) and Long Term Corrective Action (what should be done to prevent recurrence). In applying the RCA Process, the Triage Team starts with a specific CRP and asks:
Most root causes are found in way we operate. That includes:
The Triage Team asks questions about “Who does what”, “How things get done”, and “Why we behave the way we do”, in order to identify factual information that can be helpful in identifying real root causes. In asking these questions, the Triage Team uses a tool called the Why Tree. Why Trees are similar to Fault Trees in that the CRP is placed at the top. We then ask “Why did this happen?” and start drilling down into “Who does what”, “How things get done”, and “Why we behave the way we do”. At each level, the team continues to ask “Why” – usually at least five times (though for simpler problems, less than five Whys may suffice). The following illustrates a partially completed Why Tree for a simple problem: |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
![]() |
Answers to Why questions may need to be determined from documents (like Functional Specifications, Test Plans, User Manuals, etc.), from records (like test results, shipping invoices, etc.), from interviews with staff and customers, and from brainstorming sessions. The information shown in green circles on the Why Tree example represents probable root causes. The Triage Team reaches consensus on the most probable root cause(s). Often, there will be more than one root cause. Using the Why Tree, the Triage Team develops an Immediate CA (which could be a workaround, hot fix, patch, new CDs, new doc, etc.). The team also identifies effectiveness checks that can determine if the Immediate CA, once implemented, has effectively resolved the CRP. Once the Immediate CA is implemented and the effectiveness checks are satisfactory, the Triage Team decides if a Long Term CA is needed. A Long Term CA would be appropriate if the root cause points to systemic problems. If so, they begin to develop a Long Term CA. The team does this by:
Once the team has competed work on the Long Term CA, it can be presented to Management and implemented. The team then collects data to determine if long term effectiveness checks are satisfactory. Now let’s identify the specific steps needed to perform an effective RCA. RCA Process Steps Step 1 - Data Collection The majority of time spent analyzing events will be spent gathering data. Complete information and a thorough understanding of events required to identify causal factors and real root causes.
Step 2 – Determine What Happened The Triage Team starts with the CRP in the Customer’s words and asks “Why did this happen?” As they start to drill down, they create the Why Tree and continue asking “Why?” until there are no more answers. Usually, you need to ask “Why?” a minimum of 5 times. This process will identify additional information to collect. For example:
When the team is satisfied that they have answered all the relevant questions and gathered all relevant information, the team is then ready to identify potential root causes. Step 3 - Root Cause Identification Based on the Why Tree, the Triage Team reviews results and identifies most probable root causes. The team ensures that most probable root causes meet the following criteria:
Once the team is satisfied that they have identified the most probable root cause(s), they document their results. With this information, the team can then identify an Immediate CA. These actions can be taken immediately to help resolve the original CRP. Effectiveness checks are included as part of the Immediate CA. Step 4 – Long Term Corrective Action Once an Immediate CA is implemented and determined to be effective, the Triage Team decides if a Long Term CA is warranted. Usually, root causes that identify underlying systemic problems are good candidates. Also, once root causes are identified, they should be added to a list, as illustrated below: Example Root Cause List
The Pareto Principle tells us that, in many cases, 80% of all problems result from only 20% of root causes. Performing a Pareto Analysis based on the Root Cause List can help determine what areas should be the focus of Long Term CA in order to keep the ROI high. The following example illustrates a simple Pareto Analysis of observed root causes and their associated CRPs. Example of Simple Pareto Analysis of Observed Root Causes
From this analysis it is clear that addressing Root Cause #2 with a Long Term CA would have the highest ROI. The Triage Team would identify and propose a Long Term CA and present recommendations to Management. Included with this are effectiveness checks. Once implemented, data is collected and reviewed by the Triage Team to ensure that the systemic issues have been effectively eliminated. In Summary… Incorporating RCA into your Triage Process can lead to several benefits:
By incorporating Root Cause Analysis into your Triage Process, the resolution of your CRPs will be more effective and your customers will certainly be happier. Till next time… |
Every month in this space you’ll find additional information related to this month’s topic.
|
Every month, you’ll find news here about local and national events that are of interest to the software community …
|
Software Quality Consulting provides consulting, training, and auditing services tailored to meet the specific needs of clients. We help clients fine-tune their software development processes and improve the quality of their software products. The overall goal is to help clients achieve Predictable Software Development™ – so that organizations can consistently deliver quality software with promised features in the promised timeframe. To learn more about how we can help your organization, visit our web site or send us an email. |
I hope this newsletter has been informative and helpful. Your comments and feedback are most welcome. Send me your feedback… Thanks, |