Whac-A-Mole Arcade Game

Corrective Action "Whac-A-Mole"

While “Corrective Action” has traditionally been considered an essential component for any quality management system, surprisingly few quality professionals have a good understanding of it. This is most often because so many quality professionals lack a basic understanding of “Common Cause” vs “Assignable Cause” (aka “Special Cause”) variation in a process.

These concepts were first described by Walter A. Shewhart in his 1931 book, “Economic control of quality of manufactured product”. And promoted by the Western Electric Company in its 1956 book, “Introduction to Statistical Quality Control handbook” (1st edition). Later, these concepts were popularized by W. Edwards Deming in his 1982 book, “Out of the Crisis: Quality, Productivity and Competitive Position”. Unfortunately, there are still many quality professionals (and standards writing bodies) who remain oblivious to these concepts!

"Common Cause" vs “Assignable Cause” (aka “Special Cause”) Variation

Imagine that you frequent a particular restaurant. The service and food are typically great, but occasionally you find that the silverware (rolled in a napkin) is missing a piece… or has two of the same piece while missing the third piece (e.g., two forks and no knife). Whenever this happens, you simply inform the server, who immediately provides you another napkin neatly rolled around all of the required silverware. The “nonconforming” condition is quickly and easily corrected.

Common Cause usual / normal
quantifiable, random variation
Assignable Cause unusual / abnormal,
not previously observed,
non-quantifiable variation

The staff is well trained and rarely make these mistakes. If one were to initiate a “corrective action” to address this situation, it would quickly prove futile because there is NO “assignable” cause to be eliminated. This type of error is a normal, random (common cause) variation in the restaurant's process due to the volume of silverware that is washed, sorted, and manually rolled into napkins by the staff.

However, one night you arrive and order dinner only to find that the food takes much longer than normal to arrive at your table has been over-cooked and is cold. You complain to the server who apologizes explaining that their chef is unexpectedly absent tonight due to an illness. And that this was terrible timing because his Assistant chef is traveling on vacation. Consequently, a replacement chef, who is unfamiliar with both their kitchen and menu, had to be brought in to fill this temporary need.

Unlike the silverware issue, this is an unusual/abnormal, situation that has not previously occurred. Therefore, this would be a “special cause” variation in their process with an “assignable” cause. Special causes produce systematic effects/errors in a process.

The Minitab® 18 Support web site offers “Examples of common-cause and special-cause variation” in a table (provided below with a few minor improvements).

Process Common Cause of Variation Special Cause of Variation
Baking a loaf of bread The control loop in the oven's thermostat allows the temperature to drift up and down slightly. Changing the oven's temperature or repeatedly opening the oven door during baking can cause the temperature to fluctuate needlessly.
Recording customer contact information An experienced operator makes an occasional error. An untrained operator new to the job makes numerous data-entry errors.
Injection molding of plastic toys Slight variations in the plastic resin from a supplier result in minor variations in product strength from batch to batch. Changing to a less reliable plastic resin supplier leads to an immediate shift in the strength and consistency of your final product.

As shown in the above examples, a control chart isn't always needed to differentiate between a “Common Cause” and “Assignable Cause” variation. However, because far too many quality professionals fail to differentiate between “Common Cause” and “Assignable Cause” variations in a process, a very large number of nonconformities from “Common Cause” variations are incorrectly addressed through a corrective action process. This leads to a cycle resembling the 1976 arcade game, “Whac-A-Mole” (where moles pop up from their holes at random, and the player earns points by forcing them back into their hole by hitting them directly on the head with a mallet). In the end, nothing is accomplished… other than the player feeling a false sense of accomplishment reflected by their score. In this case, the quality team “feels” good (a false sense of accomplishment) about the apparent (short-term) success of each corrective action.

Reactive vs. Proactive Process Improvement

Reactive Process Improvement

In order to properly implement a “corrective action”, there must be an “assignable cause”. And in order for there to be an “assignable cause”, the nonconforming condition must be the result of a “special cause” variation (i.e., an “unnatural pattern” in the process, which is unusual, not previously observed, non-quantifiable variation in the process).

Consequently, because “special causes” produce systematic effects/errors in a process, properly implemented “corrective actions” are reactive process improvements… refining an existing process.

Proactive Process Improvement

Because “common cause” variations are inherent to the existing process, these variations can only be reduced or eliminated by “modifying” or “re-designing” the existing process. Proactive process improvements are most often realized through a “risk management” system (often mitigating the probability/occurrence and/or severity/consequences of nonconformities rather than eliminating them). The most effective process improvement activities are realized by applying the Lean Six Sigma concepts & methodologies.

What about "Zero Defects"?

Ironically, many traditional quality professionals continue to be obsessed with achieving “Zero Defects”; either ignoring or failing to understand the difference between “Common Cause” and “Assignable Cause” (aka “Special Cause”) variations!

“Zero Defects” is a motivational management approach that first appeared in the “Quality and Reliability Assurance Handbook – A Guide to Zero Defects” (4155.12H) published by the U.S. Department of Defense on November 1, 1965, which explained: “Zero Defects is a motivational approach to the elimination of defects attributable to human error”. And: “Zero Defects is dedicated to preventing defects by detecting and removing the causes of their generation. This is an attempt to reverse the unquestioning acceptance of human error as a normal byproduct of personal effort”.

However, the “Zero Defects” concept didn’t gain widespread popularity until it was promoted by Philip B. Crosby, in his book “Quality is Free” (1979). Crosby is credited with having developed the “Zero Defects” concept during the early ‘60s while working at the Martin Company as the quality control manager overseeing the Pershing missile program.

This is diametrically opposed to the philosophy and teachings of W. Edwards Deming, who repeatedly showed that, no matter how vigilant the employees, every process contains inherent (i.e., natural “Common Cause”) variations resulting in defects. This was most popularly demonstrated through Deming’s “Red Bead Experiment”. In fact, point 10 of Deming's 14 points specifically rejects the “Zero Defects” concept, stating: “Eliminate slogans, exhortations, and targets for the work force asking for zero defects and new levels of productivity. Such exhortations only create adversarial relationships, as the bulk of the causes of low quality and low productivity belong to the system and thus lie beyond the power of the workforce”.

The "Human Factors" in Cause & Effect Analysis

While we're on the topic of “Human Errors”, these are very much prevalent in the Cause & Effect Chain Analysis process.

First, the individual(s) performing the Cause & Effect Chain Analysis have their own bias for what the “cause(s)” should be. Like it or not, a “Blame Game” often occurs during this process. No individual or manager wants to be the target of that “negative” perception. So “finger pointing” often ensues.

Another important “Human Factor” is that humans have evolved to “think” in a linear fashion. Consequently, people have difficulty making sense of non-linear events that have branches and/or parallel causes/effects. Read any Corrective Action Report and you will see a linear story detailing the events in a sequence… omitting any parallel causes/effects.

Sadly, most Quality Professionals have not been taught that every event must have at least one condition and one action (i.e., “Apollo Root Cause Analysis”). Similarly, most Quality Professionals have also not been taught that upon identifying a “Cause - Effect” relationship, the “Effect” now becomes the “Cause” for the next step in the analysis. In other words, whether something is a “Cause” or “Effect” depends upon the analyst's perspective.

Also, the simple “5 Whys” method only goes to the point of the analyst's “ignorance”… NOT the mythical “Root Cause”. In other words, the analyst asks “Why?” until he cannot answer the question. Out of ideas, he assumes that is the “Root Cause”… when in reality, the analyst(s) has merely reached their “Point of Ignorance”. While the analyst could continue the analysis by obtaining participation from one or more people (Subject Matter Experts) who can answer that question, they too would eventually encounter their “Point of Ignorance”.

Consequently, the “5 Whys” method leads to untrained analysts ignoring multiple links in the cause-and-effect chain… many of which would prevent the recurrence of the nonconformity.

Conclusion

Upon understanding “Common Cause” and “Assignable Cause” variations in a process, one soon realizes that these are both associated with “risk management”. And the most popular tools used for addressing quality-related risk is the:

The FMEA & FMECA are used for managing risks related to a single event project or design. A “Risk Register” is used (and maintained) for addressing ongoing process risks. Sometimes, Project Managers will use a “Risk Register” for a long-term project so as to address updated risk mitigation controls.

Some people have attempted to apply the single event FMEA to ongoing processes by simply renaming it “Process Failure Mode and Effects Analysis” (PFMEA). This doesn't work because the FMEA is designed for a single-event project or design. In contrast, a “Risk Register” is “maintained” through continuous updates as improvements are made to the process.

IF a nonconformity is due to a “Common Cause” variation, then “controls” can be put in place to “mitigate” the risk of the nonconformity recurring. However, IF a nonconformity is due to an “Assignable Cause” variation, then the process could possibly be changed to “eliminate” the cause of the nonconformity recurring (i.e., reducing the probability OR impact of a risk to zero).

Upon completing an FMEA/FMECA or Risk Register, one quickly realizes just how few “true” corrective actions are actually possible. And that by shifting the mindset to “risk management”, rather than being obsessed with “risk elimination” (aka “Corrective Action”), significant quality improvements can finally be realized.