Special Cause Variation — Eliminate It Before Improving the System
Joiner’s rule, drawn from Deming and Shewhart: you must first eliminate special cause variation before attempting to improve a stable system. Acting on a special cause as if it were a systemic problem — or ignoring it and “improving” a system that is not yet stable — produces worse outcomes than doing nothing. The Bootstrap CUSUM step-change framework is built on this distinction.
☰ Contents
Two types of variation — and why the distinction matters
Every process produces variation. The critical question is not whether variation exists but what kind it is. Shewhart’s original insight, developed by Deming and then by Joiner, is that variation comes from two fundamentally different sources — and they require fundamentally different responses.
| Type | What it is | Source | Correct response |
|---|---|---|---|
| Common cause variation | The normal, expected variation inherent in a stable system. The process is doing exactly what it is designed to do — and this is the result. Predictable within a range. | The system itself — its design, its inputs, its structural conditions | Do not tamper. To reduce it, you must change the system. See Joiner’s approach to common cause variation. |
| Special cause variation | An unexpected, non-random signal that something outside the normal system has occurred. A spike, a shift, a one-off event. The process is behaving differently from its normal pattern. | A specific, identifiable cause — an event, an error, a change in conditions — that is external to the system’s normal operation | Identify and address the specific cause. Do not change the system to accommodate it. |
The consequences of confusing the two are severe. Treating common cause variation as if it were a special cause — reacting to every dip and spike — is tampering: it adds variation to a stable system and makes performance worse. Treating special cause variation as if it were common cause — accepting an unusual event as “just noise” — means missing a signal that something specific has gone wrong or, occasionally, right.
What special cause variation looks like
Special cause variation appears as a data point or sequence of data points that cannot be explained by the system’s normal behaviour. In Bootstrap CUSUM terms, it is a shift in the process mean — a change point — that is attributable to a specific, identifiable external event rather than to a structural change in the system itself.
🔎 Common forms of special cause variation
Joiner’s rule — the correct sequence
Joiner states the rule precisely on page 138 of Fourth Generation Management: eliminate special cause variation before attempting to improve a stable system. The sequence matters. Trying to redesign a system that is not yet stable — that is still experiencing special cause events — produces unpredictable results, because you cannot measure the effect of your intervention against a baseline that keeps moving.
The Joiner sequence
Step 1 — Detect. Is the variation in your data common cause or special cause? Use Bootstrap CUSUM or a control chart to determine whether the process is stable (common cause only) or unstable (one or more special causes present).
Step 2 — Address special causes first. If special causes are present, identify each one and address it specifically — don’t change the system to accommodate it. Eliminate or standardise the special cause so the system returns to stable, predictable behaviour.
Step 3 — Confirm stability. Once special causes have been addressed, confirm the process is now stable — fluctuating within a predictable range with no further change points. Bootstrap CUSUM on recent data should show a flat line.
Step 4 — Then improve the system. Only now is a system-level improvement meaningful. You have a stable baseline to measure against. Any structural change you introduce will produce a detectable change point — and you will know it was your intervention that caused it, not a residual special cause.
If you attempt a system improvement while special causes are still present, you cannot interpret the result. Suppose you introduce a new discharge protocol and the following month’s data improves significantly. Was that your protocol — or the resolution of the industrial action that had been suppressing performance for the previous three months? Without first eliminating the special cause and stabilising the baseline, you cannot answer that question. Attribution is impossible. The improvement programme gets credited for a change it may not have caused, and the underlying structural issue remains unaddressed.
How Bootstrap CUSUM detects special cause variation
Bootstrap CUSUM is designed specifically to detect structural shifts — sustained moves from one stable level to another. Its relationship to special cause variation is precise:
- Single spikes are not detected as change points. The cumulative sum algorithm requires a sustained shift to cross the detection threshold. A one-off extreme value contributes to the cumulative sum briefly but does not by itself trigger a change point. This is a feature, not a limitation: it prevents false positives from one-off events.
- Sustained special causes are detected as change points. If a special cause produces a lasting shift — a new ward closed, a new system introduced, a key dependency removed — Bootstrap CUSUM will detect this as a genuine change point. The investigation determines whether the cause is structural improvement or a specific addressable event.
- The change point date localises the special cause. Bootstrap CUSUM returns a date. That date is the starting point for identifying what changed. A special cause that predates the improvement programme by months is not the programme’s doing — the date tells you that.
- Multiple change points may indicate multiple special causes. If Bootstrap CUSUM returns two or three change points in a short period, this may indicate a system that has been subject to several distinct special causes rather than a single structural improvement. Each change point deserves its own investigation.
If Bootstrap CUSUM returns no change point, that is a finding: the system is stable. It may be stable at an unacceptably poor level — in which case the task is system redesign (see common cause variation). Or it may be stable at an acceptable level — in which case the task is to hold it there and resist the temptation to tamper. A flat line from Bootstrap CUSUM on a stable system is not a failure of the method. It is the honest answer to an honest question.
The two mistakes — and their consequences
| Mistake | What it looks like | Consequence | Joiner’s term |
|---|---|---|---|
| Mistake 1: Treating common cause as special cause | Reacting to every dip in the data with an intervention. Calling a meeting every time a metric falls below target. Changing the process in response to normal variation. | Tampering. Each reaction adds variation to the system. Performance becomes less predictable, not more. Staff learn that the response to data is always an intervention — so data reporting becomes political rather than analytical. | Tampering with a stable system |
| Mistake 2: Treating special cause as common cause | Accepting an unusual event as “just noise.” Not investigating a sustained shift. Attributing a genuine change point to random variation because it is inconvenient to investigate. | Missing a signal. An improvement goes uncredited and unstandardised; it drifts back. A deterioration goes unaddressed; it compounds. Special causes that are not identified recur. | Ignoring a signal in a changing system |
Deming estimated that the majority of management interventions in organisations he studied were Mistake 1 — reactions to common cause variation treated as if they were special causes. The result was systems that were more variable than they needed to be, with staff who had learned that numbers trigger reactions regardless of whether those numbers contained a real signal.
Once special causes are eliminated — what next
Once special cause variation has been identified and addressed, and the process is confirmed stable, you face a different question: is the stable level acceptable? Two paths follow.
▶ From stable system — two paths
Test your data for special cause variation
Upload your time-series data to the StepChange Analyzer. Bootstrap CUSUM will detect whether a structural shift — a special cause step change — is present, and will date it precisely so you can investigate the cause.
▶ Open the StepChange Analyzer