FEATURED ARTICLES

Classical versus Machine Powered Algorithms

The BIS.Net Team BIS.Net Team

Background

Although the term Statistical Process Control covers more than Control Charts this article focuses only on statistically based control charts, recently also known as process behavior charts.

SPC Charts were introduced by Dr. Walter Shewhart in the 1920s, based on six years of an investigation to “develop a scientific basis for attaining economic control of quality of manufactured product through the establishment of control limits to indicate at every stage in the production process from raw material to finished product when the quality of product is varying more than is economically desirable”. Shewhart particularly used the control chart to identify assignable causes which are economically undesirable. Shewhart and his engineers realized that reacting to common causes that are inherent to the process is a futile process that can increase variation, not reduce it.

Over the years since, many others such as W.E. Deming popularized the use of control charts to reduce variation and improve processes. Almost one hundred years since the technology remains virtually unchanged in a ‘time capsule’. Even though todays processes are far more complicated than in the limited environment of Bell Laboratories and Western Electrical Company from early this century the application of control charts has been somewhat dogmatically enforced without regard to suitability.

Notably, the focus of this article is the dependency on Normality. Shewhart in his book ‘Economic Control of Quality of Manufactured Product’ alleges that even for a sample size of 4 the distribution of averages is normal. It has thus been generally accepted that Shewhart X-bar charts and later Moving Average and Exponentially Weighted Moving Averages Charts are robust to even large departures from Normality. The assumption that averages approximate the normal distribution is based on the central limit theorem.

Walter Shewhart was a practical man evidenced by his statements when talking about control chart criteria. Walter Shewhart said “Furthermore, we may say that mathematical statistics as such does not give us the desired criterion. What does this situation mean in plain everyday engineering English? Simply this: such criteria, if they exist cannot be shown to exist by theorizing alone, no matter how well equipped the theorist is in respect to probability theory. We see in this situation the long recognized dividing line between theory and practice. ……. In other words, the fact the criterion which we happen to use has a fine ancestry of high brow statistical theorems does not justify its use. Such justification must come from practical evidence that it works. As the practical engineer might say, the proof in the pudding is in the eating.”

For control charts the criteria for detecting assignable cause variation are points falling outside control limits, based on an assumption of normality for X-bar, Individual Charts, Moving Averages and EWMA and Modified Control Charts. The assumption of normality as part of this criteria has been accepted over the years, possibly because an alternative criterion was not available and in some cases because of the ‘fine ancestry of high brow statistical theorems’ in this case introduced by Walter Shewhart. Yet Walter Shewhart would likely be the first to modify his approach if it can be proven that departure from normality does affect the performance of control charts. To requote “the proof in the pudding is in the eating.

The eating of the pudding

Let us examine the eating of the pudding in relation to x-bar charts being robust to non-normality by investigating the effect of non-normality on false alarms with 3 different distributions, all from an IN-CONTROL process.

The following X-Bar Chart, shown in Figure 2 was obtained with the distribution in Figure 1.

Figure 1: A highly skewed distribution
Figure 2: X-bar & Range Chart applied to the highly skewed distribution with sub-group size of 5.

Although this example was an extreme case ‘cherry picked’ to emphasize the problem, these types of distributions have been encountered many times in our experience.

From 500 points the X-Bar chart has 16 out-of-control points and 3 warning limit violations. The Range Chart has 45 out of control points. The range chart behaves even more poorly, as expected as there is no averaging.

This alone demonstrates that the x-bar chart is not as robust to non-normality to the degree that has been stated, even by Walter Shewhart.

Figure 3 shows the distributions of the averages. The averages did not significantly normalize the data.

Figure 3: The distribution of averages of five.

Figure 4 shows an X-Bar chart from a more moderate non-normal distribution shown in Figure 5.

Figure 4: X-bar & Range Chart applied to a more moderate skewed distribution with sub-group size of 5.
Figure 5: A more moderate non-normal distribution as compared to Figure 1.

Figure 6 shows that the distribution was normalized to a greater extent than the distribution in Figure 1. Nevertheless, out of 500 points the X-Bar Chart had 8 out of control points and the Range Chart 12.

Figure 6: The distribution of averages of five based on the Figure 5 distribution data.

Now, as a final argument that the X-Bar chart is not robust to non-normality, data from a simulated Logistic distribution was used to obtain the X-bar/Range chart shown in Figure 7. The logistic distribution resembles a normal distribution, in that it is bell shaped, without skew. The difference is flatness. Figure 8 shows how closely the distribution resembles a normal distribution. Figure 9 shows that the distribution differs from a normal distribution at the tails (flatness). This alone was enough to cause problems with both charts.

Figure 7: X-bar & Range Chart applied to a zero skewed non-normal distribution with sub-group size of 3.

In this instance 6 out-of-control points were shown for the X-bar chart and 9 for the Range Chart.

Figure 8: A non-normal distribution that looks like a normal distribution.
Figure 9: The probability plot shows that the distribution is not normal.

It is now appropriate to quote Shewhart from his book ‘Economic Control of Quality of Manufactured Product’. To quote “It will have been observed that the factor c2 used in setting limits for standard deviation is based on the assumption that that the samples are drawn from a normal universe, whereas in general, we know that this condition is not rigorously fulfilled. Furthermore, we have seen that the distribution function of both the average X bar and standard deviation sigma of samples of a given size depends upon the nature of the universe. Hence, the probabilities associated with the limits in the control charts for the average x-bar and the standard deviation sigma depends upon the universes from which the samples are drawn. Of course, the distributions of averages, even for samples of four, is approximately normal independent of the universe so that the probabilities associated with control charts for averages are closely comparable irrespective of the nature of the universes. This is not true, however, in respect to the distribution, of standard deviations.”

The above examples demonstrate that the x-bar chart and even more so, the range chart is not robust to non-normality. This is only expected as the central limit theorem states that all continuous distributions approach a normal distribution as sample size approaches infinity. Typical sample sizes used in practice when applying X-Bar charts however are very low when compared to infinity. This conclusion does therefore contradict what Shewhart said about the distribution of averages being normal even for small sample sizes of 4.

The above examples also demonstrate that the probabilities associated with control charts for averages are not closely comparable irrespective of the nature of the universes when the sub-group size 5, i.e. higher than the 4 referred to by Shewhart. If the distribution is truly normal the probability of exceeding the average + 3 * sigma/sqrt (5) is .00135 and for average - 3 * sigma /sqrt (5) is also .00135. Whereas if the distribution follows the one in Figure 1 the probability of exceeding the average + 3 * sigma/sqrt (5) is .032 and for average - 3 * sigma /sqrt (5) is practically zero. If the distribution follows the one in Figure 5 the probability of exceeding the average + 3 * sigma/sqrt (5) is practically zero and for average - 3 * sigma/sqrt (5) is .013. If the distribution follows the one in Figure 8 which is like the normal distribution and symmetrical, the probability of exceeding the average + 3 * sigma/sqrt (5) is .0045 and for average - 3 * sigma/sqrt (5) is .0045.

In other words, a highly skewed distribution can on average have almost 24 times as many false alarms on one side as a x-bar chart from a normal distribution whilst the other side has an almost zero chance of false alarms, making this side very insensitive for detecting assignable cause variation. Even for a non-normal curve which closely resembles a normal curve but is flatter at the tail ends the probability of exceeding the average + 3 * sigma /sqrt (5) the false alarm rate is almost 3 times as high and likewise for the lower limit. For the in-between case the false alarm rate is about 10 times as high for one side and 0 for the other. Other non-normal distributions will of course differ, different subgroup sizes will have different results, so these results should not be used to generalize. They simply show that x-bar charts cannot be assumed to be robust to normality, especially with commonly used subgroup sizes such as 4 or 5.

Shewhart acknowledges that the condition of normality is not rigorously followed. He acknowledges that the distribution for averages and standard deviation depends on the probability distribution of the underlying data. Yet. One can say he seemingly contradicts himself by saying that even for small sample sizes the distribution for averages is approximately normal, hence not practically dependent on the probability distribution of the data. There can be several reasons for his contradiction, all of which of course are speculation. His own contradiction may simply be because he was not aware of the fact that the averages for subgroup sizes commonly used, such as 4 and 5 are dependent on normality. Shewhart lived in an era where todays computing power was absent. He was unable thoroughly study the effect of non-normality through simulation say. Shewhart may have felt a need to downplay the effect of non-normality knowing his critics would cease any opportunity to discredit his work, as is the case even today when new ideas are introduced.

One can only speculate. What is important is that Shewhart acknowledges that both the X-bar and Standard deviation control limits are dependent on the underlying distribution. Disputed is his statement that averages of four follow a normal distribution, irrespective of the underlying distribution.

What is important is that the proof with the pudding was not found with the eating. Non-normality can have a significant effect, as has been shown above. The probabilities associated with control charts for averages cannot be assumed to be closely comparable irrespective of the nature of the universes. Visually inspecting the charts knowing that all data was obtained from simulated in-control processes alone concludes that the computed control limits are inadequate.

Even if Shewhart understood the extent of the problem, in Shewhart’s days it would have been impossible to deal with non-normality. Even now, there are no formulae that could simply be applied manually as is possible with the standard x-bar chart. It makes sense that until recently control chart technology has remained in a time capsule.

Due to modern computers it has now become possible to use Machine Power (‘number crunching’ power of the computer) to provide non-normal x-bar, range, Sd, moving average, and Ewma and modified control charts. Machine power alone of course does not provide the solutions. However, applying machine power to intelligently developed algorithms does. How these algorithms work is not important. Unlike simple formulae and functions they cannot be academically scrutinized from a theoretical perspective. The proof in the pudding remains in the eating just as it was in Shewhart’s days. The technology works whereas the assumption of normality does not. To see how well the technology works the data used for the above charts was applied using Machine Powered Algorithms.

Figure 10 is the corresponding machine powered distribution optimized chart for the data that went into the Figure 2 chart. Whereas the standard control chart had 16 out of control points above the upper limit, the distribution optimized chart has only two above and 2 below (closely hugging only), which is not unexpected for the chosen probability points of .00135 (equivalent to 3 sigma normal limits) taking ARL variability in account. The range chart has only one out of control point compared with 45 using standard x-bar and range charts

Figure 10: Distribution Optimized X-bar & Range Chart applied to the same data used in Figure 2.

Figure 11 is the corresponding machine powered distribution optimized chart for the data that went into the Figure 4 chart. Whereas the standard control chart had 8 out of control points below the lower limit, the distribution optimized chart has only 2 above and 1 below, which too is not unexpected for the chosen probability points of .00135 (equivalent to 3 sigma normal limits). The range chart has only 2 out of control point compared with 12 using standard x-bar and range charts

Figure 11: Distribution Optimized X-bar & Range Chart applied to the same data used in Figure 4.

Figure 12 is the corresponding machine powered distribution optimized chart for the data that went into the Figure 7 chart. Whereas the standard control chart had 3 out of control points below the lower limit and 3 above the upper limit, the distribution optimized chart has only 1 above. The range chart has only 2 out of control point compared with 9 using standard x-bar and range charts

Figure 11: Distribution Optimized X-bar & Range Chart applied to the same data used in Figure 7.

Notes

  • The demonstrated charts, which may be considered ‘cherry picked’ are snap shot examples. A rerun of the simulations will show different results. The number of out-of-control points are also a random variable for an in-control process. The degree of non-normality in the averages depends on the sample size and underlying distribution. Sometimes the difference between standard and optimized x-bar control limits may be negligible. It is impossible to predict when. The range charts fall completely flat on their feet when the data is non-normal.
  • A fortuitous fact is that the distribution optimized machine powered technology when applied to known normal distributed data will result in practically the same control chart limits as standard x-bar charts. The Analyst can thus be confident that the technology will work with normal and non-normal charts. If a distribution cannot be fitted, then a non-standard X-bar chart will be applied which bases control limits on the sample averages directly. The process is clearly out-of-control to a level where control limits cannot be accurately determined.
  • The above reported situation is far worse for subgroup sizes of 2
  • The range chart was used for the above demonstrations because many people are still more comfortable with interpreting the range, over the standard deviation. The same analysis performed on control charts based on standard deviations, showed a slightly improved robustness, but still at unsatisfactory levels. Further details can be found on the Knowledge Centre
  • Do say 10 false alarm out-of-control points out of 500 i.e. 2% matter. That depends on the application. It can take days to identify assignable causes and remove them. Charts with too many false alarms lose credibility. Philosophically we can argue that 2% of false alarms are a defect in the SPC system, so one can argue that if we are not willing to accept such a defect or defective rate in our products and service we should not accept such a rate in our systems.
  • Standard X-Bar charts are based on within subgroup variability to remove the effect of assignable causes of variation. This works only when data is normal, which is not the norm and only applies for simple processes. Today’s processes are highly complex, and it is not uncommon for there to be inherent time to time variation which must be treated as common cause variation. Machine powered distribution optimized control charts hence are not based on within subgroup variation. This makes them more robust to today’s complex processes where special common causes are not as simple as in Shewhart’s days. They do however require for the process to be in control as do standard control charts. Histograms and change analysis are used to establish control before applying the machine powered technology. Comprehensive instructions are provided in the documentation.
  • Machine powered SPC should be used to obtain fixed control limits not updated each time an extra point is added. Updating is a bad practice also for classical SPC. As points are added limits will change and drift, especially if affected by assignable causes. Fixed control limits should be changed only after there is evidence of a process change.

Individuals Control Charts

The above problems are even worse for individual charts. This fact has been noted by scholars and until recently addressed by applying either Johnson Curves or Pearson’s Curves. These however cannot always be fitted. Machine powered algorithms are more versatile

Moving Average and Ewma.

These are typically applied to ‘individuals’ data. Even though some sort of averaging is applied these are not robust to non-normality. If sub-group sizes greater than 2 are used, they can become robust dependent on the period and sub group size

Modified Control Charts

Modified Control Charts are applicable when process capability is at a level where the process can afford to drift. Some may feel this contradicts the principles of quality improvement and that process drift should not be allowed. Some may refer to Taguchi’s loss function which implies that any departure from target is costly. The level of control is a management decision, which should not ignore the cost of control to tight limits. Taguchi’s loss function intentionally or unintentionally overlooked the cost of control when deriving the loss function (see Article in the Knowledge Center). As Shewhart’s book’s (‘Economic Control of Quality of Manufactured Product’) implies control of quality must be economic. Provided specifications are based on fitness-for-use and if the process is highly capable then some drift may be allowed. Six sigma processes allow for processes to drift by 1.5 standard deviations.

Modified control charts are also applicable for high speed automated filling lines and similar where the main objective is to target the process such that defective product is not produced.

Whatever the application, modified control charts until recently relied on normality to calculate control limits. These are easy to calculate when the distribution is normal. However, when the process is non-normal these limits risk producing non-conforming product.

Figure 12 shows a highly capable process. The outer limits are the specifications and the inner limits the control limits for the averages.

Figure 12: Modified control chart with non-normal data

If the process drifts to the point where the averages just touch the upper control limit action would not be taken and yet there would be a high level of non-conformance. This can be seen in Figure 13 recalling the upper line is the specification limit and the dotted the control limit.

Figure 13: Process is near the upper control limit with non-conformances

Clearly the averages, represented by the green points are within control limits but many non-conformances are produced. When applying machine powered algorithms that consider non-normality this is no longer a problem. This can be seen in Figure 14.

Figure 14: Process is near the upper control limit but no non-conformances

Only machine powered algorithms are able determine the underlying distribution for the individuals and averages and use this information to calculate realistic control limits. This applies equally and indeed more to six sigma processes where modified control limits are calculated differently.

Proof of the pudding is in the eating

Whether you use Classical or Machine powered technology is your choice. Following Shewhart’s reasoning, no matter how sophisticated the technology, that is no justification to use it. Sophistication is not the criteria for using machine powered algorithm. The criteria must be what works for you knowing the facts. The facts are that classical SPC is not robust to non-normality. No process follows a perfect normal distribution. Machine powered SPC looks and is used the same way as is classical SPC. Machine powered SPC will detect the correct distribution to use. If it is truly a normal distribution, then the limits will be the same. If it cannot decide which distribution to use, then your process is invariably badly out of control. Neither classical or machine powered SPC will then work. You will need to bring your process in control first.

Read the next article in the featured articles series?

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

PC APP for SPC

Choice of 34 control charts - 17 classics AND 17 distribution optimized (includes non-normal x-bar charts) using machine-power!

Pull data from any source, share outputs, learn through our knowledge center, track usage, and more. Always up-to-date! Fast Support!

FREE download! NO monthly subscriptions! ONLY PAY FOR WHAT YOU USE!

Dont need the APP? Only want to use the 'classics'? Visit BIS.Net Analyst Online Walter Shewhart was a practical man evidenced by his statements when talking about control chart criteria. Walter Shewhart said “Furthermore, we may say that mathematical statistics as such does not give us the desired criterion. What does this situation mean in plain everyday engineering English? Simply this: such criteria, if they exist cannot be shown to exist by theorizing alone, no matter how well equipped the theorist is in respect to probability theory. We see in this situation the long recognized dividing line between theory and practice. ……. In other words, the fact the criterion which we happen to use has a fine ancestry of high brow statistical theorems does not justify its use. Such justification must come from practical evidence that it works. As the practical engineer might say, the proof in the pudding is in the eating.”