Sample size for hypothesis testing on the standard deviation

The BIS.Net Team BIS.Net Team

It can be argued that philosophically the standard deviation is more important than the mean. The standard deviation is a measure of variability and some have called variability Quality’s biggest enemy. Variability must not be confused with variation which is a good thing. Human beings need variation or a change. Variability is synonymous with inconsistency and needs to be controlled and minimized.

To control variability, it needs to be measured and monitored for changes. If the standard deviation for blood pressure suddenly increases, it will alarm doctors. If the standard deviation of plating thickness suddenly increases, it will alarm the QA manager.

Although confidence intervals and hypothesis test are related there are sometimes advantages in conducting a hypothesis test, instead of using confidence intervals to determine if here has been a change in standard deviation. The advantage of hypothesis testing for these applications is that sample size can be tailored to the detection of a change considered to be important to be detected. Not every change, even if statistically significant is of importance.

Determining sample size for a hypothesis test on the standard deviation is not as simple as determining the sample size to detect a change of concern in the mean.

The analyst will need to specify a reference value for the null hypothesis, which can be considered the status quo and an amount of concern, i.e. the minimum change from the reference value that is important to detect. Additionally, the analyst must specify two types of risk. The first is the risk or probability of concluding there is a change at least equal to the specified amount of concern when there is none. This is called the alpha risk. The second is the risk or probability of concluding there has been no change when in fact there was a change. This is called the beta risk. Both risks are due to ‘the way the numbers fall’ when sampling. By chance sampling may have selected samples with measurements larger or smaller than expected. Finally, the type of change must be specified, which can be > or < or <>.

The image below shows these risks for a > change. Although the shape of the actual distributions will not be symmetrical the image helps demonstrates the problem

Hypothesis testing on the standard deviation

The calculated sample size must be such that the specified value of alpha and beta is obtained. Unfortunately for the standard deviation, where the probability distribution depends on the standard deviation itself, there is no closed form expression that can be used to directly or numerically calculate the sample size such that both risks are kept.

Instead a machine learning algorithm is used which learns to discard incorrect solution paths, ultimately finding the required sample size.

There are some limitations however. These are:

  • The underlying distribution must be normally distributed.
  • Part of the solution requires use of the expression for confidence intervals for the variance. This expression does not provide an interval of minimum width. The actual risks are therefore only approximate, but not by much. The alternative is to arbitrarily choose a sample size providing risks that differ significantly from those desired.

Once the sample size is determined it is possible to calculate critical values for the standard deviation. An amount beyond the critical value implies that there is reasonable evidence that there has been a change. For a > than hypothesis an upper critical value is calculated. For a < than hypothesis a lower critical value is calculated. For a <> equal hypothesis both a lower and upper critical value is calculated.

A preferred and recommended option to using critical values is to perform a hypothesis test after sampling with the calculated sample size and make decisions based on the p value.

The BIS.Net Sample Size app also displays an OC and Power curve such as shown in the image below.

Hypothesis testing on the standard deviation

Power refers to the power of detecting the specified change. The power curve shows the power of concluding there has been a change greater or equal to various hypothesized values of the population standard deviation. The red lines correspond to the specified alternative standard deviation calculated from the amount of concern. In this instance the null hypothesis standard deviation is 1, the amount of concern .5 and hence the upper alternative mean is 1.5 and lower .95, each of which have a power of 90% of being detected and 10% probability of not being detected (obtained from the OC Curve). Using this curve, the analyst can determine the power at different alternative means.

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

Analytics as a Service (AaaS) for Quality

Drive quality improvement through actionable insights using analytics you can trust! Use up to 200 analytics tools downloadable through a suite of Apps!

FREE usage of the analytics Apps for quality improvement
  • Augmented with machine-powered smarts
  • Always updated with the latest tools and features
  • No licencing or fixed subscriptions - Pay ONLY for the analysis you run from 20 USD cents per analysis, billed monthly! Set a budget so you don't exceed!