Sample size for hypothesis testing on the mean

The BIS.Net Team BIS.Net Team

There are many situations where it is necessary to obtain an average reading of some variable. For example, the average calorie consumption by people from various demographic categories per day. Often a sample needs to be taken as there is no information available for every person in the various categories. If the only interest is in the average, simple confidence intervals are all that is required. However sometimes a statement needs to be made about the average.

For example, a medical doctor may wish to confirm that blood pressure medication has reduced a patient’s blood pressure from an average Systolic of 152 down to at least 140. A R&D Manager may be experimenting with a buffer to stabilize the PH to changes in acidity and alkalinity of swimming pool water. He may test to see if there has been a change in PH greater than plus or minus 1 after some time period of normal usage of the swimming pool.

For applications where a treatment has been applied a hypothesis test is more appropriate than just confidence intervals, although confidence intervals and hypothesis test are related. The advantage of hypothesis testing for these applications is that sample size can be tailored to the detection of a change considered to be important to be detected. In the above examples the Medical Doctor is looking for a reduction of at least 8 units and the R&D manager for a plus/minus change greater than 1.

Determining sample size for a hypothesis test on the average of some important variable is a simple. The analyst will need to specify a reference value for the null hypothesis, which can be considered the status quo and an amount of concern, i.e. the minimum change from the reference value that is important to detect. Additionally, the analyst must specify two types of risk. The first is the risk or probability of concluding there is a change at least equal to the specified amount-of-concern when there is none. This is called the alpha risk. The second is the risk or probability of concluding there has been no change when in fact there was a change. This is called the beta risk. Both risks are due to ‘the way the numbers fell’ when sampling. By chance sampling may have selected samples with measurements larger or smaller than expected. Finally, the type of change must be specified, which can be > or < or <>

The image blow shows these risks for a <> change. In this instance the tail probabilities are equal to alpha/2. For a one sided tests the tail probabilities equal alpha.

Hypothesis testing on the mean drawing

Mu0 is the Null Hypotheses mean and Mu1 the alternative mean and c the amount of concern.

Sample size is computed using the following formula.

N = ((Z_Value(alpha) + Z_Value(beta)) * Sd / Amount-of-concern) ^ 2

The standard deviation must be known or based on a large sample prior estimate. The practice of using a t_value and previous sample standard deviation is unreliable and not condoned by BIS.Net Analyst.

Once the sample size is determined it is possible to calculate critical values for the average. An amount beyond the critical value implies that there is reasonable evidence that there has been a change. For a > hypothesis an upper critical value is calculated. For a < hypothesis a lower critical value is calculated. For a <> hypothesis both a lower and upper critical value are calculated.

A preferred and recommended option to using critical values is to perform a hypothesis test after sampling with the calculated sample size and make decisions based on the p value.

BIS.Net Analyst also displays an OC and Power curve such as shown in the image below.

Hypothesis testing on the mean

Power refers to the power of detecting the specified change. The power curve shows the power of concluding there has been a change greater or equal to various hypothesized value of the population mean. The red lines correspond to the specified alternative means calculated from the amount of concern. In this instance the null hypothesis mean is 10, the amount of concern .5 and hence the upper alternative mean is 10.5 and lower 9.5, each of which have a power of 90% of being detected and 10% probability of not being detected, (obtained from the OC Curve). Using this curve the analyst can determine the power at different alternative means.

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

Analytics as a Service (AaaS) for Quality

Drive quality improvement through actionable insights using analytics you can trust! Use up to 200 analytics tools downloadable through a suite of Apps!

FREE usage of the analytics Apps for quality improvement
  • Augmented with machine-powered smarts
  • Always updated with the latest tools and features
  • No licencing or fixed subscriptions - Pay ONLY for the analysis you run from 20 USD cents per analysis, billed monthly! Set a budget so you don't exceed!