Sample size for confidence intervals on the proportion using classical theory

The BIS.Net Team BIS.Net Team

Knowing the proportions of some occurrence over all possible occurrences is important in countless applications. A politician would like to know what proportion of voters in his electorate will vote for him. A market research manager would like to know what proportion of customers are highly satisfied with after sales service. A Quality Assurance manager would like to know what proportion of faulty products are returned for warranty claims. A medical researcher would like to know what proportion of patients reacted favorably to new pain relief medication.

For all these applications samples need to be taken as the population is too large. The question is how large a sample size needs to be taken to provide the required precision. Consider a politician who has conducted a mini-survey based on 30 samples and found that 54% of constituents would vote for him. He would be very foolish to conclude that most voters in his electorate would vote for him based on this sample size. Based on this sample size it is possible that 64 percent of voters would vote for his opponent. The 54% is just the way the numbers fell when taking a sample size of 30.

The classical approach is of calculating confidence intervals for the proportion is to use the Wald formula obtained from inverting the Wald statistic. The expression for this interval is equal to:


i.e. sample estimate of the proportion +/- sqrt(sample estimate of proportion*(1-sample estimate of proportion)/sample size)

This formula is often used to compute the sample size by solving for n where n=

N = Int((Z_Value(alpha / 2) * Math.Sqrt(p * p) / margin of error) ^ 2 + 1)

The problem is that the expression contains the sample proportion which is not known until the sample is taken.

It is possible to substitute a guess to what the proportion will be, however BISNET Analyst does not support guesses, unless there is a sound reason to do so. Instead BISNET Analyst uses the worst-case proportion of p and computes a sample size for this. The worst-case p is equal to 0.5. The analyst can then be assured that the confidence interval margin of error will be no greater than specified.

The Wald method is used for the online version where classical technologies are free.

However, the Wald method and the ‘exact method are unreliable. The actual confidence interval coverage is considerably less than the specified coverage. Even though a 95% confidence interval is specified the actual coverage may be as low as 85%, leading to wrong conclusions.

The machine powered app version will base sample size on more reliable technology that will provide confidence intervals closer to the specified level of confidence. Related articles can be read here (machine powered confidence intervals of the proportion) (machine powered sample size determination for the proportion)

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

Analytics as a Service (AaaS) for Quality

Drive quality improvement through actionable insights using analytics you can trust! Use up to 200 analytics tools downloadable through a suite of Apps!

FREE usage of the analytics Apps for quality improvement
  • Augmented with machine-powered smarts
  • Always updated with the latest tools and features
  • No licencing or fixed subscriptions - Pay ONLY for the analysis you run from 20 USD cents per analysis, billed monthly! Set a budget so you don't exceed!