Confidence Intervals for proportions

The BIS.Net Team BIS.Net Team

Inferences about proportions are needed in a large variety of situations. Examples of questions asked by analysts are:

What proportion of the population has an income lower than the legislated minimum wage?

What percentage of patients survive beyond 5 years after a cancer operation?

What percentage of patients develop cardiovascular disease five years after being diagnosed with hyper tension?

What percentage of voters intend to vote for a political candidate prior to an election?

What percentage of warranty claims have been made in a year?

What percentage of crimes are involve a particular race?

Answers to many questions can result in changes to legislation, campaign strategies, new medication, new processes in industry. It is therefore important that inferences are reliable. However, this is not the case. Inferences about proportions are susceptible to errors due to failure of understanding the difference between the specified confidence level and actual coverage, which will become clearer below.

Arguably the Wald method calculated as:

LL= sample proportion - Z(alpha/2)*Sqrt((sample proportion)*(1-sample proportion)/sample size)

UL= sample proportion + Z(alpha/2)*Sqrt((sample proportion)*(1-sample proportion)/sample size)

Has until the time of writing this article been used predominantly and still receives the focus in statistical text books. Yet it is the most unreliable method

An alternative is the exact method, for example the Clopper–Pearson interval. However, ‘exact’ is misleading because it falsely implies that the coverage will match the specified confidence level. The word exact is misused, which can cause considerable harm considering the effect on decisions by false conclusions. There is nothing ‘exact’ about the ‘exact’ method except that the calculations to obtain the confidence intervals are based on cumulative probabilities of the binomial distribution instead of an approximation, such as the Normal distribution.

There have been countless of academic papers published in this area, adding to the confusions, without offering reliable practical alternatives. A number revolve around theoretical scenarios, not known in advance, making the work impractical. A small percentage of author’s work is of high standard and extremely practical, such as that published by eminent statistician Alan Agresti.

Much research has been performed around simulations, however these simulations have often not been extensive enough to draw reliable conclusions. We have therefore performed several thousand simulations for various levels of confidence and hypothetical levels of population proportions which concluded that the Wilson Score Method and Likelihood ratio test method have the best overall performance.

Figure 1 shows a small snap shot of simulation output for a sample size of 10 at hypothetical population proportions of 0.1 to 0.9. (Other levels were also investigated such as .01 to .10 . Each simulation was based on 1 million runs.

Simulated Coverage

Exact CI Score CI Waldt CI Likelihood CI
0.1 95.8 92.9 92.9 64.45 98.75
0.2 96.7 96.7 96.7 88.7 96.43
0.3 93.8 95.2 92.4 84.1 95.1
0.4 96.3 89.9 98.2 90.3 94.68
0.5 93.45 89.6 97.9 88.8 88.8
0.6 96.15 89.8 98.2 89.7 94.5
0.7 93.95 92.4 91.4 84.2 95
0.8 96.7 85.9 96.7 88.9 96.9
0.9 95.8 58.1 92.8 65 98.6

Figure 1 : Specified confidence coefficient = 95. Sample Size 10

Clearly when comparing the coverage with the specified 95, the standard Wald interval and the exact method are inadequate and should not be used in critical research. Both the Score and Likelihood ratio test provide far better coverage closer to the specified levels.

Figure 2 is another snap shot for a sample size of 100, at different assumed population proportions.

Exact CI Score CI Waldt CI Likelihood CI
0.01 92.6 92 63.1 98.2
0.02 94.9 95 86.6 98.5
0.03 96.9 96.9 80 96.9
0.04 86.3 93.1 90.6 98
0.05 93.2 96.8 87.8 93.7
0.06 90.5 94.7 94.4 96.8
0.07 92.7 97.2 91 95.2
0.08 93.7 96 90.4 93.5
0.09 91.8 94.8 94.7 96.6
0.1 90.5 93.6 93.4 95
0.2 94.38 94 93 95.5
0.3 93 93.7 94.8 95.3
0.4 93.6 94.8 94.8 94.7
0.5 94 94.3 94.2 94.3
0.6 93.3 95 95 94.6
0.7 93.7 93.7 94.7 95
0.8 94 94 95 95
0.9 90.6 93.6 93 95.5

Figure 2: Specified confidence coefficient = 95. Sample Size 100

The conclusions are similar. Although the Exact and Wald methods have improved as reported in the literature at low assumed proportions, even for a high sample size performance is still vastly inferior.

Considering all simulations performed our conclusion is that the Likelihood ratio method provides the best overall performance, closely matched by the Wilson Score method. If it were known in advance what the population proportion is then it would be possible to better match method to population proportion, but if the population proportion were known in advance then an estimate would not be required.

The BIS.Net Inferences app uses both the likelihood ratio and Wilson Score method.

The Wilson Score interval is equal to


The likelihood ratio-based method is more complex to solve. It requires an iterative method to solve the following expression where the right-hand side is equal to the ChiSquare value for one degree of freedom and the chosen level of significance, e.g. .05. Pi^ is the sample proportion and n sample size. A machine powered algorithm is thus used to solve this expression


The right hand side of the expression is equal to the ChiSquared value with 1 degree freedom at the chosen level of significance alpha.

200 Analytics Tools For Quality

Download 200 statistical analytics tools for quality improvement - Mobile Version

Improve product quality through smart data analysis using new-age, machine-powered driven analytics for quality assurance, available to download through a suite of Apps!

  • Apps for SPC, MSA, Process Performance, Inferences, Visualization, and much more
  • No licences or subscriptions! Pay ONLY per analysis, billed monthly! Don't Use, Don't Pay!
  • Always up-to-date with the latest tools and features