FEATURED ARTICLE
# Confidence Intervals for proportions

Inferences about proportions are needed in a large variety of situations. Examples of questions asked by analysts are:

What proportion of the population has an income lower than the legislated minimum wage?

What percentage of patients survive beyond 5 years after a cancer operation?

What percentage of patients develop cardiovascular disease five years after being diagnosed with hyper tension?

What percentage of voters intend to vote for a political candidate prior to an election?

What percentage of warranty claims have been made in a year?

What percentage of crimes are involve a particular race?

Answers to many questions can result in changes to legislation, campaign strategies, new medication, new processes in industry. It is therefore important that inferences are reliable. However, this is not the case. Inferences about proportions are susceptible to errors due to failure of understanding the difference between the specified confidence level and actual coverage, which will become clearer below.

Arguably the Wald method calculated as:

LL= sample proportion - Z(alpha/2)*Sqrt((sample proportion)*(1-sample proportion)/sample size)

UL= sample proportion + Z(alpha/2)*Sqrt((sample proportion)*(1-sample proportion)/sample size)

Has until the time of writing this article been used predominantly and still receives the focus in statistical text books. Yet it is the most unreliable method

An alternative is the exact method, for example the Clopper–Pearson interval. However, ‘exact’ is misleading because it falsely implies that the coverage will match the specified confidence level. The word exact is misused, which can cause considerable harm considering the effect on decisions by false conclusions. There is nothing ‘exact’ about the ‘exact’ method except that the calculations to obtain the confidence intervals are based on cumulative probabilities of the binomial distribution instead of an approximation, such as the Normal distribution.

There have been countless of academic papers published in this area, adding to the confusions, without offering reliable practical alternatives. A number revolve around theoretical scenarios, not known in advance, making the work impractical. A small percentage of author’s work is of high standard and extremely practical, such as that published by eminent statistician Alan Agresti.

Much research has been performed around simulations, however these simulations have often not been extensive enough to draw reliable conclusions. We have therefore performed several thousand simulations for various levels of confidence and hypothetical levels of population proportions which concluded that the Wilson Score Method and Likelihood ratio test method have the best overall performance.

Figure 1 shows a small snap shot of simulation output for a sample size of 10 at hypothetical population proportions of 0.1 to 0.9. (Other levels were also investigated such as .01 to .10 . Each simulation was based on 1 million runs.

**Simulated Coverage**

Exact CI | Score CI | Waldt CI | Likelihood CI | ||

0.1 | 95.8 | 92.9 | 92.9 | 64.45 | 98.75 |

0.2 | 96.7 | 96.7 | 96.7 | 88.7 | 96.43 |

0.3 | 93.8 | 95.2 | 92.4 | 84.1 | 95.1 |

0.4 | 96.3 | 89.9 | 98.2 | 90.3 | 94.68 |

0.5 | 93.45 | 89.6 | 97.9 | 88.8 | 88.8 |

0.6 | 96.15 | 89.8 | 98.2 | 89.7 | 94.5 |

0.7 | 93.95 | 92.4 | 91.4 | 84.2 | 95 |

0.8 | 96.7 | 85.9 | 96.7 | 88.9 | 96.9 |

0.9 | 95.8 | 58.1 | 92.8 | 65 | 98.6 |

**Figure 1 : Specified confidence coefficient = 95. Sample Size 10**

Clearly when comparing the coverage with the specified 95, the standard Wald interval and the exact method are inadequate and should not be used in critical research. Both the Score and Likelihood ratio test provide far better coverage closer to the specified levels.

Figure 2 is another snap shot for a sample size of 100, at different assumed population proportions.

Exact CI | Score CI | Waldt CI | Likelihood CI | |

0.01 | 92.6 | 92 | 63.1 | 98.2 |

0.02 | 94.9 | 95 | 86.6 | 98.5 |

0.03 | 96.9 | 96.9 | 80 | 96.9 |

0.04 | 86.3 | 93.1 | 90.6 | 98 |

0.05 | 93.2 | 96.8 | 87.8 | 93.7 |

0.06 | 90.5 | 94.7 | 94.4 | 96.8 |

0.07 | 92.7 | 97.2 | 91 | 95.2 |

0.08 | 93.7 | 96 | 90.4 | 93.5 |

0.09 | 91.8 | 94.8 | 94.7 | 96.6 |

0.1 | 90.5 | 93.6 | 93.4 | 95 |

0.2 | 94.38 | 94 | 93 | 95.5 |

0.3 | 93 | 93.7 | 94.8 | 95.3 |

0.4 | 93.6 | 94.8 | 94.8 | 94.7 |

0.5 | 94 | 94.3 | 94.2 | 94.3 |

0.6 | 93.3 | 95 | 95 | 94.6 |

0.7 | 93.7 | 93.7 | 94.7 | 95 |

0.8 | 94 | 94 | 95 | 95 |

0.9 | 90.6 | 93.6 | 93 | 95.5 |

**Figure 2: Specified confidence coefficient = 95. Sample Size 100**

The conclusions are similar. Although the Exact and Wald methods have improved as reported in the literature at low assumed proportions, even for a high sample size performance is still vastly inferior.

Considering all simulations performed our conclusion is that the Likelihood ratio method provides the best overall performance, closely matched by the Wilson Score method. If it were known in advance what the population proportion is then it would be possible to better match method to population proportion, but if the population proportion were known in advance then an estimate would not be required.

The BIS.Net Inferences app uses both the likelihood ratio and Wilson Score method.

The Wilson Score interval is equal to

The likelihood ratio-based method is more complex to solve. It requires an iterative method to solve the following expression where the right-hand side is equal to the ChiSquare value for one degree of freedom and the chosen level of significance, e.g. .05. Pi^ is the sample proportion and n sample size. A machine powered algorithm is thus used to solve this expression

The right hand side of the expression is equal to the ChiSquared value with 1 degree freedom at the chosen level of significance alpha.

Drive quality improvement through actionable insights using analytics you can trust! Use up to 200 analytics tools downloadable through a suite of Apps!

- Augmented with machine-powered smarts
- Always updated with the latest tools and features
- No licencing or fixed subscriptions - Pay ONLY for the analysis you run from 20 USD cents per analysis, billed monthly! Set a budget so you don't exceed!