TECHNOLOGY OVERVIEW

Hypothesis testing on the difference between two binomial proportions and parts per million

The BIS.Net Team BIS.Net Team

The following applies to difference in proportions. The same principle is applied to ppms where n1 and n2 are each equal to one million and R1 and R2 are the parts per million.

One option is to calculate a confidence interval using sample information. If the confidence interval covers the reference value, then the null hypothesis is accepted. If it does not include the reference value, the null hypothesis is rejected.

Alternatively, randomly sample n1 items from population 1 and n2 items from population 2, then:

  • Count the number of occurrences (successes) of the variable in question in both samples e.g. number of red marbles, number of defectives, number of voters voting for your party.
  • Specify a level of significance
  • Specify an equality (<. > or <>)
  • Specify a reference value for the difference in the two proportions to test against. (usually zero)
  • Determine the test statistic
    • Mainstream technology relies on a Normal approximation related to Wald confidence intervals to obtain the test statistic. The free online analysis service BISNET.com uses this technology.
    • However, the mainstream technology is out-of-date and unreliable, hence the BISNET Inferences app determines criteria based on the Agresti-Min method which is far more reliable.
    • Agresti A and Min Y. On small-sample confidence intervals for parameters indiscrete distributions.
    • Biometrics 2001; 57: 963-971
  • Determine the critical region for the test statistic. The critical region depends on the equality i.e. < or > or <>
  • Compare the test statistic with the Critical region and conclude significance if the test statistic falls outside the critical region.

Alternatively, a p-value can be calculated and if the p-value falls below the specified level of significance the Null Hypothesis is rejected, and the Alternative Hypothesis is accepted

The BIS.Net Inferences APP use p-values for hypothesis testing, as these are more flexible than imposing a predefined level of significance on users. To place perspective on the p value a P Curve is provided for the analyst.

The p-value is calculated by adapting the Agresti-Min method for confidence intervals of two independently sampled proportions, understanding that hypothesis testing and confidence intervals are related.

As there is no closed form equation to follow, machine powered algorithms are used for the solution.

Difference between two binomial probabilities and ppms in BIS.Net Analyst