Gauge R&R (Multiple Equipment, Multiple Appraisers, One Part)

The BIS.Net Team BIS.Net Team

A Gauge R&R (GRR), is used to study repeatability (by the same operator) and reproducibility (by different operators).

There are many applications where an organization has multiple sets of measuring devices (equipment) opening the possibility of variation between the different devices, which can introduce an additional source of error to measurements.

For this application, an Analysis of Variance is required to extract as much information as possible. Notably the ANOVA method, as opposed to the historical control chart methods, can detect interaction effects between appraisers and equipment.

Interaction effects may be strong if different models of measurement devices are used. However, caution should be exercised when using different models because the standard deviation may vary. If the standard deviations can be assumed to be constant, then this analysis is a great way to compare different measurement devices. If not, then perform the analysis only on the same model of different devices.

For this type of study each appraiser must test the same part at least twice with each measuring device. At least two parts must be measured, and all appraisers must test the same number of parts the same number of times, using the same equipment.

This is called a balanced crossed design.

An example of input is shown below.

Appraisers in this instance are coded as A, B, C. Equipment is coded as E1,E2,E3.

The BIS.Net APP provides the following output:

Tabular output

The Analysis of Variation table is included for completeness for statisticians. For non-statisticians the last column conveniently lists whether an effect is significant or not. The Appraiser effect is commonly called Reproducibility and Equipment Error is called reproducibility. The Equipment effect is the variation caused by differences in the measuring devices. The Appraiser-Equipment interaction is used to determine if measurements depend on a combination of appraiser and equipment. This can be the case if different model measuring devices are used.

If a result is significant then there is statistical evidence that the differences are not due to chance alone. In this instance there is significant variation between equipment and appraisers, not explained by chance alone. The interaction between part and appraiser is insignificant and may thus be due to chance.

If there is no interaction effect the ANOVA pools the variation due to error and interaction for significance testing.

The approximate confidence intervals are the intervals within which the reproducibility and repeatability, and variation between equipment, measured by standard deviation are likely to fall at the chosen level of significance. If the default of .05 has been used, then the confidence coefficient is equal to 100-.05 *100=95

The measurement system performance table shows the how much the percentage of the various components of variation, (all measured by Sd) takes up relative to the total study variation. The last column uses variance instead of standard deviation. A zero value is used if there is no statistically significant interaction effect.

Gauge R&R is the total variation due to appraisers, equipment variation and interaction, but not equipment. This follows a common approach used in industry. However, since equipment variation is part of the measuring system, argument exists for including this component.

Another consideration is the model used for the Analysis of Variance. Conventionally, a Random effects model is assumed. One can argue for a Fixed effect model. For the sake of convention the Random Effects model is used by BISNet MSA.

Dart Board

The dart board is a visual tool which enables the analyst to see at a glance how reproducible the measurements are and how good or bad the repeatability is and the effect of the different equipment.

The circles are placed at 1 standard deviation (green), 2 standard deviations (yellow), 3 standard deviations (orange) and beyond (red) around zero. The standard deviation is the total standard deviation of all measurements, or the process variation if chosen. It is designed to place perspective on the components of variation relative to the study variation, NOT process variation.

Each of the small circles correspond to pure equipment measurement error after removing the effect of the Equipment differences and Appraisers.

Using a sophisticated algorithm, the circles have been randomly placed around the centre just as if they were thrown darts. This is an effective way for visualizing repeatability. Each coloured point corresponds to a different appraiser.

The larger black coloured clusters reflect reproducibility. The larger red clusters reflect differences in Equipment.

The example indicates that reproducibility is very small, compared to the total variation, as seen by the tight cluster. The reproducibility is also very tight, as shown by the tightly clustered black circles. However, variation due to the different equipment is very large. These results confirm the Measurement System Performance Table above which report that both the standard deviations for reproducibility and repeatability are around 0.2, whereas the standard deviation due to differences in equipment is 1.04. The dart board is hence a very powerful visualization tool.

Appraiser Reproducibility Chart

The appraiser reproducibility chart is used to identify appraisers that differ significantly from expectation. All appraisers should fall inside the two red limits. Those that fall outside the limits may need to be retrained.

The above instance shows that there is a reproducibility problem as two appraisers fall outside the red limits. This is confirmed by the Analysis of Variance. The larger white scatter for each appraiser in this instance is due to variation between the equipment.

Equipment Reproducibility Chart

The appraiser reproducibility chart is used to identify equipment that differs significantly from expectation. All equipment should fall inside the two red limits.

The above instance shows that there is a major equipment reproducibility problem, all but two measuring devices fall outsides. The much smaller larger scatter for each device in this instance is due to variation between the appraisers and pure measurement error. This chart thus also confirms that difference in equipment is the major source of error.

Variation in equipment means that there are bias issues and a bias study should be performed with BIS.Net MSA, either one by one, or using the Multiple Equipment Bias Study.

Probability Plot

The probability plot is used to establish normality of the measurement error (residuals). BIS.Net MSA uses the Anderson Darling Statistic and will advise if the if there is evidence of non-normality. The ANOVA method does assume normality. Fortunately, measurement error tends to follow a normal distribution.

Interaction Effects

The Appraiser-Equipment interaction chart is used to visually detect if there are interactions. An appraiser-equipment interaction means that the measurements obtained by each appraiser depend on the equipment used. The above example shows that the measurements for all 3 appraisers follow the same pattern over all devices, hence there is no interaction, as confirmed by the ANOVA.

Download the Inferences APP, comprised of mainstream and machine-powered analytics for statistical analysis

Analytics as a Service (AaaS) for Quality

Drive quality improvement through actionable insights using analytics you can trust! Use up to 200 analytics tools downloadable through a suite of Apps!

FREE usage of the analytics Apps for quality improvement
  • Augmented with machine-powered smarts
  • Always updated with the latest tools and features
  • No licencing or fixed subscriptions - Pay ONLY for the analysis you run from 20 USD cents per analysis, billed monthly! Set a budget so you don't exceed!