“Tell Me About Statistics!”vol.29 When to use the multiple comparison method?

on 8月 10, 2023

In this study, we introduce a method to test which groups differed significantly when the Analysis of Variance (ANOVA) was performed.

The multiple comparison method can be roughly divided into the following three types

[Parametric multiple comparison method]
This method examines the significance of differences in group means when the normality and homoscedasticity of a population can be examined using a numerical distance scale.

[Nonparametric multiple comparison method]
This method can be used to examine the significance of the group mean when the distribution of the population cannot be specified by an ordinal scale, such as rank or a 5-point scale.

[Multiple comparison method for population ratio]
This method examines the significance of differences in group proportions when the distribution of the population cannot be specified using categorical data with values of 1 and 0.

There are several such multiple comparison methods. Therefore, we will only explain the representative method of the parametric multiple comparison method and the method that should be used depending on the characteristics of the data to be compared; we will omit the detailed calculation method of each method.

How to decide the analysis method in the parametric multiple comparison method

The choice of analysis method depends on the content of the following seven items.

1) Which type of group is being compared? Pairwise comparison/comparison with control/contrast.
2) Is the amount of data in the groups different or the same?
3) Is the population normally distributed? Normality known/normality unknown
4) Is the variance of each group equal? Homogeneity of variance known/Homogeneity of variance unknown.
5) Do you conduct an ANOVA beforehand?
6) Is the p-value calculated?
7) Distribution of the test statistics: t-distribution/F-distribution/q-distribution/unique distribution

[Figure 1] Main types and characteristics of analytical methods

Overview of seven analysis methods

1.Bonferroni method
This method was introduced in Vol. 26. The significance level was obtained using a simple formula, and the test method was used because it was necessary to correct the significance level. Although it can be used even if the normality and homoscedasticity of the population are unknown, it is a slightly stricter test than other multiple comparison methods, making it difficult to obtain a significant difference. In addition, it was observed that as the number of groups (five or more) increased, the power decreased.

2.Holm’s method
This is a modification of the Bonferroni method. The test statistic and p-value were the same as those of the Bonferroni test, but the significance level with which the p-value was compared was different. Bonferroni adopted the same significance level (0.05/number of combinations) for all comparisons between groups, whereas Holm’s significance level varied according to the size of the p-value. The fact that a significant difference can be easily obtained is an advantage; however, it is difficult to compare the different significance levels and p-values for each combination to determine significance.

3.Tukey’s method
It is the most common multiple comparison method, and it is easier to obtain a significant difference than with the Bonferroni method in case of large numbers of groups. On the other hand, this method also has certain limitations, such as “the amount of data in each group must be the same,” “when the number of groups is small, it is difficult to obtain a significant difference compared to Bonferroni,” “the population must have a normal distribution,” “the population variances of each group must be equal,” and “p-values are not output.”

4.Tukey Kramer Method
This test method is for examining significant differences in population means between three or more groups. When the number of groups is large, it is easier to obtain a significant difference through this method than Bonferroni method, and it can be applied when the amount of data in each group is different. On the other hand, it also includes limitations, such as “when the number of groups is small, it is difficult to obtain a significant difference compared to Bonferroni,” “the population must have a normal distribution,” and “p-values are not output.”

5.Dunnett’s method
Instead of evaluating all paired combinations, such as in Tukey’s method, Dunnett’s method is the basis for comparing controls and the rest, such as placebo and drug X, drug Y, or drug Z.

Dunnett’s method is a multiple comparison method for simultaneously testing only pairwise comparisons of the control and treatment groups for population means, with one control group and two or more treatment groups. It can be used in situations where it is desired to determine whether the population mean of each treatment group is not only “different” but also “smaller” or “larger” than the population mean of the control group. This can be applied even if the equality of the group variances is unknown. It is easy to obtain a significant difference; it can be used even if the homogeneity of the population is unknown, and one can choose between two-sided and one-sided tests. On the other hand, there are also disadvantages that “p-values are not output.”

6.Williams method
In the comparison of four groups, i.e., the placebo and Drug A – 100 mg, 200 mg, and 300 mg groups, the Williams method may be more appropriate than Dunnett’s method. The latter is also acceptable, but when the directionality of the effect can be estimated to some extent (when monotonicity can be assumed), such as “effect of placebo < effect of 100 mg < effect of 200 mg < effect of 300 mg,” Williams’ method can be used.

Dunnett’s method fails to reflect directional information, which can lead to contradictions, such as “placebo vs. 200 mg” being significant, but “placebo vs. 300 mg” not being significant. Williams’ method can be used even if the normality of the population is known and the homoscedasticity of the population is unknown. However, this is a one-sided test and does not output a p-value.

7.Sheffe’s method
Sheffe’s method is a pairwise comparison test that examines the significant difference in the population mean between three or more groups, as well as in the mean values of two groups by dividing multiple items into two groups. It is applicable when the population is normally distributed, the population variance of each group is equal, and the amount of data in each group is different. Although this can be applied to the comparison of summed average values, it is difficult to obtain a significant difference.

Various other multiple-comparison methods exist. Make sure you understand the process of first “assessing whether there is a difference” with ANOVA and then “assessing where there is a difference” with one of the multiple comparison methods.

>>Return to Tell Me About Statistics!