Summary of statistical tests and effect sizes

Here is a summary table of all the statistical tests currently supported across various functions:

Functions	Type	Test	Effect size	95% CI available?
`expr_anova_parametric` (2 groups)	Parametric	Student’s and Welch’s t-test	Cohen’s d, Hedge’s g	\(\checkmark\)
`expr_anova_parametric` (> 2 groups)	Parametric	Fisher’s and Welch’s one-way ANOVA	\(\eta^2, \eta^2_p, \omega^2, \omega^2_p\)	\(\checkmark\)
`expr_anova_nonparametric` (2 groups)	Non-parametric	Mann-Whitney U-test	r	\(\checkmark\)
`expr_anova_nonparametric` (> 2 groups)	Non-parametric	Kruskal-Wallis Rank Sum Test	\(\epsilon^2\)	\(\checkmark\)
`expr_anova_robust` (2 groups)	Robust	Yuen’s test for trimmed means	\(\xi\)	\(\checkmark\)
`expr_anova_robust` (> 2 groups)	Robust	Heteroscedastic one-way ANOVA for trimmed means	\(\xi\)	\(\checkmark\)
`expr_anova_parametric` (2 groups)	Parametric	Student’s t-test	Cohen’s d, Hedge’s g	\(\checkmark\)
`expr_anova_parametric` (> 2 groups)	Parametric	Fisher’s one-way repeated measures ANOVA	\(\eta^2_p, \omega^2\)	\(\checkmark\)
`expr_anova_nonparametric` (2 groups)	Non-parametric	Wilcoxon signed-rank test	r	\(\checkmark\)
`expr_anova_nonparametric` (> 2 groups)	Non-parametric	Friedman rank sum test	\(W_{Kendall}\)	\(\checkmark\)
`expr_anova_robust` (2 groups)	Robust	Yuen’s test on trimmed means for dependent samples	\(\xi\)	\(\checkmark\)
`expr_anova_robust` (> 2 groups)	Robust	Heteroscedastic one-way repeated measures ANOVA for trimmed means	\(\times\)	\(\times\)
`expr_contingency_tab` (unpaired)	Parametric	\(\text{Pearson's}~ \chi^2 ~\text{test}\)	Cramér’s V	\(\checkmark\)
`expr_contingency_tab` (paired)	Parametric	McNemar’s test	Cohen’s g	\(\checkmark\)
`expr_contingency_tab`	Parametric	One-sample proportion test	Cramér’s V	\(\checkmark\)
`expr_corr_test`	Parametric	Pearson’s r	r	\(\checkmark\)
`expr_corr_test`	Non-parametric	\(\text{Spearman's}~ \rho\)	\(\rho\)	\(\checkmark\)
`expr_corr_test`	Robust	Percentage bend correlation	r	\(\checkmark\)
`expr_t_onesample`	Parametric	One-sample t-test	Cohen’s d, Hedge’s g	\(\checkmark\)
`expr_t_onesample`	Non-parametric	One-sample Wilcoxon signed rank test	r	\(\checkmark\)
`expr_t_onesample`	Robust	One-sample percentile bootstrap	robust estimator	\(\checkmark\)
`expr_meta_parametric`	Parametric	Meta-analysis via random-effects models	\(\beta\)	\(\checkmark\)
`expr_meta_robust`	Robust	Meta-analysis via robust random-effects models	\(\beta\)	\(\checkmark\)

Note that the following recommendations on how to interpret the effect sizes are just suggestions and there is nothing universal about them. The interpretation of any effect size measures is always going to be relative to the discipline, the specific data, and the aims of the analyst. Here the guidelines are given for small, medium, and large effects and references should shed more information on the baseline discipline with respect to which these guidelines were recommended. This is important because what might be considered a small effect in psychology might be large for some other field like public health.

(Additionally, you will also see which function is used internally to compute the effect size and their confidence intervals.)

One-sample tests

parametric

Test: One-sample t-test
Effect size: Cohen’s d, Hedge’s g
Function: effectsize::cohens_d

Effect size	Small	Medium	Large	Range
Cohen’s d	0 – < 0.20	0.20 – < 0.50	≥ 0.80	[-Inf,Inf]
Hedge’s g	0 – < 0.20	0.20 – < 0.50	≥ 0.80	[-Inf,Inf]

non-parametric

Test: One-sample Wilcoxon Signed-rank Test
Effect size: \(r\) ( = \(Z/\sqrt(N_{obs})\))
Function: rcompanion::wilcoxonOneSampleR

Effect size	Small	Medium	Large	Range
r	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

robust

Test: One-sample percentile bootstrap test
Effect size: robust location measure
Function: WRS2::onesampb

Two-sample tests

within-subjects design

parametric

Test: Student’s dependent samples t-test
Effect size: Cohen’s d, Hedge’s g
Function: effectsize::cohens_d

Effect size	Small	Medium	Large	Range
Cohen’s d	0.20	0.50	0.80	[0,1]
Hedge’s g	0.20	0.50	0.80	[0,1]

non-parametric

Test: Wilcoxon signed-rank test
Effect size: \(r\) ( = \(Z/\sqrt(N_{pairs})\))
Function: rcompanion::wilcoxonPairedR

Effect size	Small	Medium	Large	Range
r	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

robust

Test: Yuen’s dependent sample trimmed means t-test
Effect size: robust (trimmed-Winsorized) standardized difference similar to Cohen’s d
Function: WRS2::dep.effect

Effect size	Small	Medium	Large	Range
\(\delta_{R}\)	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

Reference: - https://CRAN.R-project.org/package=WRS2/vignettes/WRS2.pdf - https://journals.sagepub.com/doi/10.1177/0013164406288161

between-subjects design

parametric

Test: Student’s and Welch’s independent samples t-test
Effect size: Cohen’s d, Hedge’s g
Function: effectsize::cohens_d

Effect size	Small	Medium	Large	Range
Cohen’s d	0.20	0.50	0.80	[-Inf,Inf]
Hedge’s g	0.20	0.50	0.80	[-Inf,Inf]

non-parametric

Test: Two-sample Mann–Whitney U Test
Effect size: \(r\) ( = \(Z/\sqrt(N_{obs})\))
Function: rcompanion::wilcoxonR

Effect size	Small	Medium	Large	Range
r	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

Reference: https://rcompanion.org/handbook/F_04.html

robust

Test: Yuen’s independent sample trimmed means t-test
Effect size: Explanatory measure of effect size (\(\xi\))
Function: WRS2::yuen.effect.ci

Effect size	Small	Medium	Large	Range
\(\xi\)	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

Reference: https://CRAN.R-project.org/package=WRS2/vignettes/WRS2.pdf

One-way ANOVAs

within-subjects design

parametric

Test: Fisher’s repeated measures one-way ANOVA
Effect size: \(\eta^2_p\), \(\omega^2\)
Function: effectsize::eta_squared and effectsize::omega_squared

Effect size	Small	Medium	Large	Range
\(\omega^2\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]
\(\eta^2_p\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]

Reference:

non-parametric

Test: Friedman’s rank sum test
Effect size: Kendall’s W
Function: rcompanion::kendallW

In the following table, k is the number of treatments, groups, or things being rated.

k	Small	Medium	Large	Range
k = 3	< 0.10	0.10 – < 0.30	≥ 0.30	[0,1]
k = 5	< 0.10	0.10 – < 0.25	≥ 0.25	[0,1]
k = 7	< 0.10	0.10 – < 0.20	≥ 0.20	[0,1]
k = 9	< 0.10	0.10 – < 0.20	≥ 0.20	[0,1]

robust

Test: Heteroscedastic one-way repeated measures ANOVA for trimmed means
Effect size: Not available

between-subjects design

parametric

Test: Fisher’s or Welch’s one-way ANOVA
Effect size: \(\eta^2\), \(\eta^2_p\), \(\omega^2\), \(\omega^2_p\)
Function: effectsize::eta_squared and effectsize::omega_squared

Effect size	Small	Medium	Large	Range
\(\eta^2\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]
\(\omega^2\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]
\(\eta^2_p\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]
\(\omega^2_p\)	0.01 – < 0.06	0.06 – < 0.14	≥ 0.14	[0,1]

Reference:

non-parametric

Test: Kruskal–Wallis test
Effect size: \(\epsilon^2\)
Function: rcompanion::epsilonSquared

Effect size	Small	Medium	Large	Range
\(\epsilon^2\)	0.01 – < 0.08	0.08 – < 0.26	≥ 0.26	[0,1]

Reference: https://rcompanion.org/handbook/F_08.html

robust

Test: Heteroscedastic one-way ANOVA for trimmed means
Effect size: Explanatory measure of effect size (\(\xi\))
Function: WRS2::t1way

Effect size	Small	Medium	Large	Range
\(\xi\)	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]

Reference: https://CRAN.R-project.org/package=WRS2/vignettes/WRS2.pdf

Contingency table analyses

association test - unpaired

Test: Pearson’s \(\chi^2\)-squared test
Effect size: Cramér’s V
Function: rcompanion::cramerV

In the following table, k is the minimum number of categories in either rows or columns.

k	Small	Medium	Large	Range
k = 2	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[0,1]
k = 3	0.07 – < 0.20	0.20 – < 0.35	≥ 0.35	[0,1]
k = 4	0.06 – < 0.17	0.17 – < 0.29	≥ 0.29	[0,1]

Reference: https://rcompanion.org/handbook/H_10.html

association test - paired

Test: McNemar’s test
Effect size: Cohen’s g
Function: rcompanion::cohenG

Effect size	Small	Medium	Large	Range
Cohen’s g	0.05 – < 0.15	0.15 – < 0.25	≥ 0.25	[0,1]

Reference: https://rcompanion.org/handbook/H_05.html

goodness-of-fit test

Test: Pearson’s \(\chi^2\)-squared goodness-of-fit test
Effect size: Cramér’s V
Function: rcompanion::cramerVFit

In the following table, k is the number of categories.

k	Small	Medium	Large	Range
k = 2	0.100 – < 0.300	0.300 – < 0.500	≥ 0.500	[0,1]
k = 3	0.071 – < 0.212	0.212 – < 0.354	≥ 0.354	[0,1]
k = 4	0.058 – < 0.173	0.173 – < 0.289	≥ 0.289	[0,1]
k = 5	0.050 – < 0.150	0.150 – < 0.250	≥ 0.250	[0,1]
k = 6	0.045 – < 0.134	0.134 – < 0.224	≥ 0.224	[0,1]
k = 7	0.043 – < 0.130	0.130 – < 0.217	≥ 0.217	[0,1]
k = 8	0.042 – < 0.127	0.127 – < 0.212	≥ 0.212	[0,1]
k = 9	0.042 – < 0.125	0.125 – < 0.209	≥ 0.209	[0,1]
k = 10	0.041 – < 0.124	0.124 – < 0.207	≥ 0.207	[0,1]

Reference: https://rcompanion.org/handbook/H_03.html

Correlation analyses

parametric

Test: Pearson product-moment correlation coefficient
Effect size: Pearson’s correlation coefficient (r)
Function: correlation::correlation

Effect size	Small	Medium	Large	Range
Pearson’s r	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[-1,1]

non-parametric

Test: Spearman’s rank correlation coefficient
Effect size: Spearman’s rank correlation coefficient (\(\rho\))
Function: correlation::correlation

Effect size	Small	Medium	Large	Range
Spearman’s \(\rho\)	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[-1,1]

robust

Test: Percentage bend correlation coefficient
Effect size: Percentage bend correlation coefficient (\(\rho_{pb}\))
Function: correlation::correlation

Effect size	Small	Medium	Large	Range
\(\rho_{pb}\)	0.10 – < 0.30	0.30 – < 0.50	≥ 0.50	[-1,1]

Meta-analysis

parametric

Test: Parametric random-effects meta-analysis
Effect size: Regression estimate (\(\beta\))
Function: metafor::rma

robust

Test: Random-effects meta-analysis using a mixture of normals for the random effect
Effect size: Regression estimate (\(\beta\))
Function: metaplus::metaplus

Bayesian

Test: Bayesian random-effects meta-analysis
Effect size: Regression estimate (\(\beta\))
Function: metaBMA::meta_random

Suggestions

If you find any bugs or have any suggestions/remarks, please file an issue on GitHub: https://github.com/IndrajeetPatil/ggstatsplot/issues

Session Information

For details, see- https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/session_info.html

Test and effect size details

Indrajeet Patil

2020-06-19

Summary of statistical tests and effect sizes

One-sample tests

parametric

non-parametric

robust

Two-sample tests

within-subjects design

parametric

non-parametric

robust

between-subjects design

parametric

non-parametric

robust

One-way ANOVAs

within-subjects design

parametric

non-parametric

robust

between-subjects design

parametric

non-parametric

robust

Contingency table analyses

association test - unpaired

association test - paired

goodness-of-fit test

Correlation analyses

parametric

non-parametric

robust

Meta-analysis

parametric

robust

Bayesian

Suggestions

Session Information