DOI of the published article 10.7717/peerj.10314
Does One Effect Size Fit All? The Case Against Default Effect Sizes for Sport and Exercise Science
DOI:
https://doi.org/10.31236/osf.io/tfx95Keywords:
applied statistics, exercise, kinesiology, physiology, sport, statisticsAbstract
Recent discussions in the sport and exercise science community have focused on the appropriate use and reporting of effect sizes. Sport and exercise scientists often analyze repeated-measures data, from which mean differences are reported. To aid the interpretation of these data, standardized mean differences (SMD) are commonly reported as description of effect size. In this manuscript, we hope to alleviate some confusion. First, we provide a philosophical framework for conceptualizing SMDs; that is, by dichotomizing them into two groups: magnitude-based and signal-to-noise based SMDs. Second, we describe the statistical properties of SMDs and their implications. Finally, we provide high-level recommendations for how sport and exercise scientists can thoughtfully report raw effect sizes, SMDs, or other effect sizes for their own studies. This conceptual framework provides sport and exercise scientists with the background necessary to make and justify their choice of an SMD. The code to reproduce all analyses and figures within the manuscript can be found at the following link: https://www.doi.org/10.17605/OSF.IO/FC5XW.
Metrics
References
Albers C, Lakens D. 2018. When power analyses based on pilot data are biased: inaccurate effect size estimators and follow-up bias. Journal of Experimental Social Psychology 74:187-195
Amrhein V, Greenland S, McShane B. 2019. Scientists rise up against statistical significance. Berlin: Springer Nature.
Baguley T. 2009. Standardized or simple effect size: What should be reported? British Journal of Psychology 100(3):603-617
Becker BJ. 1988. Synthesizing standardized mean-change measures. British Journal of Mathematical and Statistical Psychology 41(2):257-278
Borg DN, Bon JJ, Sainani K, Baguley BJ, Tierney NJ, Drovandi C. 2020. Sharing data and code: a comment on the call for the adoption of more transparent research practices in sport and exercise science. SportRxiv.
Buchanan EM, Gillenwaters A, Scofield JE, Valentine K. 2019. MOTE: measure of the effect: package to assist in effect size calculations and their confidence intervals. R package version 1.0.2 software
Cohen J. 1977. Statistical power analysis for the behavioral sciences (2nd edition). New York: Academic Press.
Dankel SJ, Loenneke JP. 2018. Effect sizes for paired data should use the change score variability rather than the pre-test variability. Journal of Strength and Conditioning Research Epub ahead of print Oct 24 2018
Dankel SJ, Mouser JG, Mattocks KT, Counts BR, Jessee MB, Buckner SL, Loprinzi PD, Loenneke JP. 2017. The widespread misuse of effect sizes. Journal of Science and Medicine in Sport 20(5):446-450
Dunlap WP, Cortina JM, Vaslow JB, Burke MJ. 1996. Meta-analysis of experiments with matched groups or repeated measures designs. Psychological Methods 1(2):170-177
Efron B, Morris C. 1977. Stein’s paradox in statistics. Scientific American 236(5):119-127
Flanagan EP. 2013. The Effect size statistic—applications for the strength and conditioning coach. Strength and Conditioning Journal 35(5):37-40
Gibbons RD, Hedeker DR, Davis JM. 1993. Estimation of effect size from a series of experiments involving paired comparisons. Journal of Educational Statistics 18(3):271-279
Gigerenzer G. 2018. Statistical rituals: the replication delusion and how we got there. Advances in Methods and Practices in Psychological Science 1(2):198-218
Goulet-Pelletier J-C, Cousineau D. 2018. A review of effect sizes and their confidence intervals, Part I: the Cohen’s d family. The Quantitative Methods for Psychology 14(4):242-265
Greenland S. 2019. Valid p-values behave exactly as they should: some misleading criticisms of p-values and their resolution with s-values. The American Statistician 73(sup1):106-114
Greenland S, Maclure M, Schlesselman M, Poole C, Morgenstrern H. 1986. Standardized regression coefficients: a further critique and review of some alternatives. Epidemiology 2(5):387-392
Grissom RJ. 1994. Probability of the superior outcome of one treatment over another. Journal of Applied Psychology 79(2):314-316
Hanel PH, Mehler DM. 2019. Beyond reporting statistical significance: identifying informative effect sizes to improve scientific communication. Public Understanding of Science 28(4):468-485
Hedges LV. 1981. Distribution theory for Glass’s estimator of effect size and related estimators. Journal of Educational Statistics 6(2):107-128
Hedges LV. 2008. What are effect sizes and why do we need them? Child Development Perspectives 2(3):167-171
Hedges LV, Olkin I. 1985. CHAPTER 5 - estimation of a single effect size: parametric and nonparametric methods. In: Hedges LV, Olkin I, eds. Statistical methods for meta-analysis. San Diego: Academic Press. 75-106
Hislop J, Adewuyi TE, Vale LD, Harrild K, Fraser C, Gurung T, Altman DG, Briggs AH, Fayers P, Ramsay CR+6 more. 2020. Methods for specifying the target difference in a randomised controlled trial: the difference elicitation in trials (DELTA) systematic review. PLOS Medicine 9:e53275
Hunink MM, Weinstein MC, Wittenberg E, Drummond MF, Pliskin JS, Wong JB, Glasziou PP. 2014. Decision making in health and medicine: integrating evidence and values. Cambridge: Cambridge University Press.
Hönekopp J, Becker BJ, Oswald FL. 2006. The meaning and suitability of various effect sizes for structured rater × ratee designs. Psychological Methods 11(1):72-86
Kelley K, Preacher KJ. 2012. On effect size. Psychological Methods 17(2):137-152
Lakens D. 2013. Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. Frontiers in Psychology 4:863
Lenth RV. 2001. Some practical guidelines for effective sample size determination. The American Statistician 55(3):187-193
Mansfield RJ. 1974. Measurement, invariance, and psychophysics. In: Sensation and measurement. Dordrecht: Springer Netherlands. 113-128
Maxwell SE, Kelley K, Rausch JR. 2008. Sample size planning for statistical power and accuracy in parameter estimation. Annual Review of Psychology 59(1):537-563
McGraw KO, Wong SP. 1992. A common language effect size statistic. Psychological Bulletin 111(2):361-365
McShane BB, Böckenholt U. 2014. You cannot step into the same river twice. Perspectives on Psychological Science 9(6):612-625
Morris SB. 2000. Distribution of the standardized mean change effect size for meta-analysis on repeated measures. British Journal of Mathematical and Statistical Psychology 53(1):17-29
Morris SB. 2008. Estimating effect sizes from pretest-posttest-control group designs. Organizational Research Methods 11(2):364-386
Morris SB, DeShon RP. 2002. Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods 7(1):105-125
Quintana DS. 2016. Statistical considerations for reporting and planning heart rate variability case-control studies. Psychophysiology 54(3):344-349
Quintana DS. 2020. A synthetic dataset primer for the biobehavioural sciences to promote reproducibility and hypothesis generation. eLife
Rhea MR. 2004. Determining the magnitude of treatment effects in strength training research through the use of the effect size. The Journal of Strength and Conditioning Research 18(4):918-920
Riley RD, Kauser I, Bland M, Thijs L, Staessen JA, Wang J, Gueyffier F, Deeks JJ. 2013. Meta-analysis of randomised trials with a continuous outcome according to baseline imbalance and availability of individual participant data. Statistics in Medicine 32(16):2747-2766
Robinson DH, Whittaker TA, Williams NJ, Beretvas SN. 2003. It’s not effect sizes so much as comments about their magnitude that mislead readers. The Journal of Experimental Education 72(1):51-64
Rousselet GA, Wilcox RR. 2019. Reaction times and other skewed distributions: problems with the mean and the median. PsyArXiv
Sundberg R. 1994. Interpretation of unreplicated two-level factorial experiments, by examples. Chemometrics and Intelligent Laboratory Systems 24(1):1-17
Thomas JR, Salazar W, Landers DM. 1991. What is missing in p < .05? Effect size. Research Quarterly for Exercise and Sport 62(3):344-348
Tukey JW. 1969. Analyzing data: sanctification or detective work? American Psychologist 24(2):83-91
Vickers AJ, Elkin EB. 2006. Decision curve analysis: a novel method for evaluating prediction models. Medical Decision Making 26(6):565-574
Viechtbauer W. 2007. Approximate confidence intervals for standardized effect sizes in the two-independent and two-dependent samples design. Journal of Educational and Behavioral Statistics 32(1):39-60
Downloads
Posted
Categories
License
Copyright (c) 2020 Aaron Caldwell, Andrew Vigotsky
![Creative Commons License](http://i.creativecommons.org/l/by/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution 4.0 International License.