Comparative effect size distributions in strength and conditioning and implications for future research
Keywords:Sample size, Power, S&C, applied statistics, Bayesian
Background Controlled experimental designs are frequently used in strength and conditioning (S&C) to determine which interventions are most effective. The purpose of this large meta-analysis was to quantify the distribution of comparative effect sizes in S&C to determine likely magnitudes and inform future research regarding sample sizes and inference methods.
Methods Baseline and follow-up data were extracted from a large database of studies comparing at least two active S&C interventions. Pairwise comparative standardised mean difference effect sizes were calculated and categorised according to the outcome domain measured. Hierarchical Bayesian meta-analyses and meta-regressions were used to model overall comparative effect size distributions and correlations, respectively. The direction of comparative effect sizes within a study were assigned arbitrarily (e.g. A vs. B, or B vs. A), with bootstrapping performed to ensure effect size distributions were symmetric and centred on zero. The middle 25, 50, and 75% of distributions were used to define small, medium, and large thresholds, respectively.
Results A total of 3874 pairwise effect sizes were obtained from 417 studies comprising 958 active interventions. Threshold values were estimated as: small = 0.14 [95%CrI: 0.12 to 0.15]; medium: 0.29 [95%CrI: 0.28 to 0.30]; and large = 0.51 [95%CrI: 0.50 to 0.53]. No differences were identified in the threshold values across different outcome domains. Correlations ranged widely (0.06 ≤ r ≤0.36), but were larger when outcomes within the same outcome domain were considered.
Conclusions The finding that comparative effect sizes in S&C are typically below 0.30 and can be moderately correlated has important implications for future research. Sample sizes should be substantively increased to appropriately power controlled trials with pre-post intervention data. Alpha adjustment approaches used to control for multiple testing should account for correlations between outcomes and not assume independence.
(1) Swinton PA, Burgess K, Hall A, Greig L, Psyllas J, Aspe R, et al. Interpreting magnitude of change in strength and conditioning: Effect size selection, threshold values and Bayesian updating. Journal of Sports Sciences. 2022; In Press.
(2) Rhea MR, Alvar BA, Burkett LN. Single versus multiple sets for strength: a meta-analysis to address the controversy. Research Quarterly for Exercise and Sport. 2002; 73:485-488. https://doi.org/10.1080/02701367.2002.10609050
(3) Ralston GW, Kilgore, L. Wyatt, F.B. Baker, J.S. The effect of weekly set volume on strength gain: A meta-analysis. Sports Medicine. 2017; 47:2585-2601. https://doi.org/10.1007/s40279-017-0762-7
(4) Williams TD, Tolusso DV, Fedewa MV, Esco MR. Comparison of periodized and non-periodized resistance training on maximal strength: A meta-analysis. Sports Medicine. 2017; 47:2083-2100. https://doi.org/10.1007/s40279-017-0734-y
(5) Harries SK, Lubans DR, Callister R. Systematic review and meta-analysis of linear and undulating periodized resistance training programs on muscular strength. The Journal of Strength and Conditioning Research. 2015; 29:1113-1125. https://doi.org/10.1519/JSC.0000000000000712
(6) Schoenfeld BJ, Wilson JM, Lowery RP, Krieger JW. Muscular adaptations in low-versus high-load resistance training: A meta-analysis. European Journal of Sport Science. 2016; 16:1-10. https://doi.org/10.1080/17461391.2014.989922
(7) Morris SJ, Oliver JL, Pedley JSH, G.G., Lloyd RS. Comparison of weightlifting, traditional resistance training and plyometrics on strength, power and speed: a systematic review with meta-analysis. Sports Medicine. 2022; 13:1-22. https://doi.org/10.1007/s40279-021-01627-2
(8) Maier M, Lakens D. Justify your alpha: A primer on two practical approaches. Advances in Methods and Practices Psychological Science. 2022; 5:1-14. https://doi.org/10.1177/25152459221080396
(9) Cohen J. The earth is round (p<.05). American Psychologist. 1994; 49:997-1003. https://psycnet.apa.org/doi/10.1037/0003-066X.49.12.997
(10) Lohse K. No Estimation without Inference: A Response to the International Society of Physiotherapy Journal Editors. Communications in Kinesiology. 1(4). https://doi.org/10.51224/cik.2022.49
(11) Caldwell A, Vigotsky AD. A case against default effect sizes in sport and exercise science. PeerJ. 2020; 8:e10314. https://doi.org/10.7717/peerj.10314
(12) Angus DC, Chang CCH. Heterogeneity of treatment effect. Estimating how the effects of interventions vary across individuals. JAMA. 2021; 326(22):2312-2313. https://doi.org/10.1001/jama.2021.20552
(13) Lakens DC, A. Simulation-Based Power Analysis for Factorial Analysis of Variance Designs. Advances in Methods and Practices in Psychological Science. 4. https://doi.org/10.1177/2515245920951503
(14) Faul F, Erdfelder E, Buchner A, Lang AG. Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods. 2009; 41:1149-1160. https://doi.org/10.3758/BRM.41.4.1149
(15) Rubin M. When to adjust alpha during multiple testing: a consideration of disjunction, conjunction, and individual testing. Synthese. 2021; 199:10969-11000. https://doi.org/10.1007/s11229-021-03276-4
(16) Midway S, Robertson M, Flinn S, Kaller M. Comparing multiple comparisons: practical guidance for choosing the best multiple comparisons test. PeerJ. 2020; 8:e10387. https://doi.org/10.7717/peerj.10387
(17) Sinclair JK, Taylor PJ, Hobbs SJ. Alpha level adjustments for multiple dependent variable analyses and their applicability. International Journal of Sports Science and Engineering. 2013; 7:17-20.
(18) VanderWeele TJ, Mathur MB. Some desirable properties of the Bonferroni correction: Is the Bonferroni correction really so bad? American Journal of Epidemiology. 2019; 188:617-618. https://doi.org/10.1093/aje/kwy250
(19) Swinton PA, Lloyd R, Keogh JWL, Agouris I, Stewart AD. Regression models of sprint, vertical jump, and change of direction performance. Journal of Strength and Conditioning Research. 2014; 28:1839-1848. https://doi.org/10.1519/JSC.0000000000000348
(20) Vickerstaff V, Omar RZ, Ambler G. Methods to adjust for multiple comparisons in the analysis and sample size calculation of randomised controlled trials with multiple primary outcomes. BMC Medical Research Methodology. 2019; 19:129. https://doi.org/10.1186/s12874-019-0754-4
(21) Zampieri FG, Casey JD, Shankar-Hari M, Harrell FE, Harhay MO. Using Bayesian methods to augment the interpretation of critical care trials. An overview of theory and example reanalysis of the alveolar recruitment for acute respiratory distress syndrome trial. American journal of respiratory and critical care medicine. 2021; 203:543-552. https://doi.org/10.1164/rccm.202006-2381CP
(22) Brydges CR. Effect size guidelines, sample size calculations, and statistical power in gerontology. Innovation in Aging. 2019; 3:igz036. https://doi.org/10.1093/geroni/igz036
(23) Gignac GE, Szodorai ET. Effect size guidelines for individual differences researchers. Personality and Individual Differences. 2016; 102:74-78. https://doi.org/10.1016/j.paid.2016.06.069
(24) Rhea M. Determining the magnitude of treatment effects in strength training research through the use of effect sizes. Journal of Strength and Conditioning Research. 2004; 18:918-920. https://doi.org/10.1519/14403.1
(25) Morris SB. Estimating effect sizes from pretest-posttest-control group design. Organizational Research Methods. 2008; 11:364-386. https://doi.org/10.1177/1094428106291059
(26) Swinton PA, Burges K, Hall A, Greig L, Psyllas J, Aspe R, et al. A Bayesian approach to interpret intervention effectiveness in strength and conditioning: Part 2. Effect size selection and application of Bayesian updating. Pre-print available from SportRχiv. 2021. https://doi.org/10.51224/SRXIV.11
(27) Gelman A. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis. 2006; 1:515-534. https://doi.org/10.1214/06-BA117A
(28) Verardi V, Vermandele C. Univariate and multivariate outlier identification for skewed or heavy-tailed distributions. The Stata Journal. 2018; 18:517-532. https://doi.org/10.1177/1536867X1801800303
(29) Bürkner PC. brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software. 2017; 80:1-28. https://doi.org/10.18637/jss.v080.i01
(30) Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. Taylor & Francis; 2014. https://doi.org/10.1201/b16018
(31) Soria-Gila MA, Chirosa IJ, Bautista IJ, Baena S, Chirosa LJ. Effects of variable resistance training on maximal strength: A meta-analysis. Journal of Strength and Conditioning Research. 2015; 29:3260-3270. https://doi.org/10.1519/JSC.0000000000000971
(32) Cormier P, Freitas TT, Rubio-Arias JÁ, Alcaraz PE. Complex and Contrast Training: Does Strength and Power Training Sequence Affect Performance-Based Adaptations in Team Sports? A Systematic Review and Meta-analysis. Journal of Strength and Conditioning Research. 2020; 34:1461-1479. https://doi.org/10.1519/JSC.0000000000003493
(33) Beck TW. The importance of a priori sample size estimation in strength and conditioning research. Journal of Strength and Conditioning Research. 2013; 27:2323-2337. https://doi.org/10.1519/JSC.0b013e318278eea0
(34) Swinton PA, Burgess K, Hall A, Greig L, Psyllas J, Aspe R, et al. A Bayesian approach to interpreting intervention effectiveness in strength and conditioning: Part 1. A meta-analysis to derive context-specific thresholds. Pre-print available from SportRχiv. 2021. https://doi.org/10.51224/SRXIV.9
(35) Roberts BM, Nuckols G, Krieger JW. Sex differences in resistance training: A systematic review and meta-analysis. Journal of Strength and Conditioning Research. 2020; 34:1448-1460. https://doi.org/10.1519/JSC.0000000000003521
(36) Sankoh AJ, Huque MF, Dubey SD. Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine. 1997; 16:2529-2542.
(37) Primbs MA, Pennington CR, Lakens D, Silan MAA, Lieck DSN, Forscher PS, et al. Are Small Effects the Indispensable Foundation for a Cumulative Psychological Science? A Reply to Götz et al. (2022). Perspectives on Psychological Science. 2022. https://doi.org/10.1177/17456916221100420
(38) Cohen J. Statistical power analysis for the behavioral sciences. 2nd. ed. Hillsdale, NJ: Lawrence Erlbaum Associate; 1988. https://doi.org/10.4324/9780203771587
(39) Mengersen KL, Drovandi CC, Robert CP, Pyne DP, Gore CJ. Bayesian estimation of small effects in exercise and sports science. PLos ONE. 2016; 11:e014731. https://doi.org/10.1371/journal.pone.0147311
(40) Jones HE, Ades AE, Sutton AJ, Welton NJ. Use of a random effects meta-analysis in the design and analysis of a new clinical trial. Statistics in Medicine. 2018; 37:4679. https://doi.org/10.1002/sim.7948
(41) Stefan AM, Evans NJ, Wagenmakers EJ. Practical challenges and methodological flexibility in prior elicitation. Psychological Methods. 2022; 27:177-197. https://doi.org/10.1037/met0000354
Copyright (c) 2022 Paul Swinton, Andrew Murphy
This work is licensed under a Creative Commons Attribution 4.0 International License.