Preprint has been published in a journal as an article
DOI of the published article
Preprint / Version 1

Myths and Methodologies

the use of equivalence and non-inferiority tests for interventional studies in exercise physiology and sport science




intervention efficacy, methodology, statistical review


Exercise physiology and sport science have traditionally made use of the null hypothesis of no difference to make decisions about experimental interventions. This article aims to review current statistical approaches typically used by exercise physiologists and sport scientists for the design and analysis of experimental interventions and to highlight the importance of including equivalence and non-inferiority studies, which address different research questions than deciding whether two interventions work differently. Firstly, we briefly describe the most common approaches, along with their rationale, to investigate the effects of different
interventions. We then discuss the main steps involved in the design and analysis of equivalence and non-inferiority studies, commonly performed in other research fields, with worked examples from exercise physiology and sport science scenarios. Finally, we provide recommendations to exercise physiologists and sport scientists who would like to apply the
different approaches in future research. We hope this work will promote the correct use of equivalence and non-inferiority designs in exercise physiology and sport science whenever the research context, conditions, applications, researchers’ interests, or reasonable beliefs, justify these approaches.


Metrics Loading ...


J. Aisbett, D. Lakens, and K. L. Sainani, K. L. “Magnitude based inference in relation to one-sided hypotheses testing procedures”. In: SportRxiv (2020).

T. A. Althunian, A. de Boer, R. H. H. Groenwold, and O. H. Klungel. “Defining the noninferiority margin and analysing noninferiority: an overview”. In: British Journal of Clinical Pharmacology 83 (2017), pp. 1636−1642.

T. A. Althunian, A. de Boer, R. H. H. Groenwold, and O. H. Klungel. “Using a single noninferiority margin or preserved fraction for an entire pharmacological class was found to be inappropriate”. In: Journal of Clinical Epidemiology 104 (2018), pp. 15−23.

D. G. Altman and J. M. Bland. “Absence of evidence is not evidence of absence”. In: British Medical Journal 311 (1995), pp. 485.

R. L. Berger and J. C. Hsu. “Bioequivalence trials, intersection-union tests and equivalence confidence sets”. In: Statistical Science 11 (1996), pp. 283–319.

A. R. Caldwell and S. N. Cheuvront. “Basic statistical considerations for physiology: the journal Temperature toolbox”. In: Temperature (Austin) 6 (2019), pp. 181−210.

A. Caldwell and A. D. Vigotsky. “A case against default effect sizes in sport and exercise science”. In: PeerJ 8 (2020), pp. e10314.

A. R. Caldwell, A. D. Vigotsky, M. S. Tenan, R. Radel, D. T. Mellor, A. Kreutzer, I. M. Lahart, J. P. Mills, M. P. Boisgontier, and Consortium for Transparency in Exercise Science (COTES) Collaborators. “Moving sport and exercise science forward: a call for the adoption of more transparent research practices”. In: Sports Medicine 50 (2020), pp. 449–459.

E. C. Carter, F. D. Schönbrodt, W. M. Gervais, and J. Hilgard. “Correcting for bias in psychology: a comparison of meta-analytic methods”. In: Advances in Methods and Practices in Psychological Science 2 (2019), pp. 115−144.

J. Castelloe and D. Watts. “Equivalence and Noninferiority Testing Using SAS/STAT® Software”. In: Proceedings of the SAS Global Forum 2015 Conference. Cary, NC: SAS Institute Inc.

M. Cocks, C. S. Shaw, S. O. Shepherd, J. P. Fisher, A. Ranasinghe, T. A. Barker, and A. J. Wagenmakers. “Sprint interval and moderate-intensity continuous training have equal benefits on aerobic capacity, insulin sensitivity, muscle capillarisation and endothelial eNOS/NAD(P)Hoxidase protein ratio in obese men”. In: Journal of Physiology 594 (2016), pp. 2307−2321.

J. Cohen, J. “Statistical power analysis for the behavioral sciences (2nd ed.)”. (1988). Hillsdale, NJ: Lawrence Earlbaum Associates.

Committee for Medicinal Products for Human Use. (2005). “Guideline on the choice of the non-inferiority margin”. European Medicines Agency.

Committee for Medicinal Products for Human Use. (2010). “Guideline on the investigation of bioequivalence”. European Medicines Agency.

Committee for Proprietary Medicinal Products. (2000). “Points to consider on switching between superiority and non-inferiority”. European Medicines Agency.

J. A. Cook, S. A. Julious, W. Sones, L. V. Hampson, C. Hewitt, J. A. Berlin, D. Ashby, R. Emsley, D. A. Fergusson, S. J. Walters, E. C. F. Wilson, G. Maclennan, N. Stallard, J. C. Rothwell, M. Bland, L. Brown, C. R. Ramsay, A. Cook, D. Armstrong, … L. D. Vale. “DELTA2 guidance on choosing the target difference and undertaking and reporting the sample size calculation for a randomised controlled trial”. In: British Medical Journal 363 (2018), pp. k3750.

P. M. Dixon, P. F. Saint-Maurice, Y. Kim, P. Hibbing, Y. Bai, and G. J. Welk. “A primer on the use of equivalence testing for evaluating measurement agreement”. In: Medicine & Science in Sports & Exercise 50 (2018), pp. 837−845.

J. B. Gillen, B. J. Martin, M. J. MacInnis, L. E. Skelly, M. A. Tarnopolsky, and M. J. Gibala. “Twelve weeks of sprint interval training improves indices of cardiometabolic health similar to traditional endurance training despite a five-fold lower exercise volume and time commitment”. In: PLoS One 11 (2016), pp. e0154075.

I. Halperin, A. D. Vigotsky, C. Foster, and D. B. Pyne. “Strengthening the practice of exercise and sport-science research”. In: International Journal of Sports Physiology and Performance 13 (2018), pp. 127−134.

A. Hecksteden, O. Faude, T. Meyer, and L. Donath. “How to construct, conduct and analyze an exercise training study?” in: Frontiers in Physiology 9 (2018), pp. 1007.

J. P. T. Higgins, J. Thomas, J. Chandler, M. Cumpston, T. Li, M. J. Page, and V. A. Welch. “Cochrane handbook for systematic reviews of interventions (2nd ed.)”. (2019). Chichester: Wiley.

J. L. Hodges and E. L. Lehmann. “Testing the approximate validity of statistical hypotheses.” In: Journal of the Royal Statistical Society Series B (Statistical Methodology) 16 (1954), pp. 261–268.

E. B. Holmgren. “Establishing equivalence by showing that a specified percentage of the effect of the active control over placebo is maintained”. In: Journal of Biopharmaceutical Statistics 9 (1999), pp. 651–659.

W. G. Hopkins, J. A. Hawley, and L. M. Burke. Design and analysis of research on sport performance enhancement. In: Medicine & Science in Sports & Exercise 31 (1999), pp. 472−485.

International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (1998). “ICH E9: statistical principles for clinical trials.” European Medicines Agency.

International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (2001). “ICH E10: Choice of control group in clinical trials.” European Medicines Agency.

S. A. Julious. “Sample sizes for clinical trials with normal data”. In: Statistics in Medicine 23 (2004), pp. 1921–1986.

D. Lakens. “Equivalence tests: a practical primer for t tests, correlations, and meta-analyses”. Social Psychological and Personality Science 8 (2017), pp. 355−362.

D. Lakens. “Sample size justification”. In: PsyArXiv (2019).

D. Lakens, F. G. Adolfi, C. J. Albers, F. Anvari, M. A. J. Apps, S. E. Argamon, T. Baguley, R. B. Becker, S. D. Benning, D. E. Bradford, E. M. Buchanan, A. R. Caldwell, B. Van Calster, R. Carlsson, S−C. Chen, B. Chung, L. J. Colling, G. S. Collins, Z. Crook, … Zwaan, R. A. (2018). “Justify your alpha”. In. Nature Human Behaviour 2 (2018), pp. 168–171.

D. Lakens and E. R. K. Evers. “Sailing from the seas of chaos into the corridor of stability: practical recommendations to increase the informational value of studies”. In: Perspectives on Psychological Science 9 (2014), pp. 278–292.

D. Lakens, N. McLatchie, P. M. Isager, A. M. Scheel, and Z. Dienes. “Improving inferences about null effects with Bayes factors and equivalence tests”. In: J The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences 75 (2020), pp. 45−57.

D. Lakens, F. Pahlke, and G. Wassmer. “Group sequential designs: a tutorial”. In: PsyArXiv (2021).

D. Lakens, A. M. Scheel, and P. M. Isager. “Equivalence testing for psychological research: a tutorial”. In: Advances in Methods and Practices in Psychological Science 1 (2018), pp. 259−269.

K. Magnusson (2021, October 24). “Equivalence, non-inferiority and superiority testing − an interactive visualization”. In: R Psychologist.

M. A. Mansournia and D. G. Altman. “Invited commentary: methodological issues in the design and analysis of randomised trials”. In: British Journal of Sports Medicine 52 (2018), pp. 553−555.

M. Meyners. “Equivalence tests – a review”. In: Food Quality and Preference 26 (2012), pp. 231–245.

Z. Milanović, G. Sporiš, and M. Weston. “Effectiveness of high-intensity interval training (HIT) and continuous endurance training for VO2max improvements: a systematic review and meta-analysis of controlled trials”. In: Sports Medicine 45 (2015), pp. 1469−1481.

K. R. Murphy, B. Myors, and A. Wolach. “Statistical power analysis: A simple and general model for traditional and modern hypothesis tests (4th ed.)”. (2014). New York, NY: Routledge.

M. R. Rhea. “Determining the magnitude of treatment effects in strength training research through the use of the effect size”. In: The Journal of Strength & Conditioning Research 18 (2004), pp. 918−920.

R. Ross, S. N. Blair, R. Arena, T. S. Church, J. P. Després, B. A. Franklin, W. L. Haskell, L. A. Kaminsky, B. D. Levine, C. J. Lavie, J. Myers, J. Niebauer, R. Sallis, S. S. Sawada, X. Sui, U. Wisløff, American Heart Association Physical Activity Committee of the Council on Lifestyle and Cardiometabolic Health, Council on Clinical Cardiology, Council on Epidemiology and Prevention, Council on Cardiovascular and Stroke Nursing, Council on Functional Genomics and Translational Biology, and Stroke Council. “Importance of assessing cardiorespiratory fitness in clinical practice: a case for fitness as a clinical vital sign: a scientific statement from the American Heart Association”. In: Circulation 134 (2016), pp. e653−e699.

D. J. Schuirmann. “A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability”. In: Journal of Pharmacokinetics and Pharmacodynamics 15 (1987), pp. 657−680.

J. Schumi and J. T. Wittes. “Through the looking glass: understanding non-inferiority”. In: Trials 12 (2011), pp. 106.

S. Senn. “Statistical issues in drug development (2nd ed.).” (2007). Hoboken, NJ: Wiley.

U. Simonsohn, L. D. Nelson, and J. P. Simmons. “p-curve and effect size: correcting for publication bias using only significant results”. In: Perspectives on Psychological Science 9 (2014), pp. 666–681.

S. M. Snapinn. “Alternatives for discounting in the analysis of noninferiority trials”. In: Journal of Biopharmaceutical Statistics 14 (2004), pp. 263−273.

S. Snapinn and Q. Jiang. “Controlling the type 1 error rate in non-inferiority trials”. In: Statistics in Medicine 27 (2008), pp. 371–381.

S. Snapinn and Q. Jiang. “Preservation of effect and the regulatory approval of new treatments on the basis of non-inferiority trials”. In: Statistics in Medicine 27 (2008), pp. 382−391.

H. D. Speed and M. B. Andersen. “What exercise and sport scientists don’t understand”. In: Journal of Science and Medicine in Sport 3 (2000), pp. 84−92.

D. Van Ravenzwaaij, R. Monden, J. N. Tendeiro, and J. P. A. Ioannidis. “Bayes factors for superiority, non-inferiority, and equivalence designs”. In: BMC Medical Research Methodology 19 (2019), pp. 71.

N. Victor. “On clinically relevant differences and shifted null hypotheses”. In: Methods of Information in Medicine 26 (1987), pp. 109−116.

S. J. Wang and J. D. Blume. “An evidential approach to non-inferiority clinical trials.” In: Pharmaceutical Statistics 10 (2011), pp. 440−447.

W. J. Westlake. “Response to T.B.L. Kirkwood: Bioequivalence testing − a need to rethink”. In: Biometrics 37 (1981), pp. 589−594.

B. Yu, H. Yang, and B. Sabin. “A note on the determination of non-inferiority margins with application in oncology clinical trials”. In: Contemporary Clinical Trials Communications 16 (2019), pp. 100454.