Preprint / Version 1

Regression models are not inherently predictive

An educational review on why sports exercise and medicine research should be more careful with the term “predictors”


  • José Afonso University of Porto - Faculty of Sport
  • Chris Bishop
  • Rodrigo Ramirez-Campillo
  • Filipe Manuel Clemente
  • Renato Andrade
  • Tania Pizzari
  • Anthony Turner
  • Richard Inman



regression, sports sciences, sports medicine, predictive models, predictors


The term “predictor” is commonly used in regression modelling in substitution of the more accurate “independent variables”, suggesting a predictive capacity that regression inherently lacks. The goal of this educational review is to raise awareness of the misuse of the term “predictor” when associated with regression models, with a focus on sports exercise and medicine. We start by elucidating the fundamentals of regression modelling and explain its descriptive rather than predictive nature. We then address key conceptual pre-requisites for predictive modelling: sample representativeness, context, expected consistency of relationships over time, trustworthiness of measurements, problems with multiple testing, and confounders. Next, we establish why external validation is warranted before deeming a model “predictive” and present a conceptual model for progressive extrapolation. While these steps apply to other statistical models, regression modelling is particularly prone to the use of the term “predictors” as a default terminology, fostering the misconception that regression models are inherently predictive. While regression models provide relevant insights into the relationships between a chosen set of variables, they are not inherently predictive, and their extrapolation is contingent upon rigorous validation and contextual appropriateness. Lastly, we provide an algorithm and checklist to guide researchers when the terms “predictor” or “predictive” may be applicable. We advocate that research using regression modelling should eschew from the default use of “predictive” terminology to avoid inaccurate interpretations and scientifically misleading the audience. Awareness of these nuances is crucial to strive for scientific integrity and to appropriately interpret findings from research that uses regression models.


