This is an outdated version published on 2022-11-16. Read the most recent version.
Preprint / Version 1

An overview of machine learning applications in sports injury prediction


  • Alfred Amendolara New Jersey Institute of Technology
  • Devin Pfister
  • Marina Settelmayer
  • Mujtaba Shah
  • Veronica Wu
  • Sean Donnelly
  • Brooke Johnston
  • Race Peterson
  • David Sant
  • John Kriak
  • Kyle Bills



Machine Learning, Injury, injuries prediction


Use injuries represent a serious and intractable problem in athletics that has traditionally relied on historic datasets and human experience for prevention. Existing methodologies have been frustratingly slow at developing higher precision prevention practices. Technological advancements have permitted the emergence of artificial intelligence and machine learning (ML) as promising toolsets to enhance both injury mitigation and rehabilitation protocols. This article provides a comprehensive overview of ML techniques as they have been applied to sports injury prediction and prevention to date. Literature from the last five years has been compiled and the findings presented. Given the current lack of open source, uniform data sets, as well as a reliance on dated regression models, no strong conclusions about the real-world efficacy of ML as it applies to sports injury prediction can be made. However, it is suggested that addressing these two issues will allow powerful, novel ML architectures to be deployed, thus rapidly advancing the state of this field and providing validated clinical tools.   


Metrics Loading ...


Samuel, A.L., Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development, 1959. 3(3): p. 210-229.DOI: 10.1147/rd.33.0210.

Alpaydin, E., Introduction to machine learning. 2020: MIT press.

Bullock, G.S., et al., Just How Confident Can We Be in Predicting Sports Injuries? A Systematic Review of the Methodological Conduct and Performance of Existing Musculoskeletal Injury Prediction Models in Sport. Sports Med, 2022.DOI: 10.1007/s40279-022-01698-9.

Van Eetvelde, H., et al., Machine learning methods in sport injury prediction and prevention: a systematic review. J Exp Orthop, 2021. 8(1): p. 27.DOI: 10.1186/s40634-021-00346-x.

Horvat, T. and J. Job, The use of machine learning in sport outcome prediction: A review. WIREs Data Mining and Knowledge Discovery, 2020. 10(5): p. e1380.DOI: 10.1002/widm.1380.

Claudino, J.G., et al., Current Approaches to the Use of Artificial Intelligence for Injury Risk Assessment and Performance Prediction in Team Sports: a Systematic Review. Sports Med Open, 2019. 5(1): p. 28.DOI: 10.1186/s40798-019-0202-3.

Rico-González, M., et al., Machine learning application in soccer: A systematic review. Biology of Sport, 2023: p. 249-263.DOI: 10.5114/biolsport.2023.112970.

Nassis, G., et al., A review of machine learning applications in soccer with an emphasis on injury risk. Biology of Sport, 2023: p. 233-239.DOI: 10.5114/biolsport.2023.114283.

Koseler, K. and M. Stephan, Machine learning applications in baseball: A systematic literature review. Applied Artificial Intelligence, 2017. 31(9-10): p. 745-763.

Seow, D., I. Graham, and A. Massey, Prediction models for musculoskeletal injuries in professional sporting activities: A systematic review. Translational Sports Medicine, 2020. 3(6): p. 505-517.DOI: 10.1002/tsm2.181.

Liu, Y., et al., How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature. JAMA, 2019. 322(18): p. 1806-1816.DOI: 10.1001/jama.2019.16489.

Grossberg, S., Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks, 1988. 1(1): p. 17-61.DOI: 10.1016/0893-6080(88)90021-4.

Redyuk, S., et al., Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data, in Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'19. 2019, Association for Computing Machinery: Amsterdam, Netherlands. p. 1-4.DOI: 10.1145/3328519.3329126.

Fushiki, T., Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 2009. 21(2): p. 137-146.DOI: 10.1007/s11222-009-9153-8.

Prakisya, N.P.T., et al., Utilization of K-nearest neighbor algorithm for classification of white blood cells in AML M4, M5, and M7. Open Engineering, 2021. 11(1): p. 662-668.DOI: 10.1515/eng-2021-0065.

Liu, K., et al. Classification of knee joint vibroarthrographic signals using k-nearest neighbor algorithm. in 2014 IEEE 27th Canadian Conference on Electrical and Computer Engineering (CCECE). 2014.DOI: 10.1109/CCECE.2014.6900933.

Bressan, M. and J. Vitrià, Nonparametric discriminant analysis and nearest neighbor classification. Pattern Recognition Letters, 2003. 24(15): p. 2743-2749.DOI: 10.1016/s0167-8655(03)00117-x.

Zhang, Z., Introduction to machine learning: k-nearest neighbors. Annals of Translational Medicine, 2016. 4(11): p. 218-218.DOI: 10.21037/atm.2016.03.37.

Chen, X., G. Yuan, and F. Khan, Sports Injury Rehabilitation Intervention Algorithm Based on Visual Analysis Technology. Mobile Information Systems, 2021. 2021: p. 1-8.DOI: 10.1155/2021/9993677.

Naglah, A., et al. Athlete-Customized Injury Prediction using Training Load Statistical Records and Machine Learning. in 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). 2018.DOI: 10.1109/ISSPIT.2018.8642739.

MacQueen, J. Classification and analysis of multivariate observations. in 5th Berkeley Symp. Math. Statist. Probability. 1967.

Hong, X., Basketball Data Analysis Using Spark Framework and K-Means Algorithm. J Healthc Eng, 2021. 2021: p. 6393560.DOI: 10.1155/2021/6393560.

Likas, A., N. Vlassis, and J. J. Verbeek, The global k-means clustering algorithm. Pattern Recognition, 2003. 36(2): p. 451-461.DOI: 10.1016/s0031-3203(02)00060-2.

Dingenen, B., et al., Subclassification of recreational runners with a running-related injury based on running kinematics evaluated with marker-based two-dimensional video analysis. Phys Ther Sport, 2020. 44: p. 99-106.DOI: 10.1016/j.ptsp.2020.04.032.

Ibanez, S.J., C.D. Gomez-Carmona, and D. Mancha-Triguero, Individualization of Intensity Thresholds on External Workload Demands in Women's Basketball by K-Means Clustering: Differences Based on the Competitive Level. Sensors (Basel), 2022. 22(1): p. 324.DOI: 10.3390/s22010324.

Noble, W., What is a support vector machine? 2006: Nature Biotechnology.

Cortes, C. and V. Vapnik, Support-vector networks. Machine Learning, 1995. 20(3): p. 273-297.DOI: 10.1007/bf00994018.

Guyon, I., et al., Gene Selection for Cancer Classification using Support Vector Machines. Machine Learning, 2002. 46(1/3): p. 389-422.DOI: 10.1023/a:1012487302797.

Van Eetvelde, H., et al., Machine learning methods in sport injury prediction and prevention: a systematic review. Journal of Experimental Orthopaedics, 2021. 8(1): p. 27.DOI: 10.1186/s40634-021-00346-x.

Rodas, G., et al., Genomic Prediction of Tendinopathy Risk in Elite Team Sports. Int J Sports Physiol Perform, 2019. 15(4): p. 1-7.DOI: 10.1123/ijspp.2019-0431.

Ruddy, J.D., et al., Predictive Modeling of Hamstring Strain Injuries in Elite Australian Footballers. Med Sci Sports Exerc, 2018. 50(5): p. 906-914.DOI: 10.1249/MSS.0000000000001527.

Carey, D.L., et al., Predictive Modelling of Training Loads and Injury in Australian Football. International Journal of Computer Science in Sport, 2018. 17(1): p. 49-66.DOI: 10.2478/ijcss-2018-0002.

Landset, S., M.F. Bergeron, and T.M. Khoshgoftaar. Using Weather and Playing Surface to Predict the Occurrence of Injury in Major League Soccer Games: A Case Study. in 2017 IEEE International Conference on Information Reuse and Integration (IRI). 2017.DOI: 10.1109/IRI.2017.86.

Meng, L. and E. Qiao, Analysis and design of dual-feature fusion neural network for sports injury estimation model. Neural Computing and Applications, 2021.DOI: 10.1007/s00521-021-06151-y.

Shen, H., Prediction simulation of sports injury based on embedded system and neural network. Microprocessors and Microsystems, 2021. 82: p. 103900.DOI: 10.1016/j.micpro.2021.103900.

Wang, S. and B. Lyu, Evidence-based sports medicine to prevent knee joint injury in triple jump. Revista Brasileira de Medicina do Esporte, 2022. 28: p. 195-198.

Kingsford, C. and S.L. Salzberg, What are decision trees? Nat Biotechnol, 2008. 26(9): p. 1011-3.DOI: 10.1038/nbt0908-1011.

Connaboy, C., et al., Employing machine learning to predict lower extremity injury in US Special Forces. Medicine and science in sports and exercise, 2018.

Mendonça, L.D., et al., Association of hip and Foot Factors with Patellar Tendinopathy (Jumper's knee) in athletes. journal of orthopaedic & sports physical therapy, 2018. 48(9): p. 676-684.

Kolodziej, M., et al., Identification of Neuromuscular Performance Parameters as Risk Factors of Non-contact Injuries in Male Elite Youth Soccer Players: A Preliminary Study on 62 Players With 25 Non-contact Injuries. Front Sports Act Living, 2021. 3: p. 615330.DOI: 10.3389/fspor.2021.615330.

Ruiz-Perez, I., et al., A Field-Based Approach to Determine Soft Tissue Injury Risk in Elite Futsal Using Novel Machine Learning Techniques. Front Psychol, 2021. 12: p. 610210.DOI: 10.3389/fpsyg.2021.610210.

Rommers, N., et al., A Machine Learning Approach to Assess Injury Risk in Elite Youth Football Players. Med Sci Sports Exerc, 2020. 52(8): p. 1745-1751.DOI: 10.1249/MSS.0000000000002305.

Rossi, A., et al., Effective injury forecasting in soccer with GPS training data and machine learning. PLoS One, 2018. 13(7): p. e0201264.DOI: 10.1371/journal.pone.0201264.

Breiman, L., Random forests. Machine learning, 2001. 45(1): p. 5-32.

Cutler, A., D.R. Cutler, and J.R. Stevens, Random forests, in Ensemble machine learning. 2012, Springer. p. 157-175.

Nguyen, T.T., J.Z. Huang, and T.T. Nguyen, Unbiased feature selection in learning random forests for high-dimensional data. ScientificWorldJournal, 2015. 2015: p. 471371.DOI: 10.1155/2015/471371.

Farhadian, M., S. Torkaman, and F. Mojarad, Random forest algorithm to identify factors associated with sports-related dental injuries in 6 to 13-year-old athlete children in Hamadan, Iran-2018 -a cross-sectional study. BMC Sports Sci Med Rehabil, 2020. 12(1): p. 69.DOI: 10.1186/s13102-020-00217-5.

Henriquez, M., et al., Machine Learning to Predict Lower Extremity Musculoskeletal Injury Risk in Student Athletes. Front Sports Act Living, 2020. 2: p. 576655.DOI: 10.3389/fspor.2020.576655.

Goggins, L., et al., Detecting Injury Risk Factors with Algorithmic Models in Elite Women's Pathway Cricket. Int J Sports Med, 2022. 43(4): p. 344-349.DOI: 10.1055/a-1502-6824.

Hogarth, L., et al., Classifying motor coordination impairment in Para swimmers with brain injury. J Sci Med Sport, 2019. 22(5): p. 526-531.DOI: 10.1016/j.jsams.2018.11.015.

Jauhiainen, S., et al., New Machine Learning Approach for Detection of Injury Risk Factors in Young Team Sport Athletes. Int J Sports Med, 2021. 42(2): p. 175-182.DOI: 10.1055/a-1231-5304.

Freund, Y. and R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 1997. 55(1): p. 119-139.

Friedman, J.H., Greedy function approximation: a gradient boosting machine. Annals of statistics, 2001: p. 1189-1232.

Radovanović, S., et al. Ski Injury Predictions with Explanations. in ICT Innovations 2019. Big Data Processing and Mining. 2019. Cham: Springer International Publishing.

Lopez-Valenciano, A., et al., A Preventive Model for Muscle Injuries: A Novel Approach based on Learning Algorithms. Med Sci Sports Exerc, 2018. 50(5): p. 915-927.DOI: 10.1249/MSS.0000000000001535.

Moustakidis, S., et al., Prediction of Injuries in CrossFit Training: A Machine Learning Perspective. Algorithms, 2022. 15(3): p. 77.

Nicholson, K.F., et al., Machine Learning and Statistical Prediction of Pitching Arm Kinetics. Am J Sports Med, 2022. 50(1): p. 238-247.DOI: 10.1177/03635465211054506.

Hecksteden, A., et al., Forecasting football injuries by combining screening, monitoring and machine learning. Sci Med Footb, 2022: p. 1-15.DOI: 10.1080/24733938.2022.2095006.

Luu, B.C., et al., Machine Learning Outperforms Logistic Regression Analysis to Predict Next-Season NHL Player Injury: An Analysis of 2322 Players From 2007 to 2017. Orthop J Sports Med, 2020. 8(9): p. 2325967120953404.DOI: 10.1177/2325967120953404.

Mansouri, M., et al., A predictive paradigm for identifying elevated musculoskeletal injury risks after sport-related concussion. Sports Orthopaedics and Traumatology, 2022. 38(1): p. 66-74.DOI: 10.1016/j.orthtr.2021.11.006.

Windsor, J., et al., A Retrospective Study of Foot Biomechanics and Injury History in Varsity Football Athletes at the U.S. Naval Academy. Mil Med, 2022. 187(5-6): p. 684-689.DOI: 10.1093/milmed/usab370.

Ayala, F., et al., A Preventive Model for Hamstring Injuries in Professional Soccer: Learning Algorithms. Int J Sports Med, 2019. 40(5): p. 344-353.DOI: 10.1055/a-0826-1955.

Kotsiantis, S.B., I. Zaharakis, and P. Pintelas, Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 2007. 160(1): p. 3-24.

O'Shea, K. and R. Nash, An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.

Kautz, T., et al., Activity recognition in beach volleyball using a Deep Convolutional Neural Network. Data Mining and Knowledge Discovery, 2017. 31(6): p. 1678-1705.DOI: 10.1007/s10618-017-0495-0.

Pappalardo, L., et al., Explainable Injury Forecasting in Soccer via Multivariate Time Series and Convolutional Neural Networks. Barça Sports Anal. Summit, 2019.

Song, H., et al., Secure prediction and assessment of sports injuries using deep learning based convolutional neural network. Journal of Ambient Intelligence and Humanized Computing, 2021. 12(3): p. 3399-3410.DOI: 10.1007/s12652-020-02560-4.

Ma, H. and X. Pang, Research and Analysis of Sport Medical Data Processing Algorithms Based on Deep Learning and Internet of Things. IEEE Access, 2019. 7: p. 118839-118849.DOI: 10.1109/access.2019.2936945.

Ghazi, K., et al., Instantaneous Whole-Brain Strain Estimation in Dynamic Head Impact. J Neurotrauma, 2021. 38(8): p. 1023-1035.DOI: 10.1089/neu.2020.7281.

Hochreiter, S. and J. Schmidhuber, Long short-term memory. Neural Comput, 1997. 9(8): p. 1735-80.DOI: 10.1162/neco.1997.9.8.1735.

Gers, F.A., J. Schmidhuber, and F. Cummins, Learning to forget: continual prediction with LSTM. Neural Comput, 2000. 12(10): p. 2451-71.DOI: 10.1162/089976600300015015.

Amendolara, A., Predictive modeling of influenza in New England using a recurrent deep neural network. 2019. Theses. 1739.

Cremanns, K. and D. Roos, Deep Gaussian covariance network. arXiv preprint arXiv:1710.06202, 2017.

Rahlf, A.L., et al., A machine learning approach to identify risk factors for running-related injuries: study protocol for a prospective longitudinal cohort trial. BMC Sports Sci Med Rehabil, 2022. 14(1): p. 75.DOI: 10.1186/s13102-022-00426-0.

Broomhead, D.S. and D. Lowe, Radial basis functions, multi-variable functional interpolation and adaptive networks. 1988, Royal Signals and Radar Establishment Malvern (United Kingdom).

Orr, M.J., Introduction to radial basis function networks. 1996, Technical Report, center for cognitive science, University of Edinburgh ….

Xiang, C., Early Warning Model of Track and Field Sports Injury Based on RBF Neural Network Algorithm. Journal of Physics: Conference Series, 2021. 2037(1): p. 012084.DOI: 10.1088/1742-6596/2037/1/012084.

He, F. and W. Wang, Early Warning Model of Sports Injury Based on RBF Neural Network Algorithm. Complexity, 2021. 2021: p. 1-10.DOI: 10.1155/2021/6622367.

Zimmermann, H.J., Fuzzy set theory. Wiley Interdisciplinary Reviews: Computational Statistics, 2010. 2(3): p. 317-332.DOI: 10.1002/wics.82.

Ngo, H.A., T.N. Hoang, and M. Dik, Introduction to the Grey Systems Theory and Its Application in Mathematical Modeling and Pandemic Prediction of Covid-19, in Analysis of Infectious Disease Problems (Covid-19) and Their Global Impact, P. Agarwal, et al., Editors. 2021, Springer Singapore: Singapore. p. 191-218.DOI: 10.1007/978-981-16-2450-6_10.

Wang, D. and J.S. Yang, Analysis of Sports Injury Estimation Model Based on Mutation Fuzzy Neural Network. Comput Intell Neurosci, 2021. 2021: p. 3056428.DOI: 10.1155/2021/3056428.

Zhang, F., Y. Huang, and W. Ren, Basketball Sports Injury Prediction Model Based on the Grey Theory Neural Network. J Healthc Eng, 2021. 2021: p. 1653093.DOI: 10.1155/2021/1653093.