Recommended reading
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springer.
A foundational text covering the theory and mathematics behind modern machine learning methods, including regression, classification, and ensemble techniques.James, G., Witten, D., Hastie, T., & Tibshirani, R. (2023). An Introduction to Statistical Learning with Applications in R (3rd ed.). Springer.
A more accessible companion to The Elements of Statistical Learning, focused on practical implementation in R with extensive code examples.Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer.
A comprehensive reference for probabilistic and Bayesian approaches to machine learning, suitable for readers seeking a deeper mathematical understanding.Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge University Press.
A classic text that bridges traditional statistical modeling with early neural network methods, emphasizing connections between machine learning and statistics.Wood, S. N. (2017). Generalized Additive Models: An Introduction with R (2nd ed.). CRC Press.
The definitive reference for understanding and applying GAMs using the mgcv package. Wood introduces spline-based smoothing, penalization, model selection, and diagnostics with clear R examples.Wand, M. P. & Jones, M. C. (1995). Kernel Smoothing. Chapman & Hall/CRC.
A classic text offering a rigorous treatment of kernel smoothing methods, bandwidth selection, and bias-variance trade-offs, suitable for readers seeking a mathematical foundation.Simonoff, J. S. (1996). Smoothing Methods in Statistics. Springer.
A comprehensive overview of smoothing techniques across regression, density estimation, and nonparametric modeling, with intuitive explanations and numerous examples.Kaufman, L. & Rousseeuw, P. J. (2005).
Finding Groups in Data: An Introduction to Cluster Analysis.
Wiley.
— A foundational text on clustering, introducing key methods such as partitioning, hierarchical clustering, and density-based approaches, with practical insights and examples.Maechler, M., Rousseeuw, P., Struyf, A., Hubert, M., & Hornik, K. (2025).
cluster: Cluster Analysis Basics and Extensions.
R package version 2.1.8. Available at: https://cran.r-project.org/package=cluster
— The main R package for clustering, providing implementations of many of the methods described in Kaufman & Rousseeuw (2005), along with practical extensions and tools for applied work.Kuhn, M. & Johnson, K. (2013). Applied Predictive Modeling. Springer.
A practical guide to building, tuning, and evaluating predictive models in R, with extensive coverage of the caret package.