blog image

Machine learning is one of the fastest-growing branches of computer science. Are you looking for the very best books in the area of machine learning? Over the last few years, we have asked some top experts, many of them world-renowned, for book recommendations in machine learning and artificial intelligence. In addition, we have done a lot of web searches to find what books experts use, or have used in the courses they teach or have taught. Here is a list of machine learning books that have been recommended by at least two experts. We have listed the books in the order of the number of recommendations.

The Elements of Statistical Learning

by: Trevor Hastie, Robert Tibshirani, and Jerome Friedman

This book describes the important ideas in a variety of fields such as medicine, biology, finance, and marketing in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of colour graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book’s coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting—the first comprehensive treatment of this topic in any book.

Recommended by: Andrew Zisserman (Oxford), Georgios Giannakis (University of Minnesota), Carlo Tomasi (Duke University), Bradley Efron (Stanford), Yann LeCun (New York University, Facebook), Michael I. Jordan (UC Berkeley).

Pattern Recognition and Machine Learning

by: Christopher M. Bishop

This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.

Recommended by: Daphne Koller (Stanford, Coursera, Insitro), Vahid Tarokh (Duke University), Yann LeCun (New York University, Facebook), Geoffrey Hinton (University of Toronto, Google).

Deep Learning

by: Ian Goodfellow, Yoshua Bengio, and Aaron Courville

This book offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.

Recommended by: Kees Schouhamer Immink (Turing Machines Inc.), Yoshua Bengio (University of Montreal), Daphne Koller (Stanford, Udacity, Insitro)

Understanding Machine Learning

by: Shai Shalev-Shwartz, and Shai Ben-David

Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides a theoretical account of the fundamentals underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics, the book covers a wide array of central topics unaddressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for advanced undergraduates or beginning graduates, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics and engineering.

Recommended by: Robert Schapire (Princeton, Microsoft), Carlo Tomasi (Duke University).