I just finished reading Machine Learning With Random Forests And Decision Trees: A Mostly Intuitive Guide, But Also Some Python (amazon affiliate link).
The short review
This is a great introductory book for anyone looking to learn more about Random Forests and Decision Trees. You won’t be an expert after reading this book, but you’ll understand the basic theory and and how to implement random forests in python.
The long(ish) review
This is a short book – only 76 pages. But…those 76 pages are full of good, introductory information on Random Forests and Decision Trees. Even though I’ve been using random forests and other machine learning approaches in python for years, I can easily see value for people that are just starting out with machine learning and/or random forests. That said, there were a few things in the book that I had either forgotten or didn’t know (Entropy Criteria for example).
While the entire book is excellent, the section on Feature Importance is the best in the book. This section provides a very good description of the ‘why’ and the ‘how’ of feature importance (and therefore, feature selection) for use in random forests and decision trees. There are some very good points made in this section regarding how to get started with feature selection and cross validation.
Additionally, the book provides a decent overview of the idea of ‘out-of-sample’ (or ‘Out-of-bag’) data. I’m a huge believer in keeping some data out of your initial training data set to use for validation after you’ve built your models.
If you’re looking for a good introductory book on random forests and decision trees, pick this one up ( (amazon affiliate link)) …its only $2.99 for the kindle version. Like I mentioned earlier, this book won’t make you an expert but it will provide a solid grounding to get started on the topic of random forests, decision trees and machine learning.
One negative comment I have on this book is that there is very little python in the book. The book isn’t marketed as strictly a python book, but I would have expected a bit more python in the book to help drive home some of the theory with runnable code. That said, this is a very small negative to the book overall.
The post Book Review – Machine Learning With Random Forests And Decision Trees by Scott Hartshorn appeared first on Python Data.