fbpx
Learn to build large language model applications: vector databases, langchain, fine tuning and prompt engineering. Learn more

6 books to help you learn data science

Data Science Dojo
Nathan Piccini

October 28

Learning different concepts in data science can often be daunting. Here are 6 books to help lift the burden.

books
List of books to help you learn data science

List of books:

Check out the below list of 6 data science books that help you kick off your learning journey.

1. Machine Learning: A probabilistic approach

This is an almost *exhaustive* book on machine learning topics ranging from the very basics of probability to mixture models, variational inference, and deep learning. Even though I first encountered this book as a companion textbook for a university course, I think calling this a textbook is doing it a disservice. It is an encyclopedia and can serve as a detailed reference for any data scientist or machine learning engineer.

The book doesn’t shy away from proper mathematical notation, which might be jarring for some, which is why the first couple of chapters about the basics are so important to get your feet wet. There are diagrams exploring the characteristics of models, pseudocode, fully worked examples, and even exercises at the end of chapters.

There are a bunch of fantastic online learning resources for stats, ML, and data science topics but most of them shy away from the maths and theoretical aspects which is where this book shines.

  • Author: Kevin Murphy
  • Education Level: Beginner – Advanced

2. Fundamentals of deep learning (O’Reilly)

Deep Learning is only getting more and more popular each year and with that, the wealth of online tutorials and courses about each topic keeps increasing. My main issue with most of these is that they are either too focused on the implementation (feeling more like a tutorial for Keras than deep learning as a field) or they skip out on key theoretical concepts.

Deep Learning by Ian Goodfellow, while being a very detailed exploration of the field and its roots, is (in my opinion) not the best jumping-off point for beginners or even many people who understand the basics.

This book, The Fundamentals of Deep Learning (O’Reilly), doesn’t have this problem. It uses easy-to-understand notation and minimal derivation while still covering the breadth of the field’s most common concepts (this is less ‘complete’ than the Goodfellow text).

The major advantage this book has as an introductory text is the inclusion of companion code samples in Tensorflow (the most popular DL framework) which makes the jump from reading and learning a topic in the book to implementing and experimenting seamless.

  • Author: Nikhil Buduma
  • Education Level: Beginner

3. An introduction to statistical learning (with Application in R)

An Introduction to Statistical Learning (popularly known as ‘ISLR’) is easy, one of the most popular textbooks available on machine learning. The text builds your machine learning concepts step-by-step. Also, despite consciously restricting the discussions a little short of details on ‘mathematical derivations’ and ‘statistical jargon; the text gives a complete treatment to respective topics.

  • Author: Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
  • Education Level: Beginner

4. The elements of statistical learning: Data mining, inferencing, and prediction

The Elements of Statistical Learning (popularly known as ‘ESL’) is often recommended as the next step in learning for machine learning (ISRL being the first step). In my opinion, the ESL text demands an advanced level facility with Algebra, Calculus, and Statistics. Like ISLR, ESL does find mentioned as either an assigned or a recommended textbook in leading master’s programs in Data Science, Statistics, and Business Analytics.

  • Author: Trevor Hastie, Robert Tibshirani, Jerome Friedman
  • Education Level: Advanced

5. R for everyone

The solution to the often-thought problem that R requires too much knowledge for non-statisticians, R for Everyone draws on making learning easy and intuitive. This book starts with the basics, walking you through downloading and installing R, but takes you through more advanced problems so you’ll be able to “tackle statistical problems you care about the most”.

You can expect to build both linear and non-linear models, use data mining techniques, and use LaTeX, RMarkdown, and Shiny to make your code reproducible.

“This guide focuses on the 20 percent of R functionality you’ll need to accomplish 80 percent of modern data tasks.

  • Author: Jared P. Lander
  • Education Level: Beginner

6. The cartoon guide to statistics

Used as a textbook in Data Science Dojo’s data science Bootcamp, The Cartoon Guide to Statistics covers everything needed for a basic understanding of statistics. The authors use cartoons and humor to explain the concepts many find hard to learn. This book is great if you’re just starting to learn statistics and data science, or if you want a good laugh while you refresh your memory.

The last page reads: “Well, that’s it! By now, you should be able to do anything with statistics, except lie, cheat, steal, and gamble. We left those subjects to the bibliography.”

  • Author: Larry Gonick and Woollcott Smith
  • Education Level: Beginner
DSD Sign
Written by Nathan Piccini
Interested in writing for us? Apply here: Submit your guest post with us
Newsletters | Data Science Dojo
Up for a Weekly Dose of Data Science?

Subscribe to our weekly newsletter & stay up-to-date with current data science news, blogs, and resources.

Data Science Dojo | data science for everyone

Discover more from Data Science Dojo

Subscribe to get the latest updates on AI, Data Science, LLMs, and Machine Learning.