Data science is the organizing and analyzing of massive amounts of data. Professionals in this field are called data scientists, and their skills are very much in demand by companies worldwide. Data scientists work in various fields, including business, academia, and medicine. Here we share the best data science books for people whose relationship with data runs from personal interest to professional use. Unlike some book lists that focus on one area of data science, we feature books with broad applicability. We’ve also chosen books from a diverse array of backgrounds, from hobbyists to professors.
The 20 Best Data Science Books
Data science books are a genre of nonfiction. Often, they are instructional and provide readers with exercises to improve data science skills. Some books focus on one aspect of data science, like big data or machine learning, while others are more comprehensive. Data science books can help those looking to increase their knowledge of data science for educational and career advancement.
How We Picked the Best Books for Data Science
To compile our list of the top books about data science, we gathered 110 titles from the New York Times bestseller list, Amazon recommendations, and lists provided by data scientists at Tableau and other industry leaders.
Based on our research, we narrowed our selection to these twenty books. It was important to consider a broad intended audience and the book’s appeal to people interested in data science, whether as a hobby or professionally. Some of these books are thought-provoking and are best for people wanting to learn more about data science theory, while others have practical applications for business leaders and burgeoning data scientists. If you’re interested in learning more about data science, any one of these books is a great starting point.
Hands-On Machine Learning with Scikit Learn, Keras, and TensorFlow by Aurélien GéronBuy Now
Our top pick in data science books is Hands-On Machine Learning with Scikit Learn, Keras, and Tensor Flow by Aurélien Géron. It focuses on the concepts and tools for building intelligent systems. The book is written for anyone interested in machine learning, especially beginners in the field and those without prior knowledge of the topics. Subjects covered include simple linear regression and deep neural networks. Géron is a machine learning trainer, founder, and CTO of two consulting firms.
Big Data: A Revolution That Will Transform How We Live, Work and Think by Viktor Mayer-Schönberger and Kenneth CukierBuy Now
Harnessing information is the theme of Big Data: A Revolution That Will Transform How We Live, Work and Think by Viktor Mayer-Schönberger and Kenneth Cukier. The book is written for anyone interested in data, especially for those working in information technology, public policy, intelligence, and medicine. It talks not only about the positives of big data but also about the potential hazards of misuse. Mayer-Schönberger is a professor at the University of Oxford, and Cukier is the deputy executive editor at The Economist.
Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking by Foster Provost and Tom FawcettBuy Now
The fundamental principles of data science are the focus of Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking by Foster Provost and Tom Fawcett. The book is based on an MBA course Provost teaches at New York University and is appropriate for business students and anyone interested in data science. Fawcett is an author who works in research and development.
Data Science from Scratch: First Principles with Python by Joel GrusBuy Now
In Data Science from Scratch: First Principles with Python, author Joel Grus teaches readers fundamental data science concepts. It’s illustrated in monochrome and written for those who have some skills in Python, algebra, statistics, and probability. Topics covered include machine learning, natural language processing, and linear algebra. Grus is a research engineer at the Allen Institute for Artificial Intelligence who formerly worked at Google.
Naked Statistics: Stripping the Dread from the Data by Charles WheelanBuy Now
Author Charles Wheelan takes a fresh approach to understand statistical analysis in Naked Statistics: Stripping the Dread from the Data. The book is witty and easy to understand, geared toward those interested in statistics but without much background in the field. He uses real-world examples from game shows, politics, business, sports, and other areas to explain basic statistics concepts engagingly. Wheelan is a columnist for Yahoo! and a professor at the University of Chicago.
Practical Statistics for Data Scientists by Peter Bruce, Andrew Bruce, and Peter GedeckBuy Now
Practical Statistics for Data Scientists by Peter Bruce, Andrew Bruce, and Peter Gedeck is geared toward data scientists with some familiarity with the R and/or Python programming languages and some exposure to statistics. The book does not include methods that have evolved from computer science but statistics. The authors are the founder of the Institute for Statistics Education at Statistics.com, a principal research scientist at Amazon, and a senior data scientist at Collaborative Drug Discovery, respectively.
An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert TibshiraniBuy Now
An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani provides a straightforward application of mathematical statistics and the programming language R. Although it is an introduction to statistics, the book is academic in tone and intended for readers with a college-level, mathematical background. Each chapter concludes with questions readers can use to test themselves. James is a professor at the University of Southern California, Witten is a professor at the University of Washington, and Hastie and Tibshirani are professors at Stanford University.
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin KleppmannBuy Now
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann is a comprehensive look at all aspects of data engineering. It is written for anyone who develops web-based applications or is interested in doing so. Specifics covered include how to make data systems scalable, the architecture of data systems, and the benefits of using open platforms. Kleppmann is a distributed systems researcher at the University of Cambridge.
Introduction to Machine Learning with Python: A Guide for Data Scientists by Andreas C. Müller and Sarah GuidoBuy Now
Introduction to Machine Learning with Python: A Guide for Data Scientists by Andreas C. Müller and Sarah Guido is written for readers with some computer science and programming knowledge who wish to learn more about machine learning. It focuses on the practical aspects of using machine learning algorithms and contains hands-on exercises. The authors also suggest readers have some familiarity with the NumPy and Matplotlib libraries. Müller is a researcher at New York University’s Center for Data Science, and Guido is a data scientist.
The Hundred Page Machine Learning Book by Andriy BurkovBuy Now
At just over one hundred pages, Andriy Burkov’s The Hundred Page Machine Learning Book is a compact and practical manual on the basics of machine learning. It’s geared toward beginners in the machine learning field and is especially useful for students. Topics include classical linear and logistic regression, deep learning, and modern support vector machines. Burkov leads the machine learning developers at Gartner.
Algorithms of Oppression by Safiya NobleBuy Now
Data discrimination is the theme of Algorithms of Oppression by Safiya Noble, who argues search engines use algorithms that discriminate against people of color, especially women of color. The book has won numerous awards and was endorsed by the International Journal of Information, Diversity, & Inclusion. The book is an excellent choice for anyone interested in racial and gender bias, critical race theory, and/or bias in algorithms. Noble is a professor at UCLA and a research associate at the University of Oxford’s Internet Institute.
Data Science For Dummies by Lillian PiersonBuy Now
Lillian Pierson’s Data Science For Dummies is an easy-to-read primer on data science for those with no prior knowledge of the subject. Topics covered include machine learning, artificial intelligence, and big data. The book uses an instructional tone and is published by For Dummies. It’s part of their extensive series of nearly 2,000 reference books with a distinctive yellow cover begun in the early 1990s. Pierson is a licensed professional engineer and the CEO of Data-Mania.
Everybody Lies: Big Data, New Data, And What the Internet Can Tell Us About Who We Really Are by Seth Stephens-DavidowitzBuy Now
Execution: The Discipline of Getting Things Done by businessmen Larry Bossidy and Ram Charan takes a deep dive into a part of business known as execution culture, linking people, strategy, and operation plan. This comprehensive book is an excellent resource for those interested in learning how to be a leader, not just in title but in execution. It’s also a perfect read for any individual looking to improve their productivity in any area.
Practical Data Science with R, 1st Edition by Nina Zumel and John MountBuy Now
For readers looking to learn both R programming and statistics, there’s Practical Data Science with R, by Nina Zumel and John Mount. It’s geared toward those without a background in data science but some familiarity with basic statistics, R, or another scripting language. The book has three sections: introduction to data science, modeling methods, and delivering results. Zumel and Mount are the founders of Win-Vector, a data science consulting firm.
Python Data Science Handbook by Jake VanderPlasBuy Now
The Python Data Science Handbook by Jake VanderPlas is a comprehensive reference book. It’s written for a broad audience of readers, from those looking to learn the essentials of machine learning in Python to seasoned professionals in the field. Several programs are taught in this book, including NumPy, Panda, Matplotlib, and Scikit-Learn. VanderPlas is an interdisciplinary research director at the University of Washington.
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, by Hadley Wickham and Garrett GrolemundBuy Now
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, by Hadley Wickham and Garrett Grolemund, is for aspiring data scientists with no previous experience working in R and RStudio. The book provides an understanding of the data science cycle, and each section is paired with exercises. Some familiarity with statistical programming is recommended for readers. The authors both work at RStudio – Wickham as the chief scientist and Grolemund as a data scientist and master instructor.
Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer KnaflicBuy Now
The fundamentals of data visualization are the focus of Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer Knaflic. The book is written for professionals who do presentations with data but is also valuable for graduate students preparing to defend a dissertation or thesis. Topics covered include understanding your audience, determining which graphs to use for your presentation, and how to tell visual stories. Knaflicis the founder and CEO of Storytelling With Data.
The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists by Carl Shan, William Chen, and Henry WangBuy Now
Pearls of wisdom from two dozen top data scientists are featured in The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists by Carl Shan, William Chen, Henry Wang, and Max Song. The data scientists spotlighted include those who work for Facebook, LinkedIn, Airbnb, Khan Academy, and other top companies. It’s written for aspiring and current data scientists looking for career advice and inspiration. The authors are all data scientists.
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’NeilBuy Now
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil challenges readers to look at the dark side of big data. The book was a New York Times bestseller and National Book Award Longlist selection. Anyone interested in sociology, big data, predictive analytics, artificial intelligence, and/or algorithms will find this book a fascinating read. O’Neil is a data scientist who started the Lede Program in Data Journalism at Columbia University.
Build a Career in Data Science by Emily Robinson and Jacqueline NolisBuy Now
Students and career-changers interested in working in the data science field are the target audience for Build a Career in Data Science by Emily Robinson and Jacqueline Nolis. This book begins with an explanation of what data science is, and then proceeds to explain how you can find a job in this field and grow into your role as a data scientist. It also includes interviews with professional data scientists. Robinson is a senior data scientist at Warby Parker, and Nolis is the chief product officer at Saturn Cloud.