All right. So, last, but very certainly not least, is data exploration and visualization. Data exploration and visualization are critically important to the practice of data science. You need to understand what your data looks like before you can start to model it properly. r_break r_break So, what is data exploration? Essentially, data exploration is visualization and calculation that allows us to better understand the characteristics of a dataset. The key motivations of it are that we want to be sure we select the right tools for preprocessing and analysis. And because it uses our human mind’s really, really powerful ability to recognize patterns. A person will recognize a pattern that a data analysis tool won’t in a lot of contexts. r_break r_break Building a neural network, which will tell you if a picture is of a face, is a massive endeavor. It’s a very complicated endeavor but humans can do it. Most humans can do it innately, automatically, very, very quickly. This is, of course, related to the historical field of exploratory data analysis, EDA. The original book is Exploratory Data Analysis by John Tukey. And if you’re interested in data exploration, specifically, there’s some information here. r_break r_break The original focus of the field of EDA is not the same as our focus as data scientists. As data scientists, our focus is on summary statistics and visualization. And EDA, using clustering as exploratory techniques, anomaly detection as exploratory techniques. In our context, now clustering and anomaly detection are major areas of data science interest, major fields, sub-fields of their own, not just a piece of an exploratory. Though, clustering for exploratory purposes is still used a great deal. Good clustering algorithms and good clustering practice are some of your more powerful tools if you have a very complicated dataset.