So, now we get to the much-foreshadowed data preprocessing section. r_break r_break Data preprocessing is sometimes called data cleaning but data preprocessing should involve more steps than just cleaning the data, just removing the problems with the data. So, data cleaning is kind of a subset of preprocessing but most of what we do during data preprocessing is, in fact, data cleaning. Again, lots of different terms to refer to basically the same thing. r_break r_break There’re a lot of different types of preprocessing. And I’m going to talk about a lot of different strategies, aggregation sampling, all the ones on the screen here. I’m going to talk about all these different strategies. But we don’t want to use all of these different strategies on every data set, right? There’s a lot of different strategies we can use, but for any given data set, we’re only going to use a couple of them usually. We don’t want to overwhelm you. We’re not going to need every technique and every tool in our toolbox every time. r_break r_break Another note before we keep going, not all of these are strictly independent. These terms categories are all things you see thrown around and terms you see used around the industry but, because data science is such a heterogeneous field, not all of these things are strictly independent. So if you see some overlap in what I’m talking about between different attributes, that’s why.