Ordered Data & Graph Data
Introduction to graph data and ordered data with detailed explanations and examples of both. We'll also discuss important types of ordered data that can come in handy during the analysis.
What You'll Learn
> Graph data and its examples
> Ordered data and its examples
> Types of ordered data: spatial and temporal data
The next big category of data that we’ll talk about briefly here is graph data.
Graph data - the classic example, of course, is HTML, the world wide web - is defined by nodes, which are our vertices in our graph. So every webpage is a node, and then a set of edges, which point from one node to another. Those edges can be one-directional or they can be bidirectional. And then in addition to edges and nodes, some graphs have weight. So, in this case, this count - for if it’s an HTML website, this might be a count of the number of times that website links to this [other] website here. So it links five times here but only two times here.
We won’t talk about this in great detail, because it’s sort of its own sub-problem that we don’t have a lot of time to cover but it’s good to be aware of. When you’re dealing with graph data, you have to put a lot of thought into how you capture the relationships between the nodes, how you encode your edges and vertices. You don’t get the same kind of neat, you know... there are n attributes that can be represented by n columns, right? Each vertice can have any number, anywhere from 0 to an infinite number of edges coming out of it. So when you’re analyzing, you have to handle it differently.
The last big category of data is ordered data. Now, ordered data is data which has some sort of order, where each data object has to be ordered in some way. So in the case of a genomic sequence, for instance, the ordering of the ribosome of our nucleic acids here, GGTTCC, et cetera, is important, right? The fact that we have GGTTCC here is different than if we had had CCTT and then GG. Those are fundamentally different sequences, so we have to encode it in some way that preserves that ordering.
Another example, and sort of your classic example, of ordered data is spatial and temporal data. So this little gif here represents the average monthly temperature of both lands and oceans over the course of a year. In this case, the spatial aspect of the data is important. Where we are in the world certainly matters when we’re looking at a data object. And in this case, if we were getting this data, every row in, say, a database table might have a location associated with it and a time, and there is an implicit ordering there, especially to the time, but also the location.
So, when we’re handling ordered data, we have to be very careful about it. And this is very important, because time series, of course, anytime you think about doing any kind of sensing in any material or anything like that, you get time series data. It’s the most common type of ordered data, and we’ll talk during the back half of the Bootcamp, especially, about how we handle time series data.
Data Science Dojo Instructor - Data Science Dojo is a paradigm shift in data science learning. We enable all professionals (and students) to extract actionable insights from data.
© Copyright – Data Science Dojo