The Unordered Data Structures course covers the data structures and algorithms needed to implement hash tables, disjoint sets and graphs. These fundamental data structures are useful for unordered data. For example, a hash table provides immediate access to data indexed by an arbitrary key value, that could be a number (such as a memory address for cached memory), a URL (such as for a web cache) or a dictionary. Graphs are used to represent relationships between items, and this course covers several different data structures for representing graphs and several different algorithms for traversing graphs, including finding the shortest route from one node to another node. These graph algorithms will also depend on another concept called disjoint sets, so this course will also cover its data structure and associated algorithms.

강사:

Wade Fagen-Ulmschneider

스크립트

Before we dive into actually implementing graph, let's talk about a little bit of vocabulary about graphs so that we can have a common understanding of exactly what we're talking about. Looking at a simple graph, we have a series of graphs here and I'm going to always refer to the big graph as the capital letter G and G is a collection of vertices and edges. Here inside the graph G, we have three sub graphs G1, G2 and G3. These graphs are disconnected because there is no edge that's shared between G1 and G2. But G1 itself is a connected graph, because every node inside of G1 can be connected through the various edges. I'll use the term "Node" and "Vertex" interchangeably. When I say node or when I say vertex, you can think of all the sets of nodes or all the set of vertices. These two terms are completely indistinguishable. If you come from a background of mathematics, you've probably used to the term vertex. If you come from a background of networking, you're probably used to determine node. On the other side, the term edge is universally used. So whenever you hear the term edge, that's always going to mean the connection between two nodes or the connection between two vertices depending on the choice of words you use. We're going to define the variable n to be equal to the number of vertices in the graph. So if we count the number of vertices that's going to be defined as n, we count the number of edges that's going to be defined as the variable m. So n and m are going to be the two terms that we're going to talk about it a lot. A few different things to talk about when it comes to this graph, one term is we have the idea of an incident edge and edges incident to the node, if it is an edge that is directly connected to that node. So all of the edges on a node are its incident edges. Second thing we're going to talk about is the degree of a node. The degree of the node is the count of how many incident edges that it has. So if you look at how many edges are on this node, if that node for example here, has three incident edges so we say the degree of this node is three. The third thing we can talk about is adjacent vertices. Adjacent vertices is every single vertex is adjacent to a node so if we travel over all these edges and see which nodes are connected to it, those are adjacent nodes or adjacent vertices. So we have the idea of a single node has a number of incident edges those are the number of incident edges it has is the degree of that node and the nodes on the other side of those edges are the list of adjacent nodes. We can also have terms that you may remember from trees, for example, we can have a "path". A path is a sequence of vertices through our tree. The second thing is we can have a "cycle". A cycle is a path that starts and ends at the same node. So if we traveled through a bunch of nodes and now we have a path, this path is a circle so we call this a cycle in the graph and this semester we're going to mostly talk about simple graphs. A simple graph Is going to be a graph that has no self loops, so that means there is no edge that links back to itself because that's going to cause some problems as you're moving throughout the graph, you can simply go to your own location. There is no multi edges. That is there's no set of edges that connect to two of the same vertices. That if there's a connection between vertex a and vertex b, there is exactly one edge that goes between those two edges. We've talked about already the term "Subgraph". Where we can have a subset of a graph and any subgraph is going to contain all of the vertices and all of the edges in that particular subgraph. You see we have three sub graphs here as I mentioned earlier. Finally, we have a number of other terms that will introduce you to as we come across them. These are things like a complete subgraph, a connected subgraph, a connected component, a acyclic graph and then a spanning tree which we'll dive into all of these terms as we get to them as we explain graphs. So this is just a little bit of vocabulary to ensure that we're on the same page as we talk about the graph algorithms that we're going to be talking about. So to begin this discussion, let's do a little bit of math to find out things that we can know about a graph, just from the terms that we've already been introduced to. Part of that is going to be some simple questions on what it means to be a graph. So let's dive into these. So, how many edges can exist on a graph, that is the minimum number of edges on a not connected graph? So if the graph is not connected, we know that there does not need to exist any edges whatsoever. A not connected graph can be the graph a b c which are all individual subgraphs with no connections. So the minimum number of edges, on a not connected graph is zero. Now consider if this graph was connected. If we connect this graph, then we can see every node must be connected to every other node. So the minimally connected graph, is going to be a graph where we have a path from one node to every single other node in the graph and only one path to get there. Here's a minimally connected graph, you'll notice that we need exactly one fewer edges, then we need nodes. So the minimum number of edges in a connected graph is going to be the total number of nodes minus one. The next thing we can talk about is what is the maximum number of edges for this exact same thing. So we're going to assume with most of our graphs that were always going have a simple graph. So always having a simple graph means there's no self edges and that there is no multi edges. So if we start drawing our graphs, we can see for n equals one, we have just a single node, for n equals two, we have two nodes, for n equals three we have three nodes, ends connected, n equals four we have four nodes and we can see that these nodes are all connected and we can look at how many edges this has. When n equals one number of edges is zero. N equals two edges one, n equals three edges are three, n equals four we have one, two, three, four, five, six edges. So this looks like a nice repeating sequence and we can define this to be our nice arithmetic sum of n times n minus one over two which is equal to the order of n squared total edges. So for every single node, we're going to have edges out to every other node. So it's basically, itself times. So n being the nodes itself, it's going to connect to every other node, n minus one other nodes, but we know that only half of those edges can exist because you're not going to have a node and edge from A to B and B to A so we divide that total by two. This is exactly what we found, we found n times n minus one over two is the number of edges in a connected graph. Now if the graph was not a simple graph where we can have multi edges, we can have an infinite number of edges before we run out of vertices to connect. Because think about a graph with just two nodes, if it's not simple we can have as many multi edges as we want. So you can imagine there's an infinite number of edges that it can exist in a non-simple graph, that's going to contain multi edges. So, because of that, we're going to always constrain ourselves to simple graphs this semester and in future courses, you can dive into exactly what it means when we start having multi edges. Finally, the very last thing we can ask about is what is the sum of all the degrees of all of the nodes? So, if we think about the degree of a vertex, we know that's how many edges there are. If we sum all of them up, we know that there's going to be every single outbound edge is going to eventually have an inbound edge. So the degree of all of the vertices is going to be equal to two m, two times the number of edges in the graph because here we're double counting the edges. One's going to be an outbound edge on A the other one that we're going to count as an inbound edge on B because the degree of A accounts for the outbound edge and the degree of B counts for the inbound edge despite the fact that the inbound and outbound edge is exactly the same edge. It is the edge between A and B. In the next lecture, we're going to start implementing the graphs and doing exciting things with C++ with graphs. I'll see you there.