How does Google Maps plan the best route for getting around town given current traffic conditions? How does an internet router forward packets of network traffic to minimize delay? How does an aid group allocate resources to its affiliated local partners?
To solve such problems, we first represent the key pieces of data in a complex data structure. In this course, you’ll learn about data structures, like graphs, that are fundamental for working with structured real world data. You will develop, implement, and analyze algorithms for working with this data to solve real world problems. In addition, as the programs you develop in this course become more complex, we’ll examine what makes for good code and class hierarchy design so that you can not only write correct code, but also share it with other people and maintain it in the future.
The backbone project in this course will be a route planning application. You will apply the concepts from each Module directly to building an application that allows an autonomous agent (or a human driver!) to navigate its environment. And as usual we have our different video series to help tie the content back to its importance in the real world and to provide tiered levels of support to meet your personal needs.

MK

Excellent course. I have a 32 year old CS degree and needed a Java Course which did not cover the basics of programming but rather covered advanced CS topics in a Java Context. This is the course.

BS

Jan 12, 2016

Filled StarFilled StarFilled StarFilled StarFilled Star

Splendidly covers path finding basics. Enjoyed the course.\n\nIt would be great to go into advanced algorithms useful for programming challenges (like SPOJ) with the same instructors.

À partir de la leçon

Introduction to Graphs

This week we'll start getting technical, introducing you to the central data structure in the course: Graphs. You'll learn the basics and then have a chance to dive in a little deeper into the code, getting ready to start building that Google Maps-like application.

Enseigné par

Leo Porter

Associate Teaching Professor

Mia Minnes

Assistant Teaching Professor

Christine Alvarado

Associate Teaching Professor

Transcription

[MUSIC] Okay so we've got the abstract class spelled out in terms of what methods our graphs ought to be able to implement, and now let's think about how we would actually go ahead and implement some of those methods. So, in this video, you'll be able to see how to do at least one implementation using something called an adjacency matrix. So, let's think back to an example of a graph that we've been going back to a little bit in this sequence of videos. And so, this is a small example. Most of the real world examples that you think about will have many, many, many more nodes and many, many, many more edges. Especially when you're thinking of the applications we care about. For example, in the project where the nodes are representing cities or intersections of roads, and then we have the edges indicate ways of getting from one place to the other. So, think about the number of cities in the world or the number of roads in the world and you start getting to really big graphs. But when we're doing the definitions, it's good to have a toy example of just a few vertices, a few edges, so we can see how the definitions play out. And then, afterwards we'll scale up to the realistic problems. So with this example, we just have six vertices with some edges between them. And we could label the vertices one through six. Could use these labels of vertices to index into some array structures, and so if we think about representing the relationships between these vertices, so we can think of a grid. So this grid would be indexed by the labels of these vertices. And what we'd like to do is store information in a particular grid location based on whether these two vertices do or don't have an edge between them. So for example, if we think about the vertex labeled 1 and the vertex labeled 2. In this graph there's no edge between them, and so we'd like to store that information and so in this grid representation we're going to say there's no edge and represent that with a 0. Now there are some edges in the graph that we're trying to represent, and so for example if we look at the vertex of 0, it's adjacent to the vertex 1, namely there's an edge that starts at 0 and goes to 1. And so we'd like to record that, and we're going to record that with a 1. Now notice that at this point, we've got 0 and 1 as the entries that we've used in this grid. It's possible to have bigger integers and that might indicate multiple edges from one vertex to another. Now notice that in this graph there's an edge that goes from zero to one. But there isn't an edge that goes from one to zero. And so in the symmetric location in the grid with row one and column zero we put a zero. If we were thinking about representing an undirected graph as a matrix, using these zeros and ones to indicate whether the pairs of vertices are related or not. Then in those undirected graphs, this grid would by symmetric. Namely, if we tried to imagine flipping it across the diagonal, then we'd get the result of the flip would look exactly the same as the original matrix. So with undirected graphs we get more structure in this adjacency matrix. But for directed graphs we need to think about every single grid point on its own independently. So if we want to fill in the rest of this adjacency matrix, and matrix here just means 2D array, then we notice that we are going to get a one for every single edge in the graph. And there's one, two, three, four, five edges,we see one, two, three, four, five 1s in the array. And the rest of the entries, well, they're just 0s, because the rest of those entires indicate pairs of vertices that happen not to be adjacent in this graph. And so we indicate that in the adjacency matrix by putting a 0 in each of those locations. All right, so we've got a representation of this graph as this 2D array, this grid of 0s and 1s. But really our computers, we know, don't really get as input this grid of 0s and 1s, we still wanna translate how we're doing this representation to Java. And so let's think back about the class structure that we have to write in order to represent the graphs. And from the previous video, we saw we've already set up an abstract class and now we'd like to defined a sub class of it that's going to implement those abstract methods. So in our new class, graph adjacency matrix, so GraphAdjMatrix, we're going to extend that abstract class we started with. And the new piece, the new field, that we're going to define for objects that are of type graph adjacency matrix are these adjacency matrix that are going to be 2D arrays of integers. And so what we'd like to do now is implement those add vertices, and add edge, and get neighbors methods using this idea that the relationships in the graph are encapsulated in this matrix. So let's think about how to do at least one of these methods to start, and let's think about implementing adding an edge. And so if we're adding an edge from a vertex v to a vertex w and each of these vertices is encoded by an integer. Then really what we're doing is setting that location in the matrix to say, yes, there's an edge between my two vertices. So in row v and column w, there ought to be a 1. Now, notice here, we're saying this flag is set to one, which means that we're not going to take into account multiple edges between vertices. We're going to say that if we are adding an edge between two vertices, we're going to assume that there wasn't an edge there before, or if there was an edge, we're not going to double count that edge at all. We're just going to say, if there's an edge, it's going to be set to one. All right, that wasn't too bad. So what about some of the other functionality that we wanted in our graphs? So in particular, how do we add a vertex to the graph? So think about how vertices are represented in this graph, and we don't have a dedicated data structure that tells us about all the vertices. The way that we have access to the vertices is as the row labels and the column labels of the adjacency matrix. So if we want to add another vertex what we need to do is add another row and add another column to correspond to that new vertex. But that means increasing the dimensions of this adjacency matrix, and so that's a little bit tricky. Now what we want to do is we want to say okay, so if the vertex that we need to add has to really increase the size of our adjacency matrix, then we're going to plan ahead. And we're not going to change this adjacency matrix just by adding one row and one column at a time, but we are going to say if we are adding one vertex, then maybe we'll need to add more vertices later. And so we are going to make our new adjacency matrix much bigger. So in particular we're going to double the number of vertices, double the number of columns and that way at least for a while, while we're adding more vertices we won't have to increase the size of our array each time. Because if you remember when we want to increase the size of our array we can't just take our old array and fiddle around with it. What we need to do is create a new array of the new size and then copy over all the contents from the older rate to the newer rate. That takes a bunch of time, and so we don't want to have to do that too often. And, so our idea is going to be to add a vertex. We're going to if we need to change the size of the array at all, we're gonna change it by a lot. Double the size, so double the number of rows, double the number of columns. And then copy over all of the values from the old adjacency matrix to the new adjacency matrix. And, then figure how to deal with the new vertex that we're dealing with. Namely we're going to thing adding this vertex and not connecting it to any other vertices. So there's no up gonna be any ones in the row or the column corresponding to that new vertex, okay. So we saw that adding an edge was really very quick. Adding a vertex had to be done with a little bit more care. And we can think of a filling in the rest of this class definition. We need to, of course, define a constructor which is going to initialize our adjacency matrix. And we're going to fill in some of the other methods as well. Now in particular, when we're talking about graphs, a useful method to keep in mind is to think about whether two vertices are in fact neighbors of one another, it there's an edge that goes from one vertex to another. So thinking about this implementation of the graph where we have our edge relationship defined through this adjacency matrix. How long will it take, to test whether there's an edge between one vertex and another? Okay, so let's come back and think about how to finish off this adjacency matrix representation of the graph. And in particular, let's reflect on what properties we've seen as we've built up this class. So you'll write up some of these other methods as part of the project, and also we'll guide you through some of them in the support videos. But as you're writing these methods, you want to think about the performance implications of the data structure that we've chosen. So we've chosen a data structure that is very algebraic. A 2D array, which we can think of abstractly as a matrix that we can do linear algebra on. And so it turns out that once we represent a graph as this algebraic representation, then the linear algebra tools are very powerful. And you'll see one example of that in a later video. But as you go on and think about more graph algorithms and more applications of graphs, you see linear algebra coming in again, and again, and again. And so, thinking of the adjacency matrix as a crucial representation of the graph is indeed very powerful. But it does have a lot of drawbacks. Because even though it was really fast for us to implement the test for edges method as you saw in the video quiz. And even though it's really fast to add and remove edges because we're just dealing with a single entry in the 2D array. If we want to change the big structure, the underlying structure of the graph by adding vertices, that took a lot of time. And the most important piece that is a drawback for the adjacency matrix is that when we think about this storage that's required an adjacency matrix, we need to store a 2D array, whose dimensions, are the number of the vertices. So that means, that we're storing n squared pieces of information if n is the number of vertices. And that can get really, really big. So think back to what these vertices represent. These vertices are cities, or they're intersections, or in biological representations, they might be genomes or they might be little pieces of genomes. Or if you think back to the World Wide Web analysis, then each of these vertices is a unique URL, unique website. Or if you're thinking about social networks, then each of these vertices is a person. And in all of these, think in all of these applications, the set of vertices is huge. Not only is the set of vertices huge, most vertices are not related to one another. So most people are not friends with most other people in this world. Our communities don't encompass the whole globe, don't encompass the whole world. Similarly, most cities don't have nonstop flights to all other cities or to many, many other cities in the world. Usually we have to think about going out gradually and doing lots of shorter flights in order to get from one place in the world to another. And so, even though when we have a lot of vertices there's a lot of potential edges, there's a lot of potential relationships, many of those edges just aren't in the graph. Whereas storing the adjacency matrix of a graph requires us to store information about every single pair of vertices. And that can take up a huge amount of space. And so if space is a concern, if we're trying to depict a big graph, then often adjacency matrices are prohibitively big. We can't store the whole adjacency matrix. And so, we're going to think about another implementation of the edge relation description of the graph that will lend itself better in many practical applications.