Taught By

Ira Pohl

Transcript

Okay, let's turn to another example, and study yet another example. Again, all of us out there have pretty much, or at least most instances students, myself, we learn from experience, from practice, from seeing example code. And once we see an example rather than some abstract document, we know from that idiom, that example, typically if it's archetypal, it then lets us understand the technique and absorb that technique. So, in this example, we're going to read in a series of words from a file and we're gonna read them into a vector and then, we're gonna sort them. So, you can imagine that you're a humanities professor and you want to verify that a certain play was written by Shakespeare, where maybe there was some controversy. And some of the methods you might use is to see the usage of words, how often they get used. Shakespeare where there, he has, every author seems to have patterns and usage patterns,and there are statistics you can, again, apply to their texts and use those statistics to verify that it's likely that it was Shakespeare. These techniques also get used, for example, in the Bible, to figure out whether there are multiple authors and where there seems to be stylistically changes from parts of the text to other parts of the text. And again, we're going to demonstrate how powerful SDL is, so if we would try to write this in old style C or even the pre SDL C++, we would have had to write many more lines of code. All of the code would be subject to difficulty, it would make for pretty low level routines that could have lots of runtime errors. But here, we have the ability to just reuse code. Reuse code is a silver bullet that allows us to quickly and efficiently write large, Pieces of code. Again we need to know what kinds of SDL libraries we need, iostream, iterator or fstream. fstream, again, is for the files, and now we're gonna also choose to use algorithms. So algorithm is where, in our case there's going to be a sorting routine. Again this would've required a bunch lines of code. The sorting routine, as you'll see, later on we'll talk a little more about it, is based on course quick sort algorithm. And again, the container class on most useful container classes is our vector class. So, we open word.text, so we can imagine Shakespeare, some play. Hamlet's play is sitting in a big document, you can find this stuff all over the Internet. You can get some big file, and it has words in it. And we need an istream iterator, that's gonna migrate over and visit that file. And we're gonna use what we used before, which is two pointer types, a start and an end. The start will point to the initial place and the open text file, the word.txt. The word.txt is a local file, contains the information we want to process and that information is going to be series of strings. Words are strings. And, again, we can use this kind of constructor to basically go off to, These pointers, which are now pointing at a file that's to be read and will initialize this vector. And critical to that initialized vector is, The actual reading of that file uses white space to distinguish individual strings. So we don't have to write, for example, some elementary routine that would crunch characters looking for white space to separate out the individual strings. And we also get the ability, cuz string is a dynamic type, to represent small words like a and the, from large words like polysyllabic. So arbitrary strings, which in this case are hoped to be words, can be dynamic. And here we go and start processing those words. So words are read in, and now we're gonna use our for range statement. Again, a very powerful feature that simplifies a lot of coding in modern C++ 11. So what does a for range do? It looks at in general, some kind of container. And this is everything in turn. So we can't get a range error, and decides what it needs in visiting the individual items, so auto. auto was going to pick out an individual string. So auto is going to use the compiler to infer what the appropriate type is. And then all we wanna do, for example, is print out and see what's actually in that file. So this stuff is just going to print out that file. So if the file said the quick, Brown fox, that's what we would see on the screen. And each of those words would be separated by a tab character. And here we invoke the sort, standard library sort out of algorithm. So it's a two argument thing. The two arguments are iterators. So if you look at the logic of STL, any time you see an STL algorithm, it's generally going to work with iterators because it's the most powerful logic. That's how all of this was designed to be very integrative. And again we use our for range. And this time after the sort we're going to see something different. So if it was just a quick brown these would be reoriented. I think we actually would see the brown quick, and check me on this. And the reason we'd see it in that order is The, which is the first letter has lower ASCII value than, the uppercase has lower ASCII value than the lower case, bs and qs. So for range idiom simplifies how we visit a container class. The sort is STL, so it's an extremely efficient algorithm we don't have to rewrite. And once it's sorted we can do interesting things like frequency counts. So again, try it, see if you can do some other things. Another thing you might do is maybe it's inappropriate to distinguish between capitalized and uncapitalized letters, lower and uppercase. So maybe you would want to, before you want to maybe process the file so that everything is lowercase. So that would be an exercise. Change everything to lowercase first, then process the file. And that's all. This is very powerful. You can see that we're getting a lot done and manipulating a text file with rather little effort. If you know STL, if you know STL, it just brings a lot of fire power and lets you write very concise, but very understandable and remarkably correct code. Why? Because all of these sub-pieces have been thoroughly checked out. So, conventionally that algorithm, Quicksort by Hoare, is called with an iterator range. And again I told you, we're gonna see this a lot. We're gonna see an iterator range, this is gonna be a specific iterator, a starting iterator and an ending iterator. So we could have sorted something that was a shorter range. We didn't have to pick the whole file of vectors, we could do something else. But this would be very typical, the starting iterator and the ending iterator. The other thing we have to know about this is this algorithm requires random access. So these iterator values must be random access for this to work. If it wasn't random access, this would fail. And notice in my comments on the slide, see if you could try to write your own code, if you're up for that. Write your own Quicksort and compare it timing wise and see if you could do better. This is remarkably efficient code, and it's done by professionals in this library. And they know the libraries wouldn't be used if they were easily improved on. Professional use requires that use be highly efficient.

Explore our Catalog

Join for free and get personalized recommendations, updates and offers.