Build an eDoc Reader for your iPod, Part 2

Build an eDoc Reader for your iPod, Part 2

This is the second part of a trilogy of articles that teaches you how to make reading
electronic documents (such as books and PDF documents) on your iPod easy and
enjoyable. This installment delves into the engine of our application and adds
some user interface conveniences through NSUserDefaults. Next time, we finish by incorporating the ability to handle text extraction from PDF documents
by exploiting the Cocoa-Java bridge.

Last Time

Let's briefly review the openFileDialog: method from last time and address
memory as it applies to this project. The first thing you should notice about openFileDialog: is
that it passes in a pointer to the object that brought about its activation.
The condition [sender isEqualTo:sourceButton] compares pointer
values for equality, so this is not a difficult task.

If Source File was pressed, we configure the NSOpenPanel to disable the selection
of directories via [openPanel selectCanChooseDirectories:NO] because
we want to guarantee that a file is chosen. We also limit the selection to
files ending in "txt" or "pdf" because they're the formats we'll work with
in this series.

If the Destination button was clicked, we guarantee that a directory is
chosen, because we need a location to copy the source file's pieces to after it's
parsed. The other messages we pass to the NSOpenPanel should be pretty clear
just from reading the code, which is one of the many nice things about Objective-C
syntax. You can always use Xcode's help menu if you need it.

There are a few interesting string methods in openFileDialog:. On Unix-based
systems such as Mac OS X, the tilde (~) is a shortcut for the current user's home
directory. Type echo ~ in Terminal to check it out for yourself. The Cocoa
method stringByExpandingTildeInPath does just what it sounds like. This is
nice because we often like to start navigation in the user's home directory
as a convenience to them, and this is really simple way of making that happen.
As another convenience, we start navigation for the destination directory in /Volumes,
because the iPod is normally mounted there.

The line of code if (NSOKButton == [openPanel runModalForTypes:typesArray]) opens
the file dialog window, and if the user has clicked on OK (as opposed to Cancel),
we're guaranteed to have a valid selection because we configured our NSOpenPanels
accordingly. Because NSOpenPanel can handle multiple selections, the result
is contained in an NSArray. We retrieve the selection with [selection
lastObject] because it is the only object in the array and use this
value to set the corresponding NSTextField. Finally, we enable the Copy It button
if both text fields have valid values.

A final note to take from last time is the reason why we passed the retain message
to the NSArray in the init: method. In general, you should assume that objects
initialized with convenience constructors like arrayWithObjects are not in
the current autorelease pool. Thus, if you don't explicitly retain them, they're
gone with the wind once you leave the current scope. Objects explicitly initialized
with alloc or new, however, must be explicitly released with a release message,
usually in dealloc. If you're craving more information on memory or have
no idea what that just meant, check out the reference I pointed you to last
time, "Introduction
to Memory Management."

On to the Model

With that recapitulation, we can now proceed to build our engine. In your
Xcode project, choose File -> New File, select Objective-C class, and
click on Next. Name the class "TextChunker.m", ensure that "Also create TextChunker.h" is
checked, and click on Finish to create and add the files to our PodReader
target. Drag the TextChunker files into the Classes folder of your project.
Once that's done, replace the TextChunker files that Xcode generated with these files: TextChunker.h and TextChunker.m.
I recommend you do this through Finder, rather than through Xcode.

The header file, TextChunker.h, tells you all you need to know if you simply
want to use the TextChunker class and are not interested in how it works. The
method signature for chunkIt: specifies a source file, a destination
directory, and a value like CHAPTER or SCENE that should be used to segment
the chunks of text.

The file TextChunker.m is where all of the work happens. At the top of the
method chunkIt:, there are three constants that designate the guidelines
for chunking the text. We use the constant PR_CHUNK as a guideline to keep
the chunk sizes smaller than 4K. Technically, 4K is 4096 bytes, but we're using
some magic to link "pages" of our book together using the HTML anchor tag (this will be discussed
in a moment), so we leave an ample margin to account for it.

The constant PR_INITIAL_CUT_POINT provides a starting location to segment
a chunk if it is larger than PR_CHUNK, and PR_CUT_DELTA is used to incrementally
decrease PR_INITIAL_CUT_POINT if cutting at PR_INITIAL_CUT_POINT still
doesn't decrease the chunk size. These values will make more sense as we move on.
Specifying these values one time at the top of the method allows for easy adjustment.
For example, if you should want 2K chunks instead of 4K chunks, simply change PR_CHUNK and
you're all done.

The next few lines of code and the first while loop create and load an
NSMutableArray with chunks of text that are partitioned according to a separator
value. Notice how simple it is to load an entire text file into a string value
with [NSMutableString stringWithContentsOfFile:fileName]. The
NSString method rangeOfString: returns an NSRange, so if you're unfamiliar
with NSRange, look it up. You'll notice that it's a struct with two parts,
a location and a length. We're interested in using the location to determine
where a particular section of text in fileContents starts.

We start the substring matching from 1 instead of 0 because we want to
skip the first occurrence of it, which is quite possibly the very beginning
of the string at some point. For example, if we segment the following excerpt
from Shakespeare's Macbeth on the word "ACT," our code spins in an infinite
loop because it can never shorten the string containing the text. It repeatedly
identifies the first word of the excerpt as the separator and never moves on.

ACT I. SCENE I.
A desert place. Thunder and lightning.
Enter three Witches.
FIRST WITCH. When shall we three meet again?
In thunder, lightning, or in rain?
SECOND WITCH. When the hurlyburly's done,
When the battle's lost and won.
THIRD WITCH. That will be ere the set of sun.
FIRST WITCH. Where the place?
SECOND WITCH. Upon the heath.
THIRD WITCH. There to meet with Macbeth.
FIRST WITCH. I come, Graymalkin.
ALL. Paddock calls. Anon!
Fair is foul, and foul is fair.
Hover through the fog and filthy air. Exeunt.

Another detail of interest is that we use NSMutableArray and NSMutableString
as opposed to NSArray and NSString. Objective-C is a language that distinguishes
between mutable and non-mutable classes. In general, there's a performance
overhead for handling mutable classes for reasons that deal with memory allocation
and management. Non-mutable objects, on the other hand, do not require this
overhead. Use immutable objects if possible.

An interesting implementation detail that you might like to know is that most
mutable objects inherit from non-mutable parents. Looking up NSMutableArray
in Xcode's help shows you that it inherits from NSArray. We use the mutable
version of these objects, because the next block of code is very likely to
alter both the string values in the array as well as the array itself.

A final note to take from this first block of code regards some of the many NSString
methods: length, paragraphRangeForRange, lineRangeForRange, and stringByTrimmingCharactersInSet. They're
all easily found in Xcode's help, which you should be a big fan of by this
point, so check them out. There's no better way to learn what tools you have
available to you than to dig through the API.

The next block of code resizes the chunks of text as needed. The high-level
algorithm is like so:

Walk through the array:

If the current chunk is larger than PR_CHUNK, create two chunks:

Make the first chunk as close to size PR_CHUNK as possible.

Whatever is left over forms the next chunk.

Otherwise, try to combine this chunk with the previous chunk in the array.

Ideally, the logic is designed to produce neatly organized sections of text,
but it attempts to maximize the reading time on each "page" of your iPod by
filling up as much of the PR_CHUNK size as possible.

The final block of code writes each of the chunks to its own file. Files
in the iPod's Notes directory will appear in alphanumeric order, so naming
the files isn't as simple as just appending a number to the end of their file names. For example,
if you parse Oscar Wilde's The Picture of Dorian Gray from Project
Gutenberg, you'll end up with 122 different files. Naming the files by
simply appending a number to the end of the original file name dgray results
in these files appearing in this order on your iPod: dgray1, dgray10, dgray11,
... dgray19, dgray2, dgray20, ... and so on. Clearly, this doesn't make
navigation convenient. Therefore,, we must pad the numeric value on the end
of dgray with the correct number of leading zeros so that "pages" of text
appear in sequential order.

The last interesting thing that happens is to append a hyperlink to the next "page" using
a simple HTML anchor
tag. The hyperlink appears in dark underlined letters. Once you're at the
bottom of a "page," just press the button in the center of your scroll wheel
to proceed to the next "page." Press the Menu button to go back. Question:
could reading on the iPod get any easier?

Icing on the Cake

At this point, you could allocate a TextChunker to do your bidding when the Copy
It button is clicked, and be done with it. But we'll quickly throw in a few
more niceties. One such thing is to remember the locations of the last used
file location and directory location. Users will typically store their electronic
books in the same location, and the iPod's location doesn't change. Small conveniences
like these are icing on the cake, but significantly enhance usability.

We'll set the values of the NSTextFields in our application by storing their
previous values with the NSUserDefaults class: simply record these values each
time the user changes them, and load in their previous values each time the
application starts up. If you'd like additional information on NSUserDefaults,
you can check out "Mac
OS X's Preferences System (and More!)" here on Mac DevCenter, or Xcode's
help. One thing you'll want to do in your project is to open the Info Window
on PodReader, which is located under Targets, and input an appropriate identifier
value and version, which you'll find in the Properties tab. The identifier
is formed like a Java package structure, so you can enter something like "com.yourname.PodReader" and
it will work fine. Once you've run the application, you can go to ~/Library/Preferences if
you'd like to investigate the contents of this file, which is simply a special
kind of XML file called a plist.

The Info Window for the PodReader target. Set the Identifier and Version
fields here.