Displaying the Texts

When I first integrated the texts, the parser, and the database, I created a Web site to display the few plays of Open Source Shakespeare. There were two Web pages for each play: The first was the menu page that showed the play’s acts and scenes on the left, and a character list on the right (Figure 5). This page linked to the text display page, which shows the text of a range of scenes (Figure 6). The range might include anything from a single scene to the entire play. These pages are still in use, although they have many refinements.

At first, the text display page just showed the act and scene indicators, with the characters’ lines and stage directions underneath. The only navigational aid was a link back to the play menu. Users could not jump from one scene to the next, nor from one act to the next. I thought that creating fancier navigation aids, which would require at least one or two additional database queries, would slow down the page display and frustrate users. Once I tested those features, it only slowed down the page by a fraction of a second, so I gladly included them.

Looking at an open-source encyclopedia, I noticed a small yet nifty feature. When a user double-clicks on any word, the site redirects the user to a page with a definition of that word. I appropriated this feature for OSS, and so when you click on a word while viewing a work, or you click on a word in the search results, it pulls up that word in the concordance.

The last significant thing added to the play view function was the line number display. This was actually less straightforward than it sounds. Displaying every line number to the right of the line would have been easy to program, but they would look ugly. The convention of displaying line numbers every five lines, followed by Harrison and others, looked quite readable on the screen. (The print version of the Globe shows them every ten lines, but the typeface is very small – perhaps 6.5 points, about half the height of the text on this page – and the lines are much closer together.)

The problem was that the text lines are not stored one-by-one in the database, they are stored as part of a character’s line, so a soliloquy spanning forty lines of text is stored as a long, single string of data, with the indicator [p] showing where each line break occurs within that line. That soliloquy might begin on line 937 within the play, so the first line would not be numbered because it is not divisible by five. The numbering would need to begin with the fourth line break (line 940) and continue every five lines until 955.

The play view function does this by looping through each break within the line. If the break’s number is a multiple of five, then the line number is displayed at the right of the line, separated by an adequate amount of whitespace. I feared that performing these calculations might slow down the play view process, which it did, but only by less than a second, a trivial expenditure of time to gain this valuable feature.

Although they were stored in the same table as the plays, the poems and sonnets must be displayed differently because they look different. The poems were rather easy, although their forms vary significantly. poem_view.php, the page that displays the poems, has to take into account which poem it is displaying, as some plays have more than one part . (Figure 8 shows the poem list, and Figure 9 shows the poem view.)

To display one sonnet is a simple thing, but not as useful as being able to display more than one (Figure 10). I settled on four different ways of viewing sonnets:

A single sonnet

Two sonnets side-by-side

A range of sonnets selected by the user; and

All sonnets at once.

This arrangement lets readers and scholars compare sonnets as their needs require. The only difficulty I ran into was sonnet 99, which has fifteen lines instead of the usual fourteen. The parser, when it was reading the sonnets, looped through all of them sequentially, expecting to see the same number of lines in each one. I spent about a half-hour in frustration, looking through the code and wondering why the parser was misreading sonnets 100 through 154, thinking it was a flaw in the program itself. Once I saw the error’s cause, I added a few lines of code to handle the exception, and all was well (Figure 11).

There was a popular Shakespeare concordance at www.concordance.com, but unfortunately the owner died years ago, and his site disappeared shortly thereafter. The Works of the Bard can pull up all the instances of a word and display their contexts (Farrow), but no other site I found could do even that – the other sites had search mechanisms which returned a list of scenes that you could view if you clicked on them, but they did not provide the word’s context. I wanted to go beyond a listing of instances, and set up a “real” concordance where people could browse and look up words, like a printed concordance.

To do this, I added a function to the parser so it would keep a count of each individual word form as lines were added to the database. I use the term “word form” to mean an inflected instance of a particular word. (Lexicologists would use the term “lemma,” but OSS is supposed to include a non-academic audience, and I thought using that term might turn off potential users.) Thus play is the word, and plays and playing are the word forms. I use “word instance” to describe a word form at a particular place in a particular work.

Now, you can tell at a glance how many instances there are of a particular word form, and OSS does not have to do any extra calculations – the parser has already performed all of those counts. Once you find a word form you wish to see, either in a list or through the specialized word search function, you can click to see a breakdown of how many times it appears in each works (Figure 13). You can then display the lines containing the word form.

The word form information also undergirds much of the data for the Statistics page (Figure 14). The top 15 word forms are listed, as well as some individual facts that shed some light on Shakespeare’s use of language. For instance, there are 12,493 word forms that are used only once in all of his works. Also, the top 100 word forms make up 53.9% of all the word instances.

One final, modest feature is the character search (Figure 15). As there are over 1,200 characters in Shakespeare’s plays, and some of them have similar or identical names, it is useful to have help when sifting through them: There are two Portias, three Demetriuses, five Antonios, twenty-one characters listed as “Servant,” many lines listed as “All,” etc. If you know the name, you can search for it, or the first part of the name if you are not sure of the spelling.