A little foresight in planning your
input can save a lot of time and frustration. In Part III, the author
tells how to handle some common input problems and offers some advice
on how to prevent problems down the road.
In the first two installments we discussed setting goals for the kind
of system you want, the types of files, and what kind of output is
best. For most cases, a relative file structure with index files gives
flexibility and speed. The index files will be composed of index words
which will be either shortened versions of data in the records
themselves or bytes encoded with some kind of bitmapping.
Before discussing input strategies, let's review
some of the ideas from Part II in a bit more detail. We discussed
setting up a buffer for inputting keys or index words. This buffer can
be any free area of unused RAM memory. It must be large enough to
accommodate the record or field to be compared. For example, if your
index word is the first eight letters of the author's name, create an
eight-byte buffer for your comparison.

A Closer Look At
Indexing
Another technique we discussed was building your index file into your
record format. For example:

AUTHOR

SUBJECT

TITLE

YEAR

INDEX

AUTHOR

←
Record 1 →

← Record 2 -

After entering your first record - author, subject,
title, year - you can reserve several bytes at the end of that record
to create an index file. If you choose to bitmap here, as illustrated
in last month's installment, you gain search efficiency, although it
may at first seem tedious when creating the index this way.
If you use one byte in the index for each field, you
then have 256 possibilities for each field, which in most cases would
be more than adequate. Using last month's illustration, a bit
configuration of 1000 0000 would indicate a subject on computers. Since
the integer equivalent of a binary 1000 0000 is 128, you can use this
with an AND for compare. Let's say you've chosen the variable SU (for
subject field). The appropriate line would be:

IF
SU AND 128 THEN GOTO n

where n is a line that will direct a PRINT to screen or printer.
When using an AND, the computer will test individual
bits. The value in SU, 1000 0000, is compared to 128:

1000
0000 (SU)1000 0000
(128)1000 0000
(result)

The Boolean truth table, remember, makes this compare result "true,"
thus a "hit" is made in your search.
In some cases, depending on the total number of
subjects you want to index, it might be practical to assign variable
names to the binary equivalents:

A=1
E=16B=2 F=32C=4 G=64D=8 H=128

Then, IF SU AND H THEN n.
Let's say you're searching for a more specific
subject, computers in education. We'll assign the subject of education
a binary 0100 0000 (or integer 64). A computer subject, remember, was
assigned 1000 0000 (128). A book dealing with computers and education
would then be 1100 0000 (192). Our search statement would be:

IF
SU AND 192 THEN n

Obviously, if you use this method, you'll have to be
very thorough in creating your index. No matter what method of indexing
you choose, do it carefully - your search speed and accuracy depend on
it.
If you choose not to use the bitmapping method, a
word of caution is in order: be sure to write the data that makes up
your index file(s) also in the records themselves. You may later decide
to change the format of an index file to rewrite a search routine.
Maybe you will be forced to do this to accommodate an index file you
found you needed. The easiest way to create the new index file is to
read it item by item from the disk and assemble the index that way,
rather than to type it in by hand. The accuracy will be much greater.
Remember that one wrong bit in an index makes the record it refers to
"invisible" to a search.

System Input Problems
Now for the problems with input. You want a system which is easy to
use. This means giving cues that tell the user what is going on. One
way is to use the top one or two lines on the screen to indicate what
the program is doing or expecting at all times. Another important
feature is to make the screen format logical and easy to understand.
Finally, when inputting new records, there should be
ample opportunity to edit, erase, change, or abort without disturbing
or crashing the program.
Some computers, including my CBM, cannot handle a
string input containing commas. The operating system looks for these
delimiters in an input string. When I input titles of publications,
commas are important punctuation. That means I have to use a roundabout
way of getting the string in without having it cut off at the comma.
There are several ways of doing this. You can use GET and assemble the
string byte by byte.
I have used a nice routine for Commodore equipment
written by Jerry Dunmire (COMPUTE!, December 1981). This routine takes
up to 80 characters in a string which can contain any symbols you
want. If the 80-character limit is exceeded, you can tell by the value
of ST, a status byte in the operating system. Problems like this should
be handled at the outset. Make the system easy to use. A little
frustration becomes a big one when you are typing in data. Having to
substitute something else for commas would be very frustrating.
One thing to remember in connection with input is
that the program must "know" at all times the number of records on the
disk and the length of each index file. When you enter a new record, it
must go into the very next empty location on the disk. The new record's
index words must be put at the end of the appropriate index files. The
way to save this information from one run to the next is to have a
register pointing to the next record number. Inputting a new record
will cause the register to be incremented by one. When you SAVE the
index files, you should also SAVE this register and if the register is
adjacent to the index files, you can save them all at once.

Writing The Input
Any writing of data should be done as it is input. For example, if
there is to be a change from ASCII letters (or in my case, PETSCII),
then that ought to be done when the time delay is not objectionable.
After you type a name, and after you have a chance to edit it, you
should be asked to give a final approval. Once this is given, the
program ought to translate parts of the input before writing (sending
the input) to the disk. This might take a few seconds, but if you are
typing records from a list or card file, you will be reading the next
item or moving the pointer on the copy stand while this goes on.
For example, this is how I handle my index file of
authors. On the disk, the author's name is in capital and lowercase,
last name first, with commas and periods after initials. In the index
file all letters are written as pseudo-ASCII caps, and the index word
ends with the eighth letter of the last name. To make pseudo-ASCII, all
you need to do is shorten each ASCII byte to five bits with "AND 31"
(or AND #$1F). If the last name is shorter than eight letters, I let
the following comma and initials appear, too. The key used in searching
for an author is also changed to pseudoASCII caps. After the last
letter, the extra bytes, if any, are nulls. As mentioned, the search
program then considers it a match when the next byte of the key is a
null. That way you can search for SMITH,J. or SMITH, or even all the
S's. That's very helpful when you aren't sure about the spelling of a
name. Program 1 in the previous article illustrates this search
technique.
Bitmapping is not hard. You can do it in machine
language, but there is no particular advantage in doing so, except
saving program space. The byte in question is zeroed and then the nth
power of two is added to it whenever you want the bit in the nth
position set. You can clear the same bit by subtracting. Be sure the
bit is set before you do any subtracting and vice versa, and be sure it
is clear before setting it. You must arrange it so the user cannot
inadvertently set a bit twice or clear a bit that isn't set. The table
shows a routine for inputting subjects by bitmapping.
Particularly sticky situations can always be handled
with a table. An array with the existing value for each value of the
input is one way of doing this: A(N) contains the value used for N, the
input value.

Editing The Files
By all means, make it easy to display a record entered some time ago,
edit the display, and write the newly changed data in place of the
original record. If you use subroutines for inputting each kind of
data, this is easy to program.
For example, I have a subroutine that takes as input
an author's name, then when it's acknowledged to be correct, writes it
in the correct place on record "n" and also puts a corrected entry in
the author index file in the right place. The record "n" may be an old
one or the one we are writing for the first time. All you need to do is
branch to such routines as one of the options given on a menu at the
top of the program. Some errors will inevitably get by in your initial
input. You need a way to correct errors both at the input and later as
well.
Next issue we will outline the main program and talk
about other techniques.

Routine
To Set Bits In An Index Word.

(This routine is based on YIN
response with cursor moving down list on
screen. You must arrange a stop or wraparound when N gets to maximum
and P=7. Sanie at N=0:P=0.)

1. DIM the array IW(x) to nr of bytes in index word.
Zero IW if not already done initially.