Grepple Deliverables
Each group is responsible for submitting a final
version of their program, including all relevant
source files and a Makefile. The Makefile should
include comments about what variables are
site-specific (e.g., particular to acpub machines)
to ease in porting the code to other systems.

You can assume that all libtapestry.a files are accessible
so you don't need to submit any of those unless you
modified them. You should submit all map/hmap/hiterator, etc.
files because it's easier to compile with all source code
available.

Ideally, and certainly in the future, you will submit a
tar file containing an archive of all your files
and directory structure. This will allow you to submit
a directory hierarchy rather than a flat set of files.

You can find out about the tar command by typing man tar.
We'll go over this in class and
simple use is explained in another document.
You do NOT need to use tar for the grepple assignment.

You must turn in a design document. You should turn in hard copy
rather than submit it. This will become part of your
software portfolio described below. Ideally the design document
will include a class-relationship diagram using some kind of
modified Booch diagrams as we've discussed in class and that
are described in Horstmann's book (and also in the optional
Design Pattern's book).
Syam Gadde, the TA, has made his
class diagrams accessible, so you can
look at one example.

You must also document what each member function does. Some of
this documentation may be accessible in your header files, but there
should be programmer documentation so that if someone else takes
over your grepple project they can understand what you've done.

You should also include design decisions you made, and a justification
for the decisions. For example, if you use a hash table, how do you
choose the number of buckets (if you're using chaining). If you
modified someone else's code/classes, you should probably indicate why
this is the case. This coding document should be a help to anyone who
has to read and understand your program at a coding level.

Decisions and Rationale

As part of your design you should explain what you decided to do
about weaknesses in the original specification and how you chose
to deal with problematic parts of the program. For example, what
do you do if the user specifies unread foo and there are
many files stored named foo? You should be sure to address what you
decided to do, and any criteria you used in reaching your decisions.

You should include man page(s) or a user-manual that describes
how to use our grepple program. If you deviate from the
specifications in any way you should be sure to document why
this is the case. You must have a compelling reason to change
what's in the specifications, but it's ok to do this if you think
it's important. Note that you cannot choose to
ignore a command such as recunread, but you can choose to
format the output differently than what is specified if you have
a compelling reason to do this.

The man pages/user manual should allow an intelligent Unix user
to use your program. You should be sure to explain all options and
you should describe buggy behavior when you know the behavior is
buggy (this should be in a separate section rather than sprinkled
throughout the manual.)

Each group must submit a document, that each person in the
group signs as having read, explaining how many hours were
spent on the project. You do not need to allocate hours to
individuals, but you should estimate the total number of
person-hours spent on the project. You should include, if possible,
descriptions of problems you had with coding, design, or other
aspects of the problem. You should mention good and bad things
encountered by the group as you have worked together.

Each member of the group should also submit an individual
assessment of the project. This should include your perspectives
on the group project and an assessment of your contributions to
the project. No one will read this part of the writeup except
for the professor in charge of the course:
Owen Astrachan. You are free
to comment about contributions from others, but these should be in a
positive tone. Remember that each person in a group has different
strengths and it is not possible for everyone to be a coding expert
(or any other kind of expert).
Of course it is possible that someone did not
contribute at all. Hopefully all assessments will be be done
honestly and with no malice.

The individual assessments can be turned in electronically to
ola@cs.duke.edu, but it
is fine to turn in hardcopy.

Although all code is submitted electronically, each group's final
deliverable is a copy of everything the group has done that will
go into each group member's software portfolio. This includes
a copy of all code (hopefully printed 2 up to avoid a mass of
paper, you can use enscript -2rG foo.cc to do this.)

Each group member should have a copy of what is turned in, a complete
copy of exactly what is turned in. If a group has trouble financing
the copying of all documents then please ask for help, we can make
departmental copying machines available if there is a
real hardship.

The 10% bonus for early submission is based on when code is submitted,
not the final document and deliverables. Code is due on Friday. The
final deliverables can be turned in at the beginning of class on Monday.
Note that you cannot change your code after Friday without incurring
a penalty.

Extra Credit

There are two things that can earn extra credit.

For extra credit you can display a list of files that are stored in your grepple program
in some "neat" way that helps to understand the recursive structure
of the directories you've read in. This is kind of wide-open and there
are probably many time-consuming things you could do, so be prudent
in how much time you spend.

For extra credit, the part of your program that deals with
printing lines for each word (when the verbose option is set)
should use an "intelligent" caching strategy. This means that
you can read the files to find lines, but you'll store the lines so
that future look up of the line will find the line in memory rather
than going to disk. You'll need a cache management policy
to decide what to do when the cache becomes full (you cannot assume
that you have unlimited memory, because you don't.)

For example, you could stamp each line with a virtual time (based on
how many find operations have been performed, for example.) Then "old"
lines could be the first to be removed when the line-cache is full. You
could also decide to keep frequently accessed lines, so you'd need to
keep track of how many times each line is "found".

You can do many sorts of things here, and any kind of caching policy
can earn extra points. You should be sure to document the policy and
explain how your cache works and how it implements your cache policy.

This extra credit can add up to 20% to your final grade depending on
how much you do and how well you do it.