CS475 S2019 – Homework 2

CS 475, Spring 2019, Homework 2, Due March 4, 11:00 am

Start by downloading the handout:

2/16: Fixed testRecursiveDelete in Part1FunctionalTests. If you have the original version, you can download the new handout, or just copy that one fixed file in place.

2/18: Fixed two assertion errors in Part2ConcurrentTests.java. Fixes:

Line 81 (testGetSameKeySimultaneously) error should read “Expected to be able to concurrently get the same key, but did not finish within 1 sec”

Line 112 (testGetAndSetSameKeySimultaneously) error should read “Expected to not be able to concurrently get and set on the same key, but did”

2/20: Clarified: your KVStore will not allow lists to be added/retrieved with get or set (and hence does not have the addToList and removeFromList operations), but will use lists to keep track of directory contents.

2/23: Fixed assertion error in Part2ConcurrentTests.java, test testListDirSimultaneously, assertion should read “Expected to be able list the same directory simultaneously, but did not within 1sec”

2/28: Clarified: linearization points

This assignment will build on your knowledge of locking and concurrency control in Java, extending our KV Store to have a notion of files and directories. Normally, KV stores are “dumb” about keys: there is no relationship between keys. However, some applications demand a notion of hierarchy more similar to file systems. That is, rather than have no relationship between keys, keys are organized like paths on a file system. In this scheme, the key “/a/b/c.txt” represents a file which is stored within a folder (the folder “/a/b/”), which is itself stored within a folder (“/a/”), which is itself stored within a folder (“/”). A user could query the KV store for the contents of “/”, which would return “/a/” — one of the directories that the key “/a/b/c.txt” is stored within. In this assignment, you will implement a concurrent KV store that automatically maintains directory structures for keys.

As in Homework 1, you will implement a simple KV store that operates entirely within memory, and on a single machine. However, the KV store will be accessed by multiple client threads, and hence, must maintain thread safety.

General requirements:

We have provided you a baseline implementation of the KV store that has various methods stubbed out, which you will implement. You may not include any additional libraries (e.g. download and and require additional JARs), although feel free to use any of the various libraries included with Java or already included by the starter project. Your project will be compiled and tested using Java 8, so please do not use any Java 9+ exclusive features (the first long-term support version of Java > 8 just came out in September; we’ll switch to Java 11 next semester, sorry).

Your KV Store will be compiled and tested using apache maven, which automatically executes the provided JUnit tests. Please install Maven on your computer. Unfortunately, Maven is not installed on Zeus, however you can download the most recent version (e.g. apache-maven-3.5.2-bin.zip) and unzip it in your home directory. Then, to run maven, type
~/apache-maven-3.5.2/bin/mvn in the snippets below (instead of just
mvn). Note that you can easily import maven projects into eclipse and IntelliJ (we suggest IntelliJ over eclipse).

Your KV store will be automatically graded for correctness (note that there will be a manual grading phase to check hard-to-automatically-catch concurrency issues). Included with your handout is all of the automated tests that we will use to test your assignment. Upon submitting your assignment, our server will automatically compile and test your assignment and provide you with test results. We will
also for this assignment use a state-of-the-art race detector to check for races in your program – this will run automatically in Autolab. You can resubmit 50 times before the deadline without penalty. To run these tests, simply execute
mvn test (of course, if you do this first off, you’ll see that they all fail!)

Note: Your code must compile and run on the autograder, under Java 8. It is unlikely that you will have any difficulties developing on a Mac, Linux or Windows. When you feel satisfied with implementing one phase of the assignment, submit to AutoLab and verify that AutoLab agrees.

Academic honesty reminder: You may NOT share any of your code with anyone else. You may NOT post your code in a publicly viewable place (e.g. in a public GitHub repository). You may face severe penalties for sharing your code, even “unintentionally.” Please review the course’s academic honesty policy.

Tips and References

You might find it useful to reference the official Java tutorial on concurrency, which does a nice job outlining at a high level how you can create threads and locks. If you find any useful references, please feel free to share them on Piazza and we will update this page as well.

Part 1: The Hierarchical Key Key-Value Store (32%)

To get started, you’ll implement the basic functionality of the key value store, and ensure that it works in a normal (single-threaded) environment. This will act as something of a baseline for Java programming without concurrency. The tricky parts will come when we make this functionality thread-safe (in Part 2).

Unlike in Homework 1, this KV store will only allow get and set to be called on strings as values (no addToList, hooray!). However, we’ll have some new methods and new semantics to handle hierarchical keys — and to handle those, you may need to use lists.

Your KVStore will deal in two types of keys: Directory Keys, which end in / and Non-Directory Keys, which do not end in /. Both directory and non-directory keys will be internally managed in your code by calling
_get and
_set. The
get,
set, and
remove operations operate only on non-directory keys, and the
listDirectory,
removeDirectory and
removeDirectoryRecursively operations operate only on directory keys.
listKeys returns all keys (both those that are a directory and those that are not). All keys must start with a / – that is, all keys are (at least transitively) contained by the special directory /.

Your KVStore will automatically create and maintain directory structures so that it can efficiently implement the
listDirectory and
removeDirectory functions — it must implement these functions without calling
listKeys. You’ll do this by creating a key/value pair for the directory key, and storing the object of your choice (e.g. a list containing all of the key names within that directory) as the value for that directory key. Hence (assuming that the KVStore starts empty), calling
set("/foo/bar/b","someValueToStoreHere") will result in the following directories being created:

/ (with contents
/foo/)

/foo/ (with contents
/foo/bar/)

/foo/bar/ (with contents
/foo/bar/b)

Hence, valid keys are now:
/,
/foo/,
/foo/bar/ and
/foo/bar/b

There will now be the following keys and value pairs in your KV store:

To continue this example: after the set operation, we can work out what keys will happen if another set is called:
set("/foo/bar/anotherDirectory/anotherFile","otherValue"), the following new key and value pairs will be created by your KV store:

The existing directory structure for
/foo/bar/ will be updated, to now have the contents
/foo/bar/b (from the prior call) and also now
/foo/bar/anotherDirectory/.

You will still use the same interface to store keys and their values:
_set,
_get,_remove. Each directory can be represented with a list of the object of your choice (
Strings that represent the contents of that directory, or some class that you write for the same purpose). Since the directory itself is a key, you must store the directory structure using
_set,
_get, and
_remove. Furthermore, each of the above directories must have their own entry (in the above example, you must call
_set to create each of these directories). Note that although your
get or
set functions will not work for directories, directory contents will still be stored in and retrieved using the
_get and
_set functions.

Your KVStore should implement the following invariants:

All valid keys must have non-null representations in the underlying map (e.g. you will need to call
_set for every key, including nested directories that must be created).

If a key exists, and that key is nested in a directory, then its directory parent must exist as well.

The primary implications of these invariants are that: (1) When creating (or checking to create) nested directories, you must always start at the “bottom” (
/), and work your way up; and (2) your directories must be stored “flat” (the contents of directory
/foo/bar/ is stored in the KVStore in the key
/foo/bar/).

For the first part of the assignment, you do not need to be meet thread safety requirements.

You will implement this behavior in
edu.gmu.cs475.KeyValueStore, specifically in the following seven methods:

Java

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

/**

* Retrieve an element from this key value store

*

* @param key the key to retrieve

* @return The value mapped to this key, if one exists, otherwise null

* @throws NullPointerException if key is null

* @throws IllegalArgumentException if the key represents a directory (ends in /)

* @throws IllegalArgumentException if the key does not start with a /

*/

publicabstractStringget(Stringkey);

/**

* Lists all of the keys that are currently known to this key-value store

*

* @return A set containing all currently valid keys

*/

publicabstractSet listKeys();

/**

* Sets a key to be the given value

*

* @param key key to set

* @param value value to store

* @throws NullPointerException if key or value is null

* @throws IllegalArgumentException if key does not start with a /

* @throws IllegalArgumentException if key represents a direcotry (ends with a /)

*/

publicabstractvoidset(Stringkey,Stringvalue);

/**

* Removes this key and any values stored at it from the map

*

* @param key key to remove

* @return true if the key was successfully deleted, or false if it did not exist

* @throws NullPointerException if key is null

* @throws IllegalArgumentException if key does not start with a /

* @throws IllegalArgumentException if key represents a directory (ends with a /)

*/

publicabstractbooleanremove(Stringkey);

/**

* Lists the contents of a directory (non-recursively)

*

* @param directory path of the directory

* @return unsorted set of the files and directories inside of this directory, or null if the directory doesn't exist.

*

* @throws NullPointerException if key is null

* @throws IllegalArgumentException if the key does not represent a directory (does not end in a /)

* @throws IllegalArgumentException if the key does not start with a /

*/

publicabstractSet listDirectory(Stringdirectory);

/**

* Deletes an empty directory

*

* @param directory path of the directory

*

* @throws IllegalArgumentException if the key does not represent a directory (does not end in a /)

* @throws IllegalArgumentException if the key does not represent a directory (does not end in a /)

* @throws IllegalArgumentException if the key does not start with a /

*/

publicabstractvoidremoveDirectoryRecursively(Stringdirectory);

Note the following requirements:

You should feel free to add whatever methods you would like to the
KeyValueStore class, but you must not change the names or parameters of the methods listed above or override any other methods of the
AbstractKeyValueStore.

You should feel free to add new fields to
KeyValueStore

You can not use your own HashMap, but instead it is almost as easy: You must call
_get,
_set,
_remove, and
_listKeys in your
KeyValueStore to access the underlying data store.

Precise grading breakdown:

Automated tests: 4 points

9 JUnit tests, 3 points each

5 points from manual grading

Part 2: Managing Thread Safety (68%)

Once you are satisfied that your key-value store works correctly in a single-threaded environment, your next task will be to add locking to it. You are free to use:
synchronized,
ReentrantLock,ReentrantReadWriteLock, and
Condition. Our reference solution uses
ReentrantLock,
ReentrantReadWriteLock and
Condition (although perhaps several of each!). Different from the last assignment, we have ensured that
_get,
_set,_remove and
_listKeys will correctly uphold their contract regardless of the locking that you do: so, do not use a big lock around these methods again. Clarification 2/28: You must store your locks in (non-static) fields of your KeyValueStore (likely some in a HashMap, plus any others that you need). Do not store them in a static field, and do not store them using _set and _get.

Note the following concurrency requirements:

For regular value operations (
get,
set,
remove): these should be able to occur concurrently if they have different keys. Multiple
gets should be permitted for the same key concurrently, but
set and
remove are exclusive operations (you must not allow two
set calls on the same key at once, or a
set and a
get, etc.). Clarification 2/28: Consider the _get, _set and _remove methods to be your linearization points. This means that we will evaluate whether two set’s are concurrent based on whether the underlying _set calls are concurrent. This means that you must acquire whatever locks you need before calling _get.

For directory-level operations (list directory contents, remove empty directory and implicitly creating a directory or adding to its contents through set key): These should always be able to occur concurrently if they have different keys and if the two directories are not hierarchically nested (e.g.
/my/path/to/key1/ and
/my/path/to/key1/key2/ are nested, since
key2 is a directory in /
my/paty/to/key1/). Clarification: these operations should be able to occur concurrently on two different keys as long as neither is the parent of the other key, and they do not share the same immediate parent. Multiple threads can concurrently list the contents of the same directory (or of nested directories), but these reads must not overlap concurrently with any write to those directory’s structure (e.g. creating a directory, adding a file to a directory, removing a directory).

For global operations (
listKeys): It must be possible for multiple threads to concurrently call
listKeys, but it must not be possible for any operation that creates or removes keys to occur concurrently. Hint for listKeys: Look back at our discussion of the Reader-Writer problem, and of
ReadWriteLocks. This requirement is similar to the Reader-Writer problem (applied to some global lock on the KeyValueStore), where the writers are set and remove, and the readers are listKeys. Notice that in the normal Reader-Writer problem, we want to allow any number of readers OR a single writer access to the resource, but in this case, we need to allow any number of readers OR any number of writers. If you find it helpful, you can create new classes in the edu.gmu.cs475 package to support your implementation.

Another way to think of these semantics, is that each resource that is shared (the listing of keys in each directory, and the value of each key itself) may be protected with its own lock. Multiple threads can concurrently read the same resource, but no two threads can write the same resource at once, or read/write the same resource at once. A single high-level operation, like set may touch multiple such resources: especially in the case of a nested key, where it is necessary for your code to also create the parent directories. Hence, you will have to have a lock for each key, and a single operation might require acquiring multiple locks.

Precise grading breakdown:

15 JUnit tests, 4 points each

8 points from manual grading

Grading

Your assignment will be graded on a series of functional tests, making sure that it implements the specification above. We have provided a (not exhaustive) test suite which should help you judge (on your own) how well you have done with the functional requirements. We may add additional tests beyond what we are providing you with now, so please do not rely entirely on them.

We have also provided you with feedback from the RV-Predict tool, running on your code. You can see this when you submit on Autolab , in the job details. RV-Predict will either report some races, or “No races found.”

In accordance with the “reasonable person principle,” we reserve the right to audit your code and correct any marks that are improperly assigned, for instance, due to your code incorrectly following the specification, but passing the test. We would encourage you to spend your time correctly implementing the assignment, and not trying to force it to pass the test, even if it is not following the spirit of the assignment.

Hand In Instructions

You must turn in your assignment using Autolab (You MUST be on the campus network, or connected to the GMU VPN to connect to Autolab). If you did not receive a confirmation email from Autolab to set a password, enter your @gmu.edu (NOT @masonlive) email, and click “forgot password” to get a new password.

Create a zip file of
only the
src directory in your assignment (please:
.zip, not
.tgz or
.7z etc). When you upload your assignment, Autolab will automatically compile and test it. You should verify that the result that Autolab generates is what you expect. Your code is built and tested in a Linux VM. Assignments that do not compile using our build script will receive a maximum of 50%. Note that we have provided ample resources for you to verify that our view of your assignment is the same as your own: you will see the result of the compilation and test execution for your assignment when you submit it.

You can resubmit your assignment an unlimited number of times before the deadline. Note the course late-submission policy: assignments will be accepted up until 24 hours past the deadline at a penalty of 10%; after 24 hours, no late assignments will be accepted, no exceptions.