You may want to look into the basic methods Object.wait() and Object.notifyAll() [1] and perhaps also read through some of the threading tutorials [2]. For inspiration you can take a peek at the standard BlockingQueue interface.

For this assignment we are not allowed to use anything from java.util.concurrent.

With that in mind, how could i implement a thread safe list type data structure? I assume i would call wait() before doing any operations on the list so that a thread can only do something if the list has an item in it. Would i just call notifyAll on the list if it holds 1 or more items after i remove one?

In theory this should work but i still get exceptions caused by accessing an empty list. Strangely, the exceptions do not always occur. If i have 100 tasks and 5 threads, it works fine, but if i go with 500 tasks and 10 threads i get exceptions.

Ah, i created my own TaskList class and put the synchronized function within that class and now my program works.

Part of my assignment requires me to optimize the code such that a greater number of threads will result in faster code execution. Right now if i run the program with large data sets and only one thread, it takes 8ms. If i use 5 threads for the same data sets, i get a time of 9ms.

According to the assignment:

The actual difference will vary considerably due to (1) machine architecture (2) the specific
algorithm(s) used within the code.

That requirement, as stated, is impossible to comply with. If the number of threads is much greater than the parallelism the underlying hardware can support, adding more threads will make the system slower, not faster.

Wouldn't it also depend on the computer running the code and therefore be very subjective when it comes to optimization and speed gains?

I have some more snippets from the assignment handout. The teacher seems to think we can achieve close to linear speed increases, though i have no idea how. My code is already pretty barebones. I am using a Stack for my task list, but i don't think Stack.Pop is inefficient by any stretch.

The primary objective of the assignment is to write thread based code that will run
correctly. By “correct”, we mean that (1) multiple threads are created and they concurrently process
the task list, (2) you produce the proper results (e.g., with the FIND and ADD operations), and (3) the
application does not crash. The grader will evaluate this using a series of tests cases (i.e., various test
parameters. You do not get a copy of the test parms beforehand ).
One thing that you will probably realize, however, is that concurrent doesn’t always mean parallel. In
other words, if your computer has more than one core (as almost all do nowadays), you will see that
running the code with 2 or 4 threads may not make the code any faster than when it is run with one
thread. That can be a little discouraging after all of this work. In fact, this is quite common with parallel
or multi-threaded applications. This is why they are quite difficult to write (well) in practice.
So the last 10% of your grade for this assignment will involve improving the performance of the
application. Ideally, we want to get something closer to linear speedup (e.g., 2 X faster if you have 2
cores). Note that you may not see truly linear speedup since there is typically overhead associated with
thread creation and scheduling. In any case, to improve performance, you can “tweak” the code any way
that you like (algorithms, data structures, etc.), as long as the command line parameters do not change
and the final output is the same

edit:

With very large data sets, my program takes 75ms using 1 thread and 38ms with 2 threads. So i guess it depends on the size of the data set being used.

Without knowing what that "Perform" thing does, however, I do not think I can say anything specific. Non-specifically, I can say that it should minimize or ideally avoid interaction with other threads. Interaction means any sort of thing when it has to wait for some other threads to do something, which includes access to "synchronized" resources.

Well in order for my program to be thread safe wouldn't i have to implement blocking in some way to protect the shared resource (which is the task list in this case)?

Perform is a method which just checks if a static TreeSet contains an arbitrary value... I assume it was just meant to be some arbitrary operation which would take time to complete so we could see meaningful speed differences with larger data sets.

The only blocking taking place is when a thread accesses the stack of tasks in order to pop() and retrieve a task to execute. This would require blocking in order to avoid having threads call pop() on an empty stack.

If the only thread interaction is in the GetTask() method, then your code is close to optimal in that case. I would not attempt any further optimization unless a very specific optimization target was set.

I have not read every detail you have provided in this thread, but it seems that your code is not waiting (i.e. invoking one of the wait methods) in order to retrieve a new task. If so, that means that all worker threads will continuously poll (also known as busy-wait resulting in 100% CPU usage) until the queue releases a task for them and you should probably consider if this is acceptable or not in the solution for your assignment. In real life, this would of course not be acceptable and one would normally use a wait/notify exchange to make sure that threads only wake up when a task enters the queue. Note however, that it is easy to code an incorrect wait/notify exchange so you should make sure you understand the mechanism first or at least read some guides on how to do it before you start coding it.

Yes, that code has to get stuck, because the thread first waits, then notifies. It is going to wait forever before it can notify (and unblock other threads).

More fundamentally, what are trying to achieve here? Yesterday I had an impression that the task list was populated before the threads were launched, and no new tasks were ever added afterwards. In which case, there is no reason to wait. What are going to wait for now?

The task list is populated prior to creating the threads. Does this mean that waiting would be unnecessary?

I don't wanna lose marks on this assignment and my code seems to run correctly... There is no mention of wait/notify in the assignment handout and my code is thread safe, but the real question i suppose is should i really be using wait/notify for this?

Unless you can clearly state a reason why you should add waiting to your code, stay away from that. I do not think there is anything in the art of computer programming that is worse than adding code that does things for no apparent reason. It is called "cargo cult programming" (look this up).

Yesterday I had an impression that the task list was populated before the threads were launched, and no new tasks were ever added afterwards. In which case, there is no reason to wait.

Reading this thread more carefully (ok, pun intended for once) I agree that there is nothing in the material that twoski has presented that should indicate that this assignment is about constructing a general thread-safe task queue as I was rambling about in my previous post.

I apologize if my previous comment erroneously prompted twoski to get sidetracked into consider adding an unnecessary wait/notify coordination.