At a quick glance... yes, it is possible "for multiple threads to concurrently obtain a lock on the same record", albeit, it will be *very* unlikely. You have some serious race conditions...

First, the notifications. It is very easy for notifications to be lost -- which may cause threads to wait(), for a record to be freed, when the record has already been freed. This is actually a likely scenario.

Second, you don't protect the locks object, in one place of the unlock() method. It is unlikely that this will cause two threads to grab the same lock, but it can happen... Something with two threads trying to free the record at the same time, while another two threads trying to lock the record, and the scheduler just timing it in a really bad way.

When I initially read your code, I also thought that there was the potential for multiple threads to own the same lock. It was not immediately apparent what was happening in some parts of the code, and what the effects would be. After working on it for a while, I believe that two clients can't own a lock on the same record at the same time, however the rest of Henry's comments still apply.As a side note - it appears from the comments in your code that two clients could believe that they own the same lock. If your comments are to be believed, then an automatic test application could report that two of it's clients obtained the same lock. More on this later.One of the issues I had is with your obtaining and releasing the mutex on the locks object, as typified in this bit of your code:There is no need to do this, and it lowers performance and simultaneously reduces readability. This was an easy to read example, but in the lock method it was not so easy to read, and obscured what the method was actually doing.

The problem that this causes is that if the assessor suspects that your lock method was faulty (and the test harness implies might imply that it is), and they have to try and read through 54 lines of obscure code (refactoring can bring your code down to 35 lines - I tried ) - they might just decide that the code is faulty - hence your score.

Back to that side note about two clients potentially thinking they own the same lock simultaneously. I am just guessing here, based on what code I have seen, and the comments surrounding it, but it looks like you are releasing locks if they are owned for too long a period of time. The problem with this is that this functionality is neither required nor implied by the instructions, and could cause a problem with a test harness (and in real life).

You didnt provide code for the RecordLock class - is this just a bean? Does it provide any functionality other than associating the record number with the cookie? If not then this might be causing further obfuscation without gaining you anything, especially since the locks map also associates the record number with this object - so you may have further redundancy that can be removed.

After working on it for a while, I believe that two clients can't own a lock on the same record at the same time, however the rest of Henry's comments still apply.

I believe that it will be very unlikely to almost impossible to ever see two threads own the same lock, at the same time, but IMHO, the issue isn't whether it will happen under current conditions, but can it happen under any condition? However, unlikely?

Also, what is the difference between two threads believing they both own the lock, and actually owning the lock? This is the only protection for the record, so if two threads believe they own the lock, it is just as bad.

Anyway, here is my very unlikely condition that will yield two threads owning the lock...

1. Thread A tries to free the record. It confirms that there is such a lock, and the record is not active, and this is all done without any synchronization protection. And before it actually grabs the synch lock to remove the record...

2. Thread B tries to free the record. It confirms that there is such a lock, and the record is not active, and this is all done without any synchronization protection. And before it actually grabs the synch lock to remove the record...

3. Thread C determines that the record is not active, and tries to get the record lock. In this case, it does grab the synch lock, which will block A and B. It determines that it needs to wait, and does so... freeing the synch lock.

4. Thread D determines that the record is not active, and tries to get the record lock. In this case, it does grab the synch lock, which will block A and B. It determines that it needs to wait, and does so... freeing the synch lock.

5. Thread A removes the record. Notifies the other threads. And return from unlock() method.

6. Thread C wakes up, grabs the record lock. And return from the lock() method. Later, it is assume that the record will be changed to active.

7. Thread D wakes up, noticed that the lock is still not available, and returns back to the wait state.

8. Thread B removes the record. Notifies the other threads. And return from the unlock() method. (note: This thread has already check whether the record is active, and has already determined that it isn't)

9. Thread D wakes up, grabs the record lock. And return from the lock() method. Later, it is assumed that the record will be changed to active.

Now both thread C and D owns the lock, or believe they own the lock, and will modify the same record.

Originally posted by Henry Wong: I believe that it will be very unlikely to almost impossible to ever see two threads own the same lock, at the same time, but IMHO, the issue isn't whether it will happen under current conditions, but can it happen under any condition? However, unlikely?

You are quite correct - if it can be proven that there is any way that two clients can own the same lock, then the code fails.

Originally posted by Henry Wong: Also, what is the difference between two threads believing they both own the lock, and actually owning the lock? This is the only protection for the record, so if two threads believe they own the lock, it is just as bad.

In the SCJD assignment, this is not the only protection - the update and delete methods both require the lock cookie to be provided and valid. So the particular record is safe.

However in real life there could easily be bigger issues at stake. A simple example might be having a transaction that spans two tables, where the assumption is that it is safe to modify the related records in the second table since you own the lock on the primary record in the first table - this is not a valid assumption and might cause irrevocable errors if two threads act on the belief that they own the lock.

Originally posted by Henry Wong: Anyway, here is my very unlikely condition that will yield two threads owning the lock...

Correct me if I am wrong, but doesn't your condition start with the requirement that both thread A and thread B own the same lock?

Originally posted by Henry Wong:[/QB] Later, it is assume that the record will be changed to active.

All the code presented assumes that the record is active - Khaled throws an exception whenever isActiveRecord returns false.

This does open up a separate problem though - this is presumably Khaled's verification that a record has not been deleted. And if the thread which deletes a record also releases it's lock, then there is the potential for another thread to gain the lock on a deleted record since Khaled does not reverify that condition. However the unlock method does not allow a deleted record (an inactive record) to be unlocked, so it would have to be the delete method itself that removes the lock for this scenario to become true.

In your scenario, you had threads C and D both waiting for the same lock. This means that they must have both been at the call to wait at line 2 in the following snippet:When thread C woke up in your step 8, it verified that no other thread owned the lock for the record it wanted in line 5, then added a marker to the lock map in line 7 so that others would know that it owns that lock.

Your step 8 says that thread B is removing [its lock on] the record. Note that this must have been a release of a lock on a different record than thread A owned, or else we are starting from an invalid state to prove that we can get to an invalid state. Regardless, when thread B removes it's lock, it calls notifyAll which wakes all waiting threads, which at this time is only thread D.

In your step 9, you note that thread D wakes up. I am happy with that. However you then state that it grabs the record lock. However when thread D gets to line 5 above, it will recheck the map, and notice that somebody has put their marker against the desired record number. So thread D will have to go back to sleep.

[Andrew]: Correct me if I am wrong, but doesn't your condition start with the requirement that both thread A and thread B own the same lock?

Well, it requires that both threads be trying to unlock the same record. And it requires that they both have the correct cookie. The second is not all that difficult to achieve - since the cookie is generated from system time, it's possible for two threads operating concurrently to see the same time, and get the same cookie independently. The first condition is more problematic - two threads shouldn't be trying to unlock records unless they already own the record lock - which should be impossible if Henry's scheme for having two threads with the same record lock requires two threads with the same record lock as a precondition.

However, if all the other code in the project only did what it's supposed to, there would be no need for cookies at all. I would argue that the existence of cookies in the requirements implies they want added protection in case there's either a bug or a malicious client. Malicious clients are maybe outside the scope of the project - they could create all sorts of problems, beyond what we're discussing here. But it wouldn't be difficult for some maintenance programmer to write some code that accidentally used the same record number and cookie value in two different threads. Could be something as simple as accidentally using static variables rather than instance variables. That could easily lead to two threads trying to unlock the same record, even if they don't really both own the lock. It's the responsibility of the locking mechanism to let one unlock, and throw an error for the second. That doesn't necessarily happen in this code, and as a result it is possible (however unlikely) that two threads might later get the same lock. I think.

[Khaled]: Personally i have done unit testing on the LockManager component alone.With many threads trying to lock and unlock.But in all cases no clients could obtain a lock on a record at the same time.

Unfortunately, it's extremely difficult to prove thread safety with unit testing. You may have a lot of really good tests. But there's always a possibility that (a) there's some scenario you didn't think of, (b) the chance of errors is very small but not zero; you mmight not have tested long enough, or (c) threads may behave differently on different machines. It may be that it's impossible to observe a bug on your machine, but it shows up more easily on another machine. Threads can be very unpredictable. I'm not saying you shouldn't try to test them - I'm just saying that successfully running tests don't always imply that you have no bugs. That's true for testing in general, but it's particularly true when threads are involved.

Ultimately Khaled, I think Sun is correct here, and your locking code has a number of problems. Aside from the remote chance of two threads getting the same record lock, there's a somewhat higher chance of an infinite wait because another thread called notify() before you entered wait(). And the code is generally more complex than it needs to be. I see repeated code in the lock() method that is unnecessary. And most of your problems would go away if you made you sync blocks a bit bigger, covering more of the surrounding code. (Which would also mean you don't need so many sync blocks.) Maybe you deserve more than zero points for locking, and maybe those points would be enough to pass - or maybe not. I think you'd be best off trying to simplify the code, remove the bugs, and resubmit, rather than try to convince Sun to reevaluate.

"I'm not back." - Bill Harding, Twister

Khaled Mahmoud

Ranch Hand

Posts: 361

posted 9 years ago

Regardless whether i am right or not (probably i am sure of my code).The messages describing were i lost most of my points should be more descriptive.Since one of the main objectives of getting Certification is to learn, I expect from Sun to give more information about why my record locking mechanism is not safe and to be more professional than that.I expected a more professional way in assessment.

Anyway, even if my record locking is not correct, i need to know why it is like that. And i think i have the right to know.

I have contacted Sun by phone and sent emails to both Sun and Prometric and Sun I hope that they will reply soon.

By they,even the grade on the GUI was not fair.I expected to lose much marks on the GUI section, but I got 31/40 which i think a very good mark in comparison to other candidates taking the SCJD.

[Khaled]: Since one of the main objectives of getting Certification is to learn

Mmm, that's your objective, not Sun's. With most exams you are expected to know the material before to take the exam, and if you don't know it, you get a poor grade. The SCJd is unusual in that it's possible to learn while you're taking the exam. But that doesn't mean it's Sun's responsibility to teach you - even when grading. You may be able to get more information, but I think you need to be prepared for the possibility that you won't get it. On the other hand you have suggestions from three people here on how you could improve your code. Maybe that's useful, or not.

"I'm not back." - Bill Harding, Twister

Nicholas Cheung

Ranch Hand

Posts: 4982

posted 9 years ago

Regardless whether i am right or not (probably i am sure of my code).The messages describing were i lost most of my points should be more descriptive.Since one of the main objectives of getting Certification is to learn, I expect from Sun to give more information about why my record locking mechanism is not safe and to be more professional than that.I expected a more professional way in assessment.

I really doubt about this. I believe you would expect the same case being applied to other Professional Exams, in addition to SCJD. However, none of them will give you any feedback after the exams (except the section breakdown). You would neither know which questions you answered wrongly, nor why you perform so well/poor in that area, just like the case for your SCJP. Would you then mean none of them are professional?

I did not say that i dont know the material.What I said is that even if my locking mechanism is incorrect, I need to get an explanation about why it is incorrect. Knowing the reason or knowing that very very abnormal case in which two threads could access the same record, will make me learn something new and to some extent strange.

Thanks for all the support i got from all. I will synchronize all the hashtable instances in all code pieces and will resubmit the assignment again. [ January 30, 2007: Message edited by: Khaled Mahmoud ]

- I agree that it's not likely that Sun is going to provide mentoring as a part of reviewing an SCJD project. From an economic perspective it's simply not feasible.

- I think that you've gotten an AMAZING amount of help from the JavaRanch moderators.

- I know that at a bery high level, the SCJD evaluators are looking for standard solutions. The philosphy is that SCJD candidates should strive to be team players, not inventors The evaluators would much prefer to see candidates use standard, tried and true algorithms that everyone can understand, rather than see a candidate attempt to "invent" some new, custom solution.

- If the scenario can be solved 100% of the time, then a solution that works 99.9999% of the time isn't good enough - that's a really key part of this exam! So, any discussion of how "likely" a situation is, is really irrelevent.

I have discovered what happened.Simply i have submitted an older version of the jar file to Sun when i uploaded the assignment. When I have revised the old code, yes there is multiple threads that can have a lock on the same record at the same time.

Thanks to every one for help and i hope that all understands the shock of an unexpected failure in such a tough exam. [ January 30, 2007: Message edited by: Khaled Mahmoud ]