Suresh Srinivas
added a comment - 22/Jul/11 22:47 > Can't change Trash#getTrashPolicy() to private because it is used by TestTrash.java.
You can change it to package private not private. TestTrash and Trash are in the same package.

Hadoop QA
added a comment - 22/Jul/11 22:31 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12487485/PluggableTrash_V6.patch
against trunk revision 1148933.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/759//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/759//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/759//console
This message is automatically generated.

Hadoop QA
added a comment - 22/Jul/11 07:31 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12487397/PluggableTrash_V5.patch
against trunk revision 1148933.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/758//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/758//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/758//console
This message is automatically generated.

I know what the issue is, I was asking for ideas for a solution. The problem is if someone calls setConf(...) on Trash post construction, then it should also change the conf for TrashPolicy, no? If this isn't needed, then I can just remove the overridden setConf(...) in Trash and the issue will be resolved.

Usman Masood
added a comment - 20/Jul/11 08:38 I know what the issue is, I was asking for ideas for a solution. The problem is if someone calls setConf(...) on Trash post construction, then it should also change the conf for TrashPolicy, no? If this isn't needed, then I can just remove the overridden setConf(...) in Trash and the issue will be resolved.

Suresh Srinivas
added a comment - 20/Jul/11 08:33 Configured class should not call non final method in it's constructor. The problem starts there.
Do you expect the conf for TrashPolicy to change post construction? If not then you do not need to override the setConf() in Trash. Just call setConf() for TrashPolicy, post it's construction.

current and homesParent seem too tied up to TrashPolicyDefault to be moved up to TrashPolicy. I don't think any other TrashPolicy would make use of these two vars. I will move fs, trash and deletionInterval to TrashPolicy.

I will also make TrashPolicy InterfaceAudience.Public.

Does anyone have any ideas for how to resolve the FindBug issue? I explained it in a previous comment.

Usman Masood
added a comment - 20/Jul/11 08:19 current and homesParent seem too tied up to TrashPolicyDefault to be moved up to TrashPolicy. I don't think any other TrashPolicy would make use of these two vars. I will move fs, trash and deletionInterval to TrashPolicy.
I will also make TrashPolicy InterfaceAudience.Public.
Does anyone have any ideas for how to resolve the FindBug issue? I explained it in a previous comment.

> I think fs, trash can be moved up to TrashPolicy. Not sure about current, homesParent, deletionInterval though. Some TrashPolicy might not require these vars?

They can ignore it then right?

> but thinking about it more I don't really see any reason why TrashPolicy should only be accessible via Trash
That was not the reason why I was thinking about making it public. If new trash policy can be implemented by users, then it should be a public API.

Suresh Srinivas
added a comment - 18/Jul/11 21:49 > I think fs, trash can be moved up to TrashPolicy. Not sure about current, homesParent, deletionInterval though. Some TrashPolicy might not require these vars?
They can ignore it then right?
> but thinking about it more I don't really see any reason why TrashPolicy should only be accessible via Trash
That was not the reason why I was thinking about making it public. If new trash policy can be implemented by users, then it should be a public API.

@Suresh - Initially I thought making TrashPolicy InterfaceAudience.Private would be better, but thinking about it more I don't really see any reason why TrashPolicy should only be accessible via Trash. So yeah, I think we can make itInterfaceAudience.Public.

Usman Masood
added a comment - 18/Jul/11 21:47 @Suresh - Initially I thought making TrashPolicy InterfaceAudience.Private would be better, but thinking about it more I don't really see any reason why TrashPolicy should only be accessible via Trash. So yeah, I think we can make itInterfaceAudience.Public.

Usman Masood
added a comment - 18/Jul/11 21:02 I think fs, trash can be moved up to TrashPolicy. Not sure about current, homesParent, deletionInterval though. Some TrashPolicy might not require these vars?

The problem is that the constructor of Configured calls setConf(...) which I've overridden in Trash so that it invokes setConf(...) for trashPolicy as well. But this leads to the bug that FindBug highlights. I think this error will occur as long as a constructor calls an overridden method. But it doesn't make sense for us to setConf(...) of Trash and not change it for the underlying TrashPolicy. Any ideas how to work around it?

We don't really have another TrashPolicy, should I just set the conf variable fs.trash.classname to TrashPolicyDefault and see if that works as well?

Usman Masood
added a comment - 18/Jul/11 20:57 The problem is that the constructor of Configured calls setConf(...) which I've overridden in Trash so that it invokes setConf(...) for trashPolicy as well. But this leads to the bug that FindBug highlights. I think this error will occur as long as a constructor calls an overridden method. But it doesn't make sense for us to setConf(...) of Trash and not change it for the underlying TrashPolicy. Any ideas how to work around it?
We don't really have another TrashPolicy, should I just set the conf variable fs.trash.classname to TrashPolicyDefault and see if that works as well?

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

Hadoop QA
added a comment - 18/Jul/11 19:51 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12486738/PluggableTrash_V4.patch
against trunk revision 1147971.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
-1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/743//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/743//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/743//console
This message is automatically generated.

Usman Masood
added a comment - 16/Jul/11 21:58 Uma, great point about Trash not extending Configured. Trash now extends Configured so that no unexpected behavior is seen by a client.
Removed unused vars/imports.

2. I think previous Trash users can do setConf and getConf because it extends Configured. But now any way to handle it? TrashPolicy extended it, but as per my understanding User will deal with Trash only right.

Uma Maheswara Rao G
added a comment - 16/Jul/11 11:57 Hi Usman,
New patch looks good to me...good job!
There are some points to address:
TrashPolicy.java
1. unused variable.
private static final Log LOG =
LogFactory.getLog(TrashPolicy.class);
Trash.java:
1. unused imports are there. can you please remove them?
2. I think previous Trash users can do setConf and getConf because it extends Configured. But now any way to handle it? TrashPolicy extended it, but as per my understanding User will deal with Trash only right.
3.Also unused log variable.
TrashPolicyDefault.java:
1. Here also please clean the unused imports.
Also after attaching the patch, you need to Submit it.

I am glad you agree. By using Trash as Facade you can avoid the interface changes getting to the user.

> Though I'm just wondering if there are some clients which make some assumptions based on the implementation of Trash when they use it? This might break things for them.
Clearly articulating what Trash is and what it is not is a good start here. We could also in javadoc say, see the implementation of TrashPolicy for more details. I can help with you in this regard, in the next patch review.

Suresh Srinivas
added a comment - 15/Jul/11 17:52 I am glad you agree. By using Trash as Facade you can avoid the interface changes getting to the user.
> Though I'm just wondering if there are some clients which make some assumptions based on the implementation of Trash when they use it? This might break things for them.
Clearly articulating what Trash is and what it is not is a good start here. We could also in javadoc say, see the implementation of TrashPolicy for more details. I can help with you in this regard, in the next patch review.

Oh, I think I didn't get you the last time. Yeah I think adding a layer of indirection should definitely make this cleaner. We can rename checkpoint() and expunge() to create/deleteSnapshot() in TrashPolicy this way as well.

Though I'm just wondering if there are some clients which make some assumptions based on the implementation of Trash when they use it? This might break things for them.

Usman Masood
added a comment - 15/Jul/11 02:35 Oh, I think I didn't get you the last time. Yeah I think adding a layer of indirection should definitely make this cleaner. We can rename checkpoint() and expunge() to create/deleteSnapshot() in TrashPolicy this way as well.
Though I'm just wondering if there are some clients which make some assumptions based on the implementation of Trash when they use it? This might break things for them.

> To preserve backward compatibility, I chose not to rename Trash to something like TrashPolicyDefault or something. If Trash uses/contains TrashPolicy then it doesn't make sense to have an implementation in it. I think it's best to separate implementation of a Trash policy from what the policy actually is. This is similar to how BlockPacementPolicy has been set up.

I am not sure what you mean by this.

All I am suggesting is, use Trash as Facade. It reference TrashPolicy (base class) and points to the implementation of the TrashPolicy which provide specialization. Some what along the lines of -> http://en.wikipedia.org/wiki/Strategy_pattern.

Suresh Srinivas
added a comment - 15/Jul/11 02:20 > To preserve backward compatibility, I chose not to rename Trash to something like TrashPolicyDefault or something. If Trash uses/contains TrashPolicy then it doesn't make sense to have an implementation in it. I think it's best to separate implementation of a Trash policy from what the policy actually is. This is similar to how BlockPacementPolicy has been set up.
I am not sure what you mean by this.
All I am suggesting is, use Trash as Facade. It reference TrashPolicy (base class) and points to the implementation of the TrashPolicy which provide specialization. Some what along the lines of -> http://en.wikipedia.org/wiki/Strategy_pattern .

Moving factory methods from TrashPolicy to TrashPolicyFactory is something I'm open to. BlockPacementPolicy doesn't use a separate factory class, so maybe just for consistency reasons we can have them in TrashPolicy?

Usman Masood
added a comment - 15/Jul/11 02:11 Moving factory methods from TrashPolicy to TrashPolicyFactory is something I'm open to. BlockPacementPolicy doesn't use a separate factory class, so maybe just for consistency reasons we can have them in TrashPolicy?

1. To preserve backward compatibility, I chose not to rename Trash to something like TrashPolicyDefault or something. If Trash uses/contains TrashPolicy then it doesn't make sense to have an implementation in it. I think it's best to separate implementation of a Trash policy from what the policy actually is. This is similar to how BlockPacementPolicy has been set up.

2. ReflectionUtils calls the no parameter constructor by default. Therefore we can only really "initialize" the instance after it been returned by ReflectionUtils.newInstance(...).

3. Trash was an instance of Configured, so I pushed that up to TrashPolicy. Another reason why this is useful is that ReflectionUtils does an internal check to see if the Class is an instanceof Configured and calls setConf(...). I don't think it's strictly needed though.

4. checkpoint(), expunge() and getEmptier() were part of the public interface exposed by Trash and since the interface was marked stable, renaming/removing these methods doesn't seem favorable right now. I also think eventually we should remove them from TrashPolicy because it doesn't make sense to expect every Trash policy to have a checkpoint mechanism.

5. If there's a startEmptier() method in Trash, we either fix it to run in a separate thread or in the same thread. By exposing a Runnable, we push this decision to the client which I think might be better.

Usman Masood
added a comment - 15/Jul/11 02:09 1. To preserve backward compatibility, I chose not to rename Trash to something like TrashPolicyDefault or something. If Trash uses/contains TrashPolicy then it doesn't make sense to have an implementation in it. I think it's best to separate implementation of a Trash policy from what the policy actually is. This is similar to how BlockPacementPolicy has been set up.
2. ReflectionUtils calls the no parameter constructor by default. Therefore we can only really "initialize" the instance after it been returned by ReflectionUtils.newInstance(...).
3. Trash was an instance of Configured, so I pushed that up to TrashPolicy. Another reason why this is useful is that ReflectionUtils does an internal check to see if the Class is an instanceof Configured and calls setConf(...). I don't think it's strictly needed though.
4. checkpoint(), expunge() and getEmptier() were part of the public interface exposed by Trash and since the interface was marked stable, renaming/removing these methods doesn't seem favorable right now. I also think eventually we should remove them from TrashPolicy because it doesn't make sense to expect every Trash policy to have a checkpoint mechanism.
5. If there's a startEmptier() method in Trash, we either fix it to run in a separate thread or in the same thread. By exposing a Runnable, we push this decision to the client which I think might be better.
I will fix the javadocs/indentation issues.

I prefer Trash uses/contains TrashPolicy insted Trash is TrashPolicy relationship. With this Trash does not need to have incompatible changes or deprecation. Trash could use TrashPolicyFactory to create a policy that it references. This also makes other code changes in this patch unnecessary.

Why are you making some members non final? You could call from one constructor another constructor with this(...) and still keep it final without introducing initialize method?

There are empty changes (with just space added).

Indentation of Trash class definition is not quite right.

TrashPolicy.java

Can Trash#fs, Trash#homeParent, Trash#deletionInterval and any other members be common to all TrashPolicy and hence be moved to TrashPolicy?

Move factory methods out of TrashPolicy to a separate class TrashPolicyFactory?

What is the need to make TrashPolicy extend Configured?

I am not sure what checkpoint() method really does - the javadoc is quite vague. Is the correct name snapshot for this? Is the right name for expunge() deleteSnapshots()?

It does not look like getEmptier() method is needed? Just a start method on Trash should suffice right? Why expose emptier? Running the thread as part of Trash is better than trash policy?

Suresh Srinivas
added a comment - 15/Jul/11 01:31 I took a quick look at it. Some comments:
Trash.java
I prefer Trash uses/contains TrashPolicy insted Trash is TrashPolicy relationship. With this Trash does not need to have incompatible changes or deprecation. Trash could use TrashPolicyFactory to create a policy that it references. This also makes other code changes in this patch unnecessary.
Why are you making some members non final? You could call from one constructor another constructor with this(...) and still keep it final without introducing initialize method?
There are empty changes (with just space added).
Indentation of Trash class definition is not quite right.
TrashPolicy.java
Can Trash#fs, Trash#homeParent, Trash#deletionInterval and any other members be common to all TrashPolicy and hence be moved to TrashPolicy?
Move factory methods out of TrashPolicy to a separate class TrashPolicyFactory?
What is the need to make TrashPolicy extend Configured?
I am not sure what checkpoint() method really does - the javadoc is quite vague. Is the correct name snapshot for this? Is the right name for expunge() deleteSnapshots()?
It does not look like getEmptier() method is needed? Just a start method on Trash should suffice right? Why expose emptier? Running the thread as part of Trash is better than trash policy?

Doug Cutting
added a comment - 15/Jul/11 00:58 - edited This looks mostly reasonable to me. A few nits:
The TrashPolicy#getInstance() implementations need javadoc comments.
TrashPolicy#getEmptier() should perhaps not claim that, "only one checkpoint is kept", as not all implementations might implement that policy.
Trash methods that override TrashPolicy methods should have an @Override and don't need javadoc unless it's different than that inherited from TrashPolicy.

Can we have some kind of TrashFactory, which will create the instance of Trash policy implementaions.Bydefault it can create Trash.java instance. Trash.java class will not be changed and deprecate the constructor. If we configure some class for Trash policy(fs.trash.classname), then factory Should create that instance. We can define one Base interface for Trash policies and in Future let all Trash policy implemantations implement those APIs(looks you already identified the APIs) from Base trash interface.

In Future only TrashFactory will be used to get the Trash instance.
As Doug said, Presntly let dont distrurb the Trash class. So that no existing users will get effected.

Also, Currently Trash implementaions present in Common, So can we move this issue to common?

Uma Maheswara Rao G
added a comment - 14/Jul/11 10:46 Can we have some kind of TrashFactory, which will create the instance of Trash policy implementaions.Bydefault it can create Trash.java instance. Trash.java class will not be changed and deprecate the constructor. If we configure some class for Trash policy(fs.trash.classname), then factory Should create that instance. We can define one Base interface for Trash policies and in Future let all Trash policy implemantations implement those APIs(looks you already identified the APIs) from Base trash interface.
In Future only TrashFactory will be used to get the Trash instance.
As Doug said, Presntly let dont distrurb the Trash class. So that no existing users will get effected.
Also, Currently Trash implementaions present in Common, So can we move this issue to common?

I think the patch looks generally good and this is a fine direction to pursue. But the Trash API is declared Stable, and removing the public constructor is an incompatible API change. Perhaps instead the default implementation should remain on the base class, but the factory should be used. That way, so long as folks are using the default implementation, existing code that calls 'new Trash()' will still work. That constructor can be deprecated, and the refactoring of the default implementation could be made later.

Doug Cutting
added a comment - 14/Jul/11 01:15 I think the patch looks generally good and this is a fine direction to pursue. But the Trash API is declared Stable, and removing the public constructor is an incompatible API change. Perhaps instead the default implementation should remain on the base class, but the factory should be used. That way, so long as folks are using the default implementation, existing code that calls 'new Trash()' will still work. That constructor can be deprecated, and the refactoring of the default implementation could be made later.

Usman Masood
added a comment - 07/Jul/11 00:13 The issue is to choose the right interface for pluggable Trash modules. Currently the public methods are:
moveToTrash(..)
getEmptier()
checkpoint()
expunge()
The first two methods should be part of the Trash interface, but I'm not sure about the last two. Not every Trash policy should be required to implement a checkpoint mechanism.
Currently expunge() and checkpoint() are used by FsShell for the "-expunge" arg.