Should we display as Amar defined above ? Or should we show separate columns ? Showing separate columns might make it appear more like an extension than a change to existing interface. Or a different way of asking the question is: is it OK to change the Web interface as shown above.

The second point is that other schedulers which don't support high RAM jobs will treat Tasks and Slots identically, so it could be redundant information for them. Is this OK ?

Hemanth Yamijala
added a comment - 30/Sep/09 17:30 As discussed offline with Amar, a couple of points to consider:
Should we display as Amar defined above ? Or should we show separate columns ? Showing separate columns might make it appear more like an extension than a change to existing interface. Or a different way of asking the question is: is it OK to change the Web interface as shown above.
The second point is that other schedulers which don't support high RAM jobs will treat Tasks and Slots identically, so it could be redundant information for them. Is this OK ?

One disconnect here is that the slots info is computed using the jobinprogress while cluster-summary is computed via task-tracker join-backs. Hence the is a small window where the data will be inconsistent. So another proposal is

Show slots info along with running jobs and show the total count in the end.

Show the slots info in a separate section.

Keep slots info in cluster summary but not to be shared with clients (i.e dont serialize it)
Thoughts?

Amar Kamat
added a comment - 06/Oct/09 10:23 One disconnect here is that the slots info is computed using the jobinprogress while cluster-summary is computed via task-tracker join-backs. Hence the is a small window where the data will be inconsistent. So another proposal is
Show slots info along with running jobs and show the total count in the end.
Show the slots info in a separate section.
Keep slots info in cluster summary but not to be shared with clients (i.e dont serialize it)
Thoughts?

This looks like more a bug than feature. ClusterStatus.getMapTasks/getReduceTasks is called in existing code (in schedulers and examples etc.) to get the number of slots occupied.
I would say cluster summary on JobTracker web ui can show slots occupied instead of running tasks.
The statistics displayed by Amar would look like :

Occupied Maps Slots

Occupied Reduces Slots

Total Submissions

Nodes

Map Slot Capacity

Reduce Slot Capacity

Avg. Tasks/Node

Blacklisted Nodes

Excluded Nodes

30

76

55

38

228

76

8.00

0

0

ClusterStatus.getMapTasks/getReduceTasks will return number of slots occupied (not running tasks). Already ClusterStatus.getMaxMapTasks/getMaxReduceTasks return total number map/reduce slots in the cluster.

In 0.21, corresponding methods for ClusterStatus.getMapTasks/getReduceTasks are ClusterMetrics.getOccupiedMapSlots/getOccupiedReduceSlots and for ClusterStatus.getMaxMapTasks/getMaxReduceTasks are ClusterMetrics.getMapSlotCapacity/getReduceSlotCapacity.

Amareshwari Sriramadasu
added a comment - 08/Oct/09 10:30 In 0.21, corresponding methods for ClusterStatus.getMapTasks/getReduceTasks are ClusterMetrics.getOccupiedMapSlots/getOccupiedReduceSlots and for ClusterStatus.getMaxMapTasks/getMaxReduceTasks are ClusterMetrics.getMapSlotCapacity/getReduceSlotCapacity.

ClusterStatus.getMapTasks/getReduceTasks will return number of slots occupied (not running tasks). Already ClusterStatus.getMaxMapTasks/getMaxReduceTasks return total number map/reduce slots in the cluster.

I think we should distinguish between the two. We should have separate _ClusterStatus.get

{Map|Reduce}Tasks()_ and _ClusterStatus.get{Map|Reduce}

Slots()_ APIs.
Also changing it the way it is suggested is a regression.

Vinod Kumar Vavilapalli
added a comment - 09/Oct/09 08:39 ClusterStatus.getMapTasks/getReduceTasks will return number of slots occupied (not running tasks). Already ClusterStatus.getMaxMapTasks/getMaxReduceTasks return total number map/reduce slots in the cluster.
I think we should distinguish between the two. We should have separate _ClusterStatus.get
{Map|Reduce}Tasks()_ and _ClusterStatus.get{Map|Reduce}
Slots()_ APIs.
Also changing it the way it is suggested is a regression.

Vinod Kumar Vavilapalli
added a comment - 09/Oct/09 08:40 Also changing it the way it is suggested is a regression.
OK, it isn't a regression code wise. But the names don't quite reflect the intention.

Amareshwari Sriramadasu
added a comment - 09/Oct/09 09:54 OK, it isn't a regression code wise. But the names don't quite reflect the intention.
javadoc is updated for the change.
We should have separate ClusterStatus.get{Map|Reduce}Tasks()_ and _ClusterStatus.get{Map|Reduce}Slots() APIs.
Cluster status need not know about tasks. If you see o.a.h.mapreduce.ClusterMetrics, the same methods are remaned. See my comments https://issues.apache.org/jira/browse/MAPREDUCE-1048?focusedCommentId=12763430&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12763430 and
https://issues.apache.org/jira/browse/MAPREDUCE-1048?focusedCommentId=12763004&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12763004

Slots to TaskTracker
Add reserve slots to ClusterMetrics
In heartbeat, there may not be increment in the reserved slots if after the decrement the call gets short circuited. Would be better if we update the reserved slots at more granular level in processHeartbeat and around assignTasks call.

Sharad Agarwal
added a comment - 14/Oct/09 10:22 Overall looks fine. Few minor comments:
move methods JobTracker#getReserved
{Map/Reduce}
Slots to TaskTracker
Add reserve slots to ClusterMetrics
In heartbeat, there may not be increment in the reserved slots if after the decrement the call gets short circuited. Would be better if we update the reserved slots at more granular level in processHeartbeat and around assignTasks call.

The slot information is being accessed in an unsynchronized manner from the UI.

There is non-atomic access of this information. IOW, the map slots and reduce slots are being read from the UI in different calls, and a heartbeat could update them in between.

Note that for the above two points, the cluster status model actually works correctly, because the cluster status is being read from the UI synchronized on the JobTracker and also a snapshot of the values is captured in a new ClusterStatus object when the UI reads it.

Reservation tracking seems broken in many ways:

removeTrackerReservations is being called in lostTaskTracker after reservations are cancelled. So information that needs to be removed is cleared already.

It seems like ExpireLaunchingTasks can result in a tracker being globally blacklisted. But I don't see any add/removeTrackerReservations in this place.

In processHeartbeat, there seem to be code paths where removeTrackerReservations is being called twice. For e.g. when the tracker is decided as lost.

In general, it is very, very hard to verify the correctness of this patch in the current form as the logic is spread out in multiple code paths, and it is difficult to verify if all the code paths are being covered.

Hemanth Yamijala
added a comment - 19/Oct/09 11:52 I am seeing some issues with the 20 patch:
The slot information is being accessed in an unsynchronized manner from the UI.
There is non-atomic access of this information. IOW, the map slots and reduce slots are being read from the UI in different calls, and a heartbeat could update them in between.
Note that for the above two points, the cluster status model actually works correctly, because the cluster status is being read from the UI synchronized on the JobTracker and also a snapshot of the values is captured in a new ClusterStatus object when the UI reads it.
Reservation tracking seems broken in many ways:
removeTrackerReservations is being called in lostTaskTracker after reservations are cancelled. So information that needs to be removed is cleared already.
It seems like ExpireLaunchingTasks can result in a tracker being globally blacklisted. But I don't see any add/removeTrackerReservations in this place.
In processHeartbeat, there seem to be code paths where removeTrackerReservations is being called twice. For e.g. when the tracker is decided as lost.
In general, it is very, very hard to verify the correctness of this patch in the current form as the logic is spread out in multiple code paths, and it is difficult to verify if all the code paths are being covered.

Changes look good to me. Minor comments:
Would be better if we rename JobTracker#updateReservations to JobTracker#incrReservations
In the jsp, all slot info should be together. We can move occupied/reserved slot info after Nodes column

Sharad Agarwal
added a comment - 20/Oct/09 06:30 Changes look good to me. Minor comments:
Would be better if we rename JobTracker#updateReservations to JobTracker#incrReservations
In the jsp, all slot info should be together. We can move occupied/reserved slot info after Nodes column

1) In a 8 node cluster, Run a normal job which takes up 7 slots. Then, run a high RAM job, which takes up 3 slots for 1 map. This will cause this high RAM job to reserve the extra 1 slot and wait, since it needs the other 2 slots to start running. At this point

a) Kill a Task tracker which has that 1 reserved slot and make it lost. The reserved slot should dissapear.
b) Kill task tracker and start it again. It should again get that 1 reservation.
c) Blacklist that tasktracker. The reservation status should reflect accordingly. Bring back the node to healthy state.The reservation status should reflect accordingly
d) Decomission that tasktracker. The reservation status should reflect accordingly. Bring back the node to healthy state.The reservation status should reflect accordingly.

Iyappan Srinivasan
added a comment - 20/Oct/09 11:51 +1 from QA on reservation.
Ran some reservation specific jobs:
1) In a 8 node cluster, Run a normal job which takes up 7 slots. Then, run a high RAM job, which takes up 3 slots for 1 map. This will cause this high RAM job to reserve the extra 1 slot and wait, since it needs the other 2 slots to start running. At this point
a) Kill a Task tracker which has that 1 reserved slot and make it lost. The reserved slot should dissapear.
b) Kill task tracker and start it again. It should again get that 1 reservation.
c) Blacklist that tasktracker. The reservation status should reflect accordingly. Bring back the node to healthy state.The reservation status should reflect accordingly
d) Decomission that tasktracker. The reservation status should reflect accordingly. Bring back the node to healthy state.The reservation status should reflect accordingly.

There are two possible numbers that we can show for the occupied slots. It could be the number of slots running tasks, or it could be the number of slots running tasks + the number of virtual reserved slots (virtual reserved slots for a job = number of trackers with reservations for the job * number of slots per task for the job). Showing the actual number of slots running tasks gives a more correct view, but the second number would be what is seen by the scheduler as the capacity consumed. I think it may be good to show the first number only - it is more intuitive and more correct. Thoughts ?

Hemanth Yamijala
added a comment - 21/Oct/09 10:49 There are two possible numbers that we can show for the occupied slots. It could be the number of slots running tasks, or it could be the number of slots running tasks + the number of virtual reserved slots (virtual reserved slots for a job = number of trackers with reservations for the job * number of slots per task for the job). Showing the actual number of slots running tasks gives a more correct view, but the second number would be what is seen by the scheduler as the capacity consumed. I think it may be good to show the first number only - it is more intuitive and more correct. Thoughts ?

Hemanth Yamijala
added a comment - 22/Oct/09 07:02 In offline discussions with Arun and Eric, we decided to stick to my last proposal of showing the actual number for occupied slots - not including the virtual reserved slots in the count.

Hemanth Yamijala
added a comment - 22/Oct/09 16:19 This looks fine. I have a few minor comments:
It is confusing that incrementReservations can decrement also. I think its better to split the calls into two.
ClusterMetrics javadoc needs to be updated with total number of job submissions.
In the javadoc of ClientProtocol for version 29, please also include total job submissions. I think we also try and include the JIRA which made the change in the comment.
TestClusterStatus can also have a check with a TT coming back twice, so that we can cover that the oldStatus is also used to decrement old slot counts correctly.
Similarly, we can also have a check with re-reservation of slots

Another issue is about the jobtracker.jspx which now is (almost) a copy of jobtracker.jsp. It seems a little odd that we must update the pages independently to keep them in sync. Also, since jobtracker.jspx returns data in an XML format, I am not sure if it is intended to be used as an interface. In that case, things could break if changes are made to it. I am thinking we should update the jobtracker.jspx in a separate JIRA after confirming that users who are using it actually are OK with the change. Also in that JIRA we could figure ways of avoiding having to make changes to both jsp pages by sharing code.

Hemanth Yamijala
added a comment - 22/Oct/09 16:36 Another issue is about the jobtracker.jspx which now is (almost) a copy of jobtracker.jsp. It seems a little odd that we must update the pages independently to keep them in sync. Also, since jobtracker.jspx returns data in an XML format, I am not sure if it is intended to be used as an interface. In that case, things could break if changes are made to it. I am thinking we should update the jobtracker.jspx in a separate JIRA after confirming that users who are using it actually are OK with the change. Also in that JIRA we could figure ways of avoiding having to make changes to both jsp pages by sharing code.