Hi Vinod, I had the node assignment at first but in my second email Iexplained how I want to change the order of data partition execution. Thedefault is run tasks based on the *size *of the assigned partition to it.Now I want to run tasks such that specific order of partitions is to beexecuted.

Eg. First assume input is directory Houses/ with files {Villa, Apartment,Room} such that file "Villa" is larger in size than "Apartments" than"Room".

>> I assume you are talking about MapReduce. And 1.x release or 2.x?>> In either of the releases, this cannot be done directly.>> In 1.x, the framework doesn't expose a feature like this as it is a shared> service, and if enough jobs flock to a node, it will lead to utilization> and failure handling issues.>> In Hadoop 2 YARN, the platform does expose this functionality. But> MapReduce framework doesn't yet expose this functionality to the end users.>> What exactly is your use case? Why are some nodes of higher priority than> others?>> Thanks,> +Vinod Kumar Vavilapalli> Hortonworks Inc.> http://hortonworks.com/>> On Sep 11, 2013, at 10:09 AM, Mark Olimpiati wrote:>> Thanks for replying Rev, but the link is talking about reducers which> seems to be like a similar case but what if I assigned priorities to the> data partitions (eg. partition B=1, partition C=2, partition A=3,...) such> that first map task is assigned partition B to run first. Then second map> is given partition C, .. etc. This is instead of assigning based on> partition size. Is that possible?>> Thanks,> Mark>>> On Mon, Sep 9, 2013 at 11:17 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote:>>>>> http://lucene.472066.n3.nabble.com/Assigning-reduce-tasks-to-specific-nodes-td4022832.html>>>> ------------------------------>> *From:* Mark Olimpiati <[EMAIL PROTECTED]>>> *To:* [EMAIL PROTECTED]>> *Sent:* Friday, September 6, 2013 1:47 PM>> *Subject:* assign tasks to specific nodes>>>> Hi guys,>>>> I'm wondering if there is a way for me to assign tasks to specific>> machines or at least assign priorities to the tasks to be executed in that>> order. Any suggestions?>>>> Thanks,>> Mark>>>>>>>>> CONFIDENTIALITY NOTICE> NOTICE: This message is intended for the use of the individual or entity> to which it is addressed and may contain information that is confidential,> privileged and exempt from disclosure under applicable law. If the reader> of this message is not the intended recipient, you are hereby notified that> any printing, copying, dissemination, distribution, disclosure or> forwarding of this communication is strictly prohibited. If you have> received this communication in error, please contact the sender immediately> and delete it from your system. Thank You.

Potentially you would be able to but I guess you will have to update thepartitioning code and correspondingly RMContainerAllocator (YARN-mapreduce) code. Today we have same priority for all map task < same priorityfor all reduce task. What you can do is to change the MAP task prioritiesbased on partition size (file size). Make sure when you are assigningpriorities to container requestpriorities for containers for corresponding map tasksapartment > room > villa....

However you should notice few things here..plus I have few questions foryou..1) I don't see why you want to do this but for your task to succeed youwill need all the of the map tasks to finish.. why you want this ordering??any benefits?2) Even if you submit all the requests with specified priorities you arenot guaranteed to get them in same order because most of these requests arefor specific host machines (node managers) so we don't know in advancewhether sufficient resources will be available there or not.

> Hi Vinod, I had the node assignment at first but in my second email I> explained how I want to change the order of data partition execution. The> default is run tasks based on the *size *of the assigned partition to it.> Now I want to run tasks such that specific order of partitions is to be> executed.>> Eg. First assume input is directory Houses/ with files {Villa, Apartment,> Room} such that file "Villa" is larger in size than "Apartments" than> "Room".>> The default hadoop would run :> map1 --> Villa> map2 --> Apartment> map3 --> Room>> I want to assign priorities to the *data partitions* such that> Apartment=1, Room=2, Villa=3 then the scheduler will run the following in> this order:> map1 --> Apartment> map2 --> Room> map3 --> Villa>> My question is that possible? Notice this is regardless of the assigned> node.> Thank you,> Mark>>> On Wed, Sep 11, 2013 at 10:45 AM, Vinod Kumar Vavilapalli <> [EMAIL PROTECTED]> wrote:>>>>> I assume you are talking about MapReduce. And 1.x release or 2.x?>>>> In either of the releases, this cannot be done directly.>>>> In 1.x, the framework doesn't expose a feature like this as it is a>> shared service, and if enough jobs flock to a node, it will lead to>> utilization and failure handling issues.>>>> In Hadoop 2 YARN, the platform does expose this functionality. But>> MapReduce framework doesn't yet expose this functionality to the end users.>>>> What exactly is your use case? Why are some nodes of higher priority than>> others?>>>> Thanks,>> +Vinod Kumar Vavilapalli>> Hortonworks Inc.>> http://hortonworks.com/>>>> On Sep 11, 2013, at 10:09 AM, Mark Olimpiati wrote:>>>> Thanks for replying Rev, but the link is talking about reducers which>> seems to be like a similar case but what if I assigned priorities to the>> data partitions (eg. partition B=1, partition C=2, partition A=3,...) such>> that first map task is assigned partition B to run first. Then second map>> is given partition C, .. etc. This is instead of assigning based on>> partition size. Is that possible?>>>> Thanks,>> Mark>>>>>> On Mon, Sep 9, 2013 at 11:17 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote:>>>>>>>> http://lucene.472066.n3.nabble.com/Assigning-reduce-tasks-to-specific-nodes-td4022832.html>>>>>> ------------------------------>>> *From:* Mark Olimpiati <[EMAIL PROTECTED]>>>> *To:* [EMAIL PROTECTED]>>> *Sent:* Friday, September 6, 2013 1:47 PM>>> *Subject:* assign tasks to specific nodes>>>>>> Hi guys,>>>>>> I'm wondering if there is a way for me to assign tasks to specific>>> machines or at least assign priorities to the tasks to be executed in that>>> order. Any suggestions?>>>>>> Thanks,>>> Mark>>>>>>>>>>>>>>> CONFIDENTIALITY NOTICE>> NOTICE: This message is intended for the use of the individual or entity

CONFIDENTIALITY NOTICENOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

+

Omkar Joshi 2013-09-16, 18:03

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext