I'm not sure I am exactly the right person for this, but I assume that youare familiar with genetic algorithms. The Mahout Project is probably agood place to start http://mahout.apache.org/ they have a number ofmachine learning algorithms that run on top of Hadoop. I did a search andit looks like there may already be some support for them in Mahout, but Idon't know the current state of it. It looked like there was somediscussion about it being abandoned and might be deleted. Either way itwould be a good starting point. Commons Math may be a good place to looktoo because there is an implementation there that is already Apachelicensed. So if you borrow some of the code there is no issuehttp://commons.apache.org/math/userguide/genetics.html.

--Bobby Evans

On 1/16/13 8:24 AM, "Varsha Raveendran" <[EMAIL PROTECTED]>wrote:

>Hello!>>I require information regarding a project given on the Hadoop website. Can>anyone guide me in the right direction?>>The project is "Implement a library/framework to support Genetic>Algorithms<http://en.wikipedia.org/wiki/Genetic_algorithm>on Hadoop>Map-Reduce.">>>Regards,>Varsha>>New to Hadoop :)

+

Robert Evans 2013-01-18, 18:46

-

Re: More information regarding the Project suggestions given on the Hadoop website

> I'm not sure I am exactly the right person for this, but I assume that you> are familiar with genetic algorithms. The Mahout Project is probably a> good place to start http://mahout.apache.org/ they have a number of> machine learning algorithms that run on top of Hadoop. I did a search and> it looks like there may already be some support for them in Mahout, but I> don't know the current state of it. It looked like there was some> discussion about it being abandoned and might be deleted. Either way it> would be a good starting point. Commons Math may be a good place to look> too because there is an implementation there that is already Apache> licensed. So if you borrow some of the code there is no issue> http://commons.apache.org/math/userguide/genetics.html.>> --Bobby Evans>> On 1/16/13 8:24 AM, "Varsha Raveendran" <[EMAIL PROTECTED]>> wrote:>> >Hello!> >> >I require information regarding a project given on the Hadoop website. Can> >anyone guide me in the right direction?> >> >The project is "Implement a library/framework to support Genetic> >Algorithms<http://en.wikipedia.org/wiki/Genetic_algorithm>on Hadoop> >Map-Reduce."> >> >> >Regards,> >Varsha> >> >New to Hadoop :)>>-- *-Varsha *

+

Varsha Raveendran 2013-01-19, 04:12

-

Re: More information regarding the Project suggestions given on the Hadoop website

Hello!Based on couple of existing genetic algorithms library available on thenet, my team and I have come up with a design for the library. But we arenot able to understand how to validate the library -

Are there any test designs followed to test if a library is workingcorrectly?I would like to again mention that we are graduate students and have juststarted working on Hadoop.

> Thank you! I will check with the Mahout team and also go through Commons> Math site.>> Thanks & Regards,> Varsha>>> On Sat, Jan 19, 2013 at 12:16 AM, Robert Evans <[EMAIL PROTECTED]>wrote:>>> I'm not sure I am exactly the right person for this, but I assume that you>> are familiar with genetic algorithms. The Mahout Project is probably a>> good place to start http://mahout.apache.org/ they have a number of>> machine learning algorithms that run on top of Hadoop. I did a search and>> it looks like there may already be some support for them in Mahout, but I>> don't know the current state of it. It looked like there was some>> discussion about it being abandoned and might be deleted. Either way it>> would be a good starting point. Commons Math may be a good place to look>> too because there is an implementation there that is already Apache>> licensed. So if you borrow some of the code there is no issue>> http://commons.apache.org/math/userguide/genetics.html.>>>> --Bobby Evans>>>> On 1/16/13 8:24 AM, "Varsha Raveendran" <[EMAIL PROTECTED]>>> wrote:>>>> >Hello!>> >>> >I require information regarding a project given on the Hadoop website.>> Can>> >anyone guide me in the right direction?>> >>> >The project is "Implement a library/framework to support Genetic>> >Algorithms<http://en.wikipedia.org/wiki/Genetic_algorithm>on Hadoop>> >Map-Reduce.">> >>> >>> >Regards,>> >Varsha>> >>> >New to Hadoop :)>>>>>>> --> *-Varsha *>

-- *-Varsha *

+

Varsha Raveendran 2013-02-07, 06:55

-

Re: More information regarding the Project suggestions given on the Hadoop website

This conversation is probably better for common-user@ so I am moving itover there, I put common-dev@ in the BCC.

I am not really sure what you mean by validate. I assume you want to testthat your library does what you want it to do. I would start out withunit tests to validate the individual pieces work as you designed them to. After that you want to do some system level testing. When I typicallyport an algorithm over to Hadoop there are one of two goals that I have.I either want to reproduce the original algorithm exactly or I want tocreate a good enough approximation of it that is extremely scalable.

If you recreated the algorithm exactly you could validate it against thesingle computer reference implementation and check that the results areidentical. With machine learning this is often difficult because manyalgorithms use random numbers as part of the process. To get around thisyou sometimes have to modify both implementations to be able to use aconsistent set of pseudo-random numbers.

The other alternative is to use statistics, and this works fairly well nomatter how you ported the algorithm. Train using the same input datamultiple times using each implementation. Compare the results against atest set. As grad students you probably already understand the statsnecessary to do this correctly already. Your advisor will probably alsobe able to give you better advice on this too, because they can sit downwith you and give you much faster feedback.

--Bobby

On 2/7/13 12:55 AM, "Varsha Raveendran" <[EMAIL PROTECTED]>wrote:

>Hello!>>>Based on couple of existing genetic algorithms library available on the>net, my team and I have come up with a design for the library. But we are>not able to understand how to validate the library ->>Are there any test designs followed to test if a library is working>correctly?>>>I would like to again mention that we are graduate students and have just>started working on Hadoop.>>Thanks in advance,>Varsha>>>>On Sat, Jan 19, 2013 at 9:42 AM, Varsha Raveendran <>[EMAIL PROTECTED]> wrote:>>> Thank you! I will check with the Mahout team and also go through Commons>> Math site.>>>> Thanks & Regards,>> Varsha>>>>>> On Sat, Jan 19, 2013 at 12:16 AM, Robert Evans>><[EMAIL PROTECTED]>wrote:>>>>> I'm not sure I am exactly the right person for this, but I assume that>>>you>>> are familiar with genetic algorithms. The Mahout Project is probably a>>> good place to start http://mahout.apache.org/ they have a number of>>> machine learning algorithms that run on top of Hadoop. I did a search>>>and>>> it looks like there may already be some support for them in Mahout,>>>but I>>> don't know the current state of it. It looked like there was some>>> discussion about it being abandoned and might be deleted. Either way>>>it>>> would be a good starting point. Commons Math may be a good place to>>>look>>> too because there is an implementation there that is already Apache>>> licensed. So if you borrow some of the code there is no issue>>> http://commons.apache.org/math/userguide/genetics.html.>>>>>> --Bobby Evans>>>>>> On 1/16/13 8:24 AM, "Varsha Raveendran" <[EMAIL PROTECTED]>>>> wrote:>>>>>> >Hello!>>> >>>> >I require information regarding a project given on the Hadoop website.>>> Can>>> >anyone guide me in the right direction?>>> >>>> >The project is "Implement a library/framework to support Genetic>>> >Algorithms<http://en.wikipedia.org/wiki/Genetic_algorithm>on Hadoop>>> >Map-Reduce.">>> >>>> >>>> >Regards,>>> >Varsha>>> >>>> >New to Hadoop :)>>>>>>>>>>>> -->> *-Varsha *>>>>>>-- >*-Varsha *

+

Robert Evans 2013-02-07, 15:13

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext