I haven't been reading the list for the past couple weeks, I've quitebusy... but I've searched and didn't find any discussions related to mycurrent issue, so I thought I'd ask while I'm still investigating on myown...!

We've been running a Kafka 0.7.0 cluster without problem for a while now.I've played around<http://felixgv.com/post/88/kafka-distributed-incremental-hadoop-consumer/>withimporting data from our Kafka cluster into hadoop a while ago, usingthe simple Kafka consumer located in the contrib directory of the Kafkasource, and that worked properly. At the time, the Hadoop cluster I wasrunning was CDH3u3, IIRC.

I'm now revisiting that project with a brand new CDH4.1.2 Hadoop cluster(using MR1, not YARN), and I'm having difficulty getting it to work.

At first, the run-class.sh script in kafka/contrib/hadoop-consumer wasn'tusing the proper hadoop jars to connect to my cluster, so I tweaked it sothat it includes the output of the `hadoop classpath` command in itsclasspath. It's now able to connect to my hadoop cluster, but it's tellingme that the versions don't match:

Exception in thread "main" org.apache.hadoop.ipc.RemoteException: ServerIPC version 7 cannot communicate with client version 3at org.apache.hadoop.ipc.Client.call(Client.java:740) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)at $Proxy0.getProtocolVersion(Unknown Source) ... (I could give the whole stacktrace if you want, but I didn't thinkthat's really relevant...)

So anyway, I've messed around with thekafka/project/build/KafkaProject.scala file so that it uses the"2.0.0-mr1-cdh4.1.2" version of hadoop-core, and fetches it from thecloudera repo. I've added the cloudera repo by adding this line at thebeginning of the HadoopConsumerProject class section:

When I run ./sbt update, it fetches the new jars correctly, but then, whenI run ./sbt package, it's not able to find a bunch of hadoop relatedclasses and packages in the hadoop-consumer code, which I guess means thata few APIs have changed between the two versions of CDH.

I've tried this on the 0.7.0 branch of Kafka (from the Apache git repo) aswell as on the 0.7.2 branch, and I get the same result on both (I can'tsuccessfully run ./sbt package). The easiest for me would be to get it towork on Kafka 0.7.0, but I guess I could persuade my people to upgrade to0.7.2 if it's necessary (I'd like us to upgrade, but I guess you all knowhow it is... getting a working system to change is a political hassle). Idon't think we'd be willing to move to Kafka 0.8 just yet, so hopefullythat won't be necessary.

*TLDR: Is anyone pumping data from Kafka 0.7.x to CDH4.x ? And if so, how?Using the example consumer from kafka's contrib, or another one?* Perhaps thisone <https://github.com/miniway/kafka-hadoop-consumer>? (I'll probably giveit a try soon, BTW, so I'll keep you guys posted...). I may also tryporting the hadoop-consumer contrib to CDH4.

Finally, I haven't seen anything mentioned about the LinkedInkafka/avro/hadoop ETL stuff we've been hearing about for a while. I saw thenew LinkedIn DataFu stuff but it seems unrelated. Are there any updatesabout whether or when the ETL code would get open sourced? As far as we'reconcerned, we're using avro quite a bit, so in our case, the avro couplingwould definitely not be an issue. I don't know what version(s) of hadoopLinkedIn is running, though, so perhaps their stuff wouldn't work out ofthe box with CDH4 either anyway...

Any advice would be appreciated!

Thanks :) !

--Felix

+

Felix GV 2013-01-07, 19:21

-

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

> > Finally, I haven't seen anything mentioned about the LinkedIn> > kafka/avro/hadoop ETL stuff we've been hearing about for a while.> >>> The LinkedIn ETL kafka/avro/hadoop project is open sourced. See here -> https://github.com/linkedin/camus/wiki/Camus-Overview>> Thanks,> Neha>

+

Felix GV 2013-01-07, 19:32

-

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

________________________________NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.

+

Matt Lieber 2013-03-14, 21:34

-

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

I have used KafkaETLJob to write a job that consumes from Kafka and writes to HDFS. Kafka version 0.7.2 rc5 and CDH 4.1.2.

Is anything in particular not working?

-David

On 3/14/13 5:31 PM, Matt Lieber wrote:> Just curious, were you able to make Camus work with CDH4 then ?>> Cheers,> Matt>> ________________________________>>>>>>> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.>

+

David Arthur 2013-03-15, 03:29

-

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

=david, we use a subset of the KafkaETLJob in cdh4 with great success. Justmake sure to compile your mapreduce against CDH4On Thu, Mar 14, 2013 at 10:28 PM, David Arthur <[EMAIL PROTECTED]> wrote:

> I have used KafkaETLJob to write a job that consumes from Kafka and writes> to HDFS. Kafka version 0.7.2 rc5 and CDH 4.1.2.>> Is anything in particular not working?>> -David>>> On 3/14/13 5:31 PM, Matt Lieber wrote:>>> Just curious, were you able to make Camus work with CDH4 then ?>>>> Cheers,>> Matt>>>> ______________________________**__>>>>>>>>>>>>>> NOTE: This message may contain information that is confidential,>> proprietary, privileged or otherwise protected by law. The message is>> intended solely for the named addressee. If received in error, please>> destroy and notify the sender. Any use of this email is prohibited when>> received in error. Impetus does not represent, warrant and/or guarantee,>> that the integrity of this communication has been maintained nor that the>> communication is free of errors, virus, interception or interference.>>>>>-- Matthew RathboneFoursquare | Software Engineer | Server Engineering Team[EMAIL PROTECTED] | @rathboma <http://twitter.com/rathboma> |4sq<http://foursquare.com/rathboma>

+

Matthew Rathbone 2013-03-15, 16:20

-

Re: Is anyone able to consume from Kafka 0.7.x and write into Hadoop CDH 4.x ?

> =david, we use a subset of the KafkaETLJob in cdh4 with great success. Just> make sure to compile your mapreduce against CDH4>>> On Thu, Mar 14, 2013 at 10:28 PM, David Arthur <[EMAIL PROTECTED]> wrote:>> > I have used KafkaETLJob to write a job that consumes from Kafka and> writes> > to HDFS. Kafka version 0.7.2 rc5 and CDH 4.1.2.> >> > Is anything in particular not working?> >> > -David> >> >> > On 3/14/13 5:31 PM, Matt Lieber wrote:> >> >> Just curious, were you able to make Camus work with CDH4 then ?> >>> >> Cheers,> >> Matt> >>> >> ______________________________**__> >>> >>> >>> >>> >>> >>> >> NOTE: This message may contain information that is confidential,> >> proprietary, privileged or otherwise protected by law. The message is> >> intended solely for the named addressee. If received in error, please> >> destroy and notify the sender. Any use of this email is prohibited when> >> received in error. Impetus does not represent, warrant and/or guarantee,> >> that the integrity of this communication has been maintained nor that> the> >> communication is free of errors, virus, interception or interference.> >>> >>> >>>> --> Matthew Rathbone> Foursquare | Software Engineer | Server Engineering Team> [EMAIL PROTECTED] | @rathboma <http://twitter.com/rathboma> |> 4sq<http://foursquare.com/rathboma>>

+

Craig Lancaster 2013-03-15, 17:44

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext