You can use the "avro" utility that comes when you install the Pythonpackage (or fastavro if you need 3.X support). Then run "avro cat--print-schema /path/to/avro/file".On Sat, Nov 17, 2012 at 5:41 AM, ranjith raghunath <[EMAIL PROTECTED]> wrote:

> I could really use some advice on this topic.>> I am pulling files in avro format from an external source (outside of the> cluster). How can I generate the avro schema file? The end goal is to have> it exposed in Hive.>> Thanks,> Ranjith>

Thanks for response. When you say avro tools you mean avro-tools-.....jarright?

Let me also run the flow by all of you. Use sqoop to download data from anrdbms to avro format. Use avro tools to extract schema file. Use avro serdeto generate/update hive table. So this would eliminate the need forstatically mapping the fields in hive.

> You can use the "avro" utility that comes when you install the Python> package (or fastavro if you need 3.X support). Then run "avro cat> --print-schema /path/to/avro/file".>>> On Sat, Nov 17, 2012 at 5:41 AM, ranjith raghunath <> [EMAIL PROTECTED]> wrote:>>> I could really use some advice on this topic.>>>> I am pulling files in avro format from an external source (outside of the>> cluster). How can I generate the avro schema file? The end goal is to have>> it exposed in Hive.>>>> Thanks,>> Ranjith>>>>

> Thanks for response. When you say avro tools you mean avro-tools-.....jar> right?>> Let me also run the flow by all of you. Use sqoop to download data from an> rdbms to avro format. Use avro tools to extract schema file. Use avro serde> to generate/update hive table. So this would eliminate the need for> statically mapping the fields in hive.>> Is this flow one that makes sense?> On Nov 17, 2012 9:36 AM, "Miki Tebeka" <[EMAIL PROTECTED]> wrote:>>> You can use the "avro" utility that comes when you install the Python>> package (or fastavro if you need 3.X support). Then run "avro cat>> --print-schema /path/to/avro/file".>>>>>> On Sat, Nov 17, 2012 at 5:41 AM, ranjith raghunath <>> [EMAIL PROTECTED]> wrote:>>>>> I could really use some advice on this topic.>>>>>> I am pulling files in avro format from an external source (outside of>>> the cluster). How can I generate the avro schema file? The end goal is to>>> have it exposed in Hive.>>>>>> Thanks,>>> Ranjith>>>>>>>

> I mean the Python tools (easy_install avro).>>>> On Sat, Nov 17, 2012 at 7:46 AM, ranjith raghunath <> [EMAIL PROTECTED]> wrote:>>> Thanks for response. When you say avro tools you mean avro-tools-.....jar>> right?>>>> Let me also run the flow by all of you. Use sqoop to download data from>> an rdbms to avro format. Use avro tools to extract schema file. Use avro>> serde to generate/update hive table. So this would eliminate the need for>> statically mapping the fields in hive.>>>> Is this flow one that makes sense?>> On Nov 17, 2012 9:36 AM, "Miki Tebeka" <[EMAIL PROTECTED]> wrote:>>>>> You can use the "avro" utility that comes when you install the Python>>> package (or fastavro if you need 3.X support). Then run "avro cat>>> --print-schema /path/to/avro/file".>>>>>>>>> On Sat, Nov 17, 2012 at 5:41 AM, ranjith raghunath <>>> [EMAIL PROTECTED]> wrote:>>>>>>> I could really use some advice on this topic.>>>>>>>> I am pulling files in avro format from an external source (outside of>>>> the cluster). How can I generate the avro schema file? The end goal is to>>>> have it exposed in Hive.>>>>>>>> Thanks,>>>> Ranjith>>>>>>>>>>>

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext