Then I'm just storing a Date (timestamp) with setTimestamp(...), but I'mnot sure if readFields(...) is implemented correctly (because ofparamIn.readLine()). I assume I have to append a newline in write(...).

On 10/07/2010 05:41 PM, Ted Yu wrote:> Since mFormatter.format() returns a String, you don't need to introduce> newline.> You can call paramOut.writeUTF() to save the String and call> paramIn.readUTF() to read it back.

Ok, that's what I did right after replying. My value is a List<XMLEvent>or more precisely now it's ListWritable which extends ArrayWritable:

I'm currently writing the XMLEventWritable class, which will be aslightly bigger thing (serialization depends on the EventType). Ihaven't coded any serialization/deserialization of Objects before, soI've for example something like

Yes, my approach is to parse a very big XML file (wikipedia revisions)with StAX in my RecordReader implementation. The key is a timestamp andthe values are List<XMLEventWritable>s, because I don't want to have tosetup a new StAX-Parser in every Map, but ok, I assume the cost ofsetting up StAX-Parsers is negligible, so I can write Text-valuesinstead of List<XMLEventWritable>-values.

You can use an XMLOutputFactory to create an XMLEventWriter, and then use anXMLEventFactory to create events that can then be written to theXMLEventWriter.On Thu, Oct 7, 2010 at 4:05 PM, Johannes.Lichtenberger <[EMAIL PROTECTED]> wrote:

I just rethought about the List with Writable XMLEvents. I think itwould probably be better, otherwise I have to create an XMLEventReaderin the RecordReader, Mapper and Reducer, and everytime convert theevents back to Text, whereas a List of XMLEvents would be sufficient.I'm currently not sure.

On 10/08/2010 02:38 AM, Johannes.Lichtenberger wrote:> On 10/08/2010 01:29 AM, Ted Yu wrote:>> http://download.oracle.com/javase/6/docs/api/javax/xml/stream/XMLEventWriter.html>>>> You can use an XMLOutputFactory to create an XMLEventWriter, and then use an>> XMLEventFactory to create events that can then be written to the>> XMLEventWriter.> > I just rethought about the List with Writable XMLEvents. I think it> would probably be better, otherwise I have to create an XMLEventReader> in the RecordReader, Mapper and Reducer, and everytime convert the> events back to Text, whereas a List of XMLEvents would be sufficient.> I'm currently not sure.

Hm, maybe I can just call mEvent.writeAsEncodedUnicode(writer), thanoutput the writer but then I'm not entirely sure how to do the reversething, reading the input (how to implement readFields(...)).

> On 10/08/2010 02:38 AM, Johannes.Lichtenberger wrote:> > On 10/08/2010 01:29 AM, Ted Yu wrote:> >>> http://download.oracle.com/javase/6/docs/api/javax/xml/stream/XMLEventWriter.html> >>> >> You can use an XMLOutputFactory to create an XMLEventWriter, and then> use an> >> XMLEventFactory to create events that can then be written to the> >> XMLEventWriter.> >> > I just rethought about the List with Writable XMLEvents. I think it> > would probably be better, otherwise I have to create an XMLEventReader> > in the RecordReader, Mapper and Reducer, and everytime convert the> > events back to Text, whereas a List of XMLEvents would be sufficient.> > I'm currently not sure.>> Hm, maybe I can just call mEvent.writeAsEncodedUnicode(writer), than> output the writer but then I'm not entirely sure how to do the reverse> thing, reading the input (how to implement readFields(...)).>> regards,> Johannes>

Hm, I'm working really often with StAX and I'm using the event reader inmy custom input format to create the records and produce a List ofXMLEvents or more precisely now ListWritable which has a membervariableList<XMLEventWritable> and extends ArrayWritable:

But my problem is now readFields(final DataInput paramIn). I assume Ihave to create an XMLEventFactory, but I'm not sure how to determine thetype of the event and so on (normally event.getEventType()). So I wouldhave to get the whole input:

final String line = paramIn.readUTF();

and parse it if it's a start tag, end tag, comment, character... andthen create the appropriate event?

Baaah, horrible ;-)

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext