chukwa-dev mailing list archives

Hi Eric,
Can you create another class that takes a writer and make it a pipeline
writer? The logic for pipeline should be extracted and the current writers
should be kept clean.
I'm saying that because I have a new writer implementation and I would have
to do something similar to what you're doing for near real time monitoring.
Thanks,
/Jerome.
On 12/18/09 9:16 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:
> Correction, the HDFS has been written to HDFS correctly. Data were stuck at
> post data processing because the postProcess program crashed. I still need
> to determine the cause of postProcess crash. I think the modified
> SeqFileWriter does what I wanted, and I will implement next.add() to ensure
> the ordering can be interchanged.
>
> Regards,
> Eric
>
> On 12/18/09 8:59 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:
>
>> I like to make a T on the incoming data. One writer goes into HDFS, and
>> another writer enable real time pub/sub to monitor the data. In my case,
>> the data are mirrored, not filtered. However, I am not getting the right
>> result because it seems the data isn't getting written into HDFS regardless
>> the ordering of the writer.
>>
>> Regards,
>> Eric
>>
>> On 12/17/09 9:53 PM, "Ariel Rabkin" <asrabkin@gmail.com> wrote:
>>
>>> What's the use case for this?
>>>
>>> The original motivation for pipelined writers was so that we could do
>>> things like filtering before data got written. Then it occurred to me
>>> that SocketTeeWriter fit fairly naturally into a pipeline.
>>>
>>> Putting it "after" seq file writer wouldn't be too bad --
>>> SeqFileWriter.add() would need to call next.add(). But I would be
>>> hesitant to commit that change, without a really clear use case.
>>>
>>> --Ari
>>>
>>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>>> It works fine after I put SocketTeeWriter first. What needs to be
>>>> implemented in SeqFileWriter to be able to pipe correctly?
>>>>
>>>> Regards,
>>>> Eric
>>>>
>>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <asrabkin@gmail.com> wrote:
>>>>
>>>>> Put the SocketTeeWriter first.
>>>>>
>>>>> sent from my iPhone; please excuse typos and brevity.
>>>>>
>>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>>> socket
>>>>>> reader program. When I tried to configure two writers, i.e.,
>>>>>> SeqFileWriter
>>>>>> follow by SocketTeeWriter. It doesn't work because SeqFileWriter
>>>>>> isn't
>>>>>> extending PipelineableWriter. I went ahead to extend SeqFileWriter
as
>>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>>> and
>>>>>> configured collector with:
>>>>>>
>>>>>> <property>
>>>>>> <name>chukwaCollector.writerClass</name>
>>>>>>
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>>> alue>
>>>>>> </property>
>>>>>>
>>>>>> <property>
>>>>>> <name>chukwaCollector.pipeline</name>
>>>>>>
>>>>>> <value>
>>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>> </property>
>>>>>>
>>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>>> Commands
>>>>>> works fine, but data streaming doesn't happen. How do I configure
the
>>>>>> collector and PipelineStageWriter to be able to write data into
>>>>>> multiple
>>>>>> writer? Is there something on SeqFileWriter that could prevent
this
>>>>>> from
>>>>>> working?
>>>>>>
>>>>>> Regards,
>>>>>> Eric
>>>>>>
>>>>
>>>>
>>>
>>>
>>
>