Implementing XML Decoder for Apache MINA

Apache MINA has wonderful concept of ProtocolDecoder to process Decoding protocol specific messages. XML is one of the most widely used format for EDA. Lets see how can we implement a Protocol Decoder for Apache MINA.

Algorithm

The picture below describes the basic algorithm that we need to use to construct an XML message from bytes.

The logic is simple, keep reading the bytes till the XML message is balanced. Balanced here means, that end of the root element has been achieved. For eg. If the xml document has root element as , we have to read the bytes till we received .

Its very particular to note that large XML packets when sent over TCP, may get fragmented and we shall received the same amount of read events while using Apache MINA low level API’s.

This type of situations where, we need to wait to data to completely arrive, calls for the use of CumulativeProtocolDecoder. As the name signifies, the decoder waits till, we get the balanced xml. Once the balanced XML is found, we write the parsed object to the output, to be processed further.

The implementation is pretty straight forward. We take each character and try to match the characters as specified in XML specification.

Some keys things in the implementation:
1. The Decode function just collects the bytes till we get the balanced XML document
2. Once we get the balanced XML document, we shall call the abstract function parseXML(). The function has been kept abstract, so that its easy to implement custom parsing using desired XML library like JAXB, JIBX etc
3. We have to return true from doDecode(), the moment we have balanced XML. Return type true indicates to the framework that we are not waiting for any more data. A false, forces the framework to keep accumulating the data, till we write it to the output. Now it must be clear why, its called Cumulative decoder.

Still have Queries, please leave a comment and I shall revert back to you.

Post navigation

19 thoughts on “Implementing XML Decoder for Apache MINA”

I have tried to implement a protocol using the CumulativeProtocolDecoder . Then i wanted to test it using Junit , but when i simulate a 2 step decoding (a fragmented packet) i lost the data that came from the first call on doDecode … The ioBuffer is cleared after the return false ???

Nope ioBuffer is not cleared after false. returning false is an indicator to keep buffering till we receive complete payload. I tested the code above with 4 fragmented packets and it worked fine. If you can send the code, I can give it a try.

Yup this is a problem and unfortunately I don’t have a solution as of now. My thought process on handling this is to keep a tab on xml size and disconnect violating clients. Also, since we know the type of xml we are handling, it shall be easy. There should be a way around this. Let me check async web implementation for this. It will be a good idea to post this query in MINA forum as well.

I am one who started the thread “http://www.nabble.com/MINA—design-guidelines-td20077627.html#a20095703”

Do you have any updates on how to handle clients sending very long xml documents? Also any thoughts on how to process mutiple xml documents came as part of single read.

What are your thoughts on the following ideas?
1) Keep reading the data until all the xml documents are read and constructed and then process the xml documents one at a time and write the result back to client session.

2) Create a new thread as soon as an xml document is read and process the constructed xml document using the new thread.

Could you please post the complete code (server & client implementation including the encoders & decoders) for testing the XMLDecoder program in the blog.

Hi,
I want to use tcp mina endpoint to read XML file.I have written custom XML codec by implementing Protocolcodecfactory interface.And it got two unimplemented methods protocoldecoder and protocolencoder.I just implemented these interfaces.When i try to get entire XML,i am not able to get.It is getting broken into 4 to 5 pieces.I am receiving in file endpoint as 4 to 5 files.what could be the problem?

thanks avinash.But i don’t have any decoding logic in decode method of ProtocolDecoder.My requirement is to receive complete xml file as one file into file endpoint.My decode method is keep on calling until buffer gets empty.but i want decode method to be called once only. Can you send me sample code to do that?

Thanks ashish for quick reply for new beginners.
In the post it is extending cumulativeprotocolcodec factory.but i am implementing prtotocoldecoder in my case.
here is the complete code.Plz help me out how can i get the full xml file on file endpoint.

decode method of XMLDecoder will be called for 4 to 5 times.In the outbox folder,uploaded xml file is getting splitted into 4/5 file.
tell me how to receive uploaded file into outbox folder as a single file.Please tell me where am i doing mistake

Thank you for posting this example. It was very helpful — even several years later. I also came across your StaxXmlDecoder in the mina email archives (2009).

I’m sure you know this by now, but for those who find that email (or only care about the string in the above, the extra characters after the end element are due to ioBuffer.array() returning its contents between the position marked by limit() and the end (i.e., capacity()).

I dealt with this by extracting the substring of interest from the resulting string. I’m new to mina (and NIO) so I don’t know that this is the best approach. Since I’ve one XML doc per connection, it should be ok for me.