It looks straightforward in that it wants the output string doesn't have any XML declaration and it must contain indentation.

But I wonder how the actual output should be, suppose I have an XML node:

<p><media type="audio" id="au008093" rights="wbowned">
<title>Bee buzz</title>
</media>Most other kinds of bees live alone instead of in a colony. These bees make
tunnels in wood or in the ground. The queen makes her own nest.</p>

Could I assume the resulting String after applying the above transformation is:

You have an XML respesentation in a DOM tree.
For example you have opened an XML file and you have passed it in the DOM parser.
As a result a DOM tree in memory with your XML is created.
Now you can only access the XML info via traversal of the DOM tree.
If you need though, a String representation of the XML info of the DOM tree you use a transformation.
This happens since it is not possible to get the String representation directly from a DOM tree.
So if for example as Node node you pass in nodeToString is the root element of the XML doc then the result is a String containing the original XML data.
The tags will still be there. I.e. you will have a valid XML representation. Only this time will be in a String variable.