Description

Products.PortalTransforms.TransformEngine has a terrible _findPath method in terms of performance and scalability (especially against the number of mimetypes supported by transforms). For instance, if you have an image transformer which supports, let's say, 15 mimetypes in input and outputs, then the current naive algorithm will do about 15! (factorial 15 = 1.307.674.368.000) loops.

1) simple optimization heuristics (early path pruning) in _getPaths (changes the semantic of _getPaths a bit but this shouldn't matter)

2) a better shortest-path finding algorithm in _findPath (which, BTW, does not call _getPaths any more and _getPaths might then be removed)

If ever you don't want to deal with the proposed path-finding algorithm, please note that you may ignore this second level of optimization (changes in _findPath) and just commit the first one (changes in _getPaths). The improvement in scalability and performance will already be significant.

Even in some simple cases the new algorithm didn't provide correct results. We got a test failure in linkintegrity showing that the reStructuredText transform wasn't called anymore. This is a rather simple transform chain from:

text/x-rst -> text/html -> text/x-html-safe

With the first done by the rest transform and the second done by the safe-html transform. The new algorithm only found the first step, but failed at the second.

Details :
We are iterating over typesToStartFrom (line 402 of the patched version) while removing items from this typesToStartFrom list (line 421).
I am checking whether the removal line (typesToStartFrom.remove(startingType)) can simply be removed or if something else is needed. I keep you informed.

Cool. Since the algorithm isn't quite straight forward, it would be good to get a decent test coverage of it. Something simple that just tests the correct results for a given set of transforms and various transform paths. I think test_engine would be a good place for that.

Here (attachment above) is a revised version of the patch. It fixes the previous bug it had (see my comment above). And it also contains a unit test.
This unit test was successful when run against the original version and against the patched version as well (which is just a performance optimization after all, and performance is not test in this unit test). This unit test fails with the first buggy version of this patch.
Other PortalTransforms and linkintegrity tests also succeed with this new version of the patch.
It should be fine. If not, the unit test will help capturing any further regression.