Will Processing in QGIS 3 support parallelization?

Will Processing in QGIS 3 support parallelization?

Hi,

I'm currently writing a script that could benefit a lot from parallelization. Will Processing in QGIS 3 provide parallelization support? Or is it better to look into how to achieve parallelization independent of Processing, e.g. using Python's multiprocessing library?

I'm currently writing a script that could benefit a lot from
parallelization. Will Processing in QGIS 3 provide parallelization support?
Or is it better to look into how to achieve parallelization independent of
Processing, e.g. using Python's multiprocessing library?

Re: Will Processing in QGIS 3 support parallelization?

On 3 January 2017 at 18:19, Anita Graser <[hidden email]> wrote:
> Hi,
>
> I'm currently writing a script that could benefit a lot from
> parallelization. Will Processing in QGIS 3 provide parallelization support?
> Or is it better to look into how to achieve parallelization independent of
> Processing, e.g. using Python's multiprocessing library?

Depends what you're after. When processing is ported to the new task
manager framework then algorithms will be able to run in parallel
(where possible). Eg a buffer for one layer can run while a transform
occurs on another layer.

If you're after parallelization *within* a single algorithm (Eg
buffering features using multiple threads) then I'm unaware of any
plans in place to handle this.

* That said.... read on for some thinking aloud....

I think when we port the guts of processing over to c++ then this will
become relatively straightforward. I'd see this happening:
- algorithms which operate feature-by-feature inherit a special
algorithm subclass (say "QgsFeatureBasedAlgorithm" or something) and
override some base class "QgsFeature processFeature( QgsFeature )"
method. Eg a buffer alg will implement this to buffer the passed
feature's geometry and return a new modified feature.
- QgsFeatureBasedAlgorithm could take advantage of something like
QtConcurrent::mappedReduced to call processFeature on multiple threads
and use the result function to write out the results for each
processFeature call. Nice and (theoretically) easy way to gain
multithreaded algorithms, and it would be simple to adapt many
existing algorithms to this (buffer, centroid, transform, translate,
.... Basically anything which operates on each feature in isolation).

The side benefit of this refactoring would allow something I've wanted
for a while - a way for processing algorithms to modify a selection
inside a layer "in place". Eg select a bunch of polygons, run the
buffer alg on the selection (not sure of the best UI to expose this!)
and each selected feature will be buffered. Currently there's no easy
way to do this in QGIS - you've got to run the alg on a selection and
get a new layer, then delete the selection, and finally copy features
back from the new output layer to the source layer. Yuck.

Re: Will Processing in QGIS 3 support parallelization?

Hi Nyall,

Interesting.

Your "thinking aloud" - is this something that would have to wait for QGIS 4x or could this be interested in 3.x already? Just wondering ...

From a user point of view it would be very interesting to have processing within a layer in update mode without having to create separate new layers. Also, of course, the parallelization option (for certain algorithms where feasible).

I'm currently writing a script that could benefit a lot from parallelization. Will Processing in QGIS 3 provide parallelization support? Or is it better to look into how to achieve parallelization independent of Processing, e.g. using Python's multiprocessing library?

Depends what you're after. When processing is ported to the new task manager framework then algorithms will be able to run in parallel (where possible). Eg a buffer for one layer can run while a transform occurs on another layer.

If you're after parallelization *within* a single algorithm (Eg buffering features using multiple threads) then I'm unaware of any plans in place to handle this.

* That said.... read on for some thinking aloud....

I think when we port the guts of processing over to c++ then this will become relatively straightforward. I'd see this happening: - algorithms which operate feature-by-feature inherit a special algorithm subclass (say "QgsFeatureBasedAlgorithm" or something) and override some base class "QgsFeature processFeature( QgsFeature )" method. Eg a buffer alg will implement this to buffer the passed feature's geometry and return a new modified feature. - QgsFeatureBasedAlgorithm could take advantage of something like QtConcurrent::mappedReduced to call processFeature on multiple threads and use the result function to write out the results for each processFeature call. Nice and (theoretically) easy way to gain multithreaded algorithms, and it would be simple to adapt many existing algorithms to this (buffer, centroid, transform, translate, .... Basically anything which operates on each feature in isolation).

The side benefit of this refactoring would allow something I've wanted for a while - a way for processing algorithms to modify a selection inside a layer "in place". Eg select a bunch of polygons, run the buffer alg on the selection (not sure of the best UI to expose this!) and each selected feature will be buffered. Currently there's no easy way to do this in QGIS - you've got to run the alg on a selection and get a new layer, then delete the selection, and finally copy features back from the new output layer to the source layer. Yuck.

Re: Will Processing in QGIS 3 support parallelization?

On 3 January 2017 at 18:19, Anita Graser <[hidden email]> wrote:
> I'm currently writing a script that could benefit a lot from
> parallelization. Will Processing in QGIS 3 provide parallelization support?
> Or is it better to look into how to achieve parallelization independent of
> Processing, e.g. using Python's multiprocessing library?

Depends what you're after. When processing is ported to the new task
manager framework then algorithms will be able to run in parallel
(where possible). Eg a buffer for one layer can run while a transform
occurs on another layer.

If you're after parallelization *within* a single algorithm (Eg
buffering features using multiple threads) then I'm unaware of any
plans in place to handle this.

​Yes, definitely within a single algorithm.

Feature-by-feature parallelization would certainly be helpful for many algorithms.

For my current use case, I'd like to process groups of features (identified by common attribute value) in parallel ... but that's probably more complex and less common.

Re: [Qgis-developer] Will Processing in QGIS 3 support parallelization?

Has there been any change to the status of parallelization support?

If nobody is working on built-in support in Processing, I'd still be interested in working examples of custom implementations for individual algorithms - or failing that - warnings about what won't work (so I don't waste time trying).