Navigation

This Page

Examples

The output from all the example programs from PyMOTW has been
generated with Python 2.7.4, unless otherwise noted. Some
of the features described here may not be available in earlier
versions of Python.

Navigation

As with threads, a common use pattern for multiple processes is to
divide a job up among several workers to run in parallel. Effective
use of multiple processes usually requires some communication between
them, so that work can be divided and results can be aggregated.

A simple way to communicate between process with
multiprocessing is to use a Queue to pass messages
back and forth. Any pickle-able object can pass through a
Queue.

importmultiprocessingclassMyFancyClass(object):def__init__(self,name):self.name=namedefdo_something(self):proc_name=multiprocessing.current_process().nameprint'Doing something fancy in %s for %s!'%(proc_name,self.name)defworker(q):obj=q.get()obj.do_something()if__name__=='__main__':queue=multiprocessing.Queue()p=multiprocessing.Process(target=worker,args=(queue,))p.start()queue.put(MyFancyClass('Fancy Dan'))# Wait for the worker to finishqueue.close()queue.join_thread()p.join()

This short example only passes a single message to a single worker,
then the main process waits for the worker to finish.

A more complex example shows how to manage several workers consuming
data from a JoinableQueue and passing results back to the
parent process. The poison pill technique is used to stop the
workers. After setting up the real tasks, the main program adds one
“stop” value per worker to the job queue. When a worker encounters
the special value, it breaks out of its processing loop. The main
process uses the task queue’s join() method to wait for all of
the tasks to finish before processin the results.

importmultiprocessingimporttimeclassConsumer(multiprocessing.Process):def__init__(self,task_queue,result_queue):multiprocessing.Process.__init__(self)self.task_queue=task_queueself.result_queue=result_queuedefrun(self):proc_name=self.namewhileTrue:next_task=self.task_queue.get()ifnext_taskisNone:# Poison pill means shutdownprint'%s: Exiting'%proc_nameself.task_queue.task_done()breakprint'%s: %s'%(proc_name,next_task)answer=next_task()self.task_queue.task_done()self.result_queue.put(answer)returnclassTask(object):def__init__(self,a,b):self.a=aself.b=bdef__call__(self):time.sleep(0.1)# pretend to take some time to do the workreturn'%s * %s = %s'%(self.a,self.b,self.a*self.b)def__str__(self):return'%s * %s'%(self.a,self.b)if__name__=='__main__':# Establish communication queuestasks=multiprocessing.JoinableQueue()results=multiprocessing.Queue()# Start consumersnum_consumers=multiprocessing.cpu_count()*2print'Creating %d consumers'%num_consumersconsumers=[Consumer(tasks,results)foriinxrange(num_consumers)]forwinconsumers:w.start()# Enqueue jobsnum_jobs=10foriinxrange(num_jobs):tasks.put(Task(i,i))# Add a poison pill for each consumerforiinxrange(num_consumers):tasks.put(None)# Wait for all of the tasks to finishtasks.join()# Start printing resultswhilenum_jobs:result=results.get()print'Result:',resultnum_jobs-=1

Although the jobs enter the queue in order, since their execution is
parallelized there is no guarantee about the order they will be
completed.

The Event class provides a simple way to communicate state
information between processes. An event can be toggled between set
and unset states. Users of the event object can wait for it to change
from unset to set, using an optional timeout value.

importmultiprocessingimporttimedefwait_for_event(e):"""Wait for the event to be set before doing anything"""print'wait_for_event: starting'e.wait()print'wait_for_event: e.is_set()->',e.is_set()defwait_for_event_timeout(e,t):"""Wait t seconds and then timeout"""print'wait_for_event_timeout: starting'e.wait(t)print'wait_for_event_timeout: e.is_set()->',e.is_set()if__name__=='__main__':e=multiprocessing.Event()w1=multiprocessing.Process(name='block',target=wait_for_event,args=(e,))w1.start()w2=multiprocessing.Process(name='non-block',target=wait_for_event_timeout,args=(e,2))w2.start()print'main: waiting before calling Event.set()'time.sleep(3)e.set()print'main: event is set'

When wait() times out it returns without an error. The caller
is responsible for checking the state of the event using
is_set().

Condition objects can be used to synchronize parts of a
workflow so that some run in parallel but others run sequentially,
even if they are in separate processes.

importmultiprocessingimporttimedefstage_1(cond):"""perform first stage of work, then notify stage_2 to continue"""name=multiprocessing.current_process().nameprint'Starting',namewithcond:print'%s done and ready for stage 2'%namecond.notify_all()defstage_2(cond):"""wait for the condition telling us stage_1 is done"""name=multiprocessing.current_process().nameprint'Starting',namewithcond:cond.wait()print'%s running'%nameif__name__=='__main__':condition=multiprocessing.Condition()s1=multiprocessing.Process(name='s1',target=stage_1,args=(condition,))s2_clients=[multiprocessing.Process(name='stage_2[%d]'%i,target=stage_2,args=(condition,))foriinrange(1,3)]forcins2_clients:c.start()time.sleep(1)s1.start()s1.join()forcins2_clients:c.join()

In this example, two process run the second stage of a job in
parallel, but only after the first stage is done.

Sometimes it is useful to allow more than one worker access to a
resource at a time, while still limiting the overall number. For
example, a connection pool might support a fixed number of
simultaneous connections, or a network application might support a
fixed number of concurrent downloads. A Semaphore is one way
to manage those connections.

In this example, the ActivePool class simply serves as a
convenient way to track which processes are running at a given
moment. A real resource pool would probably allocate a connection or
some other value to the newly active process, and reclaim the value
when the task is done. Here, the pool is just used to hold the names
of the active processes to show that only three are running
concurrently.

In the previous example, the list of active processes is maintained
centrally in the ActivePool instance via a special type of
list object created by a Manager. The Manager is
responsible for coordinating shared information state between all of
its users.

The Pool class can be used to manage a fixed number of
workers for simple cases where the work to be done can be broken up
and distributed between workers independently. The return values from
the jobs are collected and returned as a list. The pool arguments
include the number of processes and a function to run when starting
the task process (invoked once per child).

importmultiprocessingdefdo_calculation(data):returndata*2defstart_process():print'Starting',multiprocessing.current_process().nameif__name__=='__main__':inputs=list(range(10))print'Input :',inputsbuiltin_outputs=map(do_calculation,inputs)print'Built-in:',builtin_outputspool_size=multiprocessing.cpu_count()*2pool=multiprocessing.Pool(processes=pool_size,initializer=start_process,)pool_outputs=pool.map(do_calculation,inputs)pool.close()# no more taskspool.join()# wrap up current tasksprint'Pool :',pool_outputs

The result of the map() method is functionally equivalent to the
built-in map(), except that individual tasks run in parallel.
Since the pool is processing its inputs in parallel, close() and
join() can be used to synchronize the main process with the task
processes to ensure proper cleanup.

By default Pool creates a fixed number of worker processes
and passes jobs to them until there are no more jobs. Setting the
maxtasksperchild parameter tells the pool to restart a worker
process after it has finished a few tasks. This can be used to avoid
having long-running workers consume ever more system resources.

importmultiprocessingdefdo_calculation(data):returndata*2defstart_process():print'Starting',multiprocessing.current_process().nameif__name__=='__main__':inputs=list(range(10))print'Input :',inputsbuiltin_outputs=map(do_calculation,inputs)print'Built-in:',builtin_outputspool_size=multiprocessing.cpu_count()*2pool=multiprocessing.Pool(processes=pool_size,initializer=start_process,maxtasksperchild=2,)pool_outputs=pool.map(do_calculation,inputs)pool.close()# no more taskspool.join()# wrap up current tasksprint'Pool :',pool_outputs

The pool restarts the workers when they have completed their allotted
tasks, even if there is no more work. In this output, eight workers
are created, even though there are only 10 tasks, and each worker can
complete two of them at a time.