ℹ️ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://chryswoods.com/parallel_python/async_map.html |
| Last Crawled | 2026-04-06 05:26:39 (17 hours ago) |
| First Indexed | 2020-11-25 08:22:05 (5 years ago) |
| HTTP Status Code | 200 |
| Meta Title | chryswoods.com | Part 2: Asynchronous Mapping |
| Meta Description | null |
| Meta Canonical | null |
| Boilerpipe Text | Asynchronous functions allow you to give different tasks to different members of the
multiprocessing.Pool
. However, giving functions one by one is not very efficient. It would be good to be able to combine mapping with asynchronous functions, i.e. be able to give different mapping tasks simultanously to the pool of workers. Fortunately,
Pool.map_async
provides exactly that - an asynchronous parallel map.
Create a new python script called
asyncmap.py
and copy into it
from
functools
import
reduce
from
multiprocessing
import
Pool, current_process
import
time
def
add(x, y):
"""Return the sum of the arguments"""
print
(
"Worker
%s
is processing add(
%s
,
%s
)"
%
(current_process().pid, x, y))
time.sleep(
1
)
return
x
+
y
def
product(x, y):
"""Return the product of the arguments"""
print
(
"Worker
%s
is processing product(
%s
,
%s
)"
%
(current_process().pid, x, y))
time.sleep(
1
)
return
x
*
y
if
__name__
==
"__main__"
:
a
=
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
9
,
10
]
b
=
[
11
,
12
,
13
,
14
,
15
,
16
,
17
,
18
,
19
,
20
]
# Now create a Pool of workers
with
Pool()
as
pool:
sum_future
=
pool.starmap_async(add,
zip
(a,b))
product_future
=
pool.starmap_async(product,
zip
(a,b))
sum_future.wait()
product_future.wait()
total_sum
=
reduce
(
lambda
x, y: x
+
y, sum_future.get())
total_product
=
reduce
(
lambda
x, y: x
+
y, product_future.get())
print
(
"Sum of sums of 'a' and 'b' is
%s
"
%
total_sum)
print
(
"Sum of products of 'a' and 'b' is
%s
"
%
total_product)
Running this script, e.g. via
python asyncmap.py
should result in something like
Worker 32722 is processing add(1, 11)
Worker 32723 is processing add(2, 12)
Worker 32724 is processing add(3, 13)
Worker 32725 is processing add(4, 14)
Worker 32722 is processing add(5, 15)
Worker 32724 is processing add(6, 16)
Worker 32725 is processing add(7, 17)
Worker 32723 is processing add(8, 18)
Worker 32722 is processing add(9, 19)
Worker 32724 is processing add(10, 20)
Worker 32725 is processing product(1, 11)
Worker 32723 is processing product(2, 12)
Worker 32722 is processing product(3, 13)
Worker 32723 is processing product(4, 14)
Worker 32724 is processing product(5, 15)
Worker 32725 is processing product(6, 16)
Worker 32722 is processing product(7, 17)
Worker 32725 is processing product(8, 18)
Worker 32723 is processing product(9, 19)
Worker 32724 is processing product(10, 20)
Sum of sums of 'a' and 'b' is 210
Sum of products of 'a' and 'b' is 935
This script provides two functions,
add
and
product
, which are mapped asynchronously using the
Pool.map_async
function. This is identical to the
Pool.map
function that you used before, except now the map is performed asynchronously. This means that the resulting list is returned in a future (in this case, the futures
sum_future
and
product_future
. The results are waited for using the
.wait()
functions, remembering to make sure that we don’t exit the
with
block until all results are available. Then, the results of mapping are retrieved using the
.get()
function of the futures.
Chunking
By default, the
Pool.map
function divides the work over the pool of workers by assiging pieces of work one by one. In the example above, the work to be performed was;
add(1, 11)
add(2, 12)
add(3, 13)
etc.
add(10,20)
product(1, 11)
product(2, 12)
product(3, 13)
etc.
product(10, 20)
The work was assigned one by one to the four workers on my computer, i.e. the first worker process was given
add(1, 11)
, the second
add(2, 12)
, the third
add(3, 13)
the then the fourth
add(4, 14)
. The first worker to finish was then given
add(5, 15)
, then the next given
add(6, 16)
etc. etc.
Giving work one by one can be very inefficient for quick tasks, as the time needed by a worker process to stop and get new work can be longer than it takes to actually complete the task. To solve this problem, you can control how many work items are handed out to each worker process at a time. This is known as chunking, and the number of work items is known as the chunk of work to perform.
You can control the number of work items to perform per worker (the chunk size) by setting the
chunksize
argument, e.g.
sum_future
=
pool.starmap_async(add,
zip
(a, b), chunksize
=
5
)
would suggest to
pool
that each worker be given a chunk of five pieces of work. Note that this is just a suggestion, and
pool
may decide to use a slightly smaller or larger chunk size depending on the amount of work and the number of workers available.
Modify your
asyncmap.py
script and set the
chunksize
to 5 for both of the asynchronous maps for
add
and
product
. Re-run your script. You should see something like;
Worker 658 is processing add(1, 11)
Worker 659 is processing add(6, 16)
Worker 660 is processing product(1, 11)
Worker 661 is processing product(6, 16)
Worker 659 is processing add(7, 17)
Worker 660 is processing product(2, 12)
Worker 661 is processing product(7, 17)
Worker 658 is processing add(2, 12)
Worker 660 is processing product(3, 13)
Worker 659 is processing add(8, 18)
Worker 661 is processing product(8, 18)
Worker 658 is processing add(3, 13)
Worker 660 is processing product(4, 14)
Worker 659 is processing add(9, 19)
Worker 661 is processing product(9, 19)
Worker 658 is processing add(4, 14)
Worker 659 is processing add(10, 20)
Worker 660 is processing product(5, 15)
Worker 661 is processing product(10, 20)
Worker 658 is processing add(5, 15)
Sum of sums of 'a' and 'b' is 210
Sum of products of 'a' and 'b' is 935
My laptop has four workers. The first worker is assigned the first five items of work, i.e.
add(1, 11)
to
add(5, 15)
, and it starts by running
add(1, 11)
, hence why
add(1, 11)
is printed first.
The next worker is given the next five items of work, i.e.
add(6, 16)
to
add(10,20)
, and starts by running
add(6, 16)
, hence why
add(6, 16)
is printed second.
The next worker is given the next five items of work, i.e.
product(1, 11)
to
product(5, 15)
, and it starts by running
product(1, 11)
, hence why this is printed third.
The last worker is given the next five items of work, i.e.
product(6, 16)
to
product(10, 20)
, and it starts by running
product(6, 16)
, hence why this is printed fourth.
Once each worker has finished its first item of work, it moves onto its second. This is why
add(2, 12)
,
add(7, 17)
,
product(2, 12)
and
product(7, 17)
are printed next. Then, each worker moves onto its third piece of work etc. etc.
If you don’t specify the
chunksize
then it is equal to
1
. When writing a new script you should experiment with different values of
chunksize
to find the value that gives best performance.
Exercise
Edit your script written in answer to
exercise 2 of Parallel Map/Reduce
, in which you count all of the words used in all Shakespeare plays (e.g. an example answer
is here
).
Edit the script so that you use an asynchronous map to distribute the work over the pool. This will free up the master process to give feedback to the user of the script, e.g. to print a progress or status message while the work is running to reassure the user that the script has not frozen. For example
while
not
future.ready():
print
(
"Work is in progress..."
)
time.sleep(
0.1
)
Add a status message to your script to reassure the user that your script hasn’t frozen while it is processing.
(note that you can call your script using
python -u countwords.py shakespeare/*
to use the
-u
argument to stop Python from buffering text written to standard output)
If you get stuck or want inspiration, a possible answer is given
here
.
Previous
Up
Next |
| Markdown | - [Home](https://chryswoods.com/index.html)
- [Research](https://chryswoods.com/research/README.html)
- [What is good research software engineering?](https://chryswoods.com/research/good_rse.html)
- [How can the RSE group work with you?](https://chryswoods.com/research/collaboration.html)
- [RSE Group news](https://chryswoods.com/research/news.html)
- [Jobs](https://chryswoods.com/research/jobs.html)
- [Training Resources](https://chryswoods.com/main/courses.html)
- [Courses](https://chryswoods.com/main/courses.html)
- [Python and Data](https://chryswoods.com/python_and_data/README.html)
- [Beginning Perl](https://chryswoods.com/beginning_perl/README.html)
- [Perl Basics](https://chryswoods.com/beginning_perl/basics.html)
- [Loops](https://chryswoods.com/beginning_perl/loops.html)
- [Arguments\!](https://chryswoods.com/beginning_perl/arguments.html)
- [Conditions](https://chryswoods.com/beginning_perl/conditions.html)
- [Files](https://chryswoods.com/beginning_perl/files.html)
- [Writing Files](https://chryswoods.com/beginning_perl/writing.html)
- [Splitting Lines](https://chryswoods.com/beginning_perl/splitting.html)
- [Searching Files](https://chryswoods.com/beginning_perl/searching.html)
- [Search and Replace](https://chryswoods.com/beginning_perl/replacing.html)
- [Running Programs](https://chryswoods.com/beginning_perl/running.html)
- [Job Scripts](https://chryswoods.com/beginning_perl/jobs.html)
- [What Next?](https://chryswoods.com/beginning_perl/whatnext.html)
- [Beginning Python](https://chryswoods.com/beginning_python/README.html)
- [Python Basics](https://chryswoods.com/beginning_python/basics.html)
- [Loops](https://chryswoods.com/beginning_python/loops.html)
- [Arguments\!](https://chryswoods.com/beginning_python/arguments.html)
- [Conditions](https://chryswoods.com/beginning_python/conditions.html)
- [Files](https://chryswoods.com/beginning_python/files.html)
- [Writing Files](https://chryswoods.com/beginning_python/writing.html)
- [Splitting Lines](https://chryswoods.com/beginning_python/splitting.html)
- [Searching Files](https://chryswoods.com/beginning_python/searching.html)
- [Search and Replace](https://chryswoods.com/beginning_python/replacing.html)
- [Running Programs](https://chryswoods.com/beginning_python/running.html)
- [Job Scripts](https://chryswoods.com/beginning_python/jobs.html)
- [What Next?](https://chryswoods.com/beginning_python/whatnext.html)
- [Intermediate Python](https://chryswoods.com/intermediate_python/README.html)
- [Lists](https://chryswoods.com/intermediate_python/lists.html)
- [Dictionaries](https://chryswoods.com/intermediate_python/dictionaries.html)
- [Functions](https://chryswoods.com/intermediate_python/functions.html)
- [Modules](https://chryswoods.com/intermediate_python/modules.html)
- [Documenting Code](https://chryswoods.com/intermediate_python/documenting.html)
- [Objects and Classes](https://chryswoods.com/intermediate_python/objects.html)
- [Testing](https://chryswoods.com/intermediate_python/testing.html)
- [Regular Expressions in Python](https://chryswoods.com/intermediate_python/regexp.html)
- [What next?](https://chryswoods.com/intermediate_python/whatnext.html)
- [Parallel Programming with Python](https://chryswoods.com/parallel_python/README.html)
- [Part 1: Functional Programming](https://chryswoods.com/parallel_python/part1.html)
- [Functions as Objects](https://chryswoods.com/parallel_python/functions.html)
- [Mapping Functions](https://chryswoods.com/parallel_python/map.html)
- [Reduction](https://chryswoods.com/parallel_python/reduce.html)
- [Anonymous Functions (lambda)](https://chryswoods.com/parallel_python/lambda.html)
- [Part 2: Multicore (local) Parallel Programming](https://chryswoods.com/parallel_python/part2.html)
- [Multiprocessing](https://chryswoods.com/parallel_python/multiprocessing.html)
- [Pool](https://chryswoods.com/parallel_python/pool_part2.html)
- [Parallel map/reduce](https://chryswoods.com/parallel_python/mapreduce_part2.html)
- [Asynchronous Functions and Futures](https://chryswoods.com/parallel_python/futures_part2.html)
- [Asynchronous Mapping](https://chryswoods.com/parallel_python/async_map.html)
- [Part 3: Multinode (distributed/cluster) Parallel Programming](https://chryswoods.com/parallel_python/part3.html)
- [Scoop](https://chryswoods.com/parallel_python/scoop.html)
- [Distributed map/reduce](https://chryswoods.com/parallel_python/mapreduce_part3.html)
- [Running Scoop on a Cluster](https://chryswoods.com/parallel_python/cluster.html)
- [What Next?](https://chryswoods.com/parallel_python/whatnext.html)
- [Epilogue](https://chryswoods.com/parallel_python/epilogue.html)
- [Changes from Python 2 to 3](https://chryswoods.com/parallel_python/python2to3.html)
- [Global Interpreter Lock (GIL)](https://chryswoods.com/parallel_python/gil.html)
- [Beginning R](https://chryswoods.com/beginning_r/index.html)
- [Intermediate R](https://chryswoods.com/intermediate_r/index.html)
- [Introduction to Data Analysis in R](https://chryswoods.com/data_analysis_r/index.html)
- [Beginning C++](https://chryswoods.com/beginning_c++/README.html)
- [Why C++?](https://chryswoods.com/beginning_c++/why.html)
- [C++ Basics](https://chryswoods.com/beginning_c++/basics.html)
- [Syntax Compared to Python](https://chryswoods.com/beginning_c++/syntax.html)
- [Types, Scopes and Auto](https://chryswoods.com/beginning_c++/typing.html)
- [Lists and Dictionaries](https://chryswoods.com/beginning_c++/lists.html)
- [Objects and Classes](https://chryswoods.com/beginning_c++/objects.html)
- [Concepts, Default Arguments and Operators](https://chryswoods.com/beginning_c++/operators.html)
- [What next?](https://chryswoods.com/beginning_c++/whatnext.html)
- [Parallel Programming with C++](https://chryswoods.com/parallel_c++/README.html)
- [Part 1: Functional Programming](https://chryswoods.com/parallel_c++/part1.html)
- [Functions as Objects](https://chryswoods.com/parallel_c++/functions.html)
- [Mapping Functions](https://chryswoods.com/parallel_c++/map.html)
- [Reduction](https://chryswoods.com/parallel_c++/reduce.html)
- [Anonymous Functions (lambda)](https://chryswoods.com/parallel_c++/lambda.html)
- [Map/Reduce](https://chryswoods.com/parallel_c++/mapreduce.html)
- [Part 2: Parallel Programming Using Intel Threading Building Blocks](https://chryswoods.com/parallel_c++/part2.html)
- [tbb::parallel\_for](https://chryswoods.com/parallel_c++/parallel_for.html)
- [tbb::parallel\_reduce](https://chryswoods.com/parallel_c++/parallel_reduce.html)
- [Writing a parallel map/reduce](https://chryswoods.com/parallel_c++/parallel_mapreduce.html)
- [What Next?](https://chryswoods.com/parallel_c++/whatnext.html)
- [Efficient Vectorisation with C++](https://chryswoods.com/vector_c++/README.html)
- [Part 1: Introduction to Vectorisation](https://chryswoods.com/vector_c++/part1.html)
- [What is Vectorisation?](https://chryswoods.com/vector_c++/vectorisation.html)
- [How to Vectorise (omp simd)](https://chryswoods.com/vector_c++/simd.html)
- [omp simd features](https://chryswoods.com/vector_c++/features.html)
- [Memory Layout](https://chryswoods.com/vector_c++/memory.html)
- [omp simd limitations](https://chryswoods.com/vector_c++/limitations.html)
- [Part 2: Vectorisation using Intrinsics](https://chryswoods.com/vector_c++/part2.html)
- [SSE Intrinsics](https://chryswoods.com/vector_c++/emmintrin.html)
- [AVX Intrinsics](https://chryswoods.com/vector_c++/immintrin.html)
- [Portable Vectorisation](https://chryswoods.com/vector_c++/portable.html)
- [What Next?](https://chryswoods.com/vector_c++/whatnext.html)
- [Parallel Programming with OpenMP](https://chryswoods.com/beginning_openmp/README.html)
- [Basics](https://chryswoods.com/beginning_openmp/basics.html)
- [Compiler Directives / Pragmas](https://chryswoods.com/beginning_openmp/directives.html)
- [Sections](https://chryswoods.com/beginning_openmp/sections.html)
- [Loops](https://chryswoods.com/beginning_openmp/loops.html)
- [Critical Code](https://chryswoods.com/beginning_openmp/critical.html)
- [Reduction](https://chryswoods.com/beginning_openmp/reduction.html)
- [Map / Reduce](https://chryswoods.com/beginning_openmp/mapreduce.html)
- [Maximising Performance](https://chryswoods.com/beginning_openmp/performance.html)
- [Case Study](https://chryswoods.com/beginning_openmp/casestudy.html)
- [What Next?](https://chryswoods.com/beginning_openmp/whatnext.html)
- [Parallel Programming with MPI](https://chryswoods.com/beginning_mpi/README.html)
- [Basics](https://chryswoods.com/beginning_mpi/basics.html)
- [MPI Functions](https://chryswoods.com/beginning_mpi/functions.html)
- [Sections](https://chryswoods.com/beginning_mpi/sections.html)
- [Loops](https://chryswoods.com/beginning_mpi/loops.html)
- [Messages](https://chryswoods.com/beginning_mpi/messages.html)
- [Reduction](https://chryswoods.com/beginning_mpi/reduction.html)
- [Map / Reduce](https://chryswoods.com/beginning_mpi/mapreduce.html)
- [Maximising Performance](https://chryswoods.com/beginning_mpi/performance.html)
- [What Next?](https://chryswoods.com/beginning_mpi/whatnext.html)
- [Version Control with Git](https://chryswoods.com/beginning_git/README.html)
- [Git Basics](https://chryswoods.com/beginning_git/basics.html)
- [Adding Files](https://chryswoods.com/beginning_git/adding.html)
- [Committing Changes](https://chryswoods.com/beginning_git/committing.html)
- [Diffing (seeing what has changed)](https://chryswoods.com/beginning_git/diffing.html)
- [Changing Versions](https://chryswoods.com/beginning_git/versions.html)
- [Branching](https://chryswoods.com/beginning_git/branching.html)
- [Renaming and Removing Files](https://chryswoods.com/beginning_git/renaming.html)
- [Subdirectories and Ignoring Files](https://chryswoods.com/beginning_git/subdirs.html)
- [Git in the Cloud](https://chryswoods.com/beginning_git/github.html)
- [Pushing to the Cloud](https://chryswoods.com/beginning_git/push.html)
- [Markdown](https://chryswoods.com/beginning_git/markdown.html)
- [Cloning a Repository](https://chryswoods.com/beginning_git/cloning.html)
- [Merging](https://chryswoods.com/beginning_git/merging.html)
- [Pull Requests](https://chryswoods.com/beginning_git/pull.html)
- [Continuous Integration](https://chryswoods.com/beginning_git/ci.html)
- [What next?](https://chryswoods.com/beginning_git/whatnext.html)
- [JupyterHub and Kubernetes](https://chryswoods.com/inception_workshop/README.html)
- [Creating the workshop](https://chryswoods.com/inception_workshop/course/part01.html)
- [Finding all your dependencies](https://chryswoods.com/inception_workshop/course/part02.html)
- [Building the docker image](https://chryswoods.com/inception_workshop/course/part03.html)
- [JupyterHub](https://chryswoods.com/inception_workshop/course/part04.html)
- [Uploading to the cloud](https://chryswoods.com/inception_workshop/course/part05.html)
- [Kubernetes](https://chryswoods.com/inception_workshop/course/part06.html)
- [Helm](https://chryswoods.com/inception_workshop/course/part07.html)
- [Configuring JupyterHub](https://chryswoods.com/inception_workshop/course/part08.html)
- [What next?](https://chryswoods.com/inception_workshop/course/whatnext.html)
- [Introduction to Monte Carlo](https://chryswoods.com/intro_to_mc/README.html)
- [Part 1](https://chryswoods.com/intro_to_mc/part1/README.html)
- [Introduction to Monte Carlo](https://chryswoods.com/intro_to_mc/part1/intro.html)
- [Software](https://chryswoods.com/intro_to_mc/part1/software.html)
- [Metropolis Monte Carlo](https://chryswoods.com/intro_to_mc/part1/metropolis.html)
- [Running metropolis.py](https://chryswoods.com/intro_to_mc/part1/running.html)
- [Control Variables](https://chryswoods.com/intro_to_mc/part1/control.html)
- [Phase Changes](https://chryswoods.com/intro_to_mc/part1/phase.html)
- [Phase Space and Ensembles](https://chryswoods.com/intro_to_mc/part1/ensemble.html)
- [Volume Moves](https://chryswoods.com/intro_to_mc/part1/volume.html)
- [NPT Simulations](https://chryswoods.com/intro_to_mc/part1/npt.html)
- [Summary](https://chryswoods.com/intro_to_mc/part1/summary.html)
- [Part 2](https://chryswoods.com/intro_to_mc/part2/README.html)
- [Introduction and Software](https://chryswoods.com/intro_to_mc/part2/intro.html)
- [Sampling the Solvent - Rigid Body Moves](https://chryswoods.com/intro_to_mc/part2/rigid.html)
- [Sampling the Ligand - Intramolecular Moves](https://chryswoods.com/intro_to_mc/part2/intra.html)
- [Sampling the Protein - Backbone Moves](https://chryswoods.com/intro_to_mc/part2/backbone.html)
- [Sampling it all - Weighting Moves](https://chryswoods.com/intro_to_mc/part2/weight.html)
- [Summary](https://chryswoods.com/intro_to_mc/part2/whatnext.html)
- [Molecular Visualisation, Modelling and Dynamics](https://chryswoods.com/dynamics/README.html)
- [Part 1: Molecular Visualisation](https://chryswoods.com/dynamics/visualisation/README.html)
- [1a: Opening Files](https://chryswoods.com/dynamics/visualisation/opening_files.html)
- [1b: Manipulating the View](https://chryswoods.com/dynamics/visualisation/mouse.html)
- [1c: Graphical Representations](https://chryswoods.com/dynamics/visualisation/representations.html)
- [1d: Selecting Atoms](https://chryswoods.com/dynamics/visualisation/selection.html)
- [1e: Complex Selections](https://chryswoods.com/dynamics/visualisation/complex_selection.html)
- [1f: Rendering](https://chryswoods.com/dynamics/visualisation/rendering.html)
- [1g: Movies](https://chryswoods.com/dynamics/visualisation/movies.html)
- [1h: Picking Atoms](https://chryswoods.com/dynamics/visualisation/picking.html)
- [1i: Comparing Trajectories](https://chryswoods.com/dynamics/visualisation/comparing.html)
- [1j: What Next?](https://chryswoods.com/dynamics/visualisation/whatnext.html)
- [Part 2: Molecular Dynamics](https://chryswoods.com/dynamics/dynamics/README.html)
- [2a: Getting Started](https://chryswoods.com/dynamics/dynamics/getting_started.html)
- [2b: Theory of MD](https://chryswoods.com/dynamics/dynamics/theory.html)
- [2c: Changing Time](https://chryswoods.com/dynamics/dynamics/time.html)
- [2d: Shake (Rattle and Roll)](https://chryswoods.com/dynamics/dynamics/shake.html)
- [2e: Periodic Boundary Conditions](https://chryswoods.com/dynamics/dynamics/protein.html)
- [2f: Under Pressure](https://chryswoods.com/dynamics/dynamics/pressure.html)
- [2g: Running the Simulation](https://chryswoods.com/dynamics/dynamics/simulation.html)
- [2h: What Next?](https://chryswoods.com/dynamics/dynamics/whatnext.html)
- [Part 3: Mutation Studies](https://chryswoods.com/dynamics/mutation/README.html)
- [3a: Getting Started](https://chryswoods.com/dynamics/mutation/gettingstarted.html)
- [3b: Mutating the Protein](https://chryswoods.com/dynamics/mutation/mutation.html)
- [3c: Solvating the Protein](https://chryswoods.com/dynamics/mutation/solvation.html)
- [3d: Minimising the System](https://chryswoods.com/dynamics/mutation/minimisation.html)
- [3e: Heating the System](https://chryswoods.com/dynamics/mutation/heating.html)
- [3f: Equilibrating the System](https://chryswoods.com/dynamics/mutation/equilibration.html)
- [3g: Running the Simulation](https://chryswoods.com/dynamics/mutation/simulation.html)
- [3h: Comparing Trajectories](https://chryswoods.com/dynamics/mutation/compare.html)
- [3i: What Next?](https://chryswoods.com/dynamics/mutation/whatnext.html)
- [QM/MM Monte Carlo](https://chryswoods.com/embo2014/Practical.html)
- [Software Carpentry](https://chryswoods.com/main/softwarecarpentry.html)
- [Bath (July 2013) Workshop](http://tinyurl.com/swcbath)
- [Bristol (September 2013) Workshop](http://tinyurl.com/swcbristol)
- [Exeter (November 2013) Workshop](http://tinyurl.com/swcexeter)
- [Software](https://chryswoods.com/main/software.html)
- [Sire Molecular Simulation Framework](https://siremol.org/)
- [ProtoMS Monte Carlo Package](http://protoms.org/)
- [Publications](https://chryswoods.com/main/publications.html)
- [Talks](https://chryswoods.com/talks/README.html)
- [Useful Links](https://chryswoods.com/main/links.html)
- [Books that are Worth Reading](https://chryswoods.com/main/reading.html)
- [Contact Information](https://chryswoods.com/main/contact.html)
- [About this Website](https://chryswoods.com/main/website.html)
- [ chryswoods.com](https://chryswoods.com/parallel_python/async_map.html#main_menu)
# Part 2: Asynchronous Mapping
Asynchronous functions allow you to give different tasks to different members of the `multiprocessing.Pool`. However, giving functions one by one is not very efficient. It would be good to be able to combine mapping with asynchronous functions, i.e. be able to give different mapping tasks simultanously to the pool of workers. Fortunately, `Pool.map_async` provides exactly that - an asynchronous parallel map.
Create a new python script called `asyncmap.py` and copy into it
```
from functools import reduce
from multiprocessing import Pool, current_process
import time
def add(x, y):
"""Return the sum of the arguments"""
print("Worker %s is processing add(%s, %s)" % (current_process().pid, x, y))
time.sleep(1)
return x + y
def product(x, y):
"""Return the product of the arguments"""
print("Worker %s is processing product(%s, %s)" % (current_process().pid, x, y))
time.sleep(1)
return x * y
if __name__ == "__main__":
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
# Now create a Pool of workers
with Pool() as pool:
sum_future = pool.starmap_async(add, zip(a,b))
product_future = pool.starmap_async(product, zip(a,b))
sum_future.wait()
product_future.wait()
total_sum = reduce(lambda x, y: x + y, sum_future.get())
total_product = reduce(lambda x, y: x + y, product_future.get())
print("Sum of sums of 'a' and 'b' is %s" % total_sum)
print("Sum of products of 'a' and 'b' is %s" % total_product)
```
Running this script, e.g. via `python asyncmap.py` should result in something like
```
Worker 32722 is processing add(1, 11)
Worker 32723 is processing add(2, 12)
Worker 32724 is processing add(3, 13)
Worker 32725 is processing add(4, 14)
Worker 32722 is processing add(5, 15)
Worker 32724 is processing add(6, 16)
Worker 32725 is processing add(7, 17)
Worker 32723 is processing add(8, 18)
Worker 32722 is processing add(9, 19)
Worker 32724 is processing add(10, 20)
Worker 32725 is processing product(1, 11)
Worker 32723 is processing product(2, 12)
Worker 32722 is processing product(3, 13)
Worker 32723 is processing product(4, 14)
Worker 32724 is processing product(5, 15)
Worker 32725 is processing product(6, 16)
Worker 32722 is processing product(7, 17)
Worker 32725 is processing product(8, 18)
Worker 32723 is processing product(9, 19)
Worker 32724 is processing product(10, 20)
Sum of sums of 'a' and 'b' is 210
Sum of products of 'a' and 'b' is 935
```
This script provides two functions, `add` and `product`, which are mapped asynchronously using the `Pool.map_async` function. This is identical to the `Pool.map` function that you used before, except now the map is performed asynchronously. This means that the resulting list is returned in a future (in this case, the futures `sum_future` and `product_future`. The results are waited for using the `.wait()` functions, remembering to make sure that we don’t exit the `with` block until all results are available. Then, the results of mapping are retrieved using the `.get()` function of the futures.
## Chunking
By default, the `Pool.map` function divides the work over the pool of workers by assiging pieces of work one by one. In the example above, the work to be performed was;
```
add(1, 11)
add(2, 12)
add(3, 13)
etc.
add(10,20)
product(1, 11)
product(2, 12)
product(3, 13)
etc.
product(10, 20)
```
The work was assigned one by one to the four workers on my computer, i.e. the first worker process was given `add(1, 11)`, the second `add(2, 12)`, the third `add(3, 13)` the then the fourth `add(4, 14)`. The first worker to finish was then given `add(5, 15)`, then the next given `add(6, 16)` etc. etc.
Giving work one by one can be very inefficient for quick tasks, as the time needed by a worker process to stop and get new work can be longer than it takes to actually complete the task. To solve this problem, you can control how many work items are handed out to each worker process at a time. This is known as chunking, and the number of work items is known as the chunk of work to perform.
You can control the number of work items to perform per worker (the chunk size) by setting the `chunksize` argument, e.g.
```
sum_future = pool.starmap_async(add, zip(a, b), chunksize=5)
```
would suggest to `pool` that each worker be given a chunk of five pieces of work. Note that this is just a suggestion, and `pool` may decide to use a slightly smaller or larger chunk size depending on the amount of work and the number of workers available.
Modify your `asyncmap.py` script and set the `chunksize` to 5 for both of the asynchronous maps for `add` and `product`. Re-run your script. You should see something like;
```
Worker 658 is processing add(1, 11)
Worker 659 is processing add(6, 16)
Worker 660 is processing product(1, 11)
Worker 661 is processing product(6, 16)
Worker 659 is processing add(7, 17)
Worker 660 is processing product(2, 12)
Worker 661 is processing product(7, 17)
Worker 658 is processing add(2, 12)
Worker 660 is processing product(3, 13)
Worker 659 is processing add(8, 18)
Worker 661 is processing product(8, 18)
Worker 658 is processing add(3, 13)
Worker 660 is processing product(4, 14)
Worker 659 is processing add(9, 19)
Worker 661 is processing product(9, 19)
Worker 658 is processing add(4, 14)
Worker 659 is processing add(10, 20)
Worker 660 is processing product(5, 15)
Worker 661 is processing product(10, 20)
Worker 658 is processing add(5, 15)
Sum of sums of 'a' and 'b' is 210
Sum of products of 'a' and 'b' is 935
```
My laptop has four workers. The first worker is assigned the first five items of work, i.e. `add(1, 11)` to `add(5, 15)`, and it starts by running `add(1, 11)`, hence why `add(1, 11)` is printed first.
The next worker is given the next five items of work, i.e. `add(6, 16)` to `add(10,20)`, and starts by running `add(6, 16)`, hence why `add(6, 16)` is printed second.
The next worker is given the next five items of work, i.e. `product(1, 11)` to `product(5, 15)`, and it starts by running `product(1, 11)`, hence why this is printed third.
The last worker is given the next five items of work, i.e. `product(6, 16)` to `product(10, 20)`, and it starts by running `product(6, 16)`, hence why this is printed fourth.
Once each worker has finished its first item of work, it moves onto its second. This is why `add(2, 12)`, `add(7, 17)`, `product(2, 12)` and `product(7, 17)` are printed next. Then, each worker moves onto its third piece of work etc. etc.
If you don’t specify the `chunksize` then it is equal to `1`. When writing a new script you should experiment with different values of `chunksize` to find the value that gives best performance.
***
## Exercise
Edit your script written in answer to [exercise 2 of Parallel Map/Reduce](https://chryswoods.com/parallel_python/mapreduce_part2.html), in which you count all of the words used in all Shakespeare plays (e.g. an example answer [is here](https://chryswoods.com/parallel_python/mapreduce2_answer2.html)).
Edit the script so that you use an asynchronous map to distribute the work over the pool. This will free up the master process to give feedback to the user of the script, e.g. to print a progress or status message while the work is running to reassure the user that the script has not frozen. For example
```
while not future.ready():
print("Work is in progress...")
time.sleep(0.1)
```
Add a status message to your script to reassure the user that your script hasn’t frozen while it is processing.
(note that you can call your script using `python -u countwords.py shakespeare/*` to use the `-u` argument to stop Python from buffering text written to standard output)
If you get stuck or want inspiration, a possible answer is given [here](https://chryswoods.com/parallel_python/async_map_answer1.html).
***
# [Previous](https://chryswoods.com/parallel_python/futures_part2.html) [Up](https://chryswoods.com/parallel_python/part2.html) [Next](https://chryswoods.com/parallel_python/part3.html)
Quick Links \| [Home](https://chryswoods.com/index.html) \| [Courses](https://chryswoods.com/main/courses.html) \| [Software](https://chryswoods.com/main/software.html) \| [Contact](https://chryswoods.com/main/contact.html)
[Copyright Information](https://chryswoods.com/main/copyright.html) \| [Report a problem](<mailto:chryswoods@gmail.com?subject=There is a problem with your website&body=Hi, I found a problem on the page parallel_python/async_map.html>) I [Privacy](https://chryswoods.com/main/cookies.html) |
| Readable Markdown | null |
| Shard | 133 (laksa) |
| Root Hash | 1202965565032433533 |
| Unparsed URL | com,chryswoods!/parallel_python/async_map.html s443 |