đŸ•ˇī¸ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 143 (from laksa053)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

â„šī¸ Skipped - page is already crawled

đŸšĢ
NOT INDEXABLE
✅
CRAWLED
7 months ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffFAILdownload_stamp > now() - 6 MONTH7.5 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://kirk86.github.io/2017/08/python-multiprocessing/
Last Crawled2025-08-29 07:21:40 (7 months ago)
First Indexed2017-08-26 04:38:51 (8 years ago)
HTTP Status Code200
Meta TitleRandom thoughts — Multiprocessing in python
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
In this blog post I'll try to cover the difference between the functionality found in the different functions that the multiprocessing package offers in python. The multiprocessing package offers the ability to work with multiple threads and if you wish with multiplce cores. First I'll try to cover the multiple core scenario and explain the difference between map, map_async , imap, imap_unorderd , apply, apply_async . In a later blog post I'll try to cover the multithreaded scenario. First I'll start with map/map_async and imap/imap_unorderd . map operates on function that accepts only one arguement, it is concurrent, it blocks your code, meaning nothing else can be executed until its finished and finally it returns your results in an ordered manner. map consumes your iterable by converting the iterable to a list assuming it isn't already, breaking it into chunks, and sending those chunks to the worker processes in the Pool . Breaking the iterable into chunks performs better than passing each item in the iterable between processes one item at a time - particularly if the iterable is large. However, turning the iterable into a list in order to chunk it can have a very high memory cost, since the entire list will need to be kept in memory. map_async on the other hand can operate on a function accepting multiple arguements it is not concurrent, it allows excution of code while it is still running, and finally it doesn't return your results in an ordered manner. imap doesn't turn the iterable you give it into a list, nor does break it into chunks by default. It will iterate over the iterable one element at a time, and send them each to a worker process. This means you don't take the memory hit of converting the whole iterable to a list, but it also means the performance is slower for large iterables, because of the lack of chunking. This can be mitigated by passing a chunksize argument larger than default of 1, however. The other major difference between imap/imap_unordered and map/map_async , is that with imap/imap_unordered , you can start receiving results from workers as soon as they're ready, rather than having to wait for all of them to be finished. With map_async , an AsyncResult is returned right away, but you can't actually retrieve results from that object until all of them have been processed, at which points it returns the same list that map does, basically map is actually implemented internally as map_async(...).get() . There's no way to get partial results, you either have the entire result, or nothing. imap and imap_unordered both return iterables right away. With imap , the results will be yielded from the iterable as soon as they're ready, while still preserving the ordering of the input iterable. With imap_unordered , results will be yielded as soon as they're ready, regardless of the order of the input iterable. To summarize the key differences between imap/imap_unordered and map/map_async are: 1. The way they consume the iterable you pass to them. 2. The way they return the result back to you. apply sends a single task off to a worker process, and then blocks until it's complete. apply_async sends a single task off to a work process, and then immediately returns an AsyncResult object, which can be used to wait for the task to finish and retrieve the result. apply is implemented by simply calling apply_async(...).get() . With apply_async , if an exception occurs inside of your calling function, you won't know about it unless you explicitly call Pool['process_x'].get() on the failing AsyncResult object, which would require iterating over all of Pool . With map_async the exception will be raised if you call ~async_result.get() - no iteration required. map_async has built-in chunking functionality, which will make your code perform noticeably better if arguement list is very large. Here is a final table summarizing all of those properties and diffferencies: Multi-args Concurrence Blocking Ordered-results map no yes yes yes map-async no yes no yes apply yes no yes no apply-async yes yes no no imap no yes yes yes imap-async no yes no no
Markdownnull
Readable Markdownnull
Shard143 (laksa)
Root Hash2566890010099092343
Unparsed URLio,github!kirk86,/2017/08/python-multiprocessing/ s443