🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 99 (from laksa039)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

Query:
Response:
🚫
NOT INDEXABLE
NOT CRAWLED
🚫
ROBOTS BLOCKED
👁️
SEEN

Page Info Filters

FilterStatusConditionDetails
HTTP statusN/Adownload_http_code = 200Not crawled
Age cutoffN/Adownload_stamp > now() - 6 MONTHNot crawled
History dropN/AisNull(history_drop_reason)Not crawled
Spam/banFAILfh_dont_index != 1 AND ml_spam_score = 0fh_dont_index=1, ml_spam_score=0
CanonicalN/Ameta_canonical IS NULL OR = '' OR = src_unparsedNot crawled

Page Details

PropertyValue
URLhttps://www.theguardian.com/technology/2011/oct/06/steve-jobs-pancreas-cancer
Shard99 (laksa)
Root Hash4161074618625082499
Unparsed URLcom,theguardian!www,/technology/2011/oct/06/steve-jobs-pancreas-cancer s443