🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 181 (from laksa000)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

Query:
Response:
📄
INDEXABLE
NOT CRAWLED
🤖
ROBOTS ALLOWED
👁️
SEEN

Page Info Filters

FilterStatusConditionDetails
HTTP statusN/Adownload_http_code = 200Not crawled
Age cutoffN/Adownload_stamp > now() - 6 MONTHNot crawled
History dropN/AisNull(history_drop_reason)Not crawled
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0

Page Details

PropertyValue
URLhttps://ecommons.cornell.edu/bitstreams/42174463-3aa7-4e34-8af3-47c12d1f8348/download
Shard181 (laksa)
Root Hash14620342054419313781
Unparsed URLedu,cornell!ecommons,/bitstreams/42174463-3aa7-4e34-8af3-47c12d1f8348/download s443