🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 199 (from laksa027)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📍
LOCATION
Host 199 · Partition 44
laksa199
2488006646444648999
📄
INDEXABLE
CRAWLED
1 month ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH1.3 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.catacombes.paris.fr/en
Last Crawled2026-04-24 12:19:46 (1 month ago)
First Indexed2020-01-23 16:20:03 (6 years ago)
HTTP Status Code200
Content
Meta TitleThe Paris Catacombs | Official website
Meta Descriptionnull
Meta Canonicalnull
Boilerpipe Text
heavy column, fetched on demand
Markdown
heavy column, fetched on demand
Readable Markdown
heavy column, fetched on demand
ML Classification
ML Categories
/Arts_and_Entertainment
93.2%
/Arts_and_Entertainment/Events_and_Listings
47.8%
/Arts_and_Entertainment/Events_and_Listings/Expos_and_Conventions
24.6%
/Travel_and_Transportation
23.9%
/Travel_and_Transportation/Tourist_Destinations
23.3%
/Travel_and_Transportation/Tourist_Destinations/Historical_Sites_and_Buildings
23.1%
Raw JSON
{
    "/Arts_and_Entertainment": 932,
    "/Arts_and_Entertainment/Events_and_Listings": 478,
    "/Arts_and_Entertainment/Events_and_Listings/Expos_and_Conventions": 246,
    "/Travel_and_Transportation": 239,
    "/Travel_and_Transportation/Tourist_Destinations": 233,
    "/Travel_and_Transportation/Tourist_Destinations/Historical_Sites_and_Buildings": 231
}
ML Page Types
/Core_Page
85.6%
/Core_Page/Services_Page
63.3%
Raw JSON
{
    "/Core_Page": 856,
    "/Core_Page/Services_Page": 633
}
ML Intent Types
Transactional
79.0%
Informational
63.7%
Commercial
14.7%
Navigational
13.2%
Raw JSON
{
    "Transactional": 790,
    "Informational": 637,
    "Commercial": 147,
    "Navigational": 132
}
Content Metadata
Languageen
Authornull
Publish Timenot set
Original Publish Time2020-01-23 16:20:03 (6 years ago)
RepublishedNo
Word Count (Total)535
Word Count (Content)247
Links
External Links6
Internal Links80
Technical SEO
Meta NofollowNo
Meta NoarchiveNo
JS RenderedNo
Redirect Targetnull
Performance
Download Time (ms)344
TTFB (ms)344
Download Size (bytes)8,568
Location
Host ID199 (laksa199)
Partition ID44
Root Hash2488006646444648999
Unparsed URLfr,paris!catacombes,www,/en s443