🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 11 (from laksa148)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📍
LOCATION
Host 11 · Partition 22
laksa011
1910924616934004411
📄
INDEXABLE
CRAWLED
12 days ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0.4 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://www.glean.com/careers
Last Crawled2026-05-21 23:40:57 (12 days ago)
First Indexed2021-09-15 23:20:55 (4 years ago)
HTTP Status Code200
Content
Meta TitleCareers at Glean | Glean Work AI
Meta DescriptionWe’re on a mission to bring people the knowledge they need to make a difference in the world. And we’re always looking for curious and creative people to help us make that happen.
Meta Canonicalnull
Boilerpipe Text
heavy column, fetched on demand
Markdown
heavy column, fetched on demand
Readable Markdown
heavy column, fetched on demand
ML Classification
ML Categories
/Jobs_and_Education
96.8%
/Jobs_and_Education/Jobs
96.6%
/Jobs_and_Education/Jobs/Job_Listings
91.2%
/Business_and_Industrial
17.1%
/Computers_and_Electronics
14.3%
/Computers_and_Electronics/Software
11.3%
/Business_and_Industrial/Business_Services
10.3%
Raw JSON
{
    "/Jobs_and_Education": 968,
    "/Jobs_and_Education/Jobs": 966,
    "/Jobs_and_Education/Jobs/Job_Listings": 912,
    "/Business_and_Industrial": 171,
    "/Computers_and_Electronics": 143,
    "/Computers_and_Electronics/Software": 113,
    "/Business_and_Industrial/Business_Services": 103
}
ML Page Types
/Core_Page
89.5%
/Core_Page/Careers_Page
89.3%
Raw JSON
{
    "/Core_Page": 895,
    "/Core_Page/Careers_Page": 893
}
ML Intent Types
Transactional
52.8%
Navigational
28.2%
Commercial
25.8%
Informational
21.6%
Raw JSON
{
    "Transactional": 528,
    "Navigational": 282,
    "Commercial": 258,
    "Informational": 216
}
Content Metadata
Languageen-us
Authornull
Publish Timenot set
Original Publish Time2021-09-15 23:20:55 (4 years ago)
RepublishedNo
Word Count (Total)2,335
Word Count (Content)711
Links
External Links345
Internal Links92
Technical SEO
Meta NofollowNo
Meta NoarchiveNo
JS RenderedYes
Redirect Targetnull
Performance
Download Time (ms)86
TTFB (ms)77
Download Size (bytes)92,529
Location
Host ID11 (laksa011)
Partition ID22
Root Hash1910924616934004411
Unparsed URLcom,glean!www,/careers s443