Using the GSC URL Inspection API to Monitor Indexing at Scale
- December 5, 2025
- Analytics & Measurement
Google's URL Inspection API is the only programmatic way to read the same index, canonical, and coverage data that powers the URL Inspection tool inside Search Console. For sites with thousands of pages, clicking through that tool one URL at a time is useless. This guide shows you how to query the API at scale, what each field actually means, and how to turn raw responses into an indexing monitoring system you can run on a schedule.
What the API returns and why it beats the Coverage report
The Coverage (Page Indexing) report in Search Console aggregates URLs into buckets, but it samples, lags, and rarely exposes the specific URL list behind a status. The URL Inspection API gives you per-URL truth pulled from Google's index, including fields the aggregate report hides.
A single call to urlInspection.index.inspect returns an indexStatusResult object. The fields that matter most for monitoring:
verdict,PASS,FAIL,NEUTRAL, orPARTIAL. This is the headline status.coverageState, the human-readable reason, e.g. "Submitted and indexed", "Crawled - currently not indexed", "Discovered - currently not indexed", "Duplicate without user-selected canonical".googleCanonicalvsuserCanonical, Google's chosen canonical vs the one you declared. A mismatch here is the single most actionable signal the API offers.indexingState,INDEXING_ALLOWEDor a blocking reason likeBLOCKED_BY_META_TAGorBLOCKED_BY_ROBOTS_TXT.robotsTxtState,pageFetchState,lastCrawlTime, andcrawledAs(desktop vs mobile smartphone).referringUrlsandsitemap, where Google found the URL.
You also get mobileUsabilityResult, richResultsResult, and ampResult in the same payload, so one call covers indexing and structured-data health together.
Setup: auth, scope, and the one hard limit
The API uses the Search Console API surface, so authenticate with a Google Cloud service account or OAuth client that has the https://www.googleapis.com/auth/webmasters.readonly scope. The authenticated principal must be a verified user on the property you inspect, add the service account email as a full or restricted user in Search Console settings, and use the exact property string (including sc-domain: prefix for domain properties).
The constraint that shapes your entire architecture: the quota is 2,000 queries per day per property, with a short-term ceiling around 600 per minute. You cannot inspect a 50,000-URL site daily. Plan around the quota rather than fighting it, that is the central engineering problem this API presents.
A minimal Python request
Using the google-api-python-client library, a single inspection looks like this:
- Build the service:
service = build('searchconsole', 'v1', credentials=creds) - Call it:
service.urlInspection().index().inspect(body={'inspectionUrl': url, 'siteUrl': property_url, 'languageCode': 'en-US'}).execute()
The response nests everything under inspectionResult.indexStatusResult. Extract the fields you care about and write them to a row. The pattern that scales is a worker pool of 5, 10 concurrent threads with retry-on-429 backoff, throttled to stay under the per-minute ceiling.
Prioritizing which URLs to inspect
Because you can only afford ~2,000 inspections a day, never inspect your whole site blindly. Build a priority queue:
- New and recently changed URLs, pages published or updated in the last few days, pulled from your CMS or sitemap
lastmod. Confirm they get indexed. - Revenue and conversion pages, a fixed watchlist inspected every run regardless of other signals.
- Pages with traffic anomalies, cross-reference the Search Analytics API. Any URL whose clicks dropped sharply gets inspected to check for a coverage or canonical change.
- A rotating sample of the long tail, cycle through the remaining inventory so every URL is checked every N days. At 2,000/day a 30,000-URL site gets full coverage roughly every 15 days, which is fine for a baseline.
Store a last_inspected timestamp per URL and let the scheduler pick the oldest, highest-priority candidates each run.
Turning responses into monitoring signals
Raw verdicts are not alerts. Persist every inspection to a database (one row per URL per run) and derive state transitions. The conditions worth alerting on:
- Canonical mismatch:
googleCanonical != userCanonicaland the user canonical is self-referential. Google is overriding your canonical, often consolidating the page into a near-duplicate. This silently removes pages from the index. - Indexed → not indexed: a URL whose
coverageStateflips from "Submitted and indexed" to "Crawled - currently not indexed". This is your earliest warning of quality demotion or thin-content suppression. - Stuck in discovery: "Discovered - currently not indexed" persisting across runs signals crawl-budget or quality starvation, common on large or scaled-content sites.
- Unexpected noindex/robots block:
indexingStatebecomesBLOCKED_BY_META_TAGorrobotsTxtStatechanges toDISALLOWED, usually a deploy regression. These deserve a same-day alert. - Crawl failures:
pageFetchStateanything other thanSUCCESSFUL.
Compute these as diffs against the previous row for each URL, then push only the transitions to Slack, email, or a dashboard. Logging current state without diffing buries the signal.
Schema for storing results
A flat table is enough. Index it on (site_url, inspection_url, inspected_at) and store at minimum: verdict, coverage_state, google_canonical, user_canonical, indexing_state, robots_txt_state, page_fetch_state, last_crawl_time, crawled_as, and the raw JSON for anything you query later. Keeping the raw response means you never have to re-spend quota to backfill a field you forgot to extract.
Common mistakes
- Treating it like an indexing-request API. Inspection is read-only. To request indexing, use the separate Indexing API (and only for JobPosting/BroadcastEvent per Google's terms) or submit sitemaps.
- Inspecting the whole site daily. You will exhaust quota by mid-morning and get nothing useful. Prioritize ruthlessly.
- Ignoring
lastCrawlTime. A "PASS" verdict from a crawl three months ago tells you little about the current page. Weight freshness into your alerting. - Not handling 429s. Burst past the per-minute limit and calls fail silently in naive scripts. Implement exponential backoff and respect
Retry-After. - Using the wrong property string. Domain properties require the
sc-domain:prefix; URL-prefix properties need the exact protocol and trailing slash. A mismatch returns a permission error, not a helpful message. - Comparing canonicals without normalizing. Trailing slashes, protocol, and parameter order cause false mismatch alerts. Normalize both sides before diffing.
FAQ
How fresh is the data? It reflects Google's last index/crawl of the URL, not a live re-crawl. The tool's "Live Test" feature is not exposed via the API, you only get the indexed snapshot.
Can I raise the 2,000/day quota? No. It is a fixed per-property limit and not adjustable through Cloud Console, which is why multi-property accounts and prioritization matter.
Does it work on domain properties? Yes, as long as you pass the sc-domain: property string and the principal is verified on it.
Built this way, the API becomes a daily indexing health monitor: it catches canonical hijacks, noindex regressions, and quality-driven deindexing days or weeks before they surface in your traffic reports.
Want this handled properly on your site?
It is exactly the kind of work an advanced technical SEO audit covers. See how an advanced SEO audit works →
Claude Vincent is a technical SEO consultant focused on crawlability, rendering, and AI-search visibility. He writes the field guides and case studies at SEO ProCheck, with a bias toward the durable, unglamorous work that decides whether search engines and AI answer engines can actually read and cite a site.
About SEO ProCheck
Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.
Work With Me
Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.







