Google Search Console API for Developers
The Search Console UI is good enough for ad hoc investigation. The Search Console API is what you reach for when you need reproducibility, scale, or automation — nightly performance reports, post-deployment regression checks, multi-property dashboards, and alert pipelines that fire before a traffic drop becomes a crisis. This section covers everything you need to go from zero to a working monitoring system.
What the API Exposes
The Search Console API is a collection of distinct endpoints, each mapping to a product area you can also access in the UI:
- Search Analytics — query-level performance data: clicks, impressions, CTR, average position, segmented by query, page, country, device, date, and search appearance
- URL Inspection — live and cached index status for individual URLs: whether a page is indexed, its canonical, Core Web Vitals assessment, and rich results eligibility
- Sitemap Management — submit and retrieve sitemap status programmatically
- Index Coverage — aggregate crawl and index health across your property
- Mobile Usability — mobile rendering issues detected by Googlebot
- Rich Results — structured data validation status across your property
For most monitoring use cases, you will spend 80% of your time in Search Analytics and URL Inspection. The others are useful for one-off automation (auto-submitting sitemaps after deployments, for example) but are not primary monitoring surfaces.
Authentication: Service Accounts Over OAuth
Two authentication paths exist:
OAuth 2.0 is the right choice when you are building a tool that accesses another user's Search Console property with their explicit consent. The authorization flow is familiar — redirect to Google, user approves, you receive a token.
Service Accounts are the right choice for server-to-server automation: cronjobs, CI/CD hooks, monitoring daemons, and reporting pipelines accessing your own properties. A service account is a non-human Google identity that you create in Google Cloud Console, generate a JSON key for, and then grant access to your Search Console property as you would any team member.
For long-term automation, always use a service account. OAuth tokens expire, require user interaction to refresh, and tie monitoring reliability to an individual's Google session. Service accounts do not.
Setting Up a Service Account
# 1. Create a project in Google Cloud Console (if you don't have one)
# 2. Enable the "Google Search Console API" under APIs & Services
# 3. Create a service account: IAM & Admin > Service Accounts > Create
# 4. Generate a JSON key: click the account > Keys > Add Key > JSON
# 5. In Search Console: Settings > Users and permissions > Add user
# Enter the service account email (looks like: [email protected])
# Grant "Full" permission (required for Search Analytics queries)
Install the required Python libraries:
pip install google-auth google-auth-httplib2 google-api-python-client
Authenticating and Building a Client
from google.oauth2 import service_account
from googleapiclient.discovery import build
SCOPES = ["https://www.googleapis.com/auth/webmasters.readonly"]
SERVICE_ACCOUNT_FILE = "path/to/service-account-key.json"
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE,
scopes=SCOPES,
)
service = build("webmasters", "v3", credentials=credentials)
The webmasters service exposes .searchanalytics(), .urlInspection(), .sitemaps(), and
.sites() resource objects.
Querying Search Analytics
The core endpoint is webmasters.searchanalytics.query. Every request specifies a property URL, a
date range, a list of dimensions to group by, and optional row filters.
def query_search_analytics(service, site_url, start_date, end_date, dimensions, row_limit=25000):
"""
Fetch search analytics data for a given property and date range.
Returns a list of row dicts with keys matching the requested dimensions
plus clicks, impressions, ctr, and position.
"""
request_body = {
"startDate": start_date, # "YYYY-MM-DD"
"endDate": end_date,
"dimensions": dimensions, # e.g. ["query", "page", "country", "device"]
"rowLimit": row_limit, # max 25,000 per request
"startRow": 0,
}
response = (
service.searchanalytics()
.query(siteUrl=site_url, body=request_body)
.execute()
)
return response.get("rows", [])
Dimensions and What They Give You
| Dimension | Use case |
|---|---|
query |
Which search terms drive traffic |
page |
Per-URL performance breakdown |
country |
Geographic distribution of impressions/clicks |
device |
Desktop vs mobile position gaps |
searchAppearance |
Rich result type filtering (AMP, FAQ, etc.) |
date |
Time-series view for trend analysis |
You can combine multiple dimensions in a single request. Combining page + query gives you
landing-page-level keyword analysis. Combining page + device lets you detect pages where mobile
position diverges from desktop — a common sign of a CWV or UX regression.
Data constraints to plan around:
- Maximum date range: 16 months of historical data
- Data latency: 2–3 days (yesterday's data is not yet available)
- Max rows per request: 25,000
- Rate limit: 1,200 requests per minute per Google Cloud project
When you need more than 25,000 rows (large sites with thousands of URLs × queries), paginate using
startRow:
def paginated_query(service, site_url, start_date, end_date, dimensions):
all_rows = []
start_row = 0
batch_size = 25000
while True:
request_body = {
"startDate": start_date,
"endDate": end_date,
"dimensions": dimensions,
"rowLimit": batch_size,
"startRow": start_row,
}
response = (
service.searchanalytics()
.query(siteUrl=site_url, body=request_body)
.execute()
)
rows = response.get("rows", [])
all_rows.extend(rows)
if len(rows) < batch_size:
break # last page
start_row += batch_size
return all_rows
URL Inspection API
The URL Inspection API answers the question: "Does Google currently see this URL the way I expect?" For each URL you inspect, the response includes:
- Index status:
INDEXED,NOT_INDEXED,EXCLUDED, orNEUTRAL - Crawl time: when Googlebot last fetched the page
- Canonical URL: what Google considers the canonical (may differ from your declared canonical)
- Mobile usability: any mobile rendering issues detected
- Rich results: which schema types were detected and their validation status
- Core Web Vitals: field data status (if enough CrUX data exists)
def inspect_url(service, site_url, inspect_url):
"""
Returns the full indexing and rich results status for a single URL.
site_url must match the Search Console property (e.g. "https://example.com/")
"""
response = (
service.urlInspection()
.index()
.inspect(
body={
"inspectionUrl": inspect_url,
"siteUrl": site_url,
}
)
.execute()
)
result = response.get("inspectionResult", {})
return {
"verdict": result.get("indexStatusResult", {}).get("verdict"),
"crawled_as": result.get("indexStatusResult", {}).get("crawledAs"),
"last_crawl_time": result.get("indexStatusResult", {}).get("lastCrawlTime"),
"canonical": result.get("indexStatusResult", {}).get("googleCanonical"),
"mobile_usability": result.get("mobileUsabilityResult", {}).get("verdict"),
"rich_results": result.get("richResultsResult", {}).get("detectedItems", []),
}
Three High-Impact Automation Patterns
1. Post-Deployment QA Check
After shipping a major release, automatically inspect your ten most strategically important URLs and verify they remain indexed, their canonicals are unchanged, and their rich results are still detected. Pipe failures to your deployment notification channel.
STRATEGIC_URLS = [
"https://example.com/",
"https://example.com/pricing",
"https://example.com/docs/getting-started",
# ...
]
def post_deploy_check(service, site_url):
issues = []
for url in STRATEGIC_URLS:
result = inspect_url(service, site_url, url)
if result["verdict"] != "PASS":
issues.append({"url": url, "verdict": result["verdict"], "canonical": result["canonical"]})
if issues:
send_slack_alert(f"⚠️ Post-deploy index check: {len(issues)} URLs need attention", issues)
return issues
2. Multi-Site Reporting Dashboard
If you manage multiple Search Console properties (www vs. non-www variants, subdomains, different TLDs), define a standardized weekly extraction function that produces consistent 7-day, 28-day, and 90-day comparison periods for each property. Feed results into a shared datastore and build a single dashboard.
Important scope note: Search Console treats https://www.example.com/, https://example.com/, and
https://subdomain.example.com/ as separate properties. Define your property list explicitly —
never assume www and non-www are combined.
3. Automated Performance Drop Monitoring
This is the highest-value ongoing automation: detect CTR or position regressions before they show up as traffic losses in GA4.
from datetime import date, timedelta
def detect_ctr_drops(service, site_url, threshold=0.20):
"""
Compares CTR for each page: previous 7 days vs the 7 days before that.
Alerts if any page drops by more than threshold (default 20%).
"""
today = date.today()
end_recent = (today - timedelta(days=3)).isoformat() # respect 2-3 day lag
start_recent = (today - timedelta(days=9)).isoformat()
end_prior = (today - timedelta(days=10)).isoformat()
start_prior = (today - timedelta(days=16)).isoformat()
recent_rows = query_search_analytics(
service, site_url, start_recent, end_recent, ["page"]
)
prior_rows = query_search_analytics(
service, site_url, start_prior, end_prior, ["page"]
)
recent_map = {r["keys"][0]: r["ctr"] for r in recent_rows}
prior_map = {r["keys"][0]: r["ctr"] for r in prior_rows}
alerts = []
for page, recent_ctr in recent_map.items():
prior_ctr = prior_map.get(page)
if prior_ctr and prior_ctr > 0:
drop = (prior_ctr - recent_ctr) / prior_ctr
if drop > threshold:
alerts.append({
"page": page,
"prior_ctr": round(prior_ctr * 100, 2),
"recent_ctr": round(recent_ctr * 100, 2),
"drop_pct": round(drop * 100, 1),
})
alerts.sort(key=lambda x: x["drop_pct"], reverse=True)
return alerts
Run this nightly via a cron job or CI schedule and route alerts to Slack, PagerDuty, or email depending on drop severity.
What to Monitor Beyond CTR
CTR drops are the most actionable signal, but there are four other patterns worth automating:
- Indexing regressions after deployments: pages that were indexed on Monday and are NOT_INDEXED on Tuesday warrant immediate investigation (robots.txt change, noindex tag accidentally deployed, canonical pointing elsewhere)
- New content indexation speed: for content-heavy sites, track how long it takes new URLs to enter Search Console data — a useful indicator of crawl budget health
- Mobile vs. desktop position gaps: pull
page+deviceweekly; flag URLs where mobile position is more than five ranks below desktop - Directory-level aggregation: aggregate impressions and clicks by URL prefix (e.g.,
/blog/,/docs/,/product/) to identify which content areas are gaining or losing ground over time
Robots.txt: Verifying Crawler Access
Before any audit, confirm your robots.txt is not inadvertently blocking Googlebot or AI crawlers from strategic sections:
User-agent: Googlebot
Allow: /
User-agent: GPTBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Google-Extended
Allow: /
Sitemap: https://example.com/sitemap.xml
The URL Inspection API will surface crawl anomalies caused by robots.txt rules, but it is faster to audit the file directly before debugging individual URL inspection results.
Practical Quotas and Error Handling
At 1,200 requests per minute per project, the Search Console API is generous for monitoring use
cases, but fine-grained queries (page × query × date × country) can hit the 25,000-row limit per
request faster than expected. Always implement exponential backoff for 429 responses:
import time
def execute_with_retry(request, max_retries=5):
for attempt in range(max_retries):
try:
return request.execute()
except Exception as e:
if "429" in str(e) or "quota" in str(e).lower():
wait = 2 ** attempt
time.sleep(wait)
else:
raise
raise RuntimeError("Max retries exceeded for Search Console API request")
Wrapping all .execute() calls in this retry decorator makes your monitoring pipeline resilient to
the transient quota errors that occur when multiple jobs run simultaneously.