What Scraping 1.1 Million Search Results Taught Us About SEO

No Comments

Authority Hacker conducted a study scraping and analyzing 1.1 million Google search results to identify patterns in what ranks well. The research examined page-level factors, domain characteristics, and SERP features across diverse query types, providing insights into both traditional ranking factors and the changing landscape of search results.

Domain Age and Authority

The study found that domain age correlated with higher rankings, though the relationship was more nuanced than simply "older is better." Domains that had been actively building content and links over many years ranked better than older domains that had been neglected. The combination of age plus ongoing development proved most important.

Domain authority metrics showed expected correlations, but the research noted significant variation. Lower-authority domains regularly appeared in top positions for specific queries where they had strong topical relevance. This suggests that topical authority in specific niches can compete with broad domain authority in the right contexts.

Content Characteristics

First-page results averaged longer content than second-page results, but with significant variance by query type. Informational queries showed the strongest length correlation, while navigational and some transactional queries showed shorter top results. The key insight: content length should match user intent and query type rather than following universal targets.

Keyword usage in titles and headings remained correlated with rankings, but exact-match optimization appeared less important than topical relevance. Pages ranking for competitive terms typically covered the topic comprehensively rather than repeating exact keywords. The research supports semantic optimization over keyword-focused approaches.

SERP Feature Prevalence

Featured snippets appeared in approximately 12% of analyzed results, with their presence varying dramatically by query type. "How to" and "what is" queries showed much higher featured snippet rates. Knowledge panels, People Also Ask boxes, and image carousels appeared with varying frequency depending on query intent and topic.

The increasing prevalence of SERP features means traditional "10 blue links" analysis understates competition for visibility. Top organic positions may appear below multiple SERP features, making actual visibility lower than position numbers suggest. Factor SERP feature competition into opportunity analysis.

Methodology Considerations

The researchers emphasized limitations: correlation studies can't prove causation, scraping captures point-in-time snapshots that may not reflect dynamic ranking, and query selection influences findings. Use such research directionally while validating against your own data and testing.

Source: Authority Hacker

About SEO ProCheck

Technical SEO consulting and GEO strategy with 20 years of enterprise experience. Case studies, resources, and tools for search and AI visibility.

Work With Me

Technical SEO audits, GEO strategy, site migrations, and international SEO. Hourly consulting for teams who need hands-on support, not just reports.

Subscribe to our newsletter!

More from our blog