Two developments have impacted how Google goes about indexing. Whereas the open internet has shrunk, Google must crawl via huge content material platforms like YouTube, Reddit, and TikTok, which are sometimes constructed on “complicated” JS frameworks, to search out new content material. On the similar time, AI is altering the underlying dynamics of the net by making mediocre and poor content material redundant.
In my work with a few of the greatest websites on the internet, I recently seen an inverse relationship between listed pages and natural visitors. Extra pages aren’t robotically dangerous however usually don’t meet Google’s high quality expectations. Or, in higher phrases, the definition of high quality has modified. The stakes for SEOs are excessive: develop too aggressively, and your entire area would possibly endure. We have to change our mindset about high quality and develop monitoring techniques that assist us perceive area high quality on a web page degree.
Satiated
Google has modified the way it treats domains, beginning round October 2023: No instance confirmed the inverse relationship earlier than October. Additionally, Google had indexing points once they launched the October 2023 Core algorithm replace, simply because it occurred now in the course of the August 2024 replace.
Earlier than the change, Google listed every thing and prioritized the highest-quality content material on a website. Give it some thought like gold panning, the place you fill a pan with gravel, soil and water after which swirl and stir till solely priceless materials stays.
Now, a website and its content material have to show themselves earlier than Google even tries to dig for gold. If the area has an excessive amount of low-quality content material, Google would possibly index just some pages or none in any respect in excessive instances.
One instance is doordash.com, which added many pages during the last 12 months and misplaced natural visitors within the course of. At the very least some, possibly all, of the brand new pages didn’t meet Google’s high quality expectations.

However why? What modified? I cause that:
- Google desires to save lots of assets and prices as the corporate strikes to an operational effectivity way of thinking.
- Partial indexing is more practical in opposition to low-quality content material and spam. As a substitute of indexing after which making an attempt to rank new pages of a website, Google observes the general high quality of a website and handles new pages with corresponding skepticism.
- If a website repeatedly produces low-quality content material, it doesn’t get an opportunity to pollute Google’s index additional.
- Google’s bar for high quality has elevated as a result of there’s a lot extra content material on the internet, but additionally to optimize its index for RAG (grounding AI Overviews) and practice fashions.
This emphasis on area high quality as a sign means it’s a must to change the best way to watch your web site to account for high quality. My tenet: “If you happen to can’t add something new or higher to the net, it’s doubtless not ok.”
High quality Meals
Area high quality is my time period for describing the ratio of listed pages assembly Google’s high quality normal vs. not. Be aware that solely listed pages depend for high quality. The utmost share of “dangerous” pages earlier than Google reduces visitors to a website is unclear, however we will actually see when its met:



I outline area high quality as a sign composed of three areas: person expertise, content material high quality and technical situation:
- Consumer expertise: are customers discovering what they’re in search of?
- Content material high quality: information gain, content material design, comprehensiveness
- Technically optimized: duplicate content material, rendering, onpage content material for context, “crawled, not listed/found”, smooth 404s

A sudden spike in listed pages normally signifies a technical subject like duplicate content material from parameters, internationalization or damaged paginations. Within the instance under, Google instantly lowered natural visitors to this area when a pagination logic broke, inflicting plenty of duplicate content material. I’ve by no means seen Google react to quick to technical bugs, however that’s the brand new state of search engine optimization we’re in.

In different instances, a spike in listed pages signifies a programmatic search engine optimization play the place the area launched lots of pages on the identical template. When the content material high quality on programmatic pages is just not ok, Google rapidly turns off the visitors faucet.


In response, Google usually reduces the variety of key phrases rating within the prime 3 positions. The variety of key phrases rating in different positions is usually comparatively steady.



Dimension will increase the issue: area high quality is usually a larger subject for bigger websites, despite the fact that smaller ones can be affected.
Including new pages to your area is just not dangerous per se. You simply wish to watch out about it. For instance, publishing new thought management or product advertising and marketing content material that doesn’t instantly goal a key phrase can nonetheless be very priceless to web site guests. That’s why measuring engagement and person satisfaction on prime of search engine optimization metrics is essential.
Weight loss plan Plan
Probably the most essential strategy to preserve the “fats” (low-quality pages) off and cut back the chance of getting hit by a Core replace is to place the appropriate monitoring system in place. It’s laborious to enhance what you don’t measure.
On the coronary heart of a area high quality monitoring system is a dashboard that tracks metrics for every web page and measures them in opposition to the typical. If I may choose solely three metrics, I’d measure inverse bounce charge, conversions (smooth and laborious), and clicks + ranks by web page kind per web page in opposition to the typical. Ideally, your system alerts you when a spike in crawl charge occurs, particularly for brand new pages that weren’t crawled earlier than.
As I write in How the best companies measure content quality:
1/ For manufacturing high quality, measure metrics like search engine optimization editor rating, Flesch/readability rating, or # spelling/grammatical errors
2/ For efficiency high quality, measure metrics like # prime 3 ranks, ratio of time on web page vs. estimated studying time, inverse bounce charge, scroll depth or pipeline worth
3/ For preservation high quality, measure efficiency metrics over time and year-over-year
Ignore pages like Phrases of Service or About Us when monitoring your web site as a result of their perform is unrelated to search engine optimization.
Acquire Section
Monitoring is step one to understanding your web site’s area high quality. You don’t all the time want so as to add extra pages to develop. Typically, you may enhance your present web page stock, however you want a monitoring system to determine this out within the first place.
Adidas is an efficient instance of a website that was capable of develop natural visitors simply by optimizing its present pages.

One other instance is Redfin, which maintained a constant variety of pages whereas considerably rising natural visitors.

Quoting Snr. Director of Product Progress in my Redfin Deep Dive about assembly the appropriate high quality bar:
Bringing our native experience to the web site – being the authority on the housing market, answering what it’s prefer to dwell in an space, providing an entire set of on the market and rental stock throughout america.
Sustaining technical excellence – our web site is massive (100m+ pages) so we will’t sleep on issues like efficiency, crawl well being, and information high quality. Typically the least “attractive” efforts could be essentially the most impactful.”
Corporations like Lending Tree or Progressive noticed important good points by decreasing pages that didn’t meet their high quality requirements (see screenshots from the Deep Dives under).


Conclusion
Google rewards websites that keep match. In 2020, I wrote about how Google’s index might be smaller than we think. Index measurement was once a aim early at first. However in the present day, it’s much less about indexing as many pages listed as potential and extra about having the appropriate pages. The definition of “good” has developed. Google is pickier about who it lets into the membership.
In the identical article, I put up a speculation that Google would swap to an indexing API and let web site homeowners take accountability for indexing. That hasn’t come to fruition, however you may say Google is utilizing extra APIs for indexing:
- The $60/y settlement between Google and Reddit supplies one-tenth of Google’s search outcomes (assuming Reddit is current within the prime 10 for nearly each key phrase).
- In e-commerce, the place more organic listings show up higher in search results, Google depends extra on the product feed within the Service provider Middle to index new merchandise and groom its Procuring Graph.
- SERP Options like High Tales, that are essential within the News industry, are small providers with their very own indexing logic.
Trying down the street, the large query about indexing is the way it will morph when extra customers search via AI Overviews and AI chatbots. Assuming LLMs will nonetheless want to have the ability to render pages, technical search engine optimization work stays important—nevertheless, the motivation for indexing modifications from surfacing internet outcomes to coaching fashions. Because of this, the worth of pages with nothing new to supply might be even nearer to zero than in the present day.