InPublishing: How the world of search is changing

The rollercoaster of algorithm updates and spam policy changes has seen seismic changes across the publishing industry. Let’s start with the site reputation abuse (SRA) policy. In March 2024, at the same time as releasing the first core update of the year and one of the search engine’s most complex ever, ‘involving changes to multiple core systems’ and marking an ‘evolution’ in how the ‘helpfulness of content’ is identified, Google gave publishers a heads up that it was also introducing some new spam policies.

As reputable publishers, we don’t often need to pay that much attention to spam updates — but this one was different. Google defined site reputation abuse as ‘when third-party pages are published with little or no first-party oversight or involvement, where the purpose is to manipulate Search rankings by taking advantage of the first-party site’s ranking signals’¹. With SRA, Google explicitly singled out news websites hosting coupons provided by a third-party, a common practice for many publishers. While some sites took steps to clarify their relationships with third parties and explain why working with them provided a better service to their readers, Google doubled down — equating trying to rank well in search results with bad user experience, as if the two are mutually exclusive. Ultimately, the end result was the same for all of us; with the vague but menacing threat of a potential infection spreading to the rest of our websites, publishers had no choice but to block the affected subdomains and subfolders from Google completely — effectively shutting down parts of the business and leading to significant revenue loss.

SRA update

We didn’t know it then, but this was just the start of a series of attacks on publisher traffic and revenue. The most recent volley came on November 19th with an update to the SRA policy (ten days before Black Friday, the biggest shopping event of the year), which completely reversed the March guidance by announcing that actually, it didn’t matter how much first-party involvement there was, any third party content was a potential violation of the policy and would result in a dreaded manual action². There was no official warning to this update, although the SEO community had been in uproar for weeks about the dramatic declines in visibility for publisher affiliate content in the US — CNN, Time, WSJ and Forbes had all been noticeably demoted in Google for their buying guide sections. And then, immediately following the release of the updated guidance, the manual actions started rolling out. At the time of writing, the first salvo has been unleashed but the industry is braced for further disruption.

There are many things about these updates which feel wrong and unfair — not least the fact that some of the sites hit with manual actions weren’t even in violation of the policy — the content is produced entirely in-house with zero third-party involvement. And yet, somewhat pointedly, Google said in the same update: ‘we don’t simply take a site’s claims about how the content was produced at face value’. Amid all the kerfuffle and confusion, the elephant in the room was the fundamental question — what’s wrong with third party content? If it’s providing a valuable service, aligns with brand values and is something users want to consume, what really is the issue? In North America, the New Media Alliance has written an open letter to the Department of Justice and Federal Trade Commission expressing its concerns about Google’s ‘overbroad, unilateral change’ in policy and the impact it has on ‘critical revenue / traffic streams’ for already struggling news outlets³.

What all this has confirmed, if there was any lingering doubt, is that the relationship between news publishers and Google has fundamentally broken down. When the SRA policy was first unveiled, there were many within the industry who were convinced they would not be impacted; then when the penalties started to hit, we believed if we could just explain to Google how we collaborated with third parties and show them how integrated they were within our systems and the immense amount of oversight we had over them, then the Google spam moderators would see the error of their ways and the traffic would return. How naive that seems now. In fact, all we were doing was giving Google a guided tour of our internal operations and all the ammunition they needed to take further action in the future.

Further disruption and evidence of the breakdown in trust has arisen from the proliferation of chatbots and the furore over publisher content being crawled and used for training large language models (LLMs) even when explicitly blocked from doing so. Two camps — sign or sue — are starting to establish themselves, with passionate arguments on both sides. It’s too soon to know how this is going to play out but one thing most of us can agree on is that AI is only as good as the resources it’s trained on. Without our content, its full potential feels unachievable.

The impact of AI

Google’s entry into the generative AI space has not been smooth. In 2023, in a move often criticised as a rushed-out not-quite-ready response to Microsoft’s Bing Chat, Google unveiled Bard (now Gemini) and almost immediately got into hot water when a viral post showed Gemini’s image generation tool creating a picture of German WW2 soldiers incorrectly including an Asian woman and a black man.

Then in May last year, it was the turn of AI Overviews (formerly known as Search Generative Experience) — put glue on pizza, eat rocks etc. On the one hand, it was embarrassing for Google, and the rollout of AI Overviews (AIOs) was scaled back. On the other, it highlighted the rise of a single player — Reddit — to become the third most visible domain in Google Search after Wikipedia and Amazon in the US and 4th in the UK, according to Sistrix⁴ — an increase of over 1,000% since August 2023. Reddit’s dominance in search started in the summer of 2023 but seemed to skyrocket in March this year, not long after the announcement of a deal with Google (reportedly worth $60m per year), to allow the search engine to access its data for AI training.

Google described the partnership as facilitating ‘more content-forward displays of Reddit information’. If this meant ranking Reddit prominently for almost every search query then that has certainly happened. With the benefit of hindsight, this move towards a more ‘real-life experience’--focused results page has been gaining momentum for some time. The 2022 Helpful Content Update wasn’t the first time Google told publishers they should focus on users of course, but it did mark a turning point for SEO.

With the emphasis firmly on showcasing content ‘written by people, for people... rather than content made primarily to gain search engine traffic’, Google’s attempts to tweak and refine their systems and signals to better identify and promote this holy grail of user-centric content became relentless. First came the ‘Discussions and forums’ feature, a carousel of UGC which showcases ‘helpful content from a variety of popular forums and online discussions across the web’⁵. Mumsnet and Quora do pop up but the list of sources is almost always topped by Reddit. Then in March 2023, Google launched the Perspectives carousel, which aims to showcase a diverse range of voices from journalists and experts — again, you won’t have to look for long before you see a Reddit thread pop up. Next came ‘hidden gems’ — an enhancement aimed at surfacing more content from, no surprises, forum thread comments, blog posts and articles with unique topical expertise⁶.

This move towards prioritising content that demonstrates personal experience and perspectives⁷, in combination with Google’s focus on AI to better answer individual user queries seems to be taking us firmly down the road of personalisation with the very real consequence that there are just fewer clicks to go around. If AIOs don’t immediately answer the question directly in the search result, there is a whole new smorgasbord of surfaces to scroll through and click on to keep you on the results page and away from the publishers who are often the original source of the content in the first place.

Zero click searches are not a new concept to anyone working in Search over the last few years — it’s a real and concerning development that actually runs the risk of bringing down the quality of content as publishers strive to make headlines as clickable and attention-grabbing as possible. For the top 500 non-brand keywords of the top news publishers in the UK and the US, analysis by SimilarWeb shows that between 54-64% of searches result in zero clicks. In the States, where the rollout of AIOs is further advanced, this trend reached a record high in November, with no sign the trajectory will level out.

What publishers need to do

What publishers can actually do in the face of Reddit domination will vary depending on the type of publication. On the one hand, Reddit is now ranking in Google for queries previously owned by media brands and poses a threat; on the other, being active on a platform that is now so visible in Google presents an opportunity to drive significant additional referral traffic. But managing this is a completely different proposition to posting content on Facebook or X and some brands will struggle to make inroads with fiercely protective moderators.

There’s been a lot written over the past few months about whether publishers should be taking legal action against generative AI companies for using their content or agreeing deals with them. In the high profile ‘suing’ camp are Mumsnet and The New York Times who are both taking action against OpenAI while News Corp has filed a lawsuit against Perplexity. Far more have agreed partnerships which, to varying degrees, give OpenAI (and other platforms) access to their content for training models. In return, as well as payment, publishers’ content can be cited in the AI summaries and attributed as original sources.

The jury is out on whether being cited with a link in AIOs or chatbot answers is really going to lead to more traffic. Google claims the overviews generate more clicks for publishers than traditional search results. Numerous studies analysing the impact of AIOs suggest otherwise. And of course, there’s no easy way to measure it because Google does not provide any insight in their analytics tools.

Understanding the impact of AIOs remains a priority however, so that means using third party tools plus your own data to track not just which keywords the overlays appear for but how much traffic you’re losing (or gaining) when they are generated. Analysing your content not just by category but by the user intent behind the queries and measuring this against user satisfaction with the AI-generated result will give you insights into which types of content are most resilient and where AI capabilities do not meet user expectations.

This is where the opportunity lies to establish a defence. David Buttle, from DJB Strategies, has undertaken some fascinating research into how AI will affect different types of publishers and the actions we can take to mitigate the risks. Ultimately, it comes down to providing trusted content that helps readers make better decisions and offering a distinctive voice: ‘AI summarisation cannot supplant reading, first-hand, that columnist around which national, local or sectoral debate centres. Be the provider of that.’⁸

The challenges and opportunities

As painful as it can be to admit, the enforced self-reflection necessitated by Google’s rule changes has been valuable. We’ve had to think hard about how to best serve our readers — what information is really useful to them, what unique expertise can we draw on to deliver genuine insights that will help them make informed decisions. I do not agree with the fundamental principle behind the site reputation abuse update — it’s a cynical, brazen move which ultimately penalises publishers and leads to a worse user experience on the web — but I do see this moment as an opportunity to reset, redefine and invest in quality.

The proliferation of AIOs and chatbots only compounds this further — the less reliant publishers are on Google for traffic, the more protected our brands are against future headwinds. That’s easier said than done of course and paints a rather bleak picture. The era of easy traffic is well and truly over and the only guarantee for the next twelve months is that there will be further change and more disruption.

Footnotes

1. Google Search Central, March 5, 2024: What web creators should know about our March 2024 core update and new spam policies

2. Google Search Central, November 19, 2024: Updating our site reputation abuse policy

3. News Media Alliance, December 2, 2024: Open letter to United States Department of Justice and Federal Trade Commission

4. Sistrix, October 16, 2024: Reddit Domain Analysis — Valuation, Competitors, and Stability

5. The Keyword, September 28, 2022: Bringing more voices to Search

6. The Keyword, May 10, 2023: Learn from others’ experiences with more perspectives on Search

7. The Keyword, November 15, 2023: New ways to find just what you need on Search

8. David Buttle, Press Gazette, May 23, 2024: Google AI Overviews breaks search giant’s grand bargain with publishers

This article was first published in InPublishing magazine. If you would like to be added to the free mailing list to receive the magazine, please register here.

SRA update

The impact of AI

What publishers need to do

The challenges and opportunities

InterMedia

Related articles

Receive InPublishing magazine