The Google API Leak for Everyone Else

If You Don’t Follow SEO You Probably Didn’t Hear

Google recently leaked API documentation revealing how their Search Engine, YouTube, and Chrome work together.

This could impact the Rumble vs. Google case and shows exactly how Google interfered with “Election Misinformation” and “Covid Misinformation” and hides other manipulations as well.

This summary is for everyone else outside of the Digital Marketing and SEO spaces. This needs attention as Rumble and the honest parts of the Government need to be asking pointed questions about this leak.

A very brief history

Google took over as a search engine with its unique method of ranking sites. Initially called “BackRub“, this technology was expanded on and improved over the years.

However, until recently, the key aspect of all of Google’s ranking factors boiled down to two things:

1. Results were democratic – the collective internet was eventually the deciding factor of how Google ranked information, not Google directly picking winners.

2. Searcher Intent – Google served as a vehicle for fulfilling what you were trying to accomplish with your search, not controlling your actions.

They have been slowly moving away from that core model in recent years and what we are seeing today is just latest in a long line of scummy moves.

SEO’s have been in an adversarial relationship with Google for a long time. Granted some SEO’s do try to game the system (commonly called Black Hat SEO’s), but most SEO’s are looking to do the same thing as Google once said they did, serve searcher intent. SEO’s try to understand Google and other search engines to do that.

We want to create buzz, strong social signals, server searcher intent to allow people to find what they are looking for.

Google once said that was its mission.

Now I doubt it.

This is about control.

You Guys Are Leaving Something Out

The API was first leaked publicly by Rand Fishkin below on May 27th and seems to have been the live code as of march of this year. Michael King also assisted in the initial analysis into the Leak.

The thing that immediately caught my eye was the section of Fishkin’s article discussing Covid and the Election whitelists.

References in several places to flags for “isCovidLocalAuthority” and “isElectionAuthority” further suggests that Google is whitelisting particular domains that are appropriate to show for highly controversial of potentially problematic queries.

I obviously have been digging into the API leak myself and have been finding things as well. However, most of the SEO community is ignoring the wider implications of the API documentation due to their political tendencies. Outside of the SEO space, this story hasn’t been getting any traction, it’s most likely being suppressed as much as possible.

The leak is preserved on HexDocs and can be accessed in full there.

Caveats About This Leak

None of the reveled information tells us about how things are “weighted” or the strength of each factor. It’s like a skeleton with no muscles or organs.

Many documents refer back to Google Intranet documentation, leaving us with a partial picture.

I furthermore suspect that there’s more that was not uploaded that factor into the rankings. Assuming that this is everything would be folly. Nothing in here references the Android ecosystem it seems. Knowing how important mobile search is, this leads us to know that we are still missing big pieces of the puzzle.

Here’s the key take aways that I think the general public, independent journalists, and Rumble specifically would be most interested in.

Key points from the leak

Google uses “isVideoFocusedSite” attribute to potentially hinder competition from sites like Rumble.

If 50% or more of the website has video it gets this tag. Right now we don’t know for sure what this does, however, it may bring it up for later review and marking as a “Known Video Hosting Domain”.

How Google treats Video hosting domains has not been surfaced yet. This is one aspect that I’m digging through the documentation to discover more.

But ask yourself this, do you really trust Google, who owns and runs the largest video hosting platform to not flag and demote websites that compete with YouTube?

Covid and Election labels boost certain sites deemed “trustworthy” on these topics.

These are probably best thought of as Whitelists that pick and choose who to surface for searches around Covid or the Elections. Right now, I’m sure that there are more running, if not outright white lists, then Twiddlers like what we’ll talk about later.

For those of you who have been hit with Channel Strikes on YouTube over one of these topics.

This is part of the machine used to target you.

This was also how voices pushing back against the major narrative were silenced. Along with the difficulty of breaking into the News section, and how Google Evaluates any sites that is related to a Your Money, Your Life topic, it was an uphill battle getting awareness out there about the risks. And here we see Google putting their finger on the scale.

NlpSemanticParsingLocalLocationConstraint

"Twiddlers" can alter search results before serving you content

Now most of them have a purpose, like giving you locally relevant search results. However, they can be used to manipulate the organic results in inorganic ways.

Of the Twiddlers I have identified so far; the Skin Tone Twiddler is the most hilarious to me. I remember a while back when people were starting to notice that search results were getting “diverse”. This Twiddler is the one responsible for making sure that search results are appositely diverse. Google literally is being racist here.

ImageQualitySensitiveMediaOrPeopleEntities

However, Twiddlers could be spun up for just about anything, manipulating the results on the fly when needed, then turned off when no longer needed.

Another thing of note is that YouTube functions off of Twiddlers specifically. Random strike on your channel for a years old video? It was probably a Twiddler that got spun up to target something.

tweet screenshot - twiddlers and youtube

Chrome Tracks User Interactions Extensively

If you are at all concerned about privacy online, you probably don’t use Google Chrome. But if you do, Google is keeping tabs on all of your activities, from where you go after a search to where you go after. Chrome was built with the express purpose of capturing all of the behavior data from users, and this API leak shows us the extent of the tracking from point A to B and beyond.

google_api_content_warehouse

Smaller sites are labeled "smallPersonalSite," likely making it harder for them to rank.

This would be especially true for independent journalists as they have even more going against them. The “Google News” ecosystem is very difficult to break into, favoring big players, and punishing smaller ones.

Authors of articles on websites are tracked and evaluated for Expertise, Authority, and Trust

Furthermore, they may have the sentiment around them and any mentions of them monitored and manipulated in the rankings. The same could be said for any entity that Google identifies, brand, person, etc.

I am experimenting with testing this hypothesis on controversial figures like Alex Jones and the results are interesting so far. Nothing positive or neutral about him comes up for a general search for his name, even with removal of all search personalization or transfer of location. I’ll publish some findings in full once I have been able to complete my research. But for the moment I heavily suspect that sentiment can be skewed via the metrics that are tracked by Google.

Model.ScienceIndexSignalAuthor

NlpSciencelitAuthor

SentimentSentiment

NlpSaftMention

Wrapping Up

In summary, this leak offers concrete evidence of Google’s manipulation of search results, giving us, if not the smoking gun, an empty shell case at the scene of the crime. I know of no one in the SEO community that is actively looking for this kind of evidence of manipulation, and even if they saw it, they would not care to call it out.

I’ll leave you with this,

Google has poisoned their product, searching for and organizing information, the only way to undo this would be to return to purely organic ranking and fulfilling searcher intent.

I want to see a Search Tool that does this and right now, there’s truly nothing on the market. Google manipulates things, Bing does the same, and Duck Duck Go has introduced bias much like Google.

I’m currently building a map of how this API works for myself; however, it is slow going due to the sheer number of documents that need to be reviewed and mapped out to how they relate to each other as well as the demands of running my own business.

If you want to help dissect this API be aware of a few things.

SEO industry terms won’t be used, short hand and intranet links will lead to dead ends, but this is something that needs to be highlighted and discussed, despite Google’s best efforts to keep this from making the wider rounds. There’s likely evidence of bias that won’t be obvious unless we’re looking for it.

Share this, poke around the API, share your findings and ideas.

Use #GoogleAPIMadePeopleDie

Let’s make Google regret their manipulation.

Share This

Need Expert Marketing Help?

Contact Us!