Alphabet, Google’s parent company, makes the bulk of its revenue from Google. And Google makes 70% of its revenue from AdWords and almost all of it from advertising generally. The importance of search to Alphabet, which is heavily invested into AI, is obvious. So it’s not a great logical leap to assume the end goal of its AI work is to improve their core offering: Search.
RankBrain
RankBrain is where some of the most impressive AI is deployed within search and it's (probably) the third most important ranking factor. This is according to Andrey Lipattsev, the Search Quality Senior Strategist at Google Ireland, who revealed that (unsuprisingly) content and links are the top two most important factors.
Since Google has previously stated before that:
RankBrain is one of the “hundreds” of signals that go into an algorithm that determines what results appear on a Google search page and where they are ranked, Corrado said. In the few months it has been deployed, RankBrain has become the third-most important signal contributing to the result of a search query.
we can, therefore, place the order as Content, Links and RankBrain. Maybe. As of course what Google was stating was that RankBrain had ‘become’ the third most important signal, which means it’s not an assigned place in the hierarchy. This might be a misunderstanding of what RankBrain is and how it works.
Does RankBrain include machine learning?
It’s also difficult to unpick what precisely RankBrain is, and what the other parts of machine learning within the algorithm are. Ultimately I’m not sure it matters that much whether these signals are collectively called RankBrain or not. However according to Search Engine Land:
All learning that RankBrain does is offline, Google told us. It’s given batches of historical searches and learns to make predictions from these.
Those predictions are tested, and if proven good, then the latest version of RankBrain goes live. Then the learn-offline-and-test cycle is repeated.
https://searchengineland.com/faq-all-about-the-new-google-rankbrain-algorithm-234440
This means anything Google does to optimize results on the fly, if it does at all, logically has to sit outside RankBrain. This would be restrictive because it would not be able to, for example, run a query, get a poor result set and get the AI to create a new set of results for the next person searching for that query. That seems pretty core to how machine learning could be used to improve the search experience. So, either RankBrain now does impact live searches or the machine learning element sits outside RankBrain.
RankBrain and Word2Vec
RankBrain (or at least part of it) is the on-page ranking factor that matches content to queries that don’t use the same words. It does this using a system behind the scenes called Word2Vec which builds something called word vectors.
Word vectors can be created in several different ways, in essence, they are a way of placing words or groups of words into a mathematical model. By looking at the placement of words within the model you can infer relationships between wordsw
From TensorFlow.com
The placement within the model depends on the type of vector being created, but all will broadly use the principle that words which are used in the same way will share the same meaning. This is called the ‘Distribution Hypothesis’.
Within this word vectors can be split into the (very broad) types, Count based and Predictive. Count based models look at how often a word re-appears next to other words in a large volume of material.
Content: ranking factor 1 and 3?
Predictive models, such as Word2Vec, instead attempt to predict what the next word will be based on groups of words. A common type of these groups is an NGRAM, into which the source text will be broken down. More commonly occurring word groups can be scored with higher confidence and thus the next word in a chain can be predicted from looking at high scoring groups.
As RankBrain uses Word2Vec, which is of course purely dependant on content, this kind of makes content ranking factor 1 and 3. This becomes a bit more complex though when you apply the other forms of machine learning Google is applying into the system, whether these are considered part of RankBrain or not.
Online or offine?
It makes sense that RankBrain works offline so that the results can be checked, as AI has a habit of generating surprising results. In particular one of the current limitations is the influencing of the way they behave from the content they consume. Remember how it was quickly discovered Google’s autocomplete feature was showing unsavoury results? It was just operating based on the given information, and of course that information can’t be blindly trusted. Because these AI’s can make unpredictable connections it makes good sense that Google wants human verification of results before including them in the live algorithm.
RankBrain within Search
There is a fantastic article over on backlinko which takes you through the various impacts of RankBrain on SEO and how to optimize for it. They combine all the machine learning algorithms into RankBrain rather than just the Word2Vec content analysis.
One point on which I do disagree a little with the article, is on long tail keywords. Long tail keywords remain important but how we use them does indeed need to change. Rather than isolating single terms and optimizing single pieces of content for that one term, instead it’s about isolating groups of terms and creating naturally written content around that group. This will create pages where the word vectors are all mathematically close, building a higher relevance and therefore ranking higher.
That’s why we have already integrated NLP technology into keyword research with Wordtracker Inspect. We take existing pages and content, analyse them using NLP and suggest additional keywords for the page. So you can make your content more RankBrain friendly.
The long game
Just listen to this clip of the new Google Duplex AI. It’s astonishing. As well as rather creepy. See if you can guess which voice is the AI and which is the real person. Information sites are likely to remain, but more and more of our commercial queries will go through these new technologies and therefore Google’s interest is within that technology.
Google knows AI is the next mobile
This is why Google is fighting to make sure that it’s their AI that you’re using - building a personal assistant that can not only give you information but actually interact with people and book appointments for you. After all, it’s the commercial queries where Google makes its money, with the bulk of all revenue provided by AdWords. Years before mobile search took over Google created and released Android, so ensuring it came to dominate the mobile search market. Well, it's now doing exactly the same with AI.
This is all a long way off, but it is happening. Although if Google still can’t stop auto complete from suggesting that "climate change... is a hoax" and isn’t even confident enough to let RankBrain run in real time, I don’t think we need worry about the robots taking over just yet.