How digital marketers can embrace Google Search’s biggest ever algorithmic update: BERT

How digital marketers can embrace Google Search’s biggest ever algorithmic update: BERT

Google has just made an algorithmic update that some believe is the biggest change in search ever. We asked leading SEO and digital marketing expert Dawn Anderson, founder of Bertey and lecturer at Manchester Metropolitan University, to brief us on the changes and how to embrace them in digital strategy.

What is BERT, when did it launch and how does it fit into the natural language processing (NLP) landscape?

Dawn: “BERT is two separate things – well, many different things actually. Firstly, BERT is an open-source research contribution from the Google AI team which followed a published academic paper in October 2018. The paper was entitled, BERT: Pre-training of Deep Bi-Directional Transformers for Language Understanding.

“Secondly, BERT is the name of a recent Google algorithmic update which Google Search have rolled out and announced as one of the biggest implementations they have made in five years and maybe even the history of search.

“When we look at the first type of BERT – the research piece and academic paper – we can break it down further and literally pick apart BERT's name to understand a little bit more about what BERT actually is and what it means.

“Essentially BERT is a machine learning, natural language processing framework which can be utilised by anyone who has a mind to do so because of its open source nature.

“BERT is pre-trained on the whole of the English Wikipedia and the Books Corpus combined which amount to over 3.5 billion words between them. The difference between BERT and many of the other past natural language processing models is that past models had to be manually labelled up with different parts of speech so that machines could begin to understand what words meant in context. This is both very time consuming and expensive because it often requires huge teams of linguists working on datasets. Furthermore, the context, and even meaning of a word continually changes as a sentence develops, and a word can suddenly move from being one part of speech to another when the full context is appreciated.

“BERT is an anagram of its own full name. B stands for bi-directional. E and R stand for encoder representations and T stands for transformers.

“All of these different parts of BERT are very important and to a large extent groundbreaking. The B part – bi-directional – means that the language model which BERT is pre-trained on looks at all the words in a sentence on either side of a target word. As mentioned, a word's meaning can literally change as you begin to add more words on and it is therefore really important to be able to see the whole sentence simultaneously to appreciate true context. Past language models were mostly pre-trained in a uni-directional way in that they could not see both sides of the sentence around a target word. BERT can also predict the next sentence, as well as the current sentence.

“The E and R part of BERT are more function in that they essentially relate to the feed in and out process of words or sentences going in and contextual representations coming out of the BERT system.

“The Transformers part of BERT is also huge. They have the potential to help a lot with the issue of pronouns and something called coreference resolution as well as some issues with disambiguation. The Transformers part of BERT uses something called 'Attention' which focuses on the noun or 'entity' being referred to in a sentence in either written form or spoken form. It's a way for machines to keep track of who, or what, is being talked about.

“In terms of BERT's fit into the natural language processing, or NLP, landscape, BERT, the open-sourced pre-trained application, is huge and is responsible for the largest growth change ever in NLP. However, it should be noted that BERT does build upon important open-source previous work such as the likes of Elmo and ULMFit, which are other NLP models.

“The whole open source community has come together to take learnings forward in the natural language understanding space, and there are also many different types of BERTs now because the original BERT has been extended and improved. For example, Facebook's AI team produced RoBERTa, which was supposedly a more robust version of BERT – hence, the name, which is a combination of 'robust' and 'BERT'. Microsoft, IBM and other major leading AI companies have also produced models. Even Google has now produced a further descendent of BERT, called ALBERT, which is supposedly much leaner and more efficient than the original BERT and even has some improvements in the area of next sentence prediction.

“Aside from in parts of Google Search, BERT's implementation in wider industry overall is somewhat unknown. Whilst BERT is something of a Swiss army knife, in that there are many different natural language processing tasks BERT helps with, the use cases are fairly granular in themselves. So, it's likely many bits of BERT will be used all over the place in industry over time.

“BERT helps with a whole range of NLP tasks, including entity determination, textual entailment – which is next sentence prediction, coreference resolution, polysemy resolution, and many more tasks.”

What is the implication for digital marketing – what is different?

Dawn: “Search engines just got a whole lot better at understanding how humans interpret language.

“All the tiny little words which have multiple meanings and act as the glue between words in sentences and phrases now suddenly begin to make sense much more. Whilst we know only of BERT being implemented in Google Search, it would be surprising if there were not elements of something similar already in production across a number of different platforms that matter to digital marketing. For example, social media platforms, and generally websites overall. Natural language processing and improved AI is going to plan an increasingly important part of digital marketing going forward.”

If you were a digital marketer reading this interview, what would be the first steps you would take to benefit from BERT?

Dawn: “The main point of BERT is that it's designed to detect natural language. There are a number of sites out there with content that looks anything but 'natural'. Lots of 'keyword-ese' content is in existence. I'd be spending some time checking that pages look natural. That said, there is no 'optimisation' to do for BERT per se. We should naturally be sharing well-formed, structured and written content designed for humans in the first place.”