I remember using search engines in the mid to late 90s as a kid. Yahoo!, WebCrawler, HotBot, Excite… AltaVista was my personal favourite. What a different world that was. The indexing and retrieval algorithms were relatively primitive; intelligent interpretations of search intent were still many, many years away, so you had to be careful about how to phrase your enquiries, and one also accepted that you may have to dig a little in the search results to find what one was looking for. Web surfers in the know could leverage the indexing algorithms by using multiple terms each enclosed in quotation marks (or joined with a + sign) to refine their search. Failure to use these punctuation marks would increase the chances of the search engine ranking each term individually, clueless as to the semantic context of the longer search query. Looking back, we might call these interfaces ‘cumbersome’. How far our online world has come.
In 1997, there were only 70 million Internet users across the world, a mere 1.7% of the world’s then population, so to be an active user – bewildered by the amount of information I could get about things like knights, skating, video games, Metallica, The Hobbit, and so on – was a privilege.
Having cut my teeth on these lifeless, rudimentary search engine platforms (excellent as they were for the times), it’s interesting to observe how the nature of the search query has changed 20 years later.
We no longer have to address the search engine in such mechanical ways with such calculated precision to even begin to hope to find the results we’re after. Instead, we can approach subjects flexibly, from varying angles, phrased in fairly user-specific language, and be quite confident that the results we’re after will be ranked sufficiently well to make our lives easier (i.e. not go digging), spelling error or not. They’re a far more user-friendly, forgiving product than they once were. Hell, you can even talk to them now (and we do – over 20% of searches in the Google App are now by voicei).
Let’s say we wanted to find out how high Mount Everest was, right now. Any of these enquiries will do:
mount everest height
mt everest high
what is the height of mount everest
height of Everest
everest summit height
20 years ago, probably the surest way of finding a fast, reliable result might have been to phrase our query:
“mount everest” “height” (or mount+everest+height),
with little wriggle-room in variations before the SERP became a bloated mess of overly broad, low-usability results. We’d be served a page that would give us the info we wanted, but the way the result itself was indexed and ranked was very much a function of it matching our query terminologically – not semantically.
Today, we use “best burgers Joburg” to find a desirable burger restaurant; “live Jazz” to find a local bar or venue with a Jazz band; and “advantages of omega” to find out the health benefits of the essential oil groups (and not the watch manufacturer).
One of the most important revisions to search engines in this regard is the rise of the semantic web, a migration of the SERP towards an ‘answer engine’ – as opposed to a retrieval one. We see this both in latent semantic indexing and the use of ‘rich results’ and Google Knowledge Graph cards that use structured data markup schemas – basically a network of standards for machine-readable data describing things and their properties on the web.
Latent Semantic Indexing – LSI for short – is an algorithm that helps search engine’s discover and understand the contextual relationships between words, and therefore understand intent. Today, we say that search is driven by user intent, not by keywords as such.
LSI is traditionally a positivist mathematical tool used in linguistic content analyses to understand text. It allows databases to be compiled that track the co-occurrences between particular words. For search engines, LSI compares how often words appear in the same document, against how often those occurrences happen in all the documents that Google has indexed (the ‘latent’ part in ‘LSI’). Assembling databases organising co-occurring words enables semantic categories to be built around particular themes, allowing an understanding of the conceptual content of a text by establishing associations between categories, and those occurring in other texts.
It’s incorporation within search engines has produced a more sophisticated AI, providing better document categorisation and improved dynamic clustering of content. It also helps us overcome several issues of Boolean keyword queries, such as synonymy (many words sharing a similar meaning) and polysemy (the same word having multiple meanings).
LSI also facilitates a fluid, user-centred approach to understanding words, capable of adapting to new and changing terminology. All these are useful developments in further aligning human and algorithmic understandings of ‘relevancy’ from a technical viewpoint.
For all the benefits in intelligence brought on by LSI, urgency of its emergence was, interestingly, largely driven in response to content marketers being able to manipulate search engine results based on black- or grey-hat keyword strategies. This meant that quality, usable content was marginalised while content written ‘for the machine’ was rewarded. Ranking this content, less natural and more spammy and superficial in character, negatively impacted on user experience of the search engine.
That’s why real search marketing success in ultra-competitive online markets today should never be based on methodically satisfying the holy-grail “Google’s 200 ranking factors”.
The problem with that more formulaic, detached, ‘laundry list’ approach to SEO is that it encourages optimisation tactics that are increasingly at odds with the more fluid, natural, organic approach to content generation that – through greater alignment between the algorithmic and the human evaluation of ‘relevant content’ – is required for successful SEO today.
One of these outdated approaches to content generation that not all marketers have yet fully relinquished is the use of traditional keyword densities to ‘optimise’ their on-page content. In 2016, highest ranking URLs for keywords across business and consumer search only had on-page densities of +/- 1.4%. Compared to 2015, these pages used 20% fewer keywords.
The same keyword sample shows that around 55% of the top 20-ranked URLs have the keyword they’re ranking for in their title (down 20% from 2015’s figure); just over 50% have the keyword in the metadescription (down 5%); and only around 38% in the H1 (down 10%)ii.
All this illustrates the increasing irrelevance of individual keyword use as a ranking factor.
…But keywords are far from dead – they just need to be re-conceptualised within the framework of LSI. This is exciting for SEOs, because it puts more power in the hands of able marketers who can create rich, tailored content built around targeting very specific search intents. This builds more effective communications with potential customers and means we can expect to increase conversion rates, improve time on site and lower bounce rates of landing pages.
We can say then that LSI keyword optimisation is important to both SEO and conversion rate optimisation (CRO). SEO by showing Google that you have provided semantically rich, relevant content; and CRO because using LSI provides a user-centred semantic framework for how to speak about a given topic that aligns with what a user would naturally expect to find.
So how should we then approach on-page content optimisation?
The first task is to compile lists of terms constituting various semantic universes relevant to the intent behind our keyword. These include lexicons, synonyms, natural and localised language variations, long-tail keywords, and so on. We use a range of tools to achieve this. Our emphasis here shouldn’t be on incorporating LSI keywords via quantity; it should rather be limited to quality keywords that clearly align with the user intent addressed by the page objective.
To better understand which keywords better align with a page objective, consider if the user intent is navigational, informational or transactional. (As SEOs, our main focus is on informational vs transactional intent.)
Each intent will have its own semantic universes, although there are generally some overlapping terms. Let’s take as a very basic example the keyword ‘iPhone’. We can see how the relationship between semantic keywords is clear along informational and transactional lines:
iPhone vs Android
Used iPhone 6
It should be obvious then, that which keywords you include are just as important as which keywords you don’t include. We’re almost identifying negative keywords, which we should avoid using in our copy.
We now have a language for how we should talk about a given theme, and should have a general idea of what our content might ultimately look like. Now consider that in 2016, average word counts for top-ranking URLs were around 1 600 words (and about 1 200 words for mobileiii ). That seems like a sound guide of what to aim for in terms of article length in a competitive keyword environment. We’re ready to start writing.
Depending on the industry and the competitiveness of the search term, I aim to achieve natural densities of the target keyword, between 1.5% and 2.5%. Be wary of anything over 3% (word on the street is that anything over 3.5% is stuffing).
There are various opinions on what the overall density of all your LSI keywords together should be; some say as low as 2%, some 4%, some 7% or higher. I try not to focus on this too much; viewing LSI keywords instead as each carrying different weightings, both in terms of degree of absolute semantic relevancy as well as in traffic volumes. Their use should mirror that. I also like to think of them in different levels (keywords of keywords). We should use them as much as we can in relevant and important places within the content, but ensure that their inclusion is always seamless, never unnatural.
I also like to use them to contextualise links and other on-page elements with semantic relevance to leverage maximum SEO value from each object.
The main principle you want to follow when optimising content in 2017 is: bring together and structure individual search terms into complete topical areas that summarise semantic terms relevant to the same or similar themes according to search intentions.
SEO is very much alive and well in 2017 – 66% of marketers say improving SEO and growing organic presence is their top inbound marketing priority. It’s not about throwing out your SEO strategy – it’s about making sure all your tactics are aligned to best practices in 2017 and revising outdated techniques.
In addition to enhancing your web rank for competitive keywords – as well as other semantically related words – LSI facilitates a natural look and feel of the content that is much more usable (improves UX); helps earn high quality backlinks; and can sustain rank for a longer period of time
A well-executed LSI-based strategy to SEO is a critical part of achieving better organic market share today.
And by the way, the height of Everest is 8 848 m.
Kieran Tavener-Smith, M.A summa cum laude (Media Studies), cert. UXD
i Think with Google: Marketing Research & Digital Trends. 2017. 4 Things You Need to Know About the Future of Marketing. https://www.thinkwithgoogle.com/marketing-resources/micro-moments/future-of-marketing-mobile-micro-moments/
ii Searchmetrics. 2017. Rebooting Ranking Factors. http://www.searchmetrics.com/knowledge-base/ranking-factors
Interested in our digital
Call us on +27 11 705 2545 or send us an enquiry here