Content strategists can discover what keywords their competitors are targeting by deconstructing and analyzing individual pages. You can repeat the process for as many pages as you want and aggregate the results.
Simply speaking, by analyzing a webpage’s content, and more specifically its noun phrases, you can understand the focus keyword(s) of any page. Reverse engineering the most used words and phrases may be tedious when done manually, but there are plenty of tools out there to help you do so automatically.
However, this is only valid for a single page. In order to know what keywords an entire website is targeting, you must do the same for each blog post ever published. This is obviously more time consuming, but you get a more overarching view of what topics they are trying to be authoritative over.
Manual Competitor Keyword Analysis
Analyzing your competitor’s keywords is an easy task. Read their blog posts and list what phrases are repeated. This is the poor man’s method but it works. Simply by taking a look at a blog post’s title you can figure out what keyword is targeted. Reading the article body will give you a bunch of secondary and supporting keywords.
Doing it manually is great for granular comparisons. For example, let’s say you want to write a new blog post targeting a specific keyword. Researching and listing the supporting and longer-tail keywords used in the top ranking blog posts for this keyword is an important step. Search the keyword in Google, along with some slight variations, and analyze top page: titles, headings, introductions, conclusions, and bodies.
How to deal with keyword variations?
Obviously, no blog writer is naive enough to just write an article with the same keyword repeated over and over. So, how can you count occurrences of a specific keyword when it is written using variations and inflections?
In natural language processing, there is a process called lemmatization. A sister process, much less difficult but a lot less precise is called stemming. They both take a word and find its basic form (singular, present, without inflection). This new word is the lemma.
- dogs has dog for lemma
- constructed has construct for lemma
- beautiful has beautiful for lemma (the same!)
- was has be for lemma
You get the gist! Take a word, and make it its simplest correct form. Stemming uses a rule-based strategy which causes some problems, especially with plurals. Lemmatization is smarter but more computational hungry. It generally uses machine learning models, and learns from its own mistakes. It is also trained on huge corpuses on millions of documents.
Once each word in a document is lemmatized, you can take the last word of each keyword’s lemma. The output is a normalized keyword which puts most variations, if not all, as identical keywords (e.g.
dog breeders, all become
dog breed after preprocessing).
Automated Competitor Keyword Analysis
As previously explained, gathering and counting a blog post’s focus keywords is not that easy. Even less when you want to do so for dozens or hundreds of published pages. Then, your only solution is to use a software or tool that will the heavy lifting for you. Reports or visualizations are then published so you can see results at a glance.
These tools do something very simple but time consuming, over and over again, for each article on a website:
- Fetch the page source
- Extract the article content
- Tokenize the text
- Save relevant keywords
- Count occurrences
- Generate variations
Most of these SEO tools have data streams telling them what keywords people searched for to reach a website. Additionally, they scour the web to also know what anchor texts were used to link to a particular page.
Competing for the same keywords
Knowing what keywords your competitors are targeting is a first step towards outranking them. But it is a hard task. Now, you must increase your authority over these topics and subtopics through hero content and topically relevant backlinks.
Refreshing your old content is also strongly recommended. Mention newer and fresher content related to these keywords you are competing for. Reshuffle your internal links to reflect your focus on highly competitive keywords.
Chasing your competitor’s tail, however, is not the right strategy. Stop reacting to what content they are publishing, and start proacting and anticipating. Whoever publishes first has somewhat of an advantage. Therefore, find keyword opportunities so you get your competitors to copy you, instead.