HomeSEO TheoryBlogContact

Keyword Analysis for Determining Relevance

Most SEOs consider keyword analysis as a way to perform keyword research and implement the use of keywords, phrases and related words to a page for optimal search engine performance. In the context of this page, the term "keyword analysis" will refer to what a web crawler or spider would do in order to determine what a webpage is about. We'll discuss the many variables involved in analyzing a page's content, this is perhaps the single most important aspect of search engine optimization. The content of a page is what solely determines how relevant a page is to a user's query and relevance is always the first thing a search engine should consider when ranking pages.

Page Content
Metadata and Page Details
Metadata and Page Details
Creator: Devin Peterson
Date: Created 02/13/2014 - (Updated 05/26/2014)
Subject: Social Media, Writing, SEO, Author Status, Google Plus
Publisher: DNM Int'l
Peer Review:
Citation: Peterson, D. (2014), "Content Keywords: Document Ranking Based Keyword Analysis", Retrieved (date), from


What is a Keyword?

The term keyword has a very ambiguous meaning. It is often used to refer to a person's search query, however it can also refer to words, phrases or strings of text in a document. Part of a search engine's job is to find the connection between a user's search query and the text on a given document. The relationship between those 2 things can be measured in "relevance", ie how related the text is with the search query.

Analyzing this relationship is an extremely difficult practice, which is precisely why a company who is able to do this effectively became one of the most lucrative businesses in the world.

Search Queries

The words that a user/search use to find something of interest is the "query".

The Importance of Matching Keywords in Search Results

Matching words in the body of a document is a good way to understand what the document is about. Under construction!!!

The Components of Keyword Analysis

Frequency and Density of Keywords

The concept of keyword density has most likely changed significantly over the years. Many SEO experts argue back and forth over whether or not this metric is actually measured and used by search engines as part of their ranking algorithm. The short is answer is that it is very likely that some form of frequency ratio or tf-idf (Term Frequency-Inverse Document Frequency) is used to determine an appropriate scoring of a webpage's relevance to a particular search term or keyword. This mathematical idea essentially compares keywords and phrases with normal usage in generic samples of text gathered from a corpus.


It is likely that a body of text will refer to the specific keywords and phrases it pertains to early in the document. In particularly lengthy bodies of text, the document might discuss other related information and subtopics later in the body, giving reason to assign more value to the text that is produced earlier in the document when trying to determine the main intent and focus of a webpage.

Word Stemming and Variations

Linguistic morphology includes the concept of word stems and how different words are related or synonymous based on their root. Conflation is a process that can be used by search engines to analyze word stems and to treat them as similar or even the same when analyzing search queries and webpages.

Semantics and Related Words

Semantic analysis is used by search engines primarily to give meaning to ambiguous words (words with many potential meanings). An example is the word "bark". A human can easily decipher what the meaning is based on the context of the word. The words around the term will indicate what the meaning of the word is. This is why using related keywords is a big part of SEO. Therefore, the absence or inclusion of related keywords can indicate the relevance of a body of text. Once again, a corpus would be used to determine what words are related to each other and to what degree. A discussion about the NFL might not get very far without use of the term "football". The converse can be true as well. Other related words to a lesser degree might be "field", "quarterback", "sport" etc.. Use of these words will likely indicate a higher degree of relevance compared to a document that lacks these terms.

LSI or Latent Semantic Indexing is an expansion of this concept.

Over Optimizing, Keyword Stuffing, and Black Hat Tactics

Do not stuff all kinds of words into your text because you think it will trigger extreme relevance. It really only needs a few mentions. Although this particular contains the word "keywords" quite a lot. There is a very high density, although I am not worried since the language is natural.

Best Practices

Use natural language. Write the article or document as you normally would do for the best user experience possible. AFTER you have written it, compile a short list of keywords and possible search queries people might use. There are plenty of tools for doing keyword research. Also, write a brief description or better yet, an EXPLANATION of what the text is about. Make sure all the words within your explanation are somewhere within the actual content of your article.

For those who like to quantify things, I would recommend a keyword density of 1-3% depending on lots of things, primarily the length of the document and the scope of your discussion. This value is not arbitrary, it is based on information gathered from some of the best written novels on the planet, encyclopedias, other high ranking websites, the Bible, and Google's own blog.