what is content analytics - measurecamp london 2016
TRANSCRIPT
What is
Content Analytics?
Content is King
...and yet
what content metrics and dimensions do you use?
On Google Analytics
Some dimensions:Title
URL
Keywords (or what is left of it)
No actual metrics directly related to content
What should we get?
NLP Data
Natural Language Processing statisticsNew data :How many times the main keywords are in my content?
How many times these keywords are subject of a sentence?
How relevant are the words I am using?
Quick poll
Who has ever heard about TF-IDF metric?
Metric: TF - IDF
Numerical statistic that is intended to reflect how important a word is to a document in a corpusFrequency of a word (or series of words) in a document. To avoid words that would be too specific to only 1 document, it is compared to the frequency in the corpus
Quick poll
Who knows what is a n-gram ?
N-gram
What is a n-gram?
N-gram is a contiguous sequence of n items from a given sequence of text.
Example of 2-grams
I am attending Measure Camp in LondonI am
am attending
attending Measure
Measure Camp
Camp in
in London
If you remove useless words
attending Measure
Measure Camp
Camp London
Let's say you want to be as relevant as possible (and therefore rank on Google) for Measure Camp
1st step
Analyse your content with a n-gram analysis
2nd - Topic Corpus
Now, create a Topic corpus around your keyword (basically, pages ranked in Google)Let's get 100 top resultsfor these keywordsAnalytics event
Analytics conference
Measure CampGet the n-gram within all the documents (around 200 documents if you remove duplicate) Calculate TF-IDF for each n gram
YAY!!!: My first relevant Content Metrics:)
measure camp: 100 (very frequent)analytics conference: 60 (quite frequent)Peter O'Neill: 50 (quite frequent)
Stay (in) London: 30 (somewhat frequent)
* not actual data. Simplified version of TF-IDF
Now, create a topic-neutral corpus (basically take thousands and thousands of random webpages and create a corpus with it)Get the n-gram out of itExtract: Click here (very frequent)Stay London(appears a few times)Peter O'Neill (nowhere to be found)Measure Camp (1 time in the corpus)
3rd topic neutral corpus
4 - Now let's compare
Stay London: somewhat frequent in both corpus: not so relevant for your content
Peter O'Neill: Yay!
Measure Camp: not so frequent in English, very frequent in our topic corpus: I shall use it
Big data: very frequent in the topic corpus, not seo frequent Oh, sounds like something people want to hear about. Let's write content about it.
5 Optimize your content
Proofread your content with these new relevant expressions in mind.
Can I add more value to the user? Can it help improve my organic ranking?
Let's discuss
What kind of other content metrics or dimensions would we use?