Download - Owl and The Hummingbird - Ontology and SEO
THE OWL ANDOntology & SEOThe Hummingbird
Dawn Anderson
How Can SEO Be Dead?• For Obsessive Compulsive Link Building
Disorder – it may be BUT..• It was NEVER just about the links• It was ALWAYS about ‘ontology’• It’s ALWAYS been about library science,
lexicons, relationships• We still need to ‘search engine optimize’ sites
for this – thinking like a machine (SEO)
Sergey’s Studies
http://infolab.stanford.edu/~sergey/
Sergey’s Studies
PICTURE SOURCE: The Anatomy of a Large-Scale Hypertextual Web Search Engine (Brin / Page – Stanford EDU (http://infolab.stanford.edu/~backrub/google.html)
Lexicon – 14 million words in 1998 – if straightforward winner??
Added last in photo-finish
If further info is needed (no clear winner??)
Red and green parts NOT Sergey’s work
“In computer science and information science, an ontology formally
represents knowledge as a hierarchy of concepts within a domain, using a shared vocabulary to denote the types, properties and interrelationships of those concepts.
Ontologies are the structural frameworks for organizing information and are used in artificial intelligence, the Semantic Web, systems engineering, software engineering,biomedical informatics, library science, enterprise bookmarking, and information architecture as a form of knowledge representation about the world or some part of it. The creation of domain ontologies is also fundamental
to the definition and use of an enterprise architecture framework. ”
Ontology Definition (InformationScience):
Meanwhile….. In Semantic Web
• People have been busy• The W3C Working Group• Why is it taking so long - decades? (it’s
complicated – using RDF, OWL, XML, SKOS)• The AAA Principle• Always about relationships between
things (entities)• Machines / humans both understanding the
web• Formal Ontologies are key to this• Web of Data was always vision• The ‘Network Effect’ has not yet happened
What Is OWL?
• Web Ontology Language• Why Not W.O.L. – Why OWL?• “A Semantic Web language
designed to represent rich and complex knowledge about things, groups of things, and relations between things.” (source: W3C.org)
• The W3C chartered the OWL Working Group as part of the Semantic Web Activity in September 2007
Meanwhile….. In SEO• Google Penguin & Panda cause chaos• Hummingbird slips quietly in• Link removal becomes the new link building• ‘Toxic’ links are removed or
disavowed• What to do next???
• A lot of titles get changed on Linkedin• Some even leave SEO• SEO budgets get pulled / reduced ???
What About The Hummingbird?• A complete rewrite
• Beginning to connect relationships between things (Semantics) – step towards semantic web (lexical onomies – e.g. synonyms)
• Is it a coincidence Penguin and Hummingbird arrived within a short period of each other?
• It didn’t just appear – a lot of work has been going on teaching machines via lexical-semantic ontology learning / natural language processing
– Did Google’s Lexicon get better? Did links matterless because the Lexicon word catalogue got ‘grammar’?? (natural
language)
Meanwhile….. In Content Marketing• Copywriters go crazy with Wordpress WSYWIG editors
….. Everywhere• Fabulous content gets released….
…… Everywhere• With great PR headlines / titles ….
…….. Everywhere• Written for humans …. Not machines (with little or NO SEO)• We begin to drown in a sea of content
• And everybody’s traffic ends up in their blog ;)• Demand for CRO goes through the roof ;)
Meanwhile….. EverywhereWe end up with ‘fuzzy ontologies’
????????????????????????? “A fuzzy ontology is one of vagueness, …. A domain or knowledge representation which is unclear and imprecise in nature as to what it relates to
This exists and ontologists / information scientists work on ways to measure ‘fuzziness’ with measurements of logice.g. 0.6 chance it might mean this
DON’T BE FUZZY
Don’t Be Fuzzy – There Is Another Way
You can write for humans and machines too
Remember Google’s origins in data mining to organise books(library science)
MAKE LIKE A LIBRARY
Is Your Library Organised?
Does It Look Like This?
Is Your Library Organised?
Or Does It Look Like This?
VocabulariesVery loose and informal
Taxonomies
AnimalMammal
Canine
Feline
HumanReptile
“All About Categories and Subcategories –broad to narrow”
Much more powerful – clear order for crawlers (and people too)
Relationship Ontologies (Knowledge Domain)
Shakespeare
Anne Hathaway
King Lear
Macbeth
UK
England
Scotland
Stratford
Married
Is In
Lived In
Set In
Wrote
Part Of
Part Of
“All About Relationships”
structure AND cross relationships
Avoid Fuzz - When Writing For Web• Write for people AND machines• Googlebot can’t read infographics – put words with them• Don’t dilute your domain ‘theme’ or internal anchor cloud -
Don’t link too many ‘irrelevant’ posts to ‘irrelevant’ posts• Talk about what you do – Obvious (you’d be surprised)• ‘Related content’ CAN be your commercial pages• Check content keywords in GWT (surprised?)• Avoid generics – use a category name in blog post structures
(NOT CATEGORY)• When engaging - Build a subject domain ‘Lexicon’ and weave it
in - Use lexical ‘onomies - synonyms / hyponyms, meronyms, verbs, holonyms, antonyms, associated keywords (Look in GWT content keywords, thesauri, SERPs) - Remember Hummingbird treats synonyms the same so avoid stuffing
Avoid Fuzz - When Writing For Web• Always get primary terms in somewhere• Use relevant lexicon words in H1’s, H2’s, H3’s, image alt tags,
image file names• Get primary keywords and sectional keywords in URLs and titles• Avoid PR headline type titles in your meta title and H1 (riddled
with stop words)• Less is More – It’s all in the hints, clues, site structure (not
stuffing or spinning) Connect ‘relationships’ where possible - Use contextual internal linking as long as it’s RELEVANT
• Don’t use nonsense / generic / irrelevant post tags (you’d be surprised)
• Build taxonomical cluster menus where possible (very powerful)• Build sectional themes in categories (keep it very narrow)
When Developing For Web• Infinite loops (even in small Wordpress sites) (each churn leaks relevance)• Thin ‘panda vulnerable’ irrelevant content• Incorrectly implemented canonicalization (pages NOT the same)• 301 redirects to irrelevant pages• Dilution through poor use of URL ‘parameters and faceted navigations• Bulk ‘relevance’ together in sections – connect horizontally and vertically -
Use ‘flattish’ silos for strong site sections combined with cross module internal contextual linking
• Build primary keyword presence through boilerplates• Keep things moving – ‘action’ in commercial pages (something that
changes that is HIGHLY relevant to the sectional theme)• Products?? What products? Avoid Generics• All roads eventually lead to commercial targets – most internally linked to• Use breadcrumbs & mega menus but avoid ‘jumble sales’ – look at
conditional / highly related sectional menus (e.g. Widget Logic in Wordpress)
• Name and categorise XML sitemaps, add categories as sites in GWT
Bring It On….. It’s easy to drop the ‘relevance’ ball at the first base through SEO neglect in pursuit of links / content marketing
Exhaust all possibilities via lexical relevance and internal links first then move up
A page beats a page, not a site beats a site
A more ‘relevant’ page will still beat you – even without external links
Because…. Where is the ontology??
Dumb Machines
• Disambiguation - A tomato is a fruit• So a tomato goes in a fruit salad??
Duh• Machines are still a bit stupid
Less Dumb Machines
Irony - Still Quite Dumb Though
Happy Ending
Not So Happy Ending
Don’t Neglect SEOTake it for what it was always meant to be –relationships, domain ontology, library science- Winning on relevance first then getting the final
votes in a photo finish
- Googlebot is still your primary persona (unless you don’t want organic traffic ;))
- Everything affects everything in your site’s‘world’ representation (every word, every internal link, every developer file upload)
REMEMBER THIS“You can have the best dress in the world….
But if you’re in a dark room with the lights turned off, no-one will see it…”
SEO still counts
“If you build it RIGHT… they will come”
Me@dawieando
Talk In Triples (RDF)SUBJECT PREDICATE OBJECT
Shakespeare Wrote Hamlet
England Is Part Of UK
Brad Pitt Is Married To Angelina Jolie
Use Semantic Relationships (Lexical Onomies)In words and taxonomies (linked and unlinked)
FUNCTIONALITY RELATIONSHIP CONCEPT EXAMPLES
Describing relationships
Synonomy Similarities “buy” and “purchase”, “big” and “large”
Describing relationships
Antonomy Differences (opposites)
“wet” and “dry”
Describing relationships
Hyponomy Specialization “Red is a colour”
Describing relationships
Meronymy Part / Whole “Finger is a meronym of hand”
Describing relationships
Holonymy Whole / Part “Hand is a holonym of finger”
Avoid Fuzz - When Writing For Web• Write for people AND machines• Googlebot can’t read infographics – put words with them• Don’t dilute your domain ‘theme’ or internal anchor cloud - Don’t link too many ‘irrelevant’ posts
to ‘irrelevant’ posts• Talk about what you do – Obvious (you’d be surprised)• ‘Related content’ CAN be your commercial pages• Check content keywords in GWT (surprised?)• Avoid generics – use a category name in blog post structures (NOT CATEGORY)• When engaging - Build a subject domain ‘Lexicon’ and weave it in - Use lexical ‘onomies -
synonyms / hyponyms, meronyms, verbs, holonyms, antonyms, associated keywords (Look in GWT content keywords, thesauri, SERPs) - Remember Hummingbird treats synonyms the same so avoid stuffing
• Always get primary terms in somewhere• Use relevant lexicon words in H1’s, H2’s, H3’s, image alt tags, image file names• Get primary keywords and sectional keywords in URLs and titles• Avoid PR headline type titles in your meta title and H1 (riddled with stop words)• Less is More – It’s all in the hints, clues, site structure (not stuffing or spinning) Connect
‘relationships’ where possible - Use contextual internal linking as long as it’s RELEVANT• Don’t use nonsense / generic / irrelevant post tags (you’d be surprised)• Build taxonomical cluster menus where possible (very powerful)• Build sectional themes in categories (keep it very narrow)
Avoid Fuzz – When Developing For Web• Infinite loops (even in small Wordpress sites) (each churn leaks relevance)• Thin ‘panda vulnerable’ irrelevant content• Incorrectly implemented canonicalization (pages NOT the same)• 301 redirects to irrelevant pages• Dilution through poor use of URL ‘parameters and faceted navigations• Bulk ‘relevance’ together in sections – connect horizontally and vertically -
Use ‘flattish’ silos for strong site sections combined with cross module internal contextual linking
• Build primary keyword presence through boilerplates• Keep things moving – ‘action’ in commercial pages (something that
changes that is HIGHLY relevant to the sectional theme)• Products?? What products? Avoid Generics• All roads eventually lead to commercial targets – most internally linked to• Use breadcrumbs & mega menus but avoid ‘jumble sales’ – look at
conditional / highly related sectional menus (e.g. Widget Logic in Wordpress)
• Name and categorise XML sitemaps, add categories as sites in GWT
Further Reading1. Semantic Web For The Working Ontologist – Dean
Allemang / Jim Hendler2. Studies On The Semantic Web – Perspectives On
Ontology Learning – Jenns Lehmann / Johnanna Volker3. OWL: Representing Information Using The Web Ontology
Language – Lee W Lacy4. Semantic Web Programming – Hebeler, Fisher, Blace,
Perez-Lopez5. Programming The Semantic Web – Segaran, Evans &
Taylor
Further Reading1. http://wortschatz.uni-
leipzig.de/~cbiemann/pub/2005/OntoML05proceedings.pdf
2. http://infolab.stanford.edu/~backrub/google.html3. http://searchengineland.com/killer-seo-string-entity-
optimization-1710944. http://www.seobythesea.com/category/fact-extraction/5. http://ilpubs.stanford.edu:8090/421/1/1999-65.pdf6. Fuzzy Ontologies -
http://link.springer.com/chapter/10.1007%2F978-3-540-77581-2_10#page-1