technology for e-commerce helena ahonen-myka. in this part... n search tools n metadata n...

44
Technology for E- commerce Helena Ahonen-Myka

Upload: cody-johnson

Post on 21-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Technology for E-commerce

Helena Ahonen-Myka

Page 2: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

In this part...

search tools metadata personalization collaborative filtering data mining

Page 3: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Search tools

the site has to be accessible site architecture and navigation

structure is important … but some users prefer search keep users on the site usage can be monitored: useful

knowledge about the users’ needs

Page 4: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Users’ preferences

search: 50% navigation: 20% mixed: the rest...

Page 5: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Search tools

Indexer: gathers the words from documents (HTML pages, local files, database records) and puts them into an index file

Search engine: accepts queries, locates the relevant pages in the index, and formats the results in an HTML page

Page 6: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Remote vs local search

search tool can reside in a different server, also in a remote location

indexing may take a lot of processing time, and the resulting index may need a lot of space

local software may be faster

Page 7: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Indexer local: scans directories web spider: an indexing robot begins at

a given page, then follows the links and stores words of the pages

’robots.txt’ file: which robots allowed HTML meta elements:

<meta name=”robots” content=”noindex, follow”><meta name=”robots” content=”index,nofollow”><meta name=”robots” content=”noindex,nofollow”>

Page 8: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Indexer

link structure should reach all the pages that should be indexed

non-text links (imagemaps etc.): robots may not be able to follow links -> provide also text links

frames: provide some navigational links to give a context, if the page is retrieved by a query

Page 9: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Search page

search forms are the user interface of the search engine

simple form: just a text field and a button

or a(n advanced) search page: boolean search, date ranges, subscopes...

Page 10: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Search results

the occurrences of the query terms are located from the index

the results are sorted according to their (assumed) relevance to the query

results page should have the same look-and-feel than the other pages on the site

Page 11: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Why searches fail?

empty searches: people just put the search button without giving any words

wrong scope: people think they are searching the entire web

vocabulary mismatch: terms are too specific, too general, just not used

spelling mistakes query requirements not met

Page 12: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Why searches fail? problems with query syntax: spaces,

parentheses, etc. capitalization and special characters:

exact matches required stopwords: some common words are not

indexed short words: short words are not

indexed numbers are not indexed

Page 13: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

No-matches pages

answer pages to the user if the search does not return any matches

should have the same look-and-feel than the other pages + navigation aids + search again field

explanations why the search might have failed and what to do next

Page 14: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Some usability issues web design: strong sense of structure

and navigation support some people do not like to search people who search end up in some

page: they should know where they are people need to move around in the

neighborhood search should be available on every

page

Page 15: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Some usability issues

scoped search: difficult for the users to understand what is the scope -> scope should be stated clearly, and a search to the entire site has to be offered easily

boolean search is difficult: ’cats and dogs’ vs ’cats or dogs’ -> ’or’ could be used in the query, ’and’ in the ordering

Page 16: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Metadata

often a search results in a long list of matches; many of them may be irrelevant

metadata can make the queries more powerful

Page 17: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

HTML meta elements

<head profile=”http://www.acme.com/profiles/core”> <title>How to complete memo cover sheets</title> <meta name=”author” content=”John Doe”> <meta name=”copyright” content=”&copy; 2000 Acme”.. <meta name=”keywords” content=”corporate, guidelines, cataloging”> <meta name=”date” content=”2000-10-17”></head>

Page 18: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Metadata

RDF (RDF (RResource esource DDescription escription FFramework):ramework):– Gives means to define metadata for XML and HTML

documents– Give means to interchange it between different applications

on the Web

Example: Dublin Core metadataExample: Dublin Core metadata– Contains 15 elements (title, creator, date…)

Page 19: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Dublin Core

Dublin Core Metadata Elements:Dublin Core Metadata Elements:

Content:Content:

TitleSubjectDescriptionLanguageRelationCoverage

Intellectual Intellectual Property:Property:

CreatorPublisherContributorRights

Instance:Instance:

DateTypeFormatIdentifier

Page 20: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Dublin Core in RDF

<RDF:RDF><RDF:RDF> <RDF:Description RDF:HREF="URI"><RDF:Description RDF:HREF="URI"> <DC:Relation><DC:Relation> <RDF:Description><RDF:Description> <DC:Relation.Type> isPartOf<DC:Relation.Type> isPartOf </DC:Relation.Type></DC:Relation.Type> <RDF:Value RDF:HREF="URI2"/><RDF:Value RDF:HREF="URI2"/> </RDF:Description></RDF:Description> </DC:Relation></DC:Relation> </RDF:Description></RDF:Description></RDF:RDF></RDF:RDF>

Dublin Core represented in RDF

Page 21: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Searching XML documents

structure of XML documents can be used to make more precise queries, e.g. find Albert Einstein in Author element only

problem: how the user specifies the structure

Page 22: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Searching XML documents

1) The user specifies the hierarchy in the query: Einstein in Author

2) The user makes a simple query, but the search engine presents the alternative contexts: Einstein can be in Author or in Street or in School

Page 23: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Using links

good site: many links into the site, particularly from other good sites

text surrounding the link describes (probably) what the target of the link is about

the knowledge above + the contents of the page itself are taken into account

e.g. Google (www.google.com)

Page 24: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Natural language queries

E.g. Ask Jeeves questions and answers prepared by

human editors user’s query is mapped to the prepared

queries

Page 25: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Personalization

goal: the right people receive the right information at the right time

but: people do not like to state complex queries, or initialize a service (like answering a questionaire)

user profiles have to be generated and stored, preferably automatically

Page 26: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

User profiles

may contain data like: interests, geographical area, age

could be collected once, and shared with many services

trust of the user: the profile should only be used to offer better service, and only if the user wants to let some service to use it

Page 27: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Recommendations

users who bought this book also bought these books / liked these cd’s etc.

rating movies, tv programs, wines… recommending paths on a site

Page 28: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Recommendations

based on the user’s former behavior and profile data

based on social (collaborative) filtering: what similar users liked

Page 29: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

User’s former behavior

if used as the only source: the user never sees anything new

particularly a new user hardly gets any recommendations

Page 30: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Collaborative filtering draws on the experiences of a

population or community of users the profile information of the target user

is compared to the profiles of nearest-neighbor users

look for correlation between users in terms of their ratings: recommend items that are included in the neighbors profile but not in the target user’s profile

Page 31: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Collaborative filtering

Problems: cannot recommend new items (some

users have to rate an item before it can be recommended)

unusual user may not get (good) recommendations: no neighbors that are close enough

Page 32: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Matching engines

Apply one set of complex characteristics to another

e.g., recruiting sites: match a job seeker and a job

Page 33: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Data mining for e-commerce

users’ behavior on the web site provides a lot of information:

Which pages the users view? Which paths the users navigate? How long the users spend on the site? What is the rate of viewing a product

and purchasing it?

Page 34: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Data mining process

Gathering the data Cleaning/preprocessing the data Transforming the data Analysis / finding general models Interpreting the results Using the knowledge

Page 35: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Data collection

clickstream logging: web server logs or packet sniffers

business event logging

Page 36: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Clickstream logging

web log: page requested, time of request, client HTTP address, etc.

lot of requests for images -> have to be filtered out

users and user sessions difficult to identify

requests for a page: the same page, but different dynamic content

Page 37: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Clickstream logging

more efficient at the application server layer

instead of just pages, knowledge on products

user and session tracking possible also track of information absent in web

server logs: pages that were aborted while being downloaded

Page 38: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Business event logging

looking at subsets of requests as one logical event or episode:

add/remove item to/from shopping cart initiate/finish checkout search (log keywords and nr of results) register

Page 39: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

From order data to customers

collected data is order-oriented data for each customer is spread into

many records information on customers is the real

target information for each customer has to be

aggregated

Page 40: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

From order data to customers

What percentage of each customer’s orders used a VISA credit card?

How much money does each customer spend on books?

What is the frequency of each customer’s purchases?

Page 41: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Model generation Answer questions like: What characterizes heavy spenders? What characterizes customers that prefer

promotion X over Y? What characterizes customers that buy

quickly? What characterizes visitors that do not

buy?

Page 42: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Data mining tools

e.g., classification rules

IF Income > $80,000 AND Age <= 30 AND Average Session Duration is between 10 AND 20 minutesTHEN Heavy spender

Page 43: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Understanding the results

result of a data mining process may be difficult for a business user to understand: e.g. thousands of rules

visualization is important tailored for a specific domain

Page 44: Technology for E-commerce Helena Ahonen-Myka. In this part... n search tools n metadata n personalization n collaborative filtering n data mining

Using the results

site structure can be updated procedures like registering or checking-

out can be simplified metadata can be added to make search

more efficient personalization rules, recommendating

systems