contents: 1 – introduction to the subject of web mining and techniques
DESCRIPTION
7ET023 – MSc Dissertation. Student Name: Colin Hopson Student Number: 0482647 Course Title: MSc Computer Science (Internet Engineering). Research Question : What is the most suitable web mining technique for a specified business and mobile application case study?. Contents: - PowerPoint PPT PresentationTRANSCRIPT
Contents:
1 – Introduction to the subject of web mining and techniques2 – Overview of research conducted (both theory and practical)3 – Software applications on which to test web mining techniques4 – Demonstration (Digital Solutions and Repairs)5 – Evaluating results (suitability and practicality)
Student Name: Colin HopsonStudent Number: 0482647Course Title: MSc Computer Science (Internet Engineering)
7ET023 – MSc Dissertation
Research Question: What is the most suitable web mining technique for a specified business and mobile application case study?
7ET023 – MSc Dissertation
1 – Introduction to the subject of web mining and techniques
Sequential research of techniques for an empirical study
Initial research into data mining (databases)
Previous knowledge of web services (RSS, REST, etc.)
Research into theory of web mining
Web usage mining – logs to examine navigation patterns Web structure mining – examine link hierarchy Web content mining – “the discovery of useful information from the Web by
examining the data that is contained in the Web site” (Pendharkar, 2003 pg.243) * Pendharkar, P.C. (2003) Managing data mining technologies in organizations: techniques and applications, Idea Group Pub, Hershey.
Data extraction from HTML (machine learning algorithms)
Wrapper Induction Semi-Automatic Extraction
7ET023 – MSc Dissertation
2 – Overview of research conducted (both theory and practical)
Researching Theory of Data and Web Mining
Empirical research method to acquire knowledge,Research into data mining, web mining, data extraction algorithms, etc.,
Sequential investigation of applicable techniques.
Artefact Design and Development
E-commerce prototype website (Digital Solutions and Repairs),Mobile application (Mobile Shopper).
Practical Research to Implement Techniques
Resolution of web services (Amazon APIs),HTML extraction technique using XML; DOM; Xpath; PHP Arrays,
Consuming Google API with REST; DOM; Xpath; PHP Arrays,Third-Party Software (Newprosoft and Automation Anywhere),
Functionality of XSLT.
7ET023 – MSc Dissertation
3 – Software applications on which to test web mining techniques
7ET023 – MSc Dissertation
4 – Demonstration (Digital Solutions and Repairs)
Web Mining Technique 1Amazon API
(coded class/methods)
Web Mining Technique 2HTML Extraction
(DOMDocument, Xpath and PHP Arrays)
Web Mining Technique 3Google API
(REST, DOMDocument, XPath and PHP Arrays)
Web Mining Technique 4Third-Party Software
(Automation Anywhere and Newprosoft)
Web Mining Technique 5None Implemented, but XSLT investigated
Website Demonstration >>>
7ET023 – MSc Dissertation
5 – Evaluating results (suitability and practicality)
Web Mining Technique 1: Amazon API Requires registration and associate keys,
Product Advertising API has most requirements (plus more),ASINs assist administration system,Top quality delivery and discounts,
Regular updates although lengthy documentation.
Web Mining Technique 2: HTML ExtractionNo cost, but requires programming knowledge,
Bespoke algorithm specific for HTML format,Limited to one online organisation.
Web Mining Technique 3: Google APIRequires registration and associate keys,
Searches products from many online organisations,GoogleId does not assist administration system,
Web service retrieves limited product information,Top security measures, but lengthy documentation.
Web Mining Technique 4: Third-Party SoftwareLimited free trial with subscription costs,
Possible difficulty with integration with administration system
Web Mining Technique 5: XSLT investigatedLimited free trial with subscription costs,
Integration difficulties with administration system
7ET023 – MSc Dissertation
SUMMARY
Questions?
Study of web mining and some of its techniquesEmpirical study, data mining, web services, web content mining, data
extraction algorithms.
Sequential research conducted (theory and practical)Web services (APIs), HTML extraction, Third-Party software, XSLT.
E-commerce prototype website and mobile application‘Digital Solutions and Repairs’ and ‘Mobile Shopper’.
Demonstration of web mining techniquesDSR computer repairs administration system
Evaluation of web mining techniques investigatedComparison between APIs, HTML extraction, third-party software and XSLT.