thesis presentation v4
TRANSCRIPT
![Page 1: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/1.jpg)
1 Cork Institute of Technology - Candidate for Master of Science Degree 1
Using Big Data Analytics in a Social Domain
Master’s in Cloud Computing 2013/2014
Ahmed Abdel-Aziz May 2015
EMCCAe, CISSP, PMP
![Page 2: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/2.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 2
Objective
1) Social Media, Analytics and the Marketing Campaign 2) Sentiment Analysis – Methodology & Techniques 3) The Need for Case-Study & an Analytics Prototype 4) Learning Outcomes & Future Work
![Page 3: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/3.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 3
Social Media & Social Analytics • Social media sites offspring of Web 2.0 Movement – Based
on cloud computing model (Software-as-a-Service) • Percentage of companies using social media for marketing is
88%
Section 1 of 4
![Page 4: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/4.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 4
Marketing Campaign Lifecycle Section 1 of 4
• Consists of 5 phases • Social analytics answers social questions for each phase Ex: What is sentiment trend?
![Page 5: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/5.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 5
• Social analytic projects based on sentiment analysis benefit from a well thought out methodology
Section 2 of 4 Sentiment Analysis Methodology &
Techniques
![Page 6: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/6.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 6
Section 2 of 4 Sentiment Analysis Methodology &
Techniques • Social sentiment analysis starts with social listening
– Social listening can be performed using a variety of open source tools such as PostgreSQL, R, Wordle, and Circos, as well as tools such as Attensity 360 and Analyze.
• Social data comes from 3 main categories of sources – Social user’s account – analytic capability limited by social
media provider (FB, Twitter, LinkedIn) – Social APIs – social media provider offers API to tap into
social data. Allows development of unique analytic programs – 3rd party tools – provides very fast results but does not
offer same level of analytic capability of a custom program
![Page 7: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/7.jpg)
SANS Technology Institute - Candidate for Master of Science Degree 7
Section 2 of 4 Sentiment Analysis Methodology
& Techniques • Sentiment analysis techniques grouped into two main
categories: • Supervised machine learning method • Unsupervised method
• Supervised learning method learns features/words that correlate with +ve/-ve sentiment. Can identify new text sentiment
• Unsupervised methods a lexicon is used with words pre-
scored for polarity values. Sum of scores gives sentiment
• Both techniques widely used and offer comparable results Cork Institute of Technology - Candidate for Master of Science Degree 7
![Page 8: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/8.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 8
• Company launched new product to market – Marketing campaign already launched long ago and in Account Performance Phase
• Marketing team needs to measure upticks in sentiment
trend regarding new product to take appropriate actions
• Data science team believes continuous user surveys are ineffective and a computational approach is necessary -> Better results and much less intrusive
Need for Case-Study/Analytics Prototype Section 3 of 4
![Page 9: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/9.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 9
Need for Case-Study/Analytics Prototype
• Decision made to build a prototype for tool to measure sentiment trend on Twitter specifically as start
• Twitter found to be the social network of choice regarding brand and product sentiment topics à Thus Twitter
• Data science team key objectives: – Produce useful results quickly and cost efficiently
Cloud Computing Value Proposition!? – Get buy-in from marketing management to build full App
Section 3 of 4
![Page 10: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/10.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 10
• Applying the Analytic Project Lifecycle to the Prototype
– Data Preparation Input Data: raw tweets Output Data: clean tweet text ready for sentiment analysis
– Analytic Model Planning & Building Input Data: clean tweet text and learnt Naïve Bayesian model Output Data: sentiment of analyzed tweets
– Communicate Results
Input Data: sentiment of analyzed tweets and tweets Output Data: sentiment trend graph for both +ve and –ve sentiments
Section 3 of 4
Need for Case-Study/Analytics Prototype
![Page 11: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/11.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 11
Need for Case-Study/Analytics Prototype Section 3 of 4
• Technology decisions made by data science team – R programming language for social listening – Twitter Social APIs for source of social data – Leverage ready-made R packages to accelerate building time – R programming for data preparation – Leverage analytics cloud services offered such as Datumbox –
supervised machine learning method using Naïve Bayesian – R programming to build main body for prototype analytics
application – Making use of R plotting capabilities to present easy to understand
results for non-technical members of Marketing team – Settling on the technologies to use to build the full blown
application dealing with much larger data sets – GPText/Pivotal HD
![Page 12: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/12.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 12
Need for Case-Study/Analytics Prototype Section 3 of 4
• Snippet R code for analytics application – Main Loop • > possentiments = 0 • > negsentiments = 0 • > for (i in 1:"5") • > { • > tweets = searchTwitter("iPhone", n=5, lang="en”)t • > tweet_txt = sapply(tweets, function(x) x$getText()) • > tweet_clean = clean.text(tweet_txt) • > tweet_num = length(tweet_clean) • …….. • > for (i in 1:tweet_num) • > { • > tmp = getSentiment(tweet_clean[i], "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") • > tweet_df$sentiment[i] = tmp$sentiment • ……….. • > } • > possentiments <- c(possentiments, sum(tweet_df$sentiment=="positive")) • > negsentiments <- c(negsentiments, sum(tweet_df$sentiment=="negative")) • > Sys.sleep(5)
![Page 13: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/13.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 13
Need for Case-Study/Analytics Prototype
• Plotting the trend of both positive and negative sentiments
Section 3 of 4
![Page 14: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/14.jpg)
SANS Technology Institute - Candidate for Master of Science Degree 14
Learning Outcomes
• Initial State – Good foundation in cloud computing and data analytics – Very little knowledge in social domain – Not even FB account J – Last coding experience was Java 13 years back
• Initial research project stages – Social media university – Addictive analytics workshop -> Introduction to Marketing domain – Pivotal workshop to learn data analytics in social domain ->
Relevant Pivotal Data Analytics Platforms: GPText and Pivotal HD
• Later research project stages – practical – Learning enough about R to build small scale analytics application – How to leverage Datumbox analytics-as-a-service offering
Section 4 of 4
Cork Institute of Technology - Candidate for Master of Science Degree 14
![Page 15: Thesis Presentation V4](https://reader034.vdocuments.site/reader034/viewer/2022042619/58ed6ed21a28ab9a118b4571/html5/thumbnails/15.jpg)
Cork Institute of Technology - Candidate for Master of Science Degree 15
Summary
• Cloud, social, and Data Analytics synergy serve Marketing • Is there an uptick in +ve/-ve sentiments of my product?
Is a question strategically important in the Account Performance phase of a Marketing Campaign
• The research answered the question using a computational
approach based on a supervised learning method for sentiment analysis that is cloud based
• Data source and data analytics in the cloud. Data preparation and data presentation on-premise using R. Future work: Optimize & Tune for Large Datasets -> Can be all Cloud