using social media data in research - southampton
TRANSCRIPT
UsingSocialMediaDatainResearch
WebDataRA fromWSI
ProfLeslieCarr
WebDataResearchAssistant
• ScrapesTwitter,FacebookandGoogledataintoaspreadsheet
• Uniquelyallowsfreehistoricdatacapture
• Noprogrammingrequired
• Browserextension,one-clickinstall
AvailablefromtheChromeWebStoreatbit.ly/WebDataRA
OverviewofUse
• TheWebDataRAwillcaptureTwitter,FacebookandGoogledatafromabrowserandallowyoutopasteatableofinformationdirectlyintoaspreadsheet.ThistutorialfocusesonitsusewithTwitter.
1. Visitbit.ly/WebDataRA inChrome,clickontheblue“+AddtoChrome”button.Thesmallgreeniconwillappearinthetoprightofthebrowserwindow,nexttotheURLbar.
2. Gototwitter.com andcreateaTwittersearchordisplayatimeline
3. ClickontheWebDataRA icontostartcollectingtweets.• Every5secsthebrowserwillautomaticallyscrolltothebottomofthepagetomakeTwitterloadthenext
batchofresultsandaddtheupdatestotheclipboard.
4. Whenyouhavecollectedenoughresults,pastethedataintoanExcelspreadsheet.
5. UseExceltoanalysedata,orexporttootherprogramssuchasGephi orVoyant forotherkindsofanalysis.
WebDataRA Tables
Thetweetdata,withauthor,mentions,hashtags,textandcountsofretweets,repliesandlikesbrokenoutinseparatecolumns.
Accountoccurrencesummary,acountofthenumberoftimesthateachTwitteraccountappearsinthedatasetasauthororamention(includingretweets).
Countsoftheappearancesofeachhashtag.
Atableofedgesoftheconversationalnetwork,i.e.thenumberoftimeseachpairofaccountscommunicatewitheachother.
UsingTheTweetDataTable
• Thetweetdata(gray)containsthebasicdataabouteachtweet:whatwassaid,when,bywhoandtowhom.• Usethisdatatoformageneraloverviewofthecommunicationovertimeandidentifythemostsignificanttweets.• ExaminespecifictweetsandtheircontextbyreferringbacktotheTwittersiteusingeachtweet’sURL.
PivotTableVisualTwitterTimeline
• Clickonanygray cellintheTweetDatatable• Choose“PivotTable”fromtheInsertribbon.• InthePivotTablebuilder
• drag“Author”fromtheFieldNamepanelintothe“Rows”panel
• drag“Timestamp”intothe“Columns”panel• drag“Author”(again)intothe“Values”panel(itwillautomaticallyturninto“CountofAuthor”).
• ReformattocreateahelpfulTimelinesummaryofcontributors(verticalaxis)bydays(horizontalaxis).• narrowthecolumns,slantthecolumnheadings,changetheangleofthetextto60°• usethe“RowLabels”controltosortbytheauthorcount• showonlytherowswherethetotalauthorcountisgreaterthanachosenthreshold.• useconditionalformattingtohighlightthemostextremevalues.
OtherQuestionstoaskoftheData
• AllkindsofsummariesandanalysesarepossibleusingExcelonthisdata,including:• Showingthedistributionofthetweetsamplethroughtime• Identifyingthemostprolificand/orpopularactors,andshowingtheiractivitythroughtime• Showingtheuseofindividualhashtags(thismightbeusefulinabigconversation,oronethatevolvesoveralongerperiod)• Comparingtherelativeproportionofcontributionsfromdifferentactors/hashtags
UsingTheAccountDataTable
• Theaccounttable(green)shows• themostactivetweeters,• themostfrequentrepliers,• themostretweetedusers.
• Thisshowsthekeyactorsinaconversation,andthemainrolesthattheytake.• Getdetailedinformationbyclickingontheaccountnames(linked)toseetheaccountbiosandtherelevanttimelinesoftheseactorsintheTwitterwebsite.• Understandwhethertheyarecorporateaccounts,privateindividuals,botsortrolls.
InspectingTwitterAccountsAccount # Bio
ItsTimeToLogOff 30 TimeToLogOffisthehomeof digitaldetox.We’respearheadingthemovementtodisconnectregularlyfromdigitaldevicesandreconnectwiththeworldoffline.Wedothisthroughcollecting factsontheneedfordigitaldetox,running campaigns togeteveryoneofftheirscreensandhosting retreats,eventsand workshops.
DinnerTableMBA 9 Acommercialorganisationworkingtogethertohelpfamiliesbecomemoreconfident,successful,andself-empowered
SpareFoot 8 Astoragecompany.Wemakeiteasytomoveandstoreyourstuff.Reservestorageforfreeandgetyourmindoutoftheclutter.
CultureEffect 5 AuthorofDigitox:HowtoFindaHealthyBalanceforyourFamily’sDigitalDiet
Theaccountnamesintheaccount“authorandmentions”(green)tableareclickable,andopenthepageoftheaccountprofileinyourdefaultwebbrowser.
Followingtheaccounthyperlinksforthemostprolificauthorsinthegreentable,weseethattheyareallcommercialorinstitutionalactorstooneextentoranother.
UsingTheHashtagDataTable
• Thehashtagtable(blue)showsyouthemostfrequentlyusedhashtags.Thiscanhelpyouextendyourdatagatheringtolookformoretweetsrelevanttoyourresearchquestion.
UsingTheEdgeDataTable
• Theedgetable(yellow)willhelpyoutoseetheinteractionsbetweenactors,andhelpyoutounderstandgroupingsofactors,andthepatternoftheirinteraction.• Isakeyaccountdominatingaconversationandtalkingtomanyothers?• Aretheyrespondingorjustbeingpassiverecipientsofmarketingmessages?• Isthereagroupofequalshavingabalancedconversationwithequalparticipation?
InspectingtheConversationNetwork
• CopyandpastetheyellowtableintoaseparatespreadsheetandsaveitasaCSVfile(callitedgetable.csv orsimilar).• Loadupthenetworkvisualisationprogram“Gephi”,andstartanewproject.• Inthe“DataLaboratory”,choose“ImportSpreadsheet”andloaduptheCSVdataasanedgetable. Youcanthenapplyavarietyofnetwork
layoutalgorithmsinthe“Overview”pane.
UnderstandingtheConversationNetwork
• ManysummariesandanalysesarepossibleusingGephi’s networkvisualisations.• Showingtheinteractionofthenetworkactors• Identifyingthecommunitiesandactiveparticipantsubgroupswithinthelargersample• Identifyingtherolesofdifferentactorsinthecommunicationsnetwork
TextualAnalysesoftheSocialConversation
• Inthegray table,copythe“SanitisedText”column.• Thiscontainsthetextofallthetexts,butwithalltheTwitterfeatures(@names,#hashtags,URLs)removedtoleaveonlytheEnglishtext.
• GototheVoyant-Tools.org website• Voyant Toolsisatextualcorpusanalyser.ItconsidersaTwitterconversationasasingledocument&individualtweetsasindividualsentences.
• Pastethetextintothetextbox• Pressthe“Reveal”button.• Youwillseeascreenwithseveralpanelsthathelpyouexplorethetextofthetweetsindifferentways.
TextualAnalyses
• Voyant includesavarietyoftextualanalysiscomponents• Wordcloud• Trendanalyser• Concordance• Summary• Vocabularyclusteranalysis• DimensionalReductions• Co-occurrenceNetwork
SentimentAnalysis
• Pastethe“SanitisedText”columnintosentigem.com.• Considertowhatextenttheresultsseemaccuratetoyou?Howwelldoesitidentifypositiveandnegative‘sentiment’inatweet?• Whatkindsofinaccuraciescanyousee?
• Doesithelpyoutoidentifyanypointsofinterestinyourdataformorethoroughinvestigation?
• Sentimentanalysiscanhelpyouidentifypositiveornegativecommentsinyoursample.• Thisisapopularmethodinindustry,especiallywithbrandmanagementcompanies.Howeveritisacademicallycontested,anddoesnothaveahighdegreeoftransparencyinthelexicalprocessing.