islamophobia on twitter/ march to july 2016 - demos...subject of earlier demos work in this area)...
TRANSCRIPT
IslamophobiaonTwitter:MarchtoJuly2016
CarlMillerJoshSmithJackDaleCentrefortheAnalysisofSocialMedia,Demos
RESULTSTheCentrefortheAnalysisofSocialMedia(CASM)atDemosisconductingcontinuousresearchonhateful,xenophobic,anti-disability,anti-Semiticandanti-IslamicideasandexpressionsonTwitter.Thisispartofabroadefforttounderstandthescale,scopeandnatureofusesofsocialmediathatarepossiblysociallyproblematicanddamaging.
ThisshortpaperdetailsrecentresultsfortheuseofTwittertoshareexpressionswhichareidentiEiedasIslamophobic,derogatoryandhateful.Itcoversabroadstretchoftime,fromthe22ndFebruary2016tothetimeofwriting,August4th,butfocusesespeciallyonactivityoverthemonthofJuly2016.ForadiscussionofhowtheseTweetswerecollectedandanalysed,seethemethodologysectionofthisreport.
July2016
OverJuly,weidentiEied215,247Tweets,sentinEnglishandfromaroundtheworld,ashighlylikelytobehateful,derogatory,andanti-Islamic. Onaverage,thisis289perhour,or6943perday.Thisisthe1
highestmonthlyaveragesincemeasurementsbeganattheendofFebruary.However,therateofanti-IslamicactivityonTwittersigniEicantlychangedoverthemonth.Twitteris,ingeneral,areal-time,reactiveandevent-speciEicplatform,andmostoftheanti-IslamicactivityidentiEiedwaslikewiselinkedtoaneventthathadrecentlyhappened.ThemostsigniEicantincreasewasintheimmediatewakeoftheterroristattackinNice,onJuly14th,withanotherappreciableincreaseintherateofanti-islamicexpressionsintheaftermathofthekillingofJacquesHamelinNormandy.TheEivemostsigniEicantspikesareanalysedingreaterdepth,below.Thisistotrytouncoverthetriggers,driversanddynamicsofanti-Islamichatredonline.
� Figure1–IslamophobicTweetscollectedduringJuly2016
Seemethodologyformoreinformationonhowthisresearchwasconducted.1
Spike1–July5th
WeidentiEied9,220IslamophobicTweetson5July.ItisdifEiculttoidentifyonesingulareventthattriggeredthisriseinIslamophobiclanguage.Onepossibleexplanationisthatthiswas4daysafterthe12-hoursiegebyISmilitantsinacaféinBangladesh.American,Italian,Indian,JapaneseandBangladeshivictimswereamongthe22peoplekilledinthisattack.Furthermore,thisdaymarkedtheendofRamadanbeforethestartofEidal-Fitr,perhapsintensifyingaglobalfocusonIslam.Examplesoftweetsfromthisdayinclude:“NobodycanstopMuslimscommittingjihadattacksanymorethantheycanstopBuddhistsmeditatingorMormonsknockingonpeople'sdoors”;“MoroccodeletesawholesectionoftheKoranfromschoolcurriculumasit’sfullofjihadincitementandviolence.TheReligionofpeace”;and“Ifuckinghatepakis.”
Spike2–July8th
11,320Islamophobictweetsweresenton8July.Again,itisdifEiculttoattributethisrisetoonespeciEicevent,thoughthiswasthedayaftertheshootingsinDallas,U.S.,inwhichMicahXavierJohnsonshotandkilled5policeofEicers,wounded7othersandwounded2civilians. ManytweetsappearedtotryandlinkthiseventtoIslam.Anexampleincludes:“ObamaisadamnRagheadexplainsalot”. Spike3–July15th
21,190tweetssenton15JulywereidentiEiedasIslamophobic.ThiswasthedayaftertheattackinNice,inwhichanarmedISmilitantdroveatruckthroughcrowdsofpeoplecelebratingBastilleDay;84peoplewerekilledandmanymoreinjured. Tweetssentonthisdayfocusedontheattack:“Sorrytohearaboutfrance-Thesemuzziesjustdontquit”;“and“Stopsaying'themajorityarepeace-loving'.Untilthemajoritydenounceeveryjihadi&turnthemin,wearesaferbelievingtheevidence”. Spike4–July17th10,610tweetsweresenton17July.ThiswasthedayafteranattemptedmilitarycoupinTurkeyfailed;afactionwithintheTurkishArmedForcesorganisedthemselvesunderthe‘PeaceatHomeCouncil’.TheCouncilcitedtheerosionofsecularismasonereasonbehindtheattemptedcoup.Assuch,sometweetscommentedonthis:“That'stheendofTurkey.AnothercountryruinedbyIslamanditsterroristculture.SameshithappenedtoPersia.NowitsIslamicRepublic”.Othertweetsweremoregeneral:“France'sIslamicpopulationisat9.6%.10%isusuallywhenJihadbegins.They'restartingearlybecausetheFrencharesoweak”;“ALLthisbecauseofthemuzzieintheWhiteHouse”. Spike5–July26th8,950tweetsweresenton26July.ThiswasthedayoftheNormandychurchattack,inwhichISmilitantskilledFatherJacquesHamel,andseriouslywoundedanother. Tweetssentonthisdaycommentonthisattack:“Normandyisreason1488thatyoushouldelectMarineLePen!>closeborders>deportmurderousIslam&Muzzies>deporttheriotingnegroes”;“Sosomesleazyscumcommittedjihadinthenameof#IslamthistimeinNormandybutheylet'skeeptellingMuslimswelovethem”;and“Priestkilledin#NormandytodaybyaRadicalIslamicTerroristyetHillarysaysthatIslamispeaceful!1274attacksthisyear=peaceful?Ok.”
GEOGRAPHYTheanalysiswasconductedonlyforanti-IslamicexpressionsintheEnglishlanguage.Consequently,thevastmajorityoftheTweetsthatcouldbelocatedtoEuropecamefromtheUnitedKingdom.However,asthemapsbelowillustrate,anti-IslamicTweetsweresentfromeveryEuropeanUnionmemberstate,withotherconcentrationsintheNetherlands,FranceandGermany.
�
Geo-locatedanti-IslamicTweetsoverJuly2016
�
Countryconcentrationsofgeo-locatedanti-IslamicTweets
Geographicalconcentrationsofanti-islamicexpressionsonTwitteralsovariedacrossthemonth.ThetweetssentinreactiontotheNiceattack(spike3,above),asshownbelow,hadhigherconcentrationsofanti-IslamicexpressionsinHollandandFrancethanthereactiontotheTurkishcoupattempt(spike4).
�
Geo-locatedanti-IslamicTweetsfor15thJuly
�
Countryconcentrationsofgeo-locatedanti-IslamicTweetsforJuly15th
�
Geo-locatedanti-IslamicTweetsfor17thJuly
�
Countryconcentrationsofgeo-locatedanti-IslamicTweetsfor17thJuly
VOLUME
FromthebeginningofMarchtotheendofJuly2016,anaverageof4972TweetswereidentiEiedaday. 2TheratedroppedsharplybetweenMarch(themonthoftheterroristattacksinBrussels,andthesubjectofearlierDemosworkinthisarea)andApril,andhassincethenbeenincreasingmonth-on-month. 3
July,withanaverageof6,943anti-IslamicTweetsperday,or215,247acrossthemonth,hasthehighestrateofanti-IslamicTweetsofanymonthanalysed,andconsiderablyabovethemonthlyaverage(168,595)duringthistime.
Islamophobictweetssenteachmonth,fromMaytoJuly
Month IslamophobicTweetssentperday(average)
%increase/decreaseonpreviousmonth
March 5,024 N/A
April 2,512 -50%
May 3,985 +37%
June 5,480 +27%
July 6,943 +21%
N.B.Duringtheinitialpilotofthissystem,only19dayswereanalysedinMarchandApril,andaverageswerebasedonthe2
daysmeasured.
ForDemos’analysisoftheonlinereactiontotheBrusselsattacks,seehttp://www.demos.co.uk/project/hate-speech-after-3
brexit/
Overthemostrecentperiod,thehighestvolumeofIslamophobicTweetsweresentfrom11thto17thJuly(64,143),whenboththeNiceattacksandtheattemptedmilitarycoupinTurkeyoccurred.ThisweekwasoneofthebiggestspikesinIslamophobiathroughoutthedataset,secondonlyto13thto19thJune.
Islamophobictweetssenteachweek,from25thAprilto31stJuly
ISLAMOPHOBICTWEETSFROMTHEUK:MAYTOJULY
ThereportalsoanalysedIslamophobicTweetsthatwerelikelysentfromtheUK.ItisimportanttonotethatnotallTweetscanbegeographicallyplaced.AsmallamountoftweetshavedeEinitiveinformationaboutwheretheyweresentfrom.Thesearegeo-tags:preciselongitudeandlatitudecoordinatesthatindicateverypreciselywherethetweetwasposted.Onlyuserswhoproactivelyturnonthegeo-locationfacilityontheirsmartphonewillincludethisinformation.Alargernumberoftweetscanbealgorithmicallylocatedbasedongeo-locationmeta-dataattachedtotheTweet.Theseinclude(inadditiontothelongitudinal-latitudinaldatacontainedabove),the‘locationEield’–whereusersreportwheretheyarefrom,andtimezone.OntestsofthismethodofgeolocatingTweets,ithasbeenfoundtobebetween80and90percentaccurateforthoseTweetsitcouldlocate,andbeabletolocatebetween40and70percentofTweets. 4
FromthebeginningofMarchtotheendofJuly2016,anaverageof367IslamophohbicTweetsfromtheUKwereidentiEiedaday.Unliketheglobaldata,therateofIslamophobicTweetsdecreasedslightlybetweenMayandJune.However,consistentwiththeglobalpicture,theratethenincreasedsharplybetweenJuneandJuly.
Month Islamophobictweetssentperday(average)
%increase/decreaseonpreviousmonth
Totalmonth
May 380 N/A 11,766
June 351 -8% 10,557
July 468 +33% 14,512
Formoreinformationonthis,seetheDemospaperTheRoadtoRepresentivity,page30.http://www.demos.co.uk/Eiles/4
Road_to_representivity_Einal.pdf?1441811336
July,withanaverageof468anti-Islamictweetsperday,or14,512acrossthemonth,hasthehighestrateofanti-Islamictweetsofanyofthethreemonths,andabovethemonthlyaverageof12,278forthethreemonths.
Islamophobictweetssenteachmonth,fromMaytoJuly
Overthemostrecentperiod(July),thehighestvolumeofIslamophobictweetsweresentfrom11thto17thJuly(3,958)whenboththeNiceattacksandtheattemptedcoupinTurkeyoccurred.ThisweekwasoneofbiggestspikesinIslamophobiathroughoutthedataset,secondonlyto2nd–8thMay.
Islamophobictweetssenteachweek,from25thAprilto31stJuly
ETHICS
AtDemoswebelieveitisimportantthattheprincipleofinternetfreedomshouldbemaintained;andthatitshouldbeaplacewherepeoplefeeltheycanspeaktheirmindopenlyandfreely.However,racist,xenophobic,Islamophobicandmisogynisticabusecancurtailfreedom,andthecapacitytospeakandactfreelyonline,asmuchasitcanbeanexpressionofit.Itisimportant,associetyconfrontsthewaysthatsocialmediaactsasanewplatformfortheexpressionanddisseminationoftheseofkindsofviews,tounderstandasbestaspossiblethescale,scope,natureandseverityofthesekindsofpractices:whentheyhappen,whotheyhappento,andwhy.Thisiswhatthisresearchhopestocontributeto.
CASMhasconductedextensiveworkontheethicsandpublicacceptabilityofsocialmediaresearch. Anethicalframeworkhasbeenappliedtothisproject,suchthat:5
• Theresearchonlyusespubliclyavailabledata,viewableandvisibletoanyTwitteruser;
• Theresearchconductedisaggregatedandanonymous:theresearchdoesnotidentifyanyspeciEicuserorusers,buttounderstandtheoverallscaleandnatureofIslamophobicabuseonTwitter;
• Wherequotationsareusedasexamplesandelaborations,theyhavebeenalteredtomaintaintheoverallmeaning,buttopreventtheretrospectiveidentiEicationofanyTwitteruseronthebasisofthequotation;
• Thereisnosuggestionofanyillegalityofanyofthecontentmeasured:thepurposeoftheresearchwasnottolookforcontentthatwasillegal,anditdoesnotsuggestthatthecontentthatwasfoundwasillegal.Thisresearchisnotseekingtoinformhowlawsshouldbeenforcedonsocialmedia.Thisresearch,andDemos’broaderresearchagenda,seeksinsteadtoinformthebroaderquestionofhowpeoplefromdifferentraces,religions,sexualitiesandgendersarespokenaboutonsocialmedia,andtheextentthatpeoplefromdifferentbackgroundsfaceabuseandhostility.
See,forinstance,Demos’recentpaperwithIpsosMORI#socialethics:AGuidetoEmbeddingEthicsinSocialMediaResearch5
,https://www.ipsos-mori.com/Assets/Docs/Publications/im-demos-social-ethics-in-social-media-research-summary.pdf
OVERALLMETHODOLOGYTwitterdataisoftenchallengingtoanalyse.Datadrawnfromsocialmediaareoftentoolargetofullyanalysemanually,andalsooftennotamenabletotheconventionalresearchmethodsofsocialscience.TheresearchteamusedatechnologyplatformcalledMethod52,developedbyCASMtechnologistsbasedattheTextAnalyticsGroupattheUniversityofSussex. Itisdesignedtoallownon-technical6
researcherstoanalyseverylargedatasetslikeTwitter.
DeiiningIslamophobia
ThispaperispredicatedonthetrainingofamachinetobeabletodistinguishbetweenanexpressionthatisIslamophobicandonethatisn’t.AnIslamophobicexpressionwasdeEinedastheillegitimateandprejudicialdislikeofMuslimsbecauseoftheirfaith.However,Islamophobiacantakeonaverylargenumberofdifferentforms,anditsidentiEication,especiallywithinTwitterresearch,wasoftenchallenging.Ultimately,thisresearchcomesdowntothejudgementoftheresearchersinvolved.FourmainqualitativecategoriesofIslamophobia,onthejudgementoftheresearchersconductingtheanalysis,wereidentiEied:
• ‘Islamistheenemy’:TheideathatitisafundamentalinjunctionofIslamforallofitsfollowerstobeengagedinaviolentstruggleagainstnon-MuslimsandtheWest;
• TheconElationofMuslimpopulationswithsexualviolenceandaproclivitytowardsrape;• Especiallyinthewakeofterroristattacks,theapportioningofblamefortheattacknotontheterroriststhemselves,oronIslamistmilitancy,butontheMuslimpopulationgenerally;
• Generalabuse,andthegeneraluseofanti-IslamicslursandderogatorydescriptionsofMuslims.
DataCollectionMethod52wasusedtodirectlycollectTweetsfromTwitter’sStreamandSearch‘ApplicationProgrammingInterfaces’(orAPIs).TheyallowallTweetstobecollectedthatcontainoneofanumberofspeciEiedkeywords.Thekeywordsusedinthevariouscollectionsusedinthisresearcharedetailedintheannex.
DataAnalysisMethod52allowsresearcherstotrainalgorithmstosplitapart(‘toclassify’)Tweetsintocategories,accordingtothemeaningoftheTweet,andonthebasisofthetexttheycontain.Todothis,itusesatechnologycallednaturallanguageprocessing.NaturallanguageprocessingisabranchofartiEicialintelligenceresearch,andcombinesapproachesdevelopedintheEieldsofcomputerscience,appliedmathematics,andlinguistics.Ananalyst‘marksup’whichcategoryheorsheconsidersatweettofallinto,andthis‘teaches’thealgorithmtospotpatternsinthelanguageuseassociatedwitheachcategorychosen.Thealgorithmlooksforstatisticalcorrelationsbetweenthelanguageusedandthecategoriesassignedtodeterminetheextenttowhichwordsandbigramsareindicativeofthepre-deEinedcategories.Detailsabouthowthesealgorithmswereused,andhowwelltheyworked,areprovidedbelow. 7
TheAccuracyofAlgorithmsTomeasuretheaccuracyofalgorithmsintothecategorieschosenbytheanalyst,weuseda‘goldstandard’approach.Foreach,around100TweetswererandomlyselectedfromtherelevantdatasettoformagoldstandardtestsetforeachclassiEier.TheseweremanuallycodedintothecategoriesdeEinedabove.TheseTweetswerethenremovedfromthemaindatasetandsowerenotusedtotraintheclassiEier.
AstheanalysttrainedtheclassiEier,thesoftwarereportedbackonhowaccuratetheclassiEierwasatcategorisingthegoldstandard,ascomparedtotheanalyst’sdecisions.Onthebasisofthiscomparison,classiEierperformancestatistics–‘recall’,‘precision’,and‘F-score’arecreatedandappraisedbya
This group is led by Professor David Weir and Dr Jeremy Reffin. More information is available about their work at: 6
http://users.sussex.ac.uk/~davidw/styled-3/
For a more detailed description of this methodology, see the Demos paper Vox Digitas, 7
humananalyst.EachmeasurestheabilityoftheclassiEiertomakethesamedecisionsasahumaninadifferentway:
Overallaccuracy:ThisrepresentsthepercentagelikelihoodofanyrandomlyselectedTweetwithinthedatasetbeingplacedintotheappropriatecategorybythealgorithm.Itisbasedonthreeothermeasures(below).
Recall:ThenumberofcorrectselectionsthattheclassiEiermakesasaproportionofthetotalcorrectselectionsitcouldhavemade.Iftherewere10relevantTweetsinadataset,andarelevancyclassiEiersuccessfullypicks8ofthem,ithasarecallscoreof80percent.
Precision:ThisisthenumberofcorrectselectionstheclassiEiersmakesasaproportionofalltheselectionsithasmade.IfarelevancyclassiEierselects10Tweetsasrelevant,and8ofthemactuallyareindeedrelevant,ithasaprecisionscoreof80percent.
F-Score:AllclassiEiersareatrade-offbetweenrecallandprecision.ClassiEierswithahighrecallscoretendtobelessprecise,andviceversa.The‘overall’scorereconcilesprecisionandrecalltocreateone,overallmeasurementofperformanceforeachdecisionbranchoftheclassiEier.
N.B.thevaluesforeachalgorithm(calledaclassiEier)arepresentedwithinthedetailedmethodologyofthisreport.Thevaluesareexpressedasvalueupto1:avalueof0.76,forinstance,indicatesa76%accuracy.
CAVEATS
Theresearchoflargesocialmediadatasetsisareasonablynewundertaking.Itisimportanttosetoutaseriesofcaveatsrelatedtotheresearchmethodologythattheresultsmustbeunderstoodinthelightof:
• Thealgorithmsusedarenotperfect:throughoutthereport,someofthedatawillbemis-classiEied.ThetechnologyusedtoanalyseTweetsisinherentlyprobabilistic,andnoneofthealgorithmstrainedandusedtoproducetheEindingsforthispaperwere100%accurate.Theaccuracyofallalgorithmsusedinthereportareclearlysetoutinthisreport.
• Somedatawillbemissed:AcquiringTweetsonthebasisofthekeywordsthattheycontainpresentstwopossibleproblems.First,theinitialdatasetmaycontainTweetsthatareirrelevanttothethingbeingstudied.Secondly,itmaymissTweetsthatarerelevanttothethingbeingstudied.Researchersworkedtoconstructascomprehensivealistofkeywordsaspossible(thesearedetailedinthereport,below),howeveritislikelysomeweremissed,andthenumberspresentedinthisreportarelikelyasubsetofthetotal.
• TwitterisnotarepresentativewindowintoBritishsociety:TwitterisnotevenlyusedbyallpartsofBritishsociety.Ittendstobeusedbygroupsthatareyounger,moresocio-economicallyprivilegedandmoreurban.Additionally,thepoorest,mostmarginalisedandmostvulnerablegroupsofsocietyareleastrepresentedonTwitter;anissueespeciallyimportantwhenstudyingtheprevalenceofxenophobia,Islamophobiaandthereportingofhateincidents. 8
• Overall,thisresearchisintendedtobeanindicative,Eirst-takeofthereactiononTwittertotheseimportantevents.ItisnotpresentedaseitherexhaustiveordeEinitive;anditisverymuchhopedthatitwillstimulatefurtherresearchonthisvitaltopicinthefuture.
For a longer discussion of this issue, see the Demos paper The Road to Representivity 8
DETAILEDMETHODOLOGY
IdentifyingTweetsthatwerehateful,derogatoryandanti-Islamicwasaformidableanalyticalchallenge.First,allTweetswerecollectedthatcontainedoneofanextensivelistoftermsthatcouldbeusedinananti-Islamicway(seeannex).ThiscollectionbeganonFebruary29thandcontinueduntilthe2ndAugust.ItreturnedaverylargenumberofTweetsoverthisperiod,over34,000,000.TheverylargemajorityoftheseTweetswerenotanti-Islamicorhateful.Aseriesofalgorithmswerebuilttorespondtothedifferentchallengesthatthisdatasetposedinordertoidentifytheanti-Islamicsubsetwithinthelargerbodyofdata.EachwasdesignedtoremoveTweetswhichwerenotIslamophobicfromthedataset:
• AlargenumberofTweetscontainedtheword‘Paki’. AclassiEierwasusedtoseparatederogatory9
usesofthiswordfromnon-derogatoryuses.• AlargenumberofTweetsalsocontainedtheword‘terrorist’.Ofcourse,manyTweetscontainingthiswordwereinnowayderogatoryoranti-Islamic.TwoclassiEierswerebuilttoanalysetweetscontainingthesewords:• First,aclassiEierwastrainedtoseparateTweetsreferringtoIslamistterrorismfromotherformsofterrorism.
• Second,oftheTweetsreferringtoIslamistterrorism,aclassiEiertodistinguishviewsbroadlyattackingMuslimcommunitiesinthecontextofterrorism,fromthosebroadlydefendingMuslimcommunities.
• AclassiEierwastrainedtoseparateallotherTweetsinthedatasetintothosethatwerederogatoryandanti-Islamicfromthosewhichwerenot.
• Last,theTweetsthat,basedontheabove,(a)usedtheterm‘Paki’inaderogatoryway,(b)thatusedtheterm‘terrorist’tobroadlyattackMuslimsorMuslimcommunities,(c)thatusedtheotherpossibleslurtermsinthecollectioninawaythatwasanti-Islamicwerecombined.ThesewerethenEilteredtoincludeonlyTweetssentfromtheUK.ThisresultedintheEinaltotalofIslamophobicTweets.
Theaccuracyofthesealgorithmsareasfollows: 10
N.B. whilst this word refers to an ethnic rather than religious group, it was found that it was often used interchangeably 9
to refer to Muslim communities
Due to the large number of classifiers used, the accuracy was checked by taking a random sample of Tweets that - 10
according to the system of algorithms described above - were classified as derogatory anti-Islamic. 75 of these 100 were identified by an analyst as derogatory and anti-Islamic.
Thesealgorithmswereconnectedtogetherintoan‘architecture’,shownbelow.EachTweetcollectedpassedthroughthearchitectureonthebasisofhowitwasclassiEied.Overall,thissystemofalgorithmssucceededinEilteringtheverylarge(over34,000,000)numberofTweetsintoamuchsmaller(657,650)subsetthatweremuchmorelikelytobehateful,derogatoryandanti-Islamic.
Annex-DataCollectionKeywordsTheannexcontainsthekeywordsusedtocollectTweetsanalysedthroughoutthisreport.
1. Words/HashtagsusedtocollectTweetsthatcouldbederogatoryandanti-Islamic
• Jihad• Jihadi• SandFlea• Terrorist• hijab• CamelFucker• CarpetPilot• Clitless• DerkaDerka• Diaper-Head• DiaperHead• DuneCoon• DuneNigger• Durka-durka• Jig-Abdul• Muzzie• Q-TipHead• Rab• Racoon• Rag-head• RugPilot• Rug-Rider• SandMonkey• SandMoolie• SandNigger• SandRat• SlurpeeNigger• Towel-head• MuslimPaedos• Muslimpigs• Muslimscum• Muslimterrorists• Muzrats• muzzies• Paki• Pakis• Pisslam• raghead• ragheads• Towelhead• FuckMuslims• WhiteGenocide• Pegida• EDL• BNP• Rapefugee• Rapeugee• mudshark• kuffar• kafEir