in-depth analysis of the great firewall of china · 2016-12-21 · abstract created by the golden...
TRANSCRIPT
In-depthanalysisoftheGreatFirewallofChina
ChaoTangDecember14,2016
SpecialthankstoMartinJohnson,CharlieSmithfromGreatFireand
MingChowforallthehelptheyhaveprovided.
AbstractCreatedbytheGoldenShieldProject,theGreatFirewallofChina(GFW)isthebackbone
ofworld’slargestsystemofcensorship.Asanon-pathsystem,theGFWcanmonitortrafficand
injectadditionalpackets,butcannotstopin-flightpacketsfromreachingitsdestination.It
achievescensorshipusingthreemaintechniques:First,itinspectsallInternettrafficbetween
Chinaandtherestoftheworld,thenterminateconnectionscontainingcensoredcontentby
injectingforgedTCPResetpacketstobothends.WiththeadventofHTTPS,whichcannotbe
decryptedbytheGFW,TCPRSThasseenfeweruseinrecentyears.Second,theGFWblocks
accesstospecificIPaddressesthroughthegatewayroutersofallChineseISPs.Third,ituses
DNStamperingtoreturnfalseIPaddressesinresponsetoDNSqueriestoblockeddomains.This
affectsqueriestobothdomesticandforeignDNSservices.IPblockingandDNStampering
togetherarethebreadandbutterofGFW,effectivelycuttingoffallaccesstoblockedwebsites.
But,suchdraconianmethodsinevitablycauseover-censoringandcollateraldamageto
internationalwebtrafficflowingthroughChinaandinnocentwebsites.Thethreemainwaysa
usercanbypasstheGFWaretheuseofVPNs,Proxies,andTor.However,GFWcanusedeep
packetinspectionandmachinelearningtoshutdownsuspectedVPNorproxytunnels,anduse
anactiveprobingsystemtoshutdownTorbridgerelays.Asoftoday,fewcommercialVPN
servicesandthelatestTorprotocolsusingPluggableTransportsareviableapproaches.
INTRODUCTION
InChina,thefirstrecordedconnectiontotheglobalwasanemailsenttoKarlsruher
InstitutfürTechnologieinGermanyonSeptember14,1987.Ironically,themessagesaid
“AcrosstheGreatWall,wecanreacheverycornerintheworld.”iTrueInternetcametoChina
in1994,asanextensionofthe“OpenDoorpolicy”thatopenedthecountrytotheWestern
world.Inthefollowingyears,asmoreandmorecitizensadoptedtheInternet,theChinese
governmentfoundthemselveslosingcontroloverthespreadandavailabilityofinformation.
“Determinedtocontrolonlinecontentanditscitizenswithregardstothekindsofinformation
towhichtheyhaveaccessed.MPS,thebranchofthegovernmentthatdealswiththisissue,
immediatelytookactionbylaunchingtheGoldenShieldProject.”ii
GoldenShieldProjectofficiallymadeitsdebutin2000,andhasbeenconstantlyevolving
since.“ThegovernmentinitiallyenvisionedtheGoldenShieldProjecttobeacomprehensive
database-drivensurveillancesystemthatcouldaccesseverycitizen’srecordaswellaslink
national,regional,andlocalsecuritytogether.”2However,therapidexpansionofInternetin
Chinarenderedthisgoalinfeasible,andtheprojectpivotedfrom“generalizedcontentcontrol
atthegatewayleveltoindividualsurveillanceofusersattheedgeofthenetwork.”2Itwasthis
ideologythatmadetheGFWwhatitistoday.
OnMarch16th,2015,theChinesecensorshipapparatusunveiledanewtool,dubbedthe
“GreatCanon”,totherestoftheworld.Itmadeitsgrandentrancebyengineeringadenial-of-
serviceattackonGreatFire.org,anorganizationdedicatedtocollectingdataaboutGFWand
sharingitwithrestoftheworld.Forthedaysthatfollowed,GreatFireserversreceivedupto
2.6billionrequestsperhour,2500morethantheirnormalload.Afterfurtherresearch,itwas
determinedthattheGCisaseparatebutrelatedin-pathsystemwiththeabilitytointerfere
withtrafficdirectlythroughinjection,redirection,andsuppression.Sinceitsdebut,ithasbeen
usedtoDDOSmultiplewebsiteswithgreatsuccess.
TothecommunityChinahas721,434,547internetusersiii,mostoutofanycountryintheworldand3times
thenumberintheUS.Theworldmustconsidertheimplicationofhavingsuchalargenumberof
peoplelivingunderaheavilycensoredandmonitoredInternet.Withoutdelvingintopolitics,itis
undeniabletosaythattherelativestabilitytheChineseCommunistPartyhasenjoyedisduein
nosmallparttotheeffectivenessoftheGFW.Allmajorglobalsocialmediawebsites,mostof
Google’sservices,anywebsiteswithinformationaboutcivilunrestspastandpresentareonlya
shortlistofwebsitesthatareblockedbytheGFW.Hundredsofthousandsofforeigncompanies
operating in China alsomust operate under the constraints of the GFW and GC, with some
altering itsbusinesspractices to complywith the restrictions imposed,andothers constantly
finding new ways to circumvent them. Finally, seeing the success of large scale Internet
censorship program in China, other countries such as Cuba, Zimbabwe, and Belarus are
consideringadoptingsimilarprograms.ivThus,computersciencestudentstodaymustkeepthe
capabilitiesandlimitationsoftheGFWandGCinmindwhendevelopingproductsforthefuture.
GreatFirewallOverview
ThenameGreatFirewall isamisnomer,astraditionalfirewallsarein-pathbarriersthat
controltrafficflowingbetweennetworks.TheGFWisanon-pathsystem,meaningitcanpassively
readalltrafficbetweenChinaandtherestoftheworldandinjectadditionalpackets,butitcannot
droppacketsalreadyin-flight.Comparedtoin-pathsystems,on-pathsystemsarelessdisruptive
anddonotdramaticallyslowdownalltrafficpassingthroughthenetwork.vOntheotherhand,
theyarelessflexiblebecausetheycannotinterferewithexistingtraffic.Asaresult,theyarealso
lessstealthy.OnecangenerallydetectwhentraffichasbeenalteredbytheGFWbyobserving
anomalousinjectedpacketsusingserverlogsandpacketanalyzers.
TheGFWhasthreemainweaponsitemploysforcensorship:TCPReset,IPaddressblocking,
andDNSpoisoning.Theywillbeindividuallyexaminedbelow.
TCPReset
Figure-1:AnillustrationofTCPReset
OncethecrèmedelacrèmeofGFW,TCPresetisadirectanswertothelimitationsofan
-patharchitecture.“Mostcontentinspectionschemesworkbypassingalltrafficthroughaproxy
thatrefusestoserveresultsforforbiddenmaterial.However,aproxy-basedsystemthatcancope
withthetrafficvolumesofamajornetwork,oranentirecountry,wouldbeextremelyexpensive
anddifficulttoscale.”viInstead,theGFWinspectstrafficbypassingcopiestoout-of-banddevices
based on Intrusion Detection Systems. vii The original packets are unaffected, while the IDS
inspectsthecontentofthepacketandtherequestedURL,matchingthemagainstablacklistof
keywords.Sincelate2008,onlythefirstHTTPGETrequestafteraTCPhandshakeisinspected,
improvingtheefficiencyofthesystemwithoutlosingtoomuchaccuracy.viiiMoreimpressively,
theGFWisnowcapableofbothIPfragmentsandTCPsegmentsreassemblywhilemaintaining
state.ixBeforethisfeaturewasintroduced,theinabilitytoreassemblesegmentswasconsidered
amajorflawinthesystem,andsimplybreakingdownpacketswasaneffectivewayofby-passing
thesystem.
OncetheIDSdetectsblacklistedkeywords,theGFWrouterinjectsmultipleforgedTCP
RSTpackets to both endpoints, forcing the connection to bedropped.Multiple packetswith
differentACKnumbers guarantees that the connection is blocked even if the original packet
reachesitsdestinationbeforetheRST.TheGFWthenmaintainstheflowstateregardingsource
anddestinationIPaddresses,portnumberandprotocolofdeniedrequesttoblockallfurther
communicationsforuptohoursatatime.
Figure-2:ThisisascreenshottakenfromWiresharkwhentheauthorattemptedtotriggerTCPRSTusingaVPN.WhileconnectedtoaVPNserverinShenzhen,theauthorusedYahootosearchforthecensoredstring“falun”.Althoughthesearchreturnedresults,theauthorwasunabletoconnecttomostwebsitesfromtheresultspage.TheTCPRetransmissionshownaboveisevidenceofthefailuretoconnect.TheauthorinitiallythoughtthefiveTCPRSTpacketsinredwerethedoingsofGFW.However,theACKnumberofthepacketswereall0,whichisuncharacteristicofforgedTCPRSTpackets.Thus,althoughitwasavaliantattempt,itisunlikelythatGFWwasatplayhere.
AfternumerousotherattemptsattriggeringTCPRSTwithoutconclusiveevidence,the
authorreachedouttoGreatFireforadvice.MartinJohnsonsentbackthefollowingresponse:
“KeywordresetsmattermuchlessnowwithmostbigwebsitesusingHTTPSandsomanymajor
onesbeingblockedwholesaleanyway.Ijusttestedacoupleofsensitivekeywordsandthe
connectionwasnotreset,soperhapstheGFWisusingitmuchlessnowthanitusedto.”HTTPS
encryptsallpacketsintransit,thustheGFWIDShasnowayofinspectingHTTPStraffic,
renderingTCPRSTuseless.Inaddition,creatingrulesonbothendpointstoignoreTCPRST
packetscanalsocompletelybypassTCPRST.Thesecripplingweaknesseshaveledtothedemise
ofTCPRSTinrecentyears.
IPAddressBlocking
Figure-3:AnillustrationofIPAddressBlocking.
IPaddressblockingisasimple,lightweight,yetextremelyeffectivecensorshiptool.“By
peeringwiththegatewayroutersofallChineseISPs,GFWinjectsalistofblacklisteddestination
addressesintoBGP(BorderGatewayProtocol)andhijacksalltraffictoblockedwebsites.”xIn
otherwords,theGFWforcesrouterstodropalltrafficforblockedIPs.Thistechniqueiscalled
nullrouting,andcanonlyblockoutboundtrafficfromChinaandpermitsinboundtraffic.Thisis
sufficientinmostcases,asmostcurrentInternetcommunicationrequireathree-way-
handshaketofunction.
IPaddressblockingisa“lightweightsolutionasthegovernmentcanmaintaina
centralizedblacklistwithoutmuchinvolvementfromtheISPs,andthuswithoutmuchriskof
leakage.”ItalsoonlyaddsasmallloadtothegatewayrouterofISPs,anddoesn’trequireany
additionaldedicatedinfrastructure.However,IPblockingdoeshavetwokeylimitations:First,
theeffectivenessofIPaddressblockingreliesontheaccuracyoftheblacklist.Itneedstobe
carefullymaintainedandupdated,andwebsitescankeepswitchingtonewIPaddressestostay
aheadoftheGFW.Second,asmanylegitimatewebsitessharethesameIPaddressesoraddress
blockswithbannedsites,over-censoringisanunavoidablesideeffect.Thishasbeenexploited
bycensoredwebsitestoleveragethegovernmentintounblockingtheminthepast.For
example,theheavilytargetedsitewww.falundafa.orgbeganresolvingtothesameIPaddress
aswww.mit.eduatonepoint,whichtheGFWthenblocked.TheOpenCourseWaresitebyMIT
wasalsoblocked,andcausedsuchapublicoutcrythattheblockwasrevoked.
Figure-4:ThisscreenshotisanexampleofIPblocking.TheauthortriedtoaccessGoogleviatheIP216.58.200.46.Nodatawasreceivedandthesiteeventuallytimedout,asevidentbytheTCP
Retransmissionpacketsinblack.
DNSTampering
Figure-5:AnillustrationofDNSTampering.
DNStamperingisusedinconjunctionwithIPaddressblocking,aschangingdomain
namesismuchharderthanchangingIPaddresses.ThefirststepinDNStamperingisDNS
injection.Whenauserattemptstoconnecttoadomain,thecomputerqueriesDNSserversfor
theIPaddressassociatedwiththedomainname.GFWmonitorseachDNSqueryoriginating
fromclientsinsideChinaattheborderoftheChineseInternet.Ifitdetectsaquerytoablocked
domainname,itinjectsafakeDNSreplywithaninvalidIP,orinsomerarecases,anIPto
anotherwebsite.ThisfakeDNSreplythentricklesdowntointernalrecursiveDNSserversin
China,withtheincorrectpairingcachedalongtheway,achievingDNSpoisoning.Thus,almost
allDNSresolversinChinahavepoisonedcaches.
Whenasite’sdomainnamegetsblockedinthisway,thereislittlethesitecando
besideschangingit.Therefore,DNStamperingandIPaddressblockingusedtogethercan
effectivelysealoffcensoredsitesatalllevels.SimilartoIPblocking,therearetwomajor
downsidestoDNStampering:first,large-scalecollateraldamageisunpreventablebecause
GFWdoesnotdistinguishbetweenDNSqueriesthatoriginatefromChinaandthosethatsimply
passthroughChina.xiResearchshowedthat“ChineseDNSinjectionaffected15,225open
resolvers(6%oftestedresolvers)outsideChina,from79countries.”xiiAnotherunintended
consequenceisthathugevolumesoftrafficcanbesuddenlydirectedtoinnocentwebsites,
seriousdisruptingtheirnormaloperationandforcingthemtoblockallcommunicationsfrom
China.CraigHockenberry,anetworkengineerthatmaintainsasimplewebserverfor
Iconfactory,wroteafascinatingblogentryabouthisencounterwithGFWtitled“FearChina”.xiii
OnJanuary20th,2015,Craig’ssinglefourcoreserverwassuddenlyhitwithtrafficthatpeaked
at52Mbps,aboutathirdofGoogle’sglobalsearchtraffic,assumingeachrequestwas500
bytes.Uponreviewingserverlogs,hesawthathisserverwashitwithconnectionstargetedat
212differentdomains,fromwww.youtube.comtocdn.gayhotlove.com,alloriginatingfrom
China.xivEssentially,theGFWhasinadvertentlyweaponizeditsnetworktoDDOSinnocentIPs.
Thishasahighpotentialforabusebythegovernment,andifthistrendcontinues,moreand
morewebsiteswillhavenochoicebuttoblockalltrafficfromChinainanticipation.
Figure-6:ThisscreenshotisanexampleofDNStampering.Theauthortriedtoaccess
www.facebook.com,ascanbeseenfromthestandardquery.DNSserverreturnedapoisonedaddress,93.46.8.89.TheTCPretransmissionstothatIPisevidencetheIPisinvalid.Further
researchrevealedthatthisisoneofsevenpoisonedIPsregularlyusedbytheGFW,andisownedbythecompanyFastwebinItaly.xv
BypassingtheGFW
WhilethemajorityofInternetusersinChinaareawareoftheexistenceofGFW,feware
actuallyinterestedinbypassingthecensorshipandaccessingblockedwebsites.Thisismainly
duetopoliticalpropaganda,andthepopularityofChinese“clones”ofsitessuchasFacebook
andTwitter.ForthefewtechnologicallysavvynetizensofChina,VPNandproxiesremainthe
mostaccessiblewaysofavoidingtheGFW.VirtualPrivateNetworksworkbyroutingalltraffic
toandfromacomputerthroughaserverusingmanysecureprotocols.Thus,allconnectionsto
theoutsidewebappeartobecomingfromthelocationoftheVPNserverinsteadoftheuser’s
actuallocation,andtheusercaneffectivelybypasstheGFW.Proxiesfunctionsimilarly,except
onlybrowsertrafficisencrypted.
AlthoughtheGFWhasnowayofinterpretingencryptedcontentbetweentheuserand
theVPNserver,theGFWhasenoughunderstandingofpopularVPNprotocolssuchthatitcan
usedeeppacketinspectionandmachinelearningtoidentifyandshutdownVPNconnections.xvi
IfausersetsuphisownVPNusingabasicOpenVPNsetup,hewillfindthattheVPNworksfine
forafewminutesbeforelatencystartsincreasingexponentiallyandeventuallytimingout.The
GFWfindsheuristicstoguesswhichTCP/UDPconnectionsareusedforVPN,thensimplydrops
allpacketswhenithasenough“proof”.OneuseronHackerNewspointedoutthattheonly
waytomanuallydisguiseVPNtrafficistomakeitlooklikestandardHTTPSsessions.“For
exampleinatraditionalHTTPSsession,iftheclientbrowserdownloadsa500kBimageover
HTTPS,itwillsendperiodicalemptyTCPACKpacketsasitreceivesthedata.Butwhenusinga
VPNthatencryptsdataattheIPlayer,theseemptyACKpacketswillbeencrypted,soTheGreat
Firewallwillseetheclientsendingsmall~80-120bytesencryptedpackets,andwillcountthisas
onemoresignthatthismightbeaVPN.”xvii
ConsideringthattheGFWhasthecapabilitiestoshutdownVPNconnections,itremains
amysterywhythegovernmentallowscommercialVPNssuchasExpressVPN,AstrillVPN,and
HMA!tooperatefreely.TheofficiallyacceptedansweristhattheChinesegovernmentare
willingtogivelegitimateforeignbusinessessomebreathingroom,asmanyfirmsrelyonthe
useofVPNintheirdaytodayoperations.ConspiracytheoristsspeculatethattheChinese
governmenthasalreadytakencontrolofthesecommercialVPNservices,andareactively
spyingonsupposedlyencryptedconnectionsusingman-in-themiddle.OneblogpostbyMarc
BevandonJan14,2016pointedoutthatExpressVPN,oneofthetop3VPNservicesusedin
China,usedaCAcertificateRSAkeyofonly1024bits.xviiiItisbelievedthatsuchakeycanbe
factoredby$10millionsofspecializedhardware,whichishardlyunfeasibleconsideringthe
benefitsitwouldbringtheChinesegovernment.OnFebruary15,2016,ExpressVPNupgraded
theirCAkeysfrom1024to4096-bits,andnoonewilleverknowwhetherExpressVPNwas
compromisedornot.
Figure-7:AnillustrationofActiveProbingSystem
ThemoreadvanceduserscanleverageTor,theinfamousanonymitynetwork,to
circumventtheGFW.Insimpleterms,Tor'susersemploytheTornetworkbyconnecting
throughaseriesofvirtualtunnelsratherthanmakingadirectconnection.“Itisaneffective
censorshipcircumventiontool,allowingitsuserstoreachotherwiseblockeddestinationsor
content.”xixHowever,TorhasnotalwaysbeenviableinChina.In2012,Chineseusersstarted
havingissuesconnectingtotheTornetwork.xxAfteranextendedinvestigation,itwasrevealed
thattheGFWusesanactiveprobingsystemtodynamicallyrecognizeTorusage.xxiTorrelieson
alargenumberofentryguardsandbridgerelaysasendpointstoofferconnectionstocensored
regions.ThesebridgesareTorrelaysthataren’tlistedinthemainTordirectory,sotheyshould
theoreticallybeuntraceable.TheGFWimplementedareal-timeprobingsystemthat“searches
forbytesthatidentifyanetworkconnectionasTor.Ifthesebytesarefound,thefirewall
initiatesascanofthehostwhichisbelievedtobeabridgeandshutsitdown.”Thisworks
becausetheGFW“isableto(partially)speakthevanillaTorprotocol,obfs2,andobfs3toprobe
bridges”,anditfunctionsinreal-time:“onaverage,ittakesonlyhalfasecondafterabridge
connectionforanactiveprobetoshowup”.xxiiThescanisrunbyseeminglyarbitrary
computersstrewnthroughoutChina,andcannotbepredictedbytheTornetwork.
Thesituationonlychangedin2015,whentheTorprojectreleasedobfs4andMeek,two
protocolsthatusePluggableTransports.xxiiiPerTorProject,PluggableTransportstransformthe
Tortrafficbetweenclientandbridge.xxivSpecifically,obfs4isanobfuscationprotocolthatuses
PluggableTransportstofurtherencrypttheconnectionbetweenclientandbridge.Thisrelies
onasharedsecretdistributedoutofband.Sinceprobesdonothaveaccesstothesecretkey,it
cannotidentifythebridges.Alternatively,Meekisatransportprotocolthatrelaytrafficthrough
popularcloudcomputingservices,suchasMicrosoftAzure,byimitatingregulartraffic.Instead
oftakingtheHTTPSapproachoftenusedagainstGFW,itusesHTTPandTLSforobfuscation.
GFWcannotdistinguishbetweenTortrafficandnormalcloudtraffic,butitalsocannotblock
theIPsofcloudcomputingservicesduetobusinessreasons.
Obfs4andMeektakeoppositeapproachestoevadingtheGFW:obfs4usesanextra
layerofencryptiontohideintheshadows,whileMeekimitatesregulartraffictohideinplain
sight.Nonetheless,bothhavecontributedtotheresurgenceofTorinbypassingGFWinthe
pastyear.
Conclusion
TheGreatFirewallisapowerfulandsophisticatedcensorshiptoolunlikeanytheworld
hasseenbefore.ItusesacombinationofDNStamperingandIPaddressblockingtocompletely
sealoffaccess.Inaddition,itusesanIDS-likesystemtoinspecttrafficforblacklistedkeywords
andterminateconnectionsbyinjectingRSTpackets.Whilethesetoolscancausesignificant
collateraldamage,theyareextremelyeffectiveforblockingalmostanywebsiteforthevast
majorityofInternetusersinChina.Ontheotherhand,netizenscanuseVPN,Proxies,andTor
tobypasstheGFW.Yet,theGFWleveragesmachinelearninganddeeppacketinspectionto
shutdownVPNandProxytunnels,anddeploysanactiveprobingsystemthatcanshutdown
TorrelaysrunningeverythingbutthelatestTorprotocols.Withtherecentintroductionofthe
attackorientedGreatCannon,anotherpieceofGoldenShieldProjectisnowcomplete.Internet
censorshipwillcontinuetoevolveandgrow,andmanyexpecttheInternetinChinatobecome
evenmorecontrolled.Atthesametime,thebattleagainstcensorshiprageson,butitwouldbe
hardtoconvinceanyonethattheChinesegovernmentislosing.
References
ihttp://news.sciencenet.cn/htmlnews/2014/8/301669.shtmiihttps://cs.stanford.edu/people/eroberts/cs181/projects/2010-11/FreedomOfInformationChina/the-great-firewall-of-china-background/index.htmliiihttp://www.internetlivestats.com/internet-users-by-country/ivhttp://networkcultures.org/query/wp-content/uploads/sites/4/2014/06/10.Min_Jiang.pdfvhttps://www.usenix.org/system/files/conference/foci15/foci15-paper-marczak.pdfvihttps://blog.thousandeyes.com/deconstructing-great-firewall-china/viihttps://www.cl.cam.ac.uk/~rnc1/ignoring.pdfviiihttp://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.191.206&rep=rep1&type=pdfixhttp://0b4af6cdc2f0c5998459-c0245c5c937c5dedcca3f1764ecc9b2f.r43.cf2.rackcdn.com/12389-foci13-khattak.pdfxhttp://queue.acm.org/detail.cfm?id=2405036xihttp://conferences.sigcomm.org/sigcomm/2012/paper/ccr-paper266.pdfxiihttp://ieeexplore.ieee.org/stamp/stamp.jsp?reload=true&arnumber=6814824xiiihttp://furbo.org/2015/01/22/fear-china/xivhttps://gist.github.com/chockenberry/c3e584c28ad6ab6e5faaxvhttp://viewdns.info/research/dns-cache-poisoning-in-the-peoples-republic-of-china/xvihttp://link.springer.com/article/10.1007/s10796-008-9131-2xviihttps://news.ycombinator.com/item?id=10101653xviiihttps://news.ycombinator.com/item?id=10101653xixhttps://www.torproject.org/about/overviewxxhttps://blog.torproject.org/blog/knock-knock-knockin-bridges-doorsxxihttp://www.cs.kau.se/philwint/gfw/xxiihttps://blog.torproject.org/category/tags/gfwxxiiihttps://plus.google.com/+GhostAssassin/posts/aLcyVfcH7mPxxivhttps://www.torproject.org/docs/pluggable-transports.html.en