george markomanolis io500 committee: john bent, julian...
Post on 25-Apr-2018
217 Views
Preview:
TRANSCRIPT
GeorgeMarkomanolisIO500Committee:JohnBent,JulianM.Kunkel,JayLofstead
2017-11-12http://www.io500.org
IBMSpectrumScaleUserGroup,Denver,Colorado,USA
Why?
• Theincreaseofthestudieddomains,leadtolargerdataoutput,thusmorestressonfilesystem• CustomersbuyastorageonlybyevaluatingthemaxGB/sachievedbyIOR,whilemanyrealapplicationscannotachievesimilarperformance• TheI/Oefficiencycanbedowngradedbyinterferencewithmultipleusers• Arealcase,commercialapplicationusingonenode,wasconsumingmorethan15%oftheoverallmetadatacapacity• Weneedasuiteofbenchmarksinordertounderstandwhataretherealperformanceexpectations• Trackingstorageperformanceandsharingbestpractices
How?
• Communitydriveneffort,discussingthroughmailinglist,Slacketc.Everythingisingithub (https://github.com/VI4IO/io-500-dev.git )• Patterns:metadata,data,search• Easyforoptimizedpatterns• Hardfornaïvepatterns
• Reliesoncommunitybenchmarks,suchasIOR,mdtest (fornow)
WhatisIO-500?
IOREasy:Thisiswhatisusedduringtheprocurements,wherewemeasurethemostefficientI/Opattern,usercandeclaredtheparametersandwesaveonefileperMPIprocess
IORHard:Single-sharedfile,47008byterandomaccess,POSIX
MDEasy:CreaterankdirectorieswithNemptyfilesMDHard:Singleshareddirectory,filesof3901bytes,POSIXFind:Findfunctionalitysearchesforfilesof3901bytesacrossallthecreatedfiles.Svenaddedthemmfind.sh scriptforSpectrumscaleenvironment(io-500-dev/utilities/find/mmfind.sh)
Challenges&ApproachI
• Representativeofapplicationsanduserrequirements• Usingdifferentworkloadsforextractingupperandlowerperformanceinthecasesofoptimizedandnon-optimizedapplicationrespectively• Reportmeaningfulmetrics• Implementafindfunctionality(wetried3differentversions)• Libcircle isusedbyparallelfindanditisnotfriendlywithmachineswhichdonotprovidethewrappermpicc,problemissolvedwithsomemanualmodifications
Challenges&ApproachII
• Concurrentrunstobeintegrated,alreadyinitialtestsprovideinterestingresults
• 5minuteslimitperexperimenttoavoidlongruns
• ExtendedIOR/mdtest forphase-outstonewallingoptions
• Easytobuild,lessthan70secondsforthebasicversiontobeinstalled
HowtorunIO-500
• git clonehttps://github.com/VI4IO/io-500-dev• cdio-500-dev• ./utilities/prepare.sh• ./io500.sh(submitthisscriptifyouuseascheduler)• emailresultstosubmit@io500.org
DemoinstallationofIO500
ModifyIO-500
• Modifyio500.shaccordingly,forexample:
io500_mpirun="mpirun"io500_mpiargs="-np2"io500_ior_easy_params="-t 2048k-b 2g-F" io500_mdtest_easy_files_per_proc=25000
ModifyIO-500II
• Modifyio500.shaccordingly,selectwhichexperimentstobeexecuted:io500_run_ior_easy="True"io500_run_md_easy="True"…io500_run_md_hard_delete="True"
• Forvalid submission,youneedtoexecuteallthetestswhilethewritephasesshouldtakeatleast5minutes
ModifyIO-500III
• Modifyio500.shaccordingly,uncommenttheselinesanddeclarethepathtoyourpfind wrapper:
#io500_find_mpi="True"#io500_find_cmd="$PWD/bin/pfind"
Exampleofatestcase
[RESULT]BW phase1 ior_easy_write 96.133GB/s:time187.24seconds[RESULT]BW phase2 ior_hard_write 11.230GB/s:time 46.79seconds[RESULT]BW phase3 ior_easy_read 109.249GB/s:time164.76seconds[RESULT]BW phase4 ior_hard_read 7.871GB/s:time 66.74seconds[RESULT]IOPSphase1 mdtest_easy_write 49.231kiops:time 19.61seconds[RESULT]IOPSphase2 mdtest_hard_write 15.444kiops:time 17.05seconds[RESULT]IOPSphase3 find 8.120kiops:time 98.45seconds[RESULT]IOPSphase5 mdtest_easy_stat 5.313kiops:time127.18seconds[RESULT]IOPSphase6 mdtest_hard_stat 6.772kiops:time 30.43seconds[RESULT]IOPSphase7 mdtest_easy_delete 14.873kiops:time 49.98seconds[RESULT]IOPSphase8 mdtest_hard_read 45.599kiops:time 10.16seconds[RESULT]IOPSphase9 mdtest_hard_delete 30.776kiops:time 11.84seconds[SCORE]Bandwidth31.04GB/s:IOPS16.1537kiops:TOTAL501.4108
ExperiencewithIO500benchmark
• Withnotpropertuning,thebenchmarkwillfinisheithertoofastortooslow• Starttuningwithsmallvaluesandincreasethemtillyoufindtheonesthatproducetherequiredoutcome• Besurethatyouhaveenoughspacefortheoutputdata• CheckformtheIORoutputifitrecognizescorrectlythenumberofprocessesandhowmanyareusedpernode• Ifthebenchmarkistooslowwithoutreason,checkifotherusersexecuteintensiveI/Oapplications• Besurethatyoudonotharmthesystem,trytoexecutethebenchmarkwhenthesystemisnottoobusyorduringmaintenance• FortheIORHard,youcouldstripethecorrespondingfolder
KAUST– CrayDataWarp – IO-500
• 300computenodes,2400processes,268DataWarp nodes
• ior_easy_params="-t 2m-b 192616m”• ior_hard_writes_per_proc=77872• mdtest_hard_files_per_proc=1630• mdtest_easy_files_per_proc=10800
Presentingdatainradarchart
0
0.2
0.4
0.6
0.8
1Score
IO
MD
TotIOPs
RadarchartRankedsystems
#1 #2
ThebeststorageI/Osystemshouldberepresentedinafulldiamondgraph
NASA- IOPSGaloreEncore
SomeGPFSsystemsareinthefirstIO500listwhichwillbepresentedonWednesdayatIO500BOF.
Wouldyoubeinterestedinprovidingnewresults?
Conclusions
• TillnowtheIOReasyisconsideredthenormalapproachforprocurement,however,thisdoesnotcorrespondtotherealapplication• WeneedabetterwaytounderstandtheprocurementofstorageandIO500seemstobeintherightdirection• Acustomercanconcludetodecisionsbasedonhisapplicationrequirements• Weplansomefutureadditions,suchasmixworkload• Moresubmissionswehave,thebettertounderstandthevariousfilesystems
YouarewelcometoIO500BOF!
top related