emi proposal for a storage accounting record...
TRANSCRIPT
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
EMI proposal for a Storage Accounting Record standard
Jon Kerr Nilsen Dept. of Physics, Univ. of Oslo
On behalf of the EMI StAR team
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• Why storage accounEng? • Storage accounEng vs. job accounEng • Processing model • Storage accounEng work in EMI and OGF • StAR proposal • Way forward
Outline
Jon Kerr Nilsen 2
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Why storage accounting?
• Storage resources are shared between organizaEons – more than one owner
• Developing a storage infrastructure – We need to know how much
storage space is used, by which group/user, on which storage media
– To know where to put the money when increasing the storage space
– To know who to ask for the money to increase the storage space
• Basis for billing – Storage is expensive – Some non-‐academic resource
owners may not like to give it away for free
Jon Kerr Nilsen 3
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Accounting
• AccounEng is the recording and summarizing of the consumpEon of a resource by an individual user or a group of users
• Typically used to find out who uses how much of a set of resources
• Typically not used to find out how individual resource components are used
Jon Kerr Nilsen 4
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611 Storage vs. job accounting
Job accoun5ng • Job accounEng is the
recording and summarizing of the consumpEon of compuEng resources
• Recording wall-‐Eme, cpu-‐Eme, etc. – strictly increasing values.
• When a job finishes it stays finished
• Typically measured per job – informaEon readily available from local batch system
• Standardized through UR1.0
Storage accoun5ng • Storage accounEng is the recording
and summarizing of the consumpEon of a storage resource
• Storage usage varies in Eme – recorded usage is only relevant in a given Eme frame
• Data is rarely in finished state and can live for a long Eme
• Typically recorded through (in)frequent usage snapshots – high fluctuaEon rate
• No standardized way to record storage usage in a distributed environment
Jon Kerr Nilsen 5
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Storage vs. job summary
Jon Kerr Nilsen 6
• Storage records must be treated different than compute records wrt. aggregaEon
• Storage records need special treatment – Only valid between measurement Eme and expiraEon Eme
– Can be invalidated if newer record is registered for same resource
– Nontrivial processing model
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 7
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 8
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 9
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 10
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 11
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 12
� tnt0
consumption dt
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
�ni consumption(ti)
Processing model
• Records can overlap in Eme – cannot simply sum up to idenEfy total consumpEon
• Must ask for a certain point in Eme and sum up the most recent, sEll-‐valid records
• To get usage per Eme, you must choose a set of Emestamps and repeat process per Eme stamp – numerical integraEon
Jon Kerr Nilsen 13
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• From EMI DescripEon of Work: – EMI will address known UR issues and extend accounEng records to include VO-‐aware storage usage accounEng
– The refined and extended standard UR format, service interface and data transport protocol will be implemented in gLite, ARC, dCache and UNICORE
– EMI will thus contribute to the definiEon of UR 2.0 of OGF
Storage accounting in EMI
Jon Kerr Nilsen 14
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• 41 mails to UR-‐WG since september – almost all about storage accounEng
• “Next GeneraEon Usage Records” OGF session on Tuesday – Discussed the way forward for UR – A properly defined storage record and an improved compute record are good steps towards UR2.0
– Lessons learned in storage records can be ported back into compute records (e.g. long-‐lived services)
– EMI will work acEvely to have standard ready before end of EMI
Storage accounting in OGF
Jon Kerr Nilsen 15
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• There is already a OGF UR 1.0 (GFD.98) • Only a limited number of properEes in common with what is needed in StAR
• UR1 not readily extendable and is missing VO informaEon
• No draj for service record in UR 2.0 yet • EMI needs record format now to implement before end of EMI project
• Decided to keep the StAR format close to OGF UR 1.0, but sEll separate
• Merge with UR 2.0 when available
Why a separate storage record?
Jon Kerr Nilsen 16
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• EMI is finalizing a proposal for a storage accounEng record (StAR)
• Contacted and received input from organizaEons and potenEal user groups – NDGF, OGF, EGI, OSG, INFN, CMS, ATLAS, …
• Current draj: StAR-‐EMI-‐tech-‐doc-‐v7.pdf • Final version will be proposed to OGF as a standard
• Will be implemented by EMI storage providers
StAR
Jon Kerr Nilsen 17
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• Resource – Fields describing the system the resource was consumed on – StorageSystem, StorageShare!
• Resource consumpEon • Resource – Fields describing the system the resource was consumed on – StorageSystem, LogicalCapacityUsedStorageShare
!• Resource consumpEon – ResourceCapacityUsed,
– StorageMedia
, DirectoryPath, … • IdenEty – Describes the person or group accountable for the consumpEon – SubjectIdentity block (UserIdentity
StAR structure
Jon Kerr Nilsen 18
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
Local example
Jon Kerr Nilsen 19
<sr:StorageUsageRecord ! xmlns:sr="http://eu-emi.eu/namespaces/2011/02/storagerecord"> ! <sr:RecordIdentity sr:createTime="2010-11-09T09:06:52Z"! sr:recordId="host.example.org/sr/87912469269276"<sr:StorageUsageRecord ! xmlns:sr="http://
eu-emi.eu/namespaces/2011/02/storagerecordsr: "> !<sr:RecordIdentity sr:createTime="2010-11-09T09:06:52Z"
! sr:recordId="host.example.org/sr/
87912469269276"
! sr:StorageMedia <sr:StorageSystem>host.example.org</ !<sr:SubjectIdentity
<sr:LocalUser>johndoe</sr:LocalUser> ! </sr:SubjectIdentity>
!
ssd</sr:StorageMedia>
!
<sr:FileCount>42</> ! <
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
<sr:StorageUsageRecord ! xmlns:sr="http://eu-emi.eu/namespaces/2011/02/storagerecord"> ! <sr:RecordIdentity sr:createTime="2010-11-09T09:06:52Z"! sr:recordId="host.example.org/sr/87912469269276"/>! <sr:StorageSystem>host.example.org</sr:StorageSystem>! <sr:SubjectIdentity> ! <sr:Group xmlns sr
="http://eu-emi.eu/namespaces/2011/02/storagerecord"><sr:RecordIdentity
createTime="2010-11-09T09:06:52Z" sr
:recordId="host.example.org/sr/87912469269276"/>!StorageSystem>host.example.org
</sr:StorageSystem> <sr:SubjectIdentity> !Group>binarydataproject.example.org</sr:Group>!
sr: ="subgroup">!ukusers ! !!
<sr:</sr:StorageUsageRecord>StorageMedia
Grid example
Jon Kerr Nilsen 20
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
• DefiniEon of StAR is a first step to establish common storage accounEng record
• A step towards a common usage record (UR2.0) • Next jump (two steps in one go)
– Propose StAR to OGF to start standardizaEon process • Will take Eme • Will include further discussions • May lead to changes in StAR • EMI will take acEve part in this process • Draj has been made available through UR-‐WG mailing list • Will be made available in SourceForge
– Implement StAR record into EMI storage middleware • EMI ends in two years • Want to gain user experience before end of EMI
• Comments and input more than welcome throughout the process!
Way forward
Jon Kerr Nilsen 21
EMI INFSO-‐RI-‐2
61611
EMI INFSO-‐RI-‐2
61611
THANK YOU! Contact: emi-‐jra1-‐data-‐sar@eu-‐emi.eu
Web page: hnp://twiki.cern.ch/twiki/bin/view/EMI/StorageAccounEng Current StAR draj:
hnp://twiki.cern.ch/twiki/pub/EMI/StorageAccounEng/StAR-‐EMI-‐tech-‐doc-‐v7.pdf hnp://twiki.cern.ch/twiki/pub/EMI/StorageAccounEng/StAR-‐EMI-‐tech-‐doc-‐v7.doc
Jon Kerr Nilsen 22