lug11 zfs on linux for lustre
TRANSCRIPT
-
8/13/2019 LUG11 ZFS on Linux for Lustre
1/22
Lawrence Livermore National Laboratory
Brian Behlendorf
Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA 94551
This work performed under the auspices of the U.S. Department of Energy by
Lawrence Livermore ational Laboratory under !ontract DE"#!$%"&'#%'())
ZFS on Linux for Lustre
LUG11April 13, !11
LLNL-PRES-479831
-
8/13/2019 LUG11 ZFS on Linux for Lustre
2/22
2
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre #istory
2007 Livermore raises ldiskfs scalability/performance concerns
Fsck, filesystem size, random IO, data integrity, etc
Alternate backend is needed for largelstre filesystems
!F" identified as tec#nically t#e best soltionAddresses all kno$n ldiskfs limitations
%roven prodction &ality implementation
Licensing concerns can be addressed
'st be ported to Lin(
)F"/"n start !F"/Lstre ser space implementation
-
8/13/2019 LUG11 ZFS on Linux for Lustre
3/22
3
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre #istory
200* Livermore starts porting !F" to t#e kernel
Intended to determine viability of a kernel port
+o nsrmontable tec#nical isses discovered
Initial performance reslts are encoraging
"n Lstreosd development
"#ift in strategy, t#e Livermore kernel port is adopted
-rian .oins t#e "n Lstreosd development team
)ontined Lstreosd development
Licensing concerns nresolved $ork contines
-
8/13/2019 LUG11 ZFS on Linux for Lustre
4/224
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre #istory
200 Livermore !F" development
Focs on a prodction &ality !F" port
-ilt &arter scale prototype !F"/Lstre filesystem
"n/Oracle Lstreosd developmentOracle ac&ires "n
Lstreosd development contines nc#anged
!erocopy, grants, large dnodes, &otas, tilities, etc
Licensing concerns nresolved $ork contines
-
8/13/2019 LUG11 ZFS on Linux for Lustre
5/225
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre #istory
2010 Livermore !F" development
Lin( integration tilities, dev, zevents, disk failres3
-ilt a fll scale !F"/Lstre filesystem
Oracle Lstreosd developmentAnnonced !F"/Lstre only available for "olaris
Lstreosd development contines on Lin(
Oracle cancels Lstre progress is delayed
Licensing concerns nresolved $ork contines at LL+L
-
8/13/2019 LUG11 ZFS on Linux for Lustre
6/226
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre #istory
2011 Livermore !F" development
!F" %osi( Layer !%L3 added
Lstreosd development branc# pblicly available
4#amclod Lstreosd development
)ontracted by Livermore to complete Lstreosd
'ost of t#e original Lstreosd developers are at 4#amclod
Licensing concerns nresolved $ork contines
Late 2011
Livermore plans a !F"/Lstre filesystem for "e&oia
50 %- capacity, 512 6-/s 1 8-/s band$idt#
-
8/13/2019 LUG11 ZFS on Linux for Lustre
7/227
Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS $verview
9eveloped by "n no$ Oracle3 on "olaris )ombined filesystem, logical volme
manager, :AI9
)opyon$rite -iltin data integrity
Intelligent online scrbbing and resilvering
;ery large filesystem limits
-
8/13/2019 LUG11 ZFS on Linux for Lustre
8/22
8
Lawrence Livermore National Laboratory
LLNL-PRES-479831
LLNL%s &easons for portin' ZFS
Lstre servers crrently se e(t< ldiskfs3 :andom $rites bond by disk IO%" rate, not disk band$idt#
O"8 size limits
fsck time is nacceptable
=(pensive #ard$are re&ired to make disks reliable
Late 2011 re&irement>
50%-, 5126-/s 1 8-/s
At a price $e can afford
)O4 se&entializes random $rites
+o longer bond by drive IO%"
"ingle volme size limit of 1? =i-
!ero fsck time Online data integrity and error #andling
=(pensive :AI9 controllers are nnecessary
-
8/13/2019 LUG11 ZFS on Linux for Lustre
9/22
9
Lawrence Livermore National LaboratoryLLNL-PRES-479831
Licensin' (oncerns
Lin( @ernel
"%L
!F")99L
6%L
6%L
)99L )ommon 9evelopment and 9istribtion License6%L 6n3 6eneral %blic License
Lstre6%L
-
8/13/2019 LUG11 ZFS on Linux for Lustre
10/22
10
Lawrence Livermore National LaboratoryLLNL-PRES-479831
Licensin' (oncerns
9istribting "orce )99L is an open sorce license
)99L provides an e(plicit patent license
!F" c#anges contribted as )99L code
!F" sorces kept separate from all 6%L code
9istribting -inaries
Lin( kernel allo$s non6%L t#ird party modles
+vidia, A8I, etc
Lins vie$s t#e kernel modle interface as L6%L
!F" ses no 6%Lonly symbols
Inclded #eaders do not make a derived $ork
-
8/13/2019 LUG11 ZFS on Linux for Lustre
11/22
11
Lawrence Livermore National LaboratoryLLNL-PRES-479831
Licensin' (oncerns
!F" is +O8 a derived $ork of Lin(
BIt $old be rat#er preposteros to call t#e Andre$ File"ystem a Cderived
$orkC of Lin(, for e(ample, so I t#ink itCs perfectly O@ to #ave a AF"
modle, for e(ampleD
Lins 8orvalds
BOr vie$ is t#at .st sing strctre definitions, typedefs, enmeration
constants, macros $it# simple bodies, etc, is +O8 enog# to make a
derivative $ork It $old take a sbstantial amont of code coming from
inline fnctions or macros $it# sbstantial bodies3 to do t#atD
:ic#ard "tallman 8#e F"FCs vie$3
S l i ) ti L
-
8/13/2019 LUG11 ZFS on Linux for Lustre
12/22
12
Lawrence Livermore National LaboratoryLLNL-PRES-479831
Solaris )ortin' LayerLinux"ZFS Glue
Lin( @ernel
"%L
!F"
-
8/13/2019 LUG11 ZFS on Linux for Lustre
13/22
-
8/13/2019 LUG11 ZFS on Linux for Lustre
14/22
14
Lawrence Livermore National LaboratoryLLNL-PRES-479831
)orte* by LLNL
A:)
!IO
;9=; )onfigration
9'E
!A%!IL
9"L
8raversal
O"8
OF9
!F" O"9
'98
'99
!;OL /dev/zfs
libzfs
!F" )LI
Interface
Layer
8ransactionalOb.ectLayer
%ooled"torageLayer
Lstre
Eser
@ernel
!%L
-
8/13/2019 LUG11 ZFS on Linux for Lustre
15/22
15
Lawrence Livermore National LaboratoryLLNL-PRES-479831
(FS + Sun + $racle + -amclou*
A:)
!IO
;9=; )onfigration
9'E
!A%!IL
9"L
8raversal
O"8
OF9
!F" O"9
'98
'99
!;OL /dev/zfs!%L
libzfs
!F" )LI
Interface
Layer
8ransactionalOb.ectLayer
%ooled"torageLayer
Lstre
Eser
@ernel
-
8/13/2019 LUG11 ZFS on Linux for Lustre
16/22
16Lawrence Livermore National Laboratory
LLNL-PRES-479831
ZFS"Lustre )rototype .Zeno/
-
8/13/2019 LUG11 ZFS on Linux for Lustre
17/22
17Lawrence Livermore National Laboratory
LLNL-PRES-479831
$SS SSU .Zeno/
)omponent -and$idt#9: I- 25? 6-/s
Gost "A" ?0 6-/s
H-O9 "A" ?0 6-/s
9isk 5?0 6-/s
*? 8- / ""E
25? 6-/s
70 28- 9isks / Gost
7 *2 :aid!2 grops 1 112 8- O"8 / Gost
-
8/13/2019 LUG11 ZFS on Linux for Lustre
18/22
18Lawrence Livermore National Laboratory
LLNL-PRES-479831
$SS SSU .Zeno3/
?0 8- / ""E
J*< 6-/s
50 28- 9isks / Gost
5 *2 :aid!2 grops 1 *08- O"8 / Gost
)omponent -and$idt#9: I- J*< 6-/s
Gost "A" J*< 6-/s
H-O9 "A" ?0 6-/s
9isk ?00 6-/s
-
8/13/2019 LUG11 ZFS on Linux for Lustre
19/22
19Lawrence Livermore National Laboratory
LLNL-PRES-479831
4rite :ead0
5
10
15
20
25
J0
)rrent ""ELstreIO:
)rrent ""EGard$areLimit
!F" ""ELstreIO:
!F" ""E!%IO"
!F" ""EGard$areLimit
6i-
/s
ZFS )erformance (omparison
"ame nmber of drives
"A8A vs "A" disk
:AI9!2 vs :AI9?
4rite %erformance isLimited by t#e !F" %ort
:ead %erformance is
Limited by Lstre/)%E
!F" is noptimized, t#iscan all be improvedK
-
8/13/2019 LUG11 ZFS on Linux for Lustre
20/22
20Lawrence Livermore National Laboratory
LLNL-PRES-479831
Sin'le No*e rite )erformance
4rite performance is
consistent $it# Lstre
Lstre $orkload
:andom 1'i- I/Os
12* t#rs to
-
8/13/2019 LUG11 ZFS on Linux for Lustre
21/22
21Lawrence Livermore National Laboratory
LLNL-PRES-479831
Sin'le No*e &ea* )erformance
:ead performance is
significantly better t#an Lstre
Lstre 4orkload
:andom 1'i- I/Os
12* t#rs to
-
8/13/2019 LUG11 ZFS on Linux for Lustre
22/22
22Lawrence Livermore National Laboratory
LLNL-PRES-479831
0ore nformation
!F" N "%L
#ttp>//zfsonlin(org
'ailing Lists
zfsannoncezfsonlin(org
zfsdiscsszfsonlin(org
zfsdevelzfsonlin(org
9o$nload soft$are
9ocmentation
Lstre spport for !F"
#ttp>//zfsonlin(org/lstre#tml
Licenses
)99L #ttp>//#bopensolarisorg/bin/vie$/'ain/licensingPfa&
6%Lv2 #ttp>//$$$gnorg/licenses/gpl20#tml
Lins #ttp>//lin(mafiacom/fa&/@ernel/proprietarykernelmodles#tml
:'" #ttp>//lkmlindianaed/#ypermail/lin(/kernel/0J011/0J?2#tml
http://zfsonlinux.org/mailto:[email protected]:[email protected]:[email protected]://zfsonlinux.org/lustre.htmlhttp://hub.opensolaris.org/bin/view/Main/licensing_faqhttp://www.gnu.org/licenses/gpl-2.0.htmlhttp://linuxmafia.com/faq/Kernel/proprietary-kernel-modules.htmlhttp://lkml.indiana.edu/hypermail/linux/kernel/0301.1/0362.htmlhttp://lkml.indiana.edu/hypermail/linux/kernel/0301.1/0362.htmlhttp://linuxmafia.com/faq/Kernel/proprietary-kernel-modules.htmlhttp://www.gnu.org/licenses/gpl-2.0.htmlhttp://hub.opensolaris.org/bin/view/Main/licensing_faqhttp://zfsonlinux.org/lustre.htmlmailto:[email protected]:[email protected]:[email protected]://zfsonlinux.org/