oracle solaris zfs

Upload: ashwin-warghade

Post on 16-Jul-2015

212 views

Category:

Documents


13 download

TRANSCRIPT

Oracle Solaris ZFSSatyajit Tripathi

4-3

ObjectivesTo understand, what makes ZFS Unique To know, what facilities are provided by ZFS To learn, how to do ZFS administration

4-4

AgendaIntroduction to ZFSZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties

General Architecture ZFS Features Simplifying Deployment ZFS AdministrationWeb-Based Management UI

ZFS Limitations Best Practices

4-5

Introduction to ZFSFirst of its kind 128-bit file system Acronym for Zettabyte File System or simply ZFS Storage capacity of 256 quadrillion zettabytes Directories with possibly 256 trillion entries No limits on number of file systems or files Dynamic metadata allocation e.g. I-node pre-allocation Data integrity management using 256-bit checksum Its a(n) :revolution over traditional file system fundamentally new approach to data and volume management transactional file system with self healing capabilities design for robustness, scalability, and easy administration architecture with storage pool of heterogeneous devices

ZFS is the default Root file system in Oracle Solaris 114-6

Setup RequirementA machine, SPARC or x86/x64 with Solaris 10 6/06 or newer Minimum disk size of 128 MB for ZFS environment Minimum disk space of 64 MB required for storage pool Recommended memory of at least 1 GB or more

4-7

ZFS ComponentsZFS Components comprise of virtual devices likeWhole disk (Recommended), Disk slice, or Files

Components should follow the naming conventions likeEmpty components are not permitted Name can contain only alphanumeric, exceptUnderscore (_), Hyphen (-), Colon (:), Period (.)

Pool names should begin with a letter, exceptbeginning sequence c[0-9] is not allowed name log and cache is reserved names beginning with mirror, raidz and spare not allowed names beginning with percent symbol (%) not allowed

Dataset names must begin with an alphanumeric character Dataset names must not contain a percent symbol (%)

4-8

Storage Pool and ZFS File Systemzpool is constructed using virtual devices zpool can be configured asNon Redundant (similar to traditional RAID-0) Mirrored (similar to traditional RAID-1) RAID-Z (similar to RAID-5 or single parity using 3 devices) RAID-Z2 (group of 4 or more devices) RAID-Z3 (featuring Triple parity)

zpool may additionally consist ofHot Spare for failing disks Read Cache devices (L2ARC) Write Cache devices (ZIL)

zpool can contain heterogeneous storage devices. No limitation zpool can be dynamically expanded without re-configuration Multiple ZFS file system or dataset can be created in a zpool File system property quota and reservation can be set4-9

ZFS PropertiesZFS Properties are of two types Native and User-defined Native properties control file system behavior, User-defined don't Native properties can be read-only or settable Many settable properties are inherited from parent All settable properties have associated source as eitherdefault (local and not inherited) local (explicitly set on the dataset) inherited from (specifies the dataset source) Native read-only properties comprise of Available compressratio creation mounted origin type used Settable properties include (See Appendix-I for complete list)

aclinherit aclmode canmount checksum compression dedup devices encryption mountpoint quota readonly recordsize reservation setuid sharenfs snapdir volsize zoned

User properties can be specified as :=

4 - 10

General ArchitectureFile System Device GUI Application

JNI libzfs

User Kernel

Interface

ZFS POSIX LayerZFS Attribute Processor

ZFS Volume

/dev/zfs Traversal Dataset and Snapshot Layer

Transactional Objects

ZFS Intent Log

Data Management UnitAdaptive Replacement Cache (ARC)

Pooled Storage

ZFS I/O Pipeline

Virtual DeviceLayered Driver Interface (LDI)

Configuration

4 - 11

AgendaIntroduction to ZFSZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties

General Architecture ZFS Features Simplifying Deployment ZFS AdministrationWeb-Based Management UI

ZFS Limitations Best Practices

4 - 12

ZFS Features Simplifying DeploymentRoot Pool File system Boot Environment Enhancement Delegation in a Zone Deduplication, Compression, Encryption Snapshot Clone ZFS Application Interface

4 - 13

Root Pool File SystemTo boot from a ZFS file system specifyBoot device identified as a storage pool and ZFS root file system within the pool

Install ZFS root file systemAutomated Installer on a SPARC or x86 based system Live CD on a x86 based system

Recommendation for ZFS root file systemMemory capacity 1 GB Disk capacity 13 GBSwap size ( physical memory) and dump devices in root pool Boot Environment size 4-6 GB Solaris OS Components residing in root file system

Requirements for Storage Pool configurationDisk should be labelled as SMI and should be < 2 TB Disk must contain Solaris fdisk partition (x86 system) Configure root pool mirror only after root pool is installed Enable Compression only after root pool is installed Do not rename root pool name after initial installation4 - 14

Boot Environment EnhancementBoot Environment (BE) enhancedManaged using beadm(1M) Deprecated Live Upgrade lu* command replaced

Auto install (new feature) facilitatesMirrored ZFS root and auto apply bootblock Creates swap and dump devices on root pool Boot support for Hot Spare

IPS facilitates BE update using latest buildNo need to apply individual patches

Use pkg image-update or pkg updateJust boot using new Boot Environment

No need to create separate boot environment

pkg update creates new BE automatically Send stream to manage ZFS root properties

4 - 15

Oracle Solaris ZFS Boot Environment

4 - 16

Delegation in a ZoneIn a zone use zonecfg, add device for ZFS volume In a zone create or modify zpool is not allowed In a zone privileged user can modify ZFS propertiesexcept sharenfs zoned quota reservation

In a global zone, Administrator can modify ZFS propertiesexcept sharenfs mountpoint

Delegation of dataset to a zoneProperty zoned must be specifically marked

On first boot of a zone containing ZFS datasetProperty zoned (boolean) is automatically turned on

Dataset cannot be mounted or shared in global zone, ifDataset property zoned=true

Removing dataset from a zone will not set zoned=false

4 - 17

Deduplication, Compression, EncryptionEnable property dedup on ZFS file system toSynchronously remove redundant data blocks Property scope is entire zpool Unique data stored on shared common components Use dedup ratio to estimate space saving possibilities Transparently compress the data blocks Property scope is individual ZFS file system Retrieve compression ratio using zfs utility

Enable property compression on ZFS file system to

Enable property encryption on ZFS file system to

Encrypt data before storing in ZFS file system Property scope is the file system and inherited by descendants File system owner's key is required to access encoded data Wrapping key encrypts the Data encryption keyStored in a file (as raw or hex) or derived from the passphrase

Encryption policy is inherited by the descendant file systemPolicy of the inherited file system cannot be modified4 - 18

SnapshotUse zfs snaphot fs@snapN to create snapshotTakes only one argument, i.e. snapshot name fs@snapN Instantly creates the Read-only snapshot of fs fs@snapN is stored on the same zpool Initially the space is shared by fs and the snapshot fs@snapN initially consumes no additional disk space fs@snapN grows in size as active dataset fs changes Provides persistence across system reboot Directory .zfs/snapshot lists all snapshots Use command zfs list -t snapshot Theoretically maximum N = 264 By default rollback to the most recent snapshot To rollback to N,intermediate snapshots must be destroyed To rollback the file system must be unmounted and remounted

Use zfs rollback -r fs@snapN to create snapshot

4 - 19

CloneUse zfs clone pool/fs1@snapN pool/fs2Takes two argument, snapshot name and the new file system Created using a snapshot only Results in new file system with contents of original file system The new file system is Writable Snapshot cannot be deleted until clone exists Creates stream representation of snapshot to transfer Incremental changes can be saved between snapshotsIndividual file restoration not possible Entire file system must be restored

Use zfs send or recv to save or replicate ZFS file system

Receive full stream to recreate the entire file system Different property values of ZFS snapshot streamsReceive stream with property value specified different than Send Specify at Receive to use the original property Specify at Receive to disable specific file system property

Use zfs send -I and -R, or recv -F for Complex streams4 - 20

Delegated AdministrationRefined permissions to specific user, group or everyone Delegated Permissions supported by ZFS of 2(Two) typesIndividual Permissions Permission Setszfs allow satya create,destroy,mount,snapshot zfsN zfs allow mystaff @myset zfsN

Advantage of Delegated AdministrationPermissions to follow the zpool when migrated Control over permission propagation or Dynamic inheritance Newly created file system can automatically pick up Permissions Ability to create snapshot over NFS Disable delegation property of ZFS pool By default zpool property delegation=on

4 - 21

New ACL ModelZFS provides pure ACL, and all files have associated ACL Solaris ACL new model is based on NFSv4 specification Set ACL using chmod ls, and not setfacl getfacl

ACL comprise of multiple Access Control Entries (ACE) ACLs are fine grained compared to standard file permissions Use ACL-aware cp mv tar cpio rcp to transfer UFS file to ZFS Translates POSIX-draft based ACL to equivalent NFSv4 ACL

Use ufsrestore on ZFS to restore, unlike tar cpio (UFS) By default, ACLs are not inherited unless Flag is specifiedfile_inherit inherit_only dir_inherit no_propagate

Set ZFS property aclinherit to restricted (default) ordiscard, noallow, passthrough, passthrough-x

Set ZFS property aclmode to groupmask (default) ordiscard, passthrough4 - 22

AgendaIntroduction to ZFSZFS Setup ZFS Components ZFS Storage Pool and File System ZFS Properties

General Architecture ZFS Features Simplifying Deployment ZFS AdministrationWeb-Based Management UI

ZFS Limitations Best Practices

4 - 23

AdministrationZFS supports both CLI and Web based Administration Use CLI command to create ZFS pool poolNzpool create poolN c0t0d0 c0t1d0 c0t1d2 zpool create poolN mirror c0t0d0 c0t1d0 zpool create poolN c0t0d0 log c0t1d0 cache c0t1d2 zfs create poolN/zfsN

Use command to create zpool with disk mirroring

Use command to define Log or Cache devices in poolN Use command to create a file system in poolN Use command to add devices to poolNzpool add poolN c1t1d1

Use command to set file system property Use command to get property valuezfs get compressratio

zfs set compression=on poolN/zfsN

Use command to remove a file systemzfs destroy poolN/zfsN

4 - 24

Web-Based ManagementUse https://host:6789/zfs for ZFS AdministrationCreate new storage pool Add capacity to existing pool Export zpool to another system Import zpool from another system View and monitor storage pools Create new file system Create volume configuration Take snapshots Rollback using snapshot To start the web console server

Use /usr/sbin/smcwebserver start or enable

4 - 25

ZFS LimitationsIt is not possible to reduce the number of top-level vdev in a zpool It is not possible to add disk as a column to RAID-Z vdev Virtual devices cannot be nested in a zpoolMirror or RAID-Z top-level vdev can only contain files or disks

ZFS cannot provide concurrent access from multiple hosts ZFS expects a disk cache flush command to commit data to media ZFS defragmentation can impact sequential read performanceBlock Pointer Rewrite functionality will eliminate defragementation issue

ZFS can only detect or report but repair silent data corruption errorsUnless explicitly specified copies=N (where N>1)

ZFS RAID resilvering may take long time ZFS does not support TRIM which is used with SSD

4 - 26

Best PracticesCreate zpool using whole disk instead of disk slices (label EFI)Provides file system safety by automatic enabling write cache

In case of Root pool use disk slice instead of whole disk (label SMI)Allocate entire disk capacity to slice 0

Create zpool with several group of vdev instead of single large vdevImproves IOPS performance

Keep vdev belonging to one zpool of similar sizesReads get skewed to larger vdev as zpool fills up, impacts adversely

Do not create zpool that contain components from another zpool RAID-Z is not recommended for random read, e.g. Databases Variable covariance between random and sequential readsSequential read of fragmented files adversely impact random reads

Match ZFS record size to db block size for OLTP workload Keep pool space under 80% utilization for maintaining performance Mirrored pool or hardware RAID is preferred over RAID-Z4 - 27

Appendix-IUse zfs set for ZFS Properties in Oracle Solaris 11 ExpressPROPERTY EDIT INHERIT VALUES yes | no undefined | unavailable | available yes | no filesystem | volume | snapshot NO discard | noallow | restricted | passthrough | passthrough-x on | off on | off | noauto sensitive | insensitive | mixed on | off | fletcher2 | fletcher4 | sha256 on | off | lzjb | gzip | gzip-[1-9] | zle 1 | 2 | 3 on | off | verify | sha256[,verify] on | off on | off | aes-128-ccm | aes-192-ccm | aes-256-ccm | aes-128-gcm | aesavailable NO NO compressratio NO NO creation NO NO defer_destroy NO NO keystatus NO NO mounted NO NO origin NO NO referenced NO NO rekeydate NO NO type NO NO used NO NO usedbychildren NO NO usedbydataset NO NO usedbyrefreservation NO usedbysnapshots NO NO userrefs NO NO aclinherit YES YES atime YES YES canmount YES NO casesensitivity NO YES checksum YES YES compression YES YES copies YES YES dedup YES YES devices YES YES encryption NO YES 192-gcm | aes-256-gcm

4 - 28

Appendix-IPROPERTY exec keysource logbias mlslabel mountpoint nbmand normalization primarycache quota readonly recordsize refquota refreservation reservation rstchown secondarycache setuid sharenfs sharesmb snapdir sync utf8only version volblocksize volsize vscan xattr zoned userused@... groupused@... userquota@... groupquota@... EDIT YES YES YES YES YES YES NO YES YES YES YES YES YES YES YES YES YES YES YES YES YES NO YES NO YES YES YES YES NO NO YES YES INHERIT YES YES YES YES YES YES YES YES NO YES YES NO NO NO YES YES YES YES YES YES YES YES NO YES NO YES YES YES NO NO NO NO VALUES on | off raw | hex | passphrase,prompt | file:// latency | throughput | legacy | none on | off none | formC | formD | formKC | formKD all | none | metadata | none on | off 512 to 128k, power of 2 | none | none | none on | off all | none | metadata on | off on | off | share(1M) options on | off | sharemgr(1M) options hidden | visible standard | always | disabled on | off 1 | 2 | 3 | 4 | current 512 to 128k, power of 2 on | off on | off on | off | none | none

4 - 29

ReferencesDownload Oracle Solaris 11 Expresswww.oracle.com/technetwork/server-storage/solaris11/overview/

Oracle Solaris ZFS Administration Guidedownload.oracle.com/docs/cd/E19963-01/html/821-1448

Oracle Solaris 11 Information Librarydownload.oracle.com/docs/cd/E19963-01/index.html

Write to ISV Technical Support

4 - 30

Oracle Solaris 11 Express ZFS