linux io performance
DESCRIPTION
Linux IOTRANSCRIPT
© Copyright IBM Corporation, 2012
Linux I/O performance An end-to-end methodology for maximizing Linux I/O performance on the
IBM System x servers in a typical SAN environment.
David Quenzler IBM Systems and Technology Group ISV Enablement
June 2012
Linux I/O Performance
Table of contentsAbstract........................................................................................................................................1 Introduction .................................................................................................................................1 External storage subsystem - XIV .............................................................................................2 External SAN switches ...............................................................................................................4
Bottleneck monitoring .............................................................................................................................. 4 Fabric parameters.................................................................................................................................... 5 Basic port configuration ........................................................................................................................... 5 Advanced port configuration .................................................................................................................... 5
Host adapter placement rules....................................................................................................6 System BIOS settings.................................................................................................................6 HBA BIOS settings......................................................................................................................6 Linux kernel parameters.............................................................................................................9 Linux memory settings...............................................................................................................9
Page size ................................................................................................................................................. 9 Transparent huge pages.......................................................................................................................... 9
Linux module settings - qla2xxx..............................................................................................10 Linux SCSI subsystem tuning - /sys .......................................................................................12 Linux XFS file system create options .....................................................................................14 Linux XFS file system mount options .....................................................................................17 Red Hat tuned............................................................................................................................18
ktune.sh ................................................................................................................................................. 18 ktune.sysconfig ...................................................................................................................................... 19 sysctl.ktune ............................................................................................................................................ 19 tuned.conf .............................................................................................................................................. 19
Linux multipath .........................................................................................................................20 Sample scripts...........................................................................................................................22 Summary....................................................................................................................................24 Resources..................................................................................................................................25 About the author .......................................................................................................................26 Trademarks and special notices..............................................................................................27
Linux I/O performance 1
Abstract
This white paper discusses an end-to-end approach for Linux I/O tuning in a typical data center environment consisting of external storage subsystems, storage area network (SAN) switches, IBM System x Intel servers, Fibre Channel host bus adapters (HBAs) and 64-bit Red Hat Enterprise Linux.
Anyone with an interest in I/O tuning is welcome to read this white paper.
Introduction Linux® I/O tuning is complex. In a typical environment, I/O makes several transitions from the client application out to disk and vice versa. There are many pieces to the puzzle.
We will examine the following topics in detail:
External storage subsystems External SAN switches
Host adapter placement rules System BIOS settings Adapter BIOS settings
Linux kernel parameters Linux memory settings Linux module settings
Linux SCSI subsystem settings Linux file system create options Linux file system mount options
Red Hat tuned Linux multipath
You should follow an end-to-end tuning methodology in order to minimize the risk of poor tuning.
Recommendations in this white paper are based on the following environment under test:
IBM® System x® 3850 (64 processors and 640 GB RAM) Red Hat Enterprise Linux 6.1 x86_64
The Linux XFS file system IBM XIV® external storage subsystem, Fibre Channel (FC) attached
An architecture comprising IBM hardware and Red Hat Linux provides a solid framework for maximizing
I/O performance.
Linux I/O performance 2
External storage subsystem - XIV The XIV has few manual tunables. Here are a few tips:
Familiarize yourself with the XIV command-line interface (XCLI) as documented in the IBM XIV
Storage System User Manual. Ensure that you connect the XIV system to your environment in the FC fully redundant
configuration as documented in the XIV Storage System: Host Attachment and Interoperability
guide from IBM Redbooks®.
Figure 1: FC fully redundant configuration
Although you can define up to 12 paths per host, a maximum of six paths per host provides sufficient
redundancy and performance.
Useful XCLI commands:
# module_list -t all
# module_list -x
# fc_port_list
The XIV storage subsystem contains six FC data modules (4 to 9), each with 8 GB memory. The FC rate
is 4 Gbps and the data partition size is 1 MB.
Check the XIV HBA queue depth setting: The higher the host HBA queue depth, the more parallel I/O goes to the XIV system, but each XIV port can only sustain up to 1400 concurrent
I/Os to the same type target or logical unit (LUN). Therefore, the number of connections multiplied by the host HBA queue depth should not exceed that value. The number of connections should take the multipath configuration into account.
Note: The XIV queue limit is 1400 per XIV FC host port and 256 per LUN per worldwide port name (WWPN) per port.
Linux I/O performance 3
Twenty-four multipath connections to the XIV system would dictate that host queue depth be set to 58. (24*58=1392)
Check the operating system (OS) disk queue depth (see below) Make use of the XIV host attachment kit for RHEL
Useful commands:
# xiv_devlist
Linux I/O performance 4
External SAN switches As a best practice, set SAN Switch port speeds to Auto (auto-negotiate).
Typical bottlenecks are:
Latency bottleneck Congestion bottleneck
Latency bottlenecks occur when frames are sent faster than they can be received. This can be due to
buffer credit starvation or slow drain devices in the fabric.
Congestion bottlenecks occur when the required throughput exceeds the physical data rate for the connection.
Most SAN Switch web interfaces can be used to monitor the basic performance metrics, such as throughput utilization, aggregate throughput, and percentage of utilization.
The Fabric OS command-line interface (CLI) can also be used to create frame monitors. These monitors
analyze the first 64 bytes of each frame and can detect various types of protocols that can be monitored. Some performance features, such as frame monitor configuration (fmconfig), require a license.
Some of the useful commands:
switch:admin>perfhelp switch:admin>perfmonitorshow switch:admin>perfaddeemonitor
switch:admin>fmconfig
Bottleneck monitoring
Enable bottleneck monitoring on SAN switches by using the following command:
switch:admin> bottleneckmon --enable -alert
Useful commands
switch:admin> bottleneckmon --status
switch:admin> bottleneckmon --show -interval 5 -span 300
switch:admin> switchstatusshow
switch:admin> switchshow
switch:admin> configshow
switch:admin> configshow -pattern "fabric"
switch:admin> diagshow
switch:admin> porterrshow
Linux I/O performance 5
Fabric parameters
Fabric parameters are described in the following table. Default values are in brackets []:
Fabric parameter Description
BBCredit Increasing the buffer-to buffer (BB) credit parameter may increase performance by buffering FC frames coming from 8Gb/s FC server ports and going to 4Gb/s FC ports on the XIV. SAN segments can run
at different rates.
Frame pacing (BB credit starvation) occurs when no more BB credits are available. Frame pacing delay: AVG FRAME PACING should
always be zero. If not, increase buffer credits. But, over-increasing the number of BB credits does not increase performance
[16]
E_D_TOV Error Detect TimeOut Value [2000]
R_A_TOV Resource Allocation TimeOut Value [10000]
dataFieldSize 512, 1024, 2048, 2112 [2112]
Sequence Level Switching Under normal conditions, disable for better performance (interleave frames, do not group frames) [0]
Disable Device Probing Set this mode only if N_Pord discovery causes attached devices to fail [0]
Per-Frame Routing Priority [0]
Suppress Class F Traffic Used with ATM gateways only [0]
Insistent Domain ID Mode fabric.ididmode [0]
Table 1: Fabric Parameters – default values are in brackets []
Basic port configuration
Target rate limiting (ratelim) is used to minimize congestion at the adapter port caused by a slow drain
device operating in the fabric at a slower speed (for example a 4 GBps XIV system)
Advanced port configuration
Turning on Interrupt Control Coalesce and increasing the latency monitor timeout value can improve
performance by reducing interrupts and processor utilization.
Linux I/O performance 6
Host adapter placement rules It is extremely important for you to follow the adapter placement rules for your server in order to minimize PCI bus saturation.
System BIOS settings Use recommended CMOS settings for your IBM System x server.
You can use the IBM Advanced Settings Utility (asu64) to modify the System x BIOS settings from the Linux command line. It is normally installed in /opt/ibm/toolscenter/asu
ASU normally tries to communicate over the LAN through the USB interface. Disable the LAN over USB interface with the following command:
# asu64 set IMM.LanOverUsb Disabled --kcs
The following settings can result in better performance
uEFI.TurboModeEnable=Enable
uEFI.PerformanceStates=Enable
uEFI.PackageCState=ACPI C3
uEFI.ProcessorC1eEnable=Disable
uEFI.DDRspeed=Max Performance
uEFI.QPISpeed=Max Performance
uEFI.EnergyManager=Disable
uEFI.OperatingMode=Performance Mode
Also: enabling or disabling Hyperthreading can improve application performance.
Useful commands:
# asu64 show
# asu64 show --help
# asu64 set IMM.LanOverUsb Disabled --kcs
# asu64 set uEFI.OperatingMode Performance
HBA BIOS settings You can use the QLogic SANSurfer command-line utility (scli) to show or modify HBA settings.
Linux I/O performance 7
Task Command
Display current HBA Parameter settings
# scli -c
Display WWPNs only # scli -c | grep WWPN
Display settings only # scli -c | grep \: | grep -v WWPN | sort | uniq -c
Restore default settings # scli -n all default
Table 2: Modifying HBA settings
WWPNs can also be determined from the Linux command line or using a small script
#!/bin/sh
###
hba_location=$(lspci | grep HBA | awk '{print $1}')
for adapter in $hba_location
do
cat $(find /sys/devices -name \*${adapter})/host*/fc_host/host*/port_name
done
Listing 1: Determining WWPNs
HBA parameters as reported by the scli command appear in the following table:
Parameter Default value
Connection Options 2 - Loop Preferred, Otherwise Point-to-Point
Data Rate Auto
Enable FC Tape Support Disabled
Enable Hard Loop ID Disabled
Enable Host HBA BIOS Disabled
Enable LIP Full Login Yes
Enable Target Reset Yes
Execution Throttle 16
Frame Size 2048
Linux I/O performance 8
Hard Loop ID 0
Interrupt Delay Timer (100ms) 0
Link Down Timeout (seconds) 30
Login Retry Count 8
Loop Reset Delay (seconds) 5
LUNs Per Target 128
Operation Mode 0
Out Of Order Frame Assembly Disabled
Port Down Retry Count 30 seconds
Table 3: HBA BIOS tunable parameters (sorted)
Use the lspci command to show which type(s) of Fibre Channel adapters exist in the system. For example:
# lspci | grep HBA
Note: Adapters from different vendors have different default values.
Linux I/O performance 9
Linux kernel parameters The available options for the Linux scheduler are noop, anticipatory, deadline, or cfq.
echo "Linux: SCHEDULER"
cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""
Listing 2: Determining the Linux scheduler for block devices
The Red Hat enterprise-storage tuned profile uses the deadline scheduler. The deadline scheduler can be enabled by adding the elevator=deadline parameter to the kernel command line in grub.conf.
Useful commands:
# cat /proc/cmdline
Linux memory settings This section shows you the Linux memory settings.
Page size
The default page size for Red Hat Linux is 4096 bytes.
# getconf PAGESIZE
Transparent huge pages
The default size for huge pages is 2048 KB for most large systems.
echo "Linux: HUGEPAGES"
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""
Listing 3: Determining the Linux huge page setting
The Red Hat enterprise-storage tuned profile enables huge pages.
Linux I/O performance 10
Linux module settings - qla2xxx You can see the parameters for the qla2xxx module using the following script:
#!/bin/sh
###
for param in $(ls /sys/module/qla2xxx/parameters)
do
echo -n "${param} = "
cat /sys/module/qla2xxx/parameters/${param}
done
Listing 4: Determining qla2xxx module parameters
Disable Qlogic failover. If the output of the following command shows the -k driver (not the -fo driver) then
failover is disabled.
# modprobe qla2xxx | grep -w ^version
version: <some_version>-k
Qlogic lists the highlights of the 2400 series HBAs:
150,000 IOPS per port
Out-of-order frame reassembly T10 CRC for end-to-end data integrity
Useful commands:
# modinfo -p qla2xxx
The qla_os.c file in the Linux kernel source contains information on many of the qla2xxx module parameters. Some parameters as listed by modinfo -p do not exist in the Linux source code. Others
are not explicitly defined but may be initialized by the adapter firmware.
Descriptions of module parameters appear in the following table:
Parameter Description Linux kernel source Default value
ql2xallocfwdump Allocate memory for a
firmware dump during
1 1 - allocate memory
Linux I/O performance 11
HBA initialization
ql2xasynctmfenable Issue TM IOCBs asynchronously via IOCB mechanism
does not exist 0 - issue TM IOCBs via mailbox mechanism
ql2xdbwr Scheme for request queue posting
does not exist 1 - CAMRAM doorbell (faster)
ql2xdontresethba Reset behavior does not exist 0 - reset on failure
ql2xenabledif T10-CRC-DIF does not exist 1 - DIF support
ql2xenablehba_err_chk T10-CRC-DIF Error
isolation by HBA
does not exist 0 - disabled
ql2xetsenable Firmware ETS burst does not exist 0 - skip ETS enablement
ql2xextended_error_logging
Extended error logging not explicitly defined 0 - no logging
ql2xfdmienable FDMI registrations 1 0 - no FDMI
ql2xfwloadbin Location from which to load firmware
not explicitly defined 0 - use default semantics
ql2xgffidenable GFF_ID checks of port type
does not exist 0 - do not use GFF_ID
ql2xiidmaenable IDMA setting 1 1 - perform iIDMA
ql2xloginretrycount Alternate value for
NVRAM login retry count
0 0
ql2xlogintimeout Login timeout value in seconds
20 20
ql2xmaxqdepth Maximum queue depth for target devices -- used to seed queue
depth for scsi devices
32 32
ql2xmaxqueues MQ 1 1 - single queue
ql2xmultique_tag CPU affinity not defined 0 - no affinity
ql2xplogiabsentdevice PLOGI not defined 0 - no PLOGI
ql2xqfulrampup Time in seconds to wait
to begin to ramp-up the queue depth for a device after a queue-full
condition has been
does not exist 120 seconds
Linux I/O performance 12
detected
ql2xqfulltracking Track and dynamically
adjust queue depth for a scsi devices
does not exist 1 - perform tracking
ql2xshiftctondsd Control shifting of command type processing based on
total number of SG elements
does not exist 6
ql2xtargetreset Target reset does not exist 1 - use hw defaults
qlport_down_retry Maximum number of
command retries to a port in PORT-DOWN state
not defined 0
Table 4: qla2xxx module parameters
Linux SCSI subsystem tuning - /sys See /sys/block/<device>/queue/<parameter>
Block device parameter values can be determined using a small script:
#!/bin/sh
###
param_list=$(find /sys/block/sda/queue -maxdepth 1 -type f -exec basename '{}' \; | sort)
dev_list=$(ls -l /dev/disk/by-path | grep -w fc | awk -F \/ '{print $3}')
dm_list=$(ls -d /sys/block/dm-* | awk -F \/ '{print $NF}')
for param in ${param_list}
do
echo -n "${param} = "
for dev in ${dev_list} ${dm_list}
do
cat /sys/block/${dev}/queue/${param}
Linux I/O performance 13
done | sort | uniq -c
done
echo -n "queue_depth = "
for dev in ${dev_list}
do
cat /sys/block/${dev}/device/queue_depth
done | sort | uniq -c
Determining block device parameters
To send down large-size requests (greater than 512 KB on 4 KB page size systems):
Consider increasing max_segments to 1024 or greater Set max_sectors_kb equal to max_hw_sectors_kb
SCSI device parameters appear in the following table. Values that can be changed are shown as (rw):
Parameter Description Value
hw_sector_size (ro) Hardware sector size in bytes 512
max_hw_sectors_kb (ro) Maximum number of kilobytes supported in a single data transfer
32767
max_sectors_kb (rw) Maximum number of kilobytes that the block layer will allow for
a file system request
512
nomerges (rw) Enable or disable lookup logic 0- all merges are enabled
nr_requests (rw) Number of read or write requests which can be allocated in the
block layer
128
read_ahead_kb (rw) 8192
rq_affinity (rw) Always complete a request on the same CPU that queued it.
1- CPU group affinity
2- strict CPU affinity
1 - CPU group affinity
scheduler (rw) deadline
Table 5: SCSI subsystem tunable parameters
Linux I/O performance 14
Using max_sectors_kb:
By default, Linux devices are configured for a maximum 512 KB I/O size. When using a larger file system
block size, increase the max_sectors_kb parameter. Max_sectors_kb must be less than or equal to max_hw_sectors_kb.
The default queue_depth is 32 and represents the total number of transfers that can be queued to a device. You can check the queue depth by examining /sys/block/<device>/device/queue_depth.
Linux XFS file system create options
Useful commands:
# getconf PAGESIZE
# man mkfs.xfs
Note: XFS writes are not guaranteed to be committed unless the program issues a fsync() call afterwards.
Red Hat: Optimizing for a large number of files
If necessary, you can increase the amount of space allowed for inodes using the mkfs.xfs -i maxpct= option. The default percentage of space allowed for inodes varies by file
system size. For example, a file system between 1 TB and 50 GB in size will allocate 5% of the total space for inodes.
Red Hat: Optimizing for a large number of files in a single directory
Normally, the XFS file system directory block size is the same as the file system block size. Choose a larger value for the mkfs.xfs -n size= option, if there are many millions of directory entries.
Red Hat: Optimizing for concurrency
Increase the number of allocation groups on systems with many processors.
Red Hat: Optimizing for applications that use extended attributes
1. Increasing inode size might be necessary if applications use extended attributes.
2. Multiple attributes can be stored in an inode provided that they do not exceed the maximum size
limit (in bytes) for attribute+value.
Linux I/O performance 15
Red Hat: Optimizing for sustained metadata modifications
1. Systems with large amounts of RAM could benefit from larger XFS log sizes.
2. The log should be aligned with the device stripe size (the mkfs command may do this automatically)
The metadata log can be placed on another device, for example, a solid-state drive (SSD) to reduce disk seeks.
Specify the stripe unit and width for hardware RAID devices
Syntax (options not related to performance are omitted)
# mkfs.xfs [ options ] device
-b block_size_options
size=<int> -- size in bytes
default 4096
minimum 512
maximum 65536 (must be <= PAGESIZE)
-d data_section_options
More allocation groups imply that more parallelism can be achieved when allocating blocks and inodes
agcount=<int> -- number of allocation groups
agsize
name
file
size
sunit
su
swidth
sw
Linux I/O performance 16
-i inode_options
size
log
perblock
maxpct
align
attr
-l log_section_options
internal
logdev
size
version
sunit
su
lazy-count
-n naming_options
size
log
version
-r realtime_section_options
rtdev
extsize
size
-s sector_size
Linux I/O performance 17
log
size
-N
Dry run. Print out filesystem parameters without creating the filesystem.
L
inux XFS file system mount options
isting 5: Create options for XFS file systems
L
Useful commands
fs_info
roc/mounts
obarrier
oatime
ode64 – XFS is allowed to create inodes at any location in the file system. Starting from kernel 2.6.35,
gbsize – Larger values can improve performance. Smaller values should be used with fsync-heavy
elaylog – RAM is used to reduces the number of changes to the log.
he Red Hat 6.2 Release Notes mention that XFS has been improved in order to better handle metadata
# x
# xfs_quota
# grep xfs /p
# mount | grep xfs
n
n
inXFS file systems will mount either with or without the inode64 option.
loworkloads.
d
Tintensive workloads. The default mount options have been updated to use delayed logging.
Linux I/O performance 18
Red Hat tuned Red Hat Enterprise Linux has a tuning package called “tuned” which sets certain parameters based on a chosen profile.
Useful commands:
# tuned-adm help
# tuned-adm list
# tuned-adm active
The enterprise-storage profile contains the following files. When comparing the enterprise-storage profile with the throughput-performance profile, some files are identical:
# cd /etc/tune-profiles
# ls enterprise-storage/
ktune.sh ktune.sysconfig sysctl.ktune tuned.conf
# sum throughput-performance/* enterprise-storage/* | sort
03295 2 throughput-performance/sysctl.s390x.ktune
08073 2 enterprise-storage/sysctl.ktune
15419 2 enterprise-storage/ktune.sysconfig
15419 2 throughput-performance/ktune.sysconfig
15570 1 enterprise-storage/ktune.sh
43756 1 enterprise-storage/tuned.conf
43756 1 throughput-performance/tuned.conf
47739 2 throughput-performance/sysctl.ktune
57787 1 throughput-performance/ktune.sh
ktune.sh
The enterprise-storage ktune.sh is the same as the throughput-performance ktune.sh but adds functionality for disabling or enabling I/O barriers. The enterprise-storage profile is preferred when using XIV storage. Important functions include:
set_cpu_governor performance -- uses cpuspeed to set the governor enable_transparent_hugepages -- does what it says
Linux I/O performance 19
remount_partitions nobarrier -- disables write barriers multiply_disk_readahead -- modifies /sys/block/sd*/queue/read_ahead_kb
ktune.sysconfig
ktune.sysconfig is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/ktune.sysconfig \ throughput-performance/ktune.sysconfig | sort | uniq -c
2 ELEVATOR="deadline"
2 ELEVATOR_TUNE_DEVS="/sys/block/{sd,cciss,dm-}*/queue/scheduler"
2 SYSCTL_POST="/etc/sysctl.conf"
2 USE_KTUNE_D="yes"
Listing 6: Sorting the ktune.sysconfig file
sysctl.ktune
sysctl.ktune is functionally identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/sysctl.ktune \ throughput-performance/sysctl.ktune | sort | uniq -c
2 kernel.sched_min_granularity_ns = 10000000
2 kernel.sched_wakeup_granularity_ns = 15000000
2 vm.dirty_ratio = 40
Listing 7: Sorting the sysctl.ktune file
tuned.conf
tuned.conf is identical for both throughput-performance and enterprise-storage profiles:
# grep -h ^[A-Za-z] enterprise-storage/tuned.conf \throughput-performance/tuned.conf | sort | uniq -c
12 enabled=False
Listing 8: Sorting the tuned.conf file
Linux I/O performance 20
Linux multipath Keep it simple: configure just enough paths for redundancy and performance.
features='1 queue_if_no_path' hwhandler='0' wp=rw
policy='round-robin 0' prio=-1
features='1 queue_if_no_path'
Set 'no_path_retry N', then remove features='1 queue_if_no_path' option or set 'features 0'
Multipath configuration defaults
Parameter Default value
polling interval 5
udev_dir/dev /dev
multipath_dir /lib/multipath
find_multipaths no
verbosity 2
path_selector round-robin 0
path_grouping_policy failover
getuid_callout /lib/udev/scsi_id -- whitelisted --device=/dev/%n
prio const
features queue_if_no_path
path_checker directio
failback manual
rr_min_io 1000
rr_weight uniform
no_path_retry 0
user_friendly_names no
queue_without_daemon yes
flush_on_last_del no
max_fds determined by the calling process
Linux I/O performance 21
checker_timer /sys/block/sdX/device/timeout
fast_io_fail_tmo determined by the OS
dev_loss_tmo determined by the OS
mode determined by the process
uid determined by the process
gid determined by the process
Table 6: Multipath configuration options
The default load balancing policy (path_selector) is round-robin 0. Other choices are queue-length 0 and
service-time 0.
Consider using the XIV Linux host attachment kit to create the multipath configuration file.
# cat /etc/multipath.conf
devices {
device {
vendor "IBM"
product "2810XIV"
path_selector "round-robin 0"
path_grouping_policy multibus
rr_min_io 15
path_checker tur
failback 15
no_path_retry 5
#polling_interval 3
}
}
defaults {
...
user_friendly_names yes
...
Linux I/O performance 22
}
Listing 9: A sample multipath.conf file
Sample scripts You can use the following script to query various settings related to I/O tuning:
#!/bin/sh
#!/bin/sh
# Query scheduler, hugepages, and readahead settings for fibre channel scsi devices
###
#hba_pci_loc=$(lspci | grep HBA | awk '{print $1}')
echo "Linux: HUGEPAGES"
cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo ""
echo "Linux: SCHEDULER"
cat /sys/block/*/queue/scheduler | grep -v none | sort | uniq -c
echo ""
echo "FC: max_sectors_kb"
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i cat /sys/block/{}/queue/max_sectors_kb | sort | uniq -c
echo ""
echo "Linux: dm-* READAHEAD"
ls /dev/dm-* | xargs -n1 -i blockdev --getra {} | sort | uniq -c
blockdev --report /dev/dm-*
echo ""
Linux I/O performance 23
echo "Linux: FC disk sd* READAHEAD"
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i blockdev --getra /dev/{} | sort | uniq -c
ls -l /dev/disk/by-path | grep -w fc | awk -F'/' '{print $3}' | xargs -n1 -i blockdev --report /dev/{} | grep dev
echo ""
Listing 10: Sorting the ktune.sysconfig file
Linux I/O performance 24
Summary This white paper presented an end-to-end approach for Linux I/O tuning in a typical data center environment consisting of external storage subsystems, storage area network (SAN) switches, IBM
System x Intel servers, Fibre Channel HBAs and 64-bit Red Hat Enterprise Linux.
Visit the links in the “Resources” section for more information on topics presented in this white paper.
Linux I/O performance 25
Resources The following websites provide useful references to supplement the information contained in this paper:
XIV Redbooks
ibm.com/redbooks/abstracts/sg247659.html ibm.com/redbooks/abstracts/sg247904.html
Note: IBM Redbooks are not official IBM product documentation.
XIV Infocenter
http://publib.boulder.ibm.com/infocenter/ibmxiv/r2
XIV Host Attachment Kit for RHEL can be downloaded from Fix Central
ibm.com/support/fixcentral
Qlogic
http://driverdownloads.qlogic.com ftp://ftp.qlogic.com/outgoing/linux/firmware/rpms
Red Hat Enterprise Linux Documentation http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux
IBM Advanced Settings Utility ibm.com/support/entry/portal/docdisplay?Indocid=TOOL-ASU
Linux Documentation/kernel-parameters.txt Documentation/block/queue-sysfs.txt
Documentation/filesystems/xfs.txt drivers/scsi/qla2xxx http://xfs.org/index.php/XFS_FAQ
Linux I/O performance 26
About the author David Quenzler is a consultant in IBM Systems and Technology Group ISV Enablement Organization. He has more than 15 years’ experience working with the IBM System x (Linux) and IBM Power Systems
(IBM AIX®) platforms. You can reach David at [email protected].
Linux I/O performance 27
Trademarks and special notices © Copyright IBM Corporation 2012.
References in this document to IBM products or services do not imply that IBM intends to make them
available in every country.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked
terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A
current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or
its affiliates.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel Inside (logos), MMX, and Pentium are trademarks of Intel Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
SET and the SET Logo are trademarks owned by SET Secure Electronic Transaction LLC.
Other company, product, or service names may be trademarks or service marks of others.
Information is provided "AS IS" without warranty of any kind.
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance
characteristics may vary by customer.
Information concerning non-IBM products was obtained from a supplier of these products, published announcement material, or other publicly available sources and does not constitute an endorsement of
such products by IBM. Sources for non-IBM list prices and performance numbers are taken from publicly available information, including vendor announcements and vendor worldwide homepages. IBM has not tested these products and cannot confirm the accuracy of performance, capability, or any other claims
related to non-IBM products. Questions on the capability of non-IBM products should be addressed to the supplier of those products.
All statements regarding IBM future direction and intent are subject to change or withdrawal without
notice, and represent goals and objectives only. Contact your local IBM office or IBM authorized reseller for the full text of the specific Statement of Direction.
Some information addresses anticipated future capabilities. Such information is not intended as a
definitive statement of a commitment to specific levels of performance, function or delivery schedules with respect to any future products. Such commitments are only made in IBM product announcements. The
Linux I/O performance 28
information is presented here to communicate IBM's current investment and development activities as a good faith effort to help with our customers' future planning.
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O
configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput or performance improvements equivalent to the ratios stated here.
Photographs shown are of engineering prototypes. Changes may be incorporated in production models.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of those websites. The materials at those websites are not part of
the materials for this IBM product and use of those websites is at your own risk.