demartek december 2012 - broadcom inc....mission critical high performance capabilities to meet the...

15
© 2012 Demartek ® www.demartek.com Email: [email protected] Emulex LPe16000B Gen 5 Fibre Channel HBA Feature Comparison Evaluation report prepared under contract with Emulex Executive Summary Explosive growth in the complexity and amount of data of today’s data center environments is pervasive. We see more users, more devices and increasing numbers of virtual machines per physical server than in the past. Cloud computing, big data, and increased backup and recovery burdens are driving the need for higher storage throughput and bandwidth and lower latency. The next-generation Gen 5 (16Gb) Fibre Channel HBAs were designed specifically to meet the challenges faced by today’s datacenters while maintaining backward compatibility to previous generations of Fibre Channel. Emulex is well known in the industry for rock-solid reliability and mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization, and backup. In September 2012, Emulex released its LPe16000B second generation Gen 5 Fibre Channel host bus adapter (HBA). This report compares the out of box experience with the generally available versions of the LPe16002B and the QLogic QLE2672 FC HBAs as they are currently shipping. The report finds that based on these tests, the Emulex LPe16000B can be better for servers running server virtualization, transactional database and data warehouse workloads both in terms of performance, CPU efficiency, reliability, and power efficiency. Key Findings The Emulex LPe16000B is the fastest generally available FC HBA evaluated by Demartek to date for these tests. The architecture enables all resources to be applied to a single-port, enabling 1.2 million IOPS on a single-port when needed. LPe16000B delivers up to eleven times better CPU utilization in these tests. LPe16000B delivers up to seven times the IOPS in these tests. The LPe16002B has the highest published MTBF, delivering best enterprise reliability. Demartek December 2012 The previous Demartek research report featured a version of the QLogic QLE2672 HBA that is different from the HBA used in this report. The HBA in this report is the generally available and currently shipping model available through the reseller channel. This shipping QLogic QLE2672 includes firmware version 6.02.00.

Upload: others

Post on 20-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

© 2012 Demartek® www.demartek.com Email: [email protected]

Emulex LPe16000B Gen 5 Fibre Channel

HBA Feature Comparison

Evaluation report prepared under contract with Emulex

Executive Summary

Explosive growth in the complexity and amount of data of today’s data center environments is pervasive. We see more users, more devices and increasing numbers of virtual machines per physical server than in the past. Cloud computing, big data, and increased backup and recovery burdens are driving the need for higher storage throughput and bandwidth and lower latency. The next-generation Gen 5 (16Gb) Fibre Channel HBAs were designed specifically to meet the challenges faced by today’s datacenters while maintaining backward compatibility to previous generations of Fibre Channel. Emulex is well known in the industry for rock-solid reliability and mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization, and backup.

In September 2012, Emulex released its LPe16000B second generation Gen 5 Fibre Channel host bus adapter (HBA). This report compares the out of box experience with the generally available versions of the LPe16002B and the QLogic QLE2672 FC HBAs as they are currently shipping. The report finds that based on these tests, the Emulex LPe16000B can be better for servers running server virtualization, transactional database and data warehouse workloads both in terms of performance, CPU efficiency, reliability, and power efficiency.

Key Findings

The Emulex LPe16000B is the fastest generally available FC HBA evaluated by Demartek to date for these tests. The architecture enables all resources to be applied to a single-port, enabling 1.2 million IOPS on a single-port when needed.

LPe16000B delivers up to eleven times better CPU utilization in these tests.

LPe16000B delivers up to seven times the IOPS in these tests.

The LPe16002B has the highest published MTBF, delivering best enterprise reliability.

Demartek December 2012

The previous Demartek research report

featured a version of the QLogic

QLE2672 HBA that is different from the

HBA used in this report. The HBA in this

report is the generally available and

currently shipping model available

through the reseller channel. This

shipping QLogic QLE2672 includes

firmware version 6.02.00.

Page 2: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 2 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

The shipping QLogic QLE2672 has a fan and heat sink which is different than some of the marketing photos suggest.

The Emulex LPe16000B is up to eleven times more server energy efficient in terms of IOPS per server watt consumed compared to the QLE2672.

Although both the LPe16002B and QLE2672 are able to achieve full line-rate, the LPe16002B, is able to achieve near line rate at 4K block size for reads. The QLE2672 HBA, however, does not reach full line-rate until 16K block size for reads. As most enterprise applications use either 4K (Oracle database) or 8K (Microsoft SQL Server, Oracle database) block sizes, this is an important limitation. The LPe16002B is up to 3.16x faster for 4K combined read/write operations and up to 2.29x faster for 8K combined read/write operations.

Up to 4.6X faster in 8Gb Mode than the QLogic adapters – The LPe16002B achieved more than 922K IOPS running at 8Gb with 512 block sizes, while both the QLE2672 and QLE2562 were limited to 200K IOPS, using the same test parameters as the 16Gb tests.

Page 3: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 3 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Product Details

Emulex LPe16002B

The dual-port model of this adapter is known as the LPe16002B. The single-port model is the LPe16000B. This adapter has a heat sink but no fan. The LPe16000B supports up to 1.2 million IOPS on a single-channel, making it the fastest generally available HBA tested by Demartek to date. It also includes some unique features such as advanced data integrity with support for the T10 Protection Information (T10 PI) standard with high-performance offload. Emulex claims that T10 PI offload eliminates the 30 percent performance penalty to the CPU seen in firmware-based T10 PI implementations. T10 PI prevents costly and time consuming silent data corruption which can sometimes lead to total data loss. The adapter has also earned Intel PCIe 3.0 certification. In terms of reliability, Emulex is claiming to have increased reliability on this model now supporting over 10 million hours mean time between failures (MTBF).

Photos

These photographs were taken in the Demartek lab of each of the generally available adapters as they were shipped in December 2012. Each of these adapters are the dual-port models. Each adapter ships mounted on a full-height bracket and also ships with a half-height bracket in the package. The LPe16002B utilizes a passive heat sink for heat management, with no moving parts.

Figure 1 – Emulex LPe16002B

Page 4: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 4 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

QLogic QLE2672

The dual-port model of this adapter is known as the QLE2672. The single-port model is the QLE2670. This adapter includes a heat sink and fan, and has ventilation holes in the full-height bracket. The half-height bracket does not have ventilation holes.

Figure 2 – QLogic QLE2672

Figure 3 – QLogic QLE2672 photo from earlier report – no fan or heatsink

Page 5: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 5 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Fibre Channel Host Bus Adapter Architecture

The Emulex LPe16000B FC HBA features a high performance 8-core architecture that can flexibly scale resources between ports based on the resource requirement and accommodating double the resources on a single port if needed, enabling SLAs to be met. This is in contrast to older designs that do not allow this flexibility. The QLogic design puts SLAs at risk if a port fails because the other port cannot scale its resources to handle the workload required to meet SLAs. Additionally, if the ports of a QLogic dual port HBA are deployed in an active/standby configuration, the performance resources of the standby port cannot be used by the active port if needed server workload spikes. The Emulex architecture echoes current data center efficiency practices of virtual resource allocation used by server virtualization and storage thin provisioning, in that resources are not wasted while waiting in reserve. This is in contrast to the QLE2672, which supports fewer resources (exchanges and logins) and does not allow for redirection of controller performance based on the quality of service required by the individual port. We found that the Emulex flexible architecture had no downside in terms of port dependencies; in fact, we conducted a number of experiments that prove the Emulex adapter has better port independence which results in higher availability, significantly reduced management time and higher SLAs. 1. The Emulex LPe16002B can reset ports independently online in an instant without disturbing

the neighboring port or running server applications. The QLogic HBA does not have this ability and requires a server reboot to reset a port, which of course stops both ports and results in significant application downtime.

2. Switch port topology or speed changes can be handled online on any port independently with the Emulex LPe16002B. The same test on the QLE2762 took down both ports and required a server reboot to make the changes effective.

3. Cable pulls, switch port resets and traffic error injection performed on one port had no effect on the neighboring Emulex port.

Page 6: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 6 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Power Consumption

Server power efficiency is a key focus of today’s data centers, who are try to lower their operating cost. Emulex’s design is such that the whole chip can provide the maximum capability to any port that needs it and can deliver up to 1.2 million IOPS on a single-port. Power efficiency in chip design is an area of innovation that we see in many areas of the IT industry today. The latest generation of data center servers employ power saving strategies when the server isn’t under a heavy CPU load. An HBA that offloads I/O resources will allow the server to use less of its CPU resources thus lowering the server’s power input draw. The charts below show the server power consumption while running high transaction I/O loads with one and two HBA ports active. The Emulex LPe16000B is up to eleven times more efficient in terms of IOPS/server watt (16,527 vs. 1,487) and allows the server to consume 14% fewer active watts (311 vs. 356 watts) while performing more than seven times the I/O traffic. Over a three year server life, the server using Emulex HBA saves 1 Megawatt of power compared to QLogic HBA.

Power Consumption: One Port Active

Page 7: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 7 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Power Consumption: Two Ports Active

Page 8: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 8 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Performance of Generally Available Adapters

Demartek audited the results of tests performed in the Emulex lab with the two competing adapters, connected to DRAM storage targets. A series of IOmeter runs were performed using the generally available models of each of the adapters and using drivers and firmware generally available from each of the respective websites in December 2012. Tests were run with a single port (“1P”) and with both ports (“2P”) active. Each port was running at 16 Gbps.

Operating System

Windows Server 2012 Datacenter

IOmeter Parameters

IOmeter version: 2006.07.27 Block sizes: 512 bytes through 1MB Queue Depth: 41 LUNs/Targets/Workers: 20 (1 target per worker) I/O profiles: 100% read, 100% write, 50% read / 50% write

Page 9: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 9 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Read (100%)

Chart 1 – IOPS for Read

In the IOPS chart above, at 512 byte block size the Emulex LPe16002B delivers up to 7x better maximum IOPS – delivering around 1.2 million IOPS on a single-port. The QLE2672 delivers under 200,000 IOPs on one port and requires both ports to reach its maximum IOPs 323,000.

Chart 2 - MBPS for Read

In the MBPS chart above, the LPe16002B runs at nearly full line rate for SQL Server/ Oracle workloads (4k and 8k block sizes), so SLAs are met. The QLE2672 can't achieve full line rate until

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

IOPS - Read

QLE2672-2P-Read QLE2672-1P-Read

LPe16002B-2P-Read LPe16002B-1P-Read

IOP

S

0

500

1,000

1,500

2,000

2,500

3,000

3,500MBPS - Read

QLE2672-2P-Read QLE2672-1P-Read

LPe16002B-2P-Read LPe16002B-1P-Read

MB

PS

Page 10: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 10 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

it reaches 16k data block sizes. The LPe16002B is up to 2.24x faster for 4K reads and up to 1.37x faster for 8K reads for these tests.

Chart 3 – % CPU Utilization during Read

In the CPU utilization chart above, The QLE2672’s I/O consumes more than 60% of the server CPU in processing I/O. The Emulex LPe16002B consumes less than 30%. The flat line profile for the QLE2672 in the smaller block sizes may indicate an internal resource bottleneck or inefficient host processor core usage. It is most desirable to consume the fewest CPU resources for I/O processing as possible so that the server CPU is free to perform other processing.

Chart 4 – CPU Efficiency for Read

CPU efficiency considers the amount of I/O being performed by the HBA in relation to the server CPU utilization being consumed. The Emulex HBA is more efficient by a factor of 13x for reads.

0

10

20

30

40

50

60

70

% CPU Utilization - Read (Lower is better)

QLE2672-2P-Read QLE2672-1P-Read

LPe16002B-2P-Read LPe16002B-1P-Read

CP

U %

0

10,000

20,000

30,000

40,000

50,000

60,000

70,000

80,000

LPe16002B QLE2672

IOP

S p

er

%C

PU

CPU Efficiency - Read (Higher is better)

4k Read CPU Efficiency 8k Read CPU Efficiency

Page 11: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 11 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Read/Write (50%/50%)

Chart 5 – IOPS for Read/Write

The LPe16002B achieves nearly 1.2 million IOPS for dual port read/write, while the QLE2672 achieves less than 400K IOPS. The LPe16002B is up to 3.16x faster for read/write operations at the 4K block size, and up to 2.29x faster for read/write operations at the 8K block size. The IOPS an HBA can support impacts the workload that can be performed per server. More IOPS per HBA equals better server utilization.

Chart 6 – MBPS for Read/Write

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000IOPS - Read/Write

QLE2672-2P-RW QLE2672-1P-RW

LPe16002B-2P-RW LPe16002B-1P-RW

IOP

S

0

1,000

2,000

3,000

4,000

5,000

6,000

7,000MBPS - Read/Write

QLE2672-2P-RW QLE2672-1P-RW

LPe16002B-2P-RW LPe16002B-1P-RW

MB

PS

Page 12: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 12 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

LPe16002B achieves better throughput compared to the QLE2672.

Chart 7 - CPU Utilization during Read/Write

Chart 7 demonstrates that the LPe160002B uses significantly fewer CPU cycles to perform I/O operations, requiring approximately 28% of the CPU running both ports for read/write operations. The QLogic QLE2672 requires more than 60% of the CPU. This means the Emulex HBA has more capacity to run enterprise applications.

0

10

20

30

40

50

60

70

% CPU Utilization - Read/Write (Lower is better)

QLE2672-2P-RW QLE2672-1P-RW

LPe16002B-2P-RW LPe16002B-1P-RW

CP

U %

Page 13: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 13 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Performance Comparison Summary

Performance Attribute Results Implication

Maximum IOPS LPe16000B is 7x better than the QLE2672

The IOPS an HBA can support impacts the workload that can be performed per server. More IOPS per HBA equals better server utilization.

IOPS/server watt efficiency LPe16002B is 11x better the QLE2672

HBAs can significantly impact server power draw. By choosing a HBA that enables the lowest server power draw, meaningful power (and cost) savings can be achieved.

MBPS LPe16002B achieves nearly full line rate for SQL Server and Oracle database read workloads (4K and 8K). Whereas, the QLE2672 underperforms, not achieving line rate until the 16K block sizes.

As most enterprise applications use either 4K or 8K block sizes, this is an important limitation

% CPU Utilization LPe160002B is 16x better than QLE2672

HBAs that have better CPU efficiency use fewer CPU resources for processing I/O and therefore have more resources available for applications processing. This enables servers to be utilized more efficiently, ultimately enabling more applications to run per server.

Page 14: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 14 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

LPe16002B Performance Running at 8Gb

At this test time, December 2012, Gen 5 FC is still in its early stage of adoption by many, but due to its backward compatibility Gen 5 FC HBAs can be deployed in 8Gb and 4Gb FC SAN infrastructures. We found that much of the LPe16002B high performance attributes translated well to an 8Gb FC SAN and significantly outperformed both the QLE2672 (16Gb FC HBA running at 8Gb) and the QLE2562 (8Gb FC HBA).

The LPe16002B achieved over 922K IOPS running at 8Gb with 512 block sizes, while both the QLE2672 and QLE2562 were limited to 200K IOPS. The same IOmeter parameters were used for the tests running at 8Gb as for the previous tests running at 16Gb.

-

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

512 1k 2k 4k 8k 16k 32k 64k 128k 256k 512k 1m

IOP

S

Maximum Read IOPS

8Gb FC Link Speeds

LPe16002B(16G@8G) QLE2672 (16G@8G) QLE2562 (8G@8G)

Page 15: Demartek December 2012 - Broadcom Inc....mission critical high performance capabilities to meet the demands of enterprise workloads like databases, business intelligence, virtualization,

Demartek Emulex Gen 5 FC HBA Feature Comparison

December 2012

Page 15 of 15

© 2012 Demartek® www.demartek.com Email: [email protected]

Summary and Conclusion

The purpose of this report was to compare the Emulex Gen 5 Fibre Channel and QLogic 16Gb Fibre Channel adapters that are generally available as of this writing. The performance work published in the previous Demartek report did not use a generally available version of the QLE2672 and selectively used the Emulex LPe16002B (which was available 2 months prior to the QLE2672) and the older Emulex LPe16002 HBAs. In this evaluation, we found that the Emulex LPe160002B is a more reliable adapter with more than 10 million hours MTBF. In addition, The Emulex LPe160002B offers

Superior server power efficiency in terms IOPs per server watt in these tests. Better overall performance with 7 times the IOPS in these tests. Higher throughput at smaller block sizes in these tests. Sixteen times better CPU efficiency in these tests. The LPe16002B is up to 3.16x faster for 4K combine read/write operations and up to

2.29x faster for 8K combined read/write operations for these tests. These block sizes are used by Oracle database (4K, 8K) and SQL Server (8K).

Such performance advantages offer significant benefits for most enterprise applications, including databases, business intelligence, virtualization and private clouds, backup, and many workloads with flash acceleration. This report discusses Emulex LPe16002B-series and QLogic QLE2672 HBAs that are generally available and shipping through the public reseller markets as of December 2012.

Emulex and LightPulse are registered trademarks of Emulex Corporation.

Intel is a registered trademark of Intel Corporation.

QLogic is a registered trademark of QLogic Corporation.

Demartek is a trademark of Demartek, LLC.

All other trademarks are the property of their respective owners.