protecting your data with remote data replication solutions · 2019-12-21 · protecting your data...

Protecting Your Data with Remote Data Replication Solutions

2015 STORAGE NETWORKING INDUSTRY ASSOCIATION EUROPE

Protecting Your Data with Remote Data Replication

Solutions

Fausto Vaninetti

SNIA Europe Board of Directors

(Cisco)


2015 STORAGE NETWORKING INDUSTRY ASSOCIATION EUROPE

Table of Contents

Protecting Your Data with Remote Data Replication Solutions ........................................................ 1

Achieving data protection ............................................................................................................... 1

RAID and RAIN .......................................................................................................................... 1

Local, Metro, Geo ....................................................................................................................... 2

Remote Data Replication for Fibre Channel-Based Disk Arrays ...................................................... 4

Advanced TCP/IP Stack ....................................................................................................................... 5

Optimization and Efficiency in IP-based Storage Replication Solutions ......................................... 6

More To The Game ............................................................................................................................... 9

Summary ............................................................................................................................................ 10

Last Page Address Section ............................................................................................................... 11


2015 STORAGE NETWORKING INDUSTRY ASSOCIATION EUROPE 1

September 2015


Achieving data protection

No one doubts the amount of data being generated across the world is increasing exponentially.

Data generated by all organizations is stored, mined, transformed and utilized seamlessly. Data

represents a critical component for the operation and function of organizations and consequently

data protection methodologies are required to avoid disruptions in business operations. In fact,

every company should consider their data as the second most valuable asset after their employees

and should implement some form of data protection.

This paper examines some of the more common and effective data protection schemes in use today,

offering a concise and simple to understand point of view. Remote data replication solutions are also

covered with some level of technical detail.

RAID and RAIN

The first approach to data protection is typically the adoption of a disk array with an embedded

specific mechanism known as Redundant Array of Independent Disks (RAID), a term dated back to

1988. In short, this is a data virtualization technology that combines multiple disk drives into a logical

group for the purpose of data protection (and performance improvement as well). Data is

distributed across the set of drives according to the desired RAID level schema and a specific balance

is achieved among reliability, performance and capacity.

RAID is categorized according to levels. The Common RAID Disk Data Format specification by SNIA

defines a standard data structure describing how data is formatted across the disks in a RAID group

for every RAID level. Some of the primary RAID levels are shown in the table below:

With RAID levels higher than 0, damage to individual disk sectors or the failure of one or more hard

disks can be tolerated, still preserving data integrity. Data is not actually copied, but rather

complemented with an amount of redundancy so that the original data can be reconstructed via

appropriate mathematical algorithms even if a limited portion of data becomes unavailable due to



failure. In order to improve performance without sacrificing fault tolerance, the use of a fast cache

to front end the RAID group has become the norm, both on servers and within disk arrays. This

explains why RAID 5 and RAID 6 have become very popular implementations in our age. It is also

worth mentioning that solid state disks, instead of magnetic disks, are the new trend and, according

to many analysts, this represents the single biggest revolution in the storage industry since a long

while.

Recently a variation of the RAID implementation got industry attention. This time multiple copies of

data are spread across multiple computational nodes. To hold naming similarity with the previously

described mechanism, this approach has been referred to as Redundant Array of Independent Nodes

(RAIN) and it forms the bases of data protection within multiple commercially available

implementations of Big Data Hadoop clusters and Hyper-converged systems.

All of the above solutions are in wide use by virtually all organizations and do provide effective local

data protection. Nevertheless, in order to protect from local site disasters, a copy of all data should

also be stored in a properly identified alternative location. Whether it is your secondary datacenter

or a third-party managed datacenter, remote data replication across a distance is the way to go.

Some organizations are also keen to keep copies of data on different physical media, in an effort to

further minimize chances of concurrent disruption related to the technology itself. When a time

lapse of 24 hours between production data and their copy is acceptable, tape backups can also be

used as another option for data protection. Tapes can store huge quantity of data at a fraction of the

cost of disk arrays, consume negligible power and are compatible with the highest security standards

requiring that tape cartridges get stored inside underground bank vaults for long term retention.

Organizations tend to use tape backups as a complement to disk-based remote data protection

solutions.

Local, Metro, Geo

The essence to data protection is to securely store multiple copies of data onto independent physical

media. Doing this within a single datacenter is clearly a local solution. If something goes wrong with

the facility (flood, fire, hurricanes, power black-out, sabotage), data can be inaccessible or even

completely lost. Having a copy of data in another location removes the criticality of single site

disaster. Interest in data recovery solutions is well demonstrated by surveys with CIOs and further

underlined by recent forecasts that indicate Disaster Recovery as a Service (DRaaS) as one of the

fastest growing segments for the cloud business.

The secondary site has to be carefully chosen and outside the so called “threat radius” so that

chances of any failure affecting both datacenters at the same time are negligible. As a result,

distances above 300 km are the norm when looking for a true protection from natural calamities or

sudden and unforeseen major system failures.

Organizations with an even higher requirement for data availability and uptime have now adopted

the three site approach, whereby twin datacenters are deployed within short distances and both of



them are active at the same time to achieve business continuity. The third site is far away and used

for simple data recovery needs or true disaster recovery purposes. In this situation, failure of one of

the twin datacenters will not prevent the business from remaining up and running. Applications will

always be on and no downtime will be required to recover them after the failure. Technically this

can be expressed as a Recovery Time Objective (RTO) equal to zero.

Within the twin datacenters, data is kept in-synch and can be assumed to be identical in both sites.

In fact, every write needs to be acknowledged by both storage arrays before being considered

completed. This imposes a practical restriction on the maximum distance between the two locations

and in the range of about 100km. Longer distances would affect application performance, driving it

down to unacceptable levels.

From the point of view of the synchronous replication software in use, it would actually be better to

consider an upper limit in round trip latency rather than distance. To some degree, this depends on

the vendor of choice, but a valid rule of thumb calls for 2 msec as the limit for these kinds of metro

implementations. As a matter of fact, when round trip latency exceeds 8 msec, it is clearly going to

be considered a geographical implementation and data replication is achieved asynchronously. In

fact, writes are considered completed when acknowledged only by the local storage array and then

data will get transferred to the remote disk array with a little delay. In other words, data in the two

locations is slightly different and the copy is lagging the source. The temporal difference between

them is more appropriately called Recovery Point Objective (RPO).

For both cost and technical reasons, nearly all geographical solutions rely upon the Internet Protocol

(IP) for transport across the Wide Area Network (WAN). This is the true realm of disaster recovery,

where both RPO and RTO values are above zero and, if disaster strikes, a manual intervention is

required to transition activity to the secondary site with an expected downtime for applications to

recover. There are many factors involved in choosing the correct remote data replication solution for

any specific business, like the amount of data that can be lost, time taken to recover, distance

between sites and so on.



Remote Data Replication for Fibre Channel-Based Disk Arrays

Fibre Channel (FC) has been the technology of choice for storage connectivity since the inception of

storage networks and even today, despite being far beyond hype-cycle and no more high on the

press, it still dominates over alternatives in terms of adoption for shared external disk arrays.

For Fibre Channel based disk arrays, two main alternative approaches for remote data replication

using IP are currently available.

The first one leverages dedicated IP replication ports on the disk array itself, whereby servers access

their local data via FC fabrics but the remote connection between peer disk arrays go straight to the

IP network. Clearly, this method implies the availability of a sufficient number of native IP ports on

the disk array and this condition is not always met. The second option makes use of a multi-service

appliance that not only provides local FC switching capabilities but also enables FC encapsulation

within IP packets for optimized transmission through the Wide Area Network (WAN). A variation of

this approach sees the same functionality hosted on a specific line-card within a highly available FC

modular switch, known as director.

In most but not all cases, data transmission is mono-directional, from the production datacenter

toward the disaster recovery site. Twin datacenters with active/active operation or occasional data

recovery situations may require data to flow in the opposite direction as well.

Companies should carefully evaluate the rainbow of technical solutions on the market by

confronting them based on some decision criteria that include performance, security, flexibility,

reliability, diagnostic tools and price. Price should not be the main decision factor since a consistent

disaster recovery project requires an overall level of investment that by far exceeds the price of the

data replication solution alone, whatever it may be. For large organizations with large storage

environments, the adoption of IP replication ports on disk arrays may not be optimal. The number of

IP replication ports that can be used concurrently on disk arrays is limited, and in any case lower

than the number of 16G FC ports connected toward the production fabrics. Potentially this will



create a bottleneck since aggregate throughput on 16G FC ports on most arrays far outperforms the

capability of their native 10G IP counterparts.

For their flexibility and performance as well as the capability to use a single remote data replication

solution for multiple disk arrays from different vendors, Fibre Channel over IP (FCIP) encapsulation

engines are the most used implementation to extend a Storage Area Network (SAN) across

geographically separated datacenters. Moreover, FCIP is not limited to remote data replication. It

supports other applications, such as centralized SAN backup and data migration over very long

distances. As a matter of fact, when distances go up, it becomes impractical, or very costly, to rely

upon native Fibre Channel connections, eventually transported over optical transmission equipment.

FCIP tunnels, built on a physical connection between two SAN extension switches or blades, allow

Fibre Channel frames to pass through the existing IP WAN. The TCP connections ensure in-order

delivery of Fibre Channel frames and lossless transmission.

The Fibre Channel fabric and all Fibre Channel targets and initiators are unaware of the presence of

the IP WAN. The TCP/IP stack ensures all data lost in flight is retransmitted and placed back in order

prior to being delivered to upper layer protocols. This is an essential feature to prevent SCSI

timeouts for open system-based replication. This stack is also capable to automatically and quickly

adjust traffic rate on the WAN connection between the user-defined min and max bandwidth values.

In other words, a feedback mechanism ensures that the quality of the long distance IP link will

dynamically affect the FCIP transmission rate, permitting optimal throughput for all flows. Evidently,

the user-defined min bandwidth value should be carefully chosen so that it does not exceed the

available bandwidth on the WAN link.

As a best practice, this minimum bandwidth should be available at all times because the need for

replication may arise at any time. This can be achieved by either specifically reserving bandwidth for

FCIP or by having sufficient bandwidth available that far exceeds the current needs for all uses.

Furthermore, whenever possible, adopt a reliable IP connection that drops very few packets since

the performance of FCIP, as any high performance TCP connection for that matter, greatly depends

on a low retransmission rate.

An enterprise-class remote data replication solution should excel in performance (achieved

throughput, tolerated latency, packet drop handling), monitoring (port and flow visibility and

statistics) and diagnostic capabilities (ping, trace-route, logging) and the group of advanced features

that are the main constituents would start from a sophisticated TCP/IP stack.

Advanced TCP/IP Stack

Despite software implementations on top of a general purpose processor would be possible, the

required performance and reliability levels that disaster recovery projects impose are considerable.

For that reason, most solutions use hardware assisted implementations, where custom ASICs sustain

the most demanding computational tasks like compression or encryption.



A valid remote data replication solution should be able to operate both in asynchronous and

synchronous mode. In the first case, the most typical one, distances up to 10,000 km should be

supported to address the needs of multinational companies with datacenters in multiple continents.

For synchronous replication, latency is a gating factor and extra care is required in order to minimize

it. Not all solutions are equal in this respect.

Data needs to be encapsulated before transmission over a long distance IP network. In general, the

efficiency of the chosen transport method depends on its capability to reduce overhead by filling

datagrams to the supported Maximum Transmission Unit (MTU). This creates maximum payload per

unit of overhead.

The best approach is to use “frame batching” so that a stream of data frames (typically 4 or 64 of

them) is worked on at the same time, compressed and fit into the available MTU size. When a single

data is compressed and mapped to an Ethernet frame, wasted payload bytes cause inefficiency and,

consequently, higher overhead for the same traffic. In this case, the bigger the MTU the better it is:

Jumbo frames up to 9000 bytes are preferred to the standard MTU of 1500 bytes when best

performance is desired. Since a Fibre Channel frame can be up to 2148 bytes, and considering some

margin for additional headers, an MTU size of 2300 bytes would be the minimum recommended

value to use.

Some implementations also incorporate a way to determine the maximum MTU all the way to the

remote target with a feature known as Path MTU Discovery. PMTUD is described in RFC 1191 and

works well with pure L3 networks. It is worth mentioning that the TCP Maximum Segment Size (MSS)

is slightly smaller than Ethernet MTU size in order to accommodate for TCP and IP headers.

In the end, data frames need to go through the TCP/IP stack and here is where some solutions may

fall short of expectations due to technical trade-offs. On one side, a long distance IP network poses

challenges on available bandwidth, available paths and packet drops. On the other side, data

replication dislikes instabilities and variability and would prefer a guaranteed bandwidth with no

packet drops. Distance, so latency, has a negative effect on throughput. Put simply, with standard

TCP/IP, information transfer suffers the farther you go. This is because of the flow control

mechanism that is part of TCP protocol. In fact, link latency and the waiting for acknowledgment of

each set of packets sent will prevent long and fat pipes to be efficiently utilized.

Optimization and Efficiency in IP-based Storage Replication Solutions

For these reasons, an efficient remote data replication solution cannot fall short of a purpose-built

WAN optimized TCP/IP stack. Thanks to that, it becomes possible to achieve wire-rate transmission

on high speed links, with application throughput up to 1250 MBytes/s for 10 Gbps ports. It is also

possible to overcome 100+ msec of latency on the WAN and tolerate excessive jitter, bit errors and a

loss of 1 out of 2000 transmitted packets. Experience has shown that general purpose WAN

optimization devices cannot provide better performance than purpose-built remote data replication

solutions, but rather introduce complexity, another point of failure, and another asset to configure,



manage, monitor, and troubleshoot. Moreover, being general purpose, they would have no specific

storage protocol awareness (FC, SCSI) and consequently fail to add real value to the solution.

Transmitting data over an IP network avoids the constraints and distance limitations suffered by

native Fibre Channel links whereby a Buffer-to-Buffer Credit mechanism is used to make sure

packets are not lost due to congestion from source to target. The burden for proper handling of

congestion situations and flow control in general is offloaded to the TCP layer and its native

capabilities.

One of them is the transmission window size, dynamically adjustable in response to WAN conditions.

In TCP, the amount of outstanding unacknowledged data that is needed to fully utilize a WAN

connection is tightly associated with the Bandwidth Delay Product (BDP), derived by multiplying link

bandwidth and link round trip time. A solid remote data replication solution will have a large BDP

value, even in excess of 120MB, and avoid any drooping effect over long and fat pipes.

Another optimization that becomes handy when more efficiency over the WAN is required is known

as Selective Acknowledgment (SACK) and described in RFC 2018. Although TCP was a very robust and

adaptable protocol since the very beginning, it has gone through several iterations to enhance its

ability to perform in high latency environments coupled with high bandwidth. The goal is to

minimize the TCP control traffic and allow it to recover faster from any dropped frames.

The standard TCP protocol implements reliability by sending a cumulative acknowledgment for

received data segments that appear to be complete and in sequence. In case of packet loss,

subsequent segments will not get acknowledged by the receiver and the sender will retransmit all

segments after the packet loss is detected. This behavior is pretty inefficient since it leads to

retransmission of segments which were actually successfully received and provokes a sharp

reduction in the congestion window size so that subsequent transmissions will happen at a slower

rate than before. By using the SACK mechanism, a receiver is able to selectively acknowledge

segments that were received after a packet loss. The sender will now have the capability to only

retransmit the lost segments and fill the holes in the data stream.



More often than not, organizations try to save money on the WAN connectivity service by enabling

compression before sending data across. The simple idea is to transmit the same amount of data

over a lower bandwidth link. The compression engines are typically based on the well-known

“deflate” algorithm as described in RFC 1951, even if derivative implementations provide a different

trade-off of throughput vs. compression ratio. The achieved results are very dependent on the data

to be compressed but a good implementation is normally capable of a 4:1 compression ratio for real

data (not test data).

Of course last but not least, with the ever-increasing amount of data generated across the globe,

there is also a clear trend toward high-speed remote replication solutions. Up to a couple of years

ago Gigabit Ethernet speeds were adequate at least for most companies; nowadays the sweet spot is

certainly at 10 Gigabit per second with 40 Gigabit Ethernet looming as the next candidate for market

adoption.



The TCP/IP implementation on enterprise-class remote data replication solutions is clearly optimized

for carrying storage traffic. It can accommodate long and fat pipes, avoid the low throughput, slow

start behavior of normal TCP implementations and recover more quickly from packet loss, as

described within several documents including RFC 1323 (Window Scaling) and RFC 5681 (Slow-start,

congestion avoidance, fast retransmit, and fast recovery). It also employs variable, per-flow traffic

shaping that yields high instantaneous throughput while minimizing the possibility of overruns on

downstream routers.

More To The Game

Security can be an added feature of the chosen implementation. By using 256-bit keys and

hardware-assisted encryption engines in compliance with the Advanced Encryption Standard (AES),

high performance can be achieved despite the complexity of algorithms. Various situations

determine where is best to apply encryption for data in-flight: for example, if only disk-to-disk

replication traffic needs this level of security, it can be advantageous to enable it on the dedicated

remote data replication solution. If other traffic needs to be encrypted between the two

datacenters, it is preferred to enable encryption on the exit datacenter routers where wire-speed

encrypted traffic on 100G ports is now possible. Alternative implementations on hosts or dedicated

security engines or DWDM muxponders are also available, but they don’t offer the same benefits in

real world deployments and are confined to more specific use cases.

Ideally the same remote data replication solution would be capable of both open system logical disk

replication and mainframe volume replication, providing a consistent and homogeneous response to

both FC and FICON replication needs. This capability, sometime referred to as multimodality or

FC/FICON intermix, helps justify the investment in high-performance extension technologies since it

can now be leveraged across the enterprise to include mainframe volume replication and tape

vaulting in addition to a variety of open system disk replication solutions and tape libraries.



Large-scale storage deployments often require support for multimodality (disk, tape, open system,

mainframe), heterogeneous arrays, large bandwidth, high throughput, nonstop operations, tools for

administration and configuration, and robust diagnostics. Some leading SAN extension solutions can

accommodate all of these requirements and allow them to be managed by different administrator

groups within an enterprise, by using INCITS T11 Virtual Fabrics (VF) technology for logical

partitioning and Role Based Access Control (RBAC) for user profiling and privileges’ assignment. This

is a warm welcome to multi-tenancy for storage area networks.

For high availability considerations, it is also recommended to architect the overall solution in such a

way that replication traffic can still operate during firmware upgrades and single replication port or

device failures. That is why link aggregation groups are configured and where equipment

redundancy comes into play.

The remote data replication network can be incorporated into production FC fabrics or kept

separate. Separation can be achieved logically or physically, using INCITS T11 Virtual Fabrics (VF)

technology or dedicated devices. When physical separation is desired, the disk array will host

onboard dedicated FC ports for replicas, connected to the SAN extension network. In small

environments, the disk array will have a limited number of FC ports and all of them will be shared for

production as well as replication traffic. In this case, the SAN extension appliance will need to

provide specific functionalities in order to avoid merging the SAN in the primary datacenter with the

secondary one, so that issues on the remote site, or even the WAN, will not negatively affect

production traffic.

Now that migration from IPv4 to IPv6 addressing is underway in many datacenters, IPv6

compatibility is also a very reasonable ask for any modern remote data replication solution.

Depending on situations, where strong asymmetry in scale (and budget) for the two datacenters

exist, there can also be a need for support of mismatched speeds on the WAN ports at both ends of

the replication link, so that 10G is used in one location and 1G in the other one. IP sub-interfaces and

VLAN tagging are extra features that are sometime required for a properly architected solution.

Cloud providers and Hosted Managed Service providers tend to make use of these capabilities when

offering Storage Private Clouds to their customers.

Most replication protocols today support unsolicited writes and thus require a single round trip to

write data to a remote disk array. If they don’t, multiservice FCIP engines can provide ad-hoc

acceleration capabilities to properly compensate for that. The industry has thus developed a wide

range of specialized acceleration solutions, falling under the name of Write Acceleration, Read

Acceleration, Tape Acceleration, Input/Output Acceleration and the likes.

Summary

Organizations looking for a remote data protection solution across geographically separated

datacenters can nowadays choose among a variety of options, including Fibre Channel based disk

array IP replication and multiservice appliances. Enterprise-class features and low TCO (Total Cost of



Ownership) represent valid decision criteria, just like multi-modality and multi-tenancy. The ability

to integrate into any IP network without special tuning considerations is enabled through an

optimized TCP/IP stack and resulting capabilities to handle glitches over the WAN. Ease of

configuration and comprehensive management tools help providing insight and end to end visibility

for proper performance assessment and troubleshooting. FCIP has emerged over alternative

protocols and since many years purpose-built FCIP devices represent the preferred solution for

remote data replication, especially for medium and large companies. Thanks to this technology, it is

now possible to alleviate the distance barrier and achieve secure local replication performance over

long distances.

About the SNIA Europe

SNIA Europe advances the interests of the storage industry by empowering organizations to

translate data and information into business value by promoting the adoption of enabling

technologies and standards. As a Regional Affiliate of SNIA Worldwide, we represent storage

product and solutions manufacturers and the channel community across EMEA. For more

information, visit http://www.snia-europe.org/.

http://www.snia-europe.org/

protecting your data with remote data replication solutions · 2019-12-21 · protecting your data...

Documents