oceanspace vtl3500v100r002 white paper

Upload: utopia-media

Post on 29-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    1/25

    Doc. code

    Oceanspace VTL3500 White Paper

    Issue 1.0

    Date 2010-05-18

    Huawei Symantec Technologies CO., LTD.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    2/25

    Copyright Huawei Symantec Technologies Co., Ltd. 2009. All rights reserved.

    No part of this document may be reproduced or transmitted in any form or by any means without

    prior written consent of Huawei Symantec Technologies Co., Ltd.

    Trademarks and Permissions

    and other Huawei Symantec trademarks are trademarks of Huawei Symantec Technologies

    Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their

    respective holders.

    Notice

    The purchased products, services and features are stipulated by the commercial contract made

    between Huawei Symantec and the customer. All or partial products, services and features

    described in this document may not be within the purchased scope or the usage scope. Unless

    otherwise agreed by the contract, all statements, information, and recommendations in thisdocument are provided AS IS without warranties, guarantees or representations of any kind, either

    express or implied.

    The information in this document is subject to change without notice. Every effort has been made in

    the preparation of this document to ensure accuracy of the contents, but all statements, information,

    and recommendations in this document do not constitute the warranty of any kind, express orimplied.

    Huawei Symantec Technologies Co., Ltd.

    Address: Building 1

    The West Zone Science Park of UESTC, No.88, Tianchen Road

    Chengdu, 611731

    P.R.China

    Website: http:// www.huaweisymantec.com

    Email: [email protected]

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    3/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 3 of 25

    Contents

    1 Executive Summary ............................................................................................................4

    2 Introduction .........................................................................................................................5

    2.1 Background of the VTL Technology....................................................................................................5

    2.2 De-duplication......................................................................................................................................9

    3 Solution...............................................................................................................................10

    3.1 Advantages of the VTL3500 ........................................................... ................................................... 10

    3.2 Powerful Virtualization Capability........................................................................ ............................. 11

    3.3 On-Demand Capacity Expansion ............................................................... ........................................ 11

    3.4 IP Replication.....................................................................................................................................12

    3.5 Tape Caching......................................................................................................................................14

    3.6 Tape Encryption ......................................................... ................................................................. .......17

    3.7 De-duplication....................................................................................................................................18

    4 Experience...........................................................................................................................21

    4.1 Powerful Virtualization Capability.....................................................................................................21

    4.2 On-Demand Capacity Expansion ............................................................... ........................................ 21

    4.3 Application Scenario ............................................................ .............................................................. 22

    5 Conclusion..........................................................................................................................24

    6 Acronyms and Abbreviations.........................................................................................25

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    4/25

    1 Executive SummaryAs the data amount increases rapidly and the market competition heats up, customers

    have higher requirements on the reliability and performance of data backup and

    recovery. The traditional physical tape library technology is already unable to meetcustomer requirements. Under the background that the virtual storage technologydevelops and SATA hard disks emerge, the virtual tape library (VTL) becomes amature and cost-effective kind of data backup device. The VTL uses disk arrays as the

    storage device and virtualizes the existing hard disks as the mainstream tape librarythrough the built-in virtualization software. The VTL combines multiple advantages,

    such as the high reliability, high performance, ease-of-management of disk devices,

    and mature media management of tape devices. Therefore, the VTL has attracted moreand more attention.

    Since the SATA disk is advantageous in the cost and performance, more and moreusers adopt disk to disk (D2D) backup to construct a fast and reliable backup system.

    The capacity of the disk backup device, however, tends insufficient as the data amount

    soars. Large amounts of duplicate data consumes much of the capacity. Under thiscircumstance, the de-duplication technology comes into being and has become hot in

    recently years. De-duplication can greatly reduce the amount of the data that needs tobe stored. In addition, de-duplication can dramatically decrease the amount of the datareplicated between remote nodes, thus reducing the occupation of bandwidth.

    This document is going to introduce some key technologies of VTL and analyze their

    values for customers. These technologies include virtualization, on-demand capacityexpansion, multi-stream backup, FC/IP SAN backup, remote IP replication, tape

    caching, and de-duplication.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    5/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 5 of 25

    2 Introduction2.1 Background of the VTL Technology

    2.1.1 Deficiencies of the Physical Tape Library

    As informatization develops and data grows explosively in recent years, more andmore users recognize the importance of data protection and purchase tape libraries and

    data backup software to construct their own data backup systems. By using tapelibraries, users can mange the media comprehensively and thoroughly, and can use the

    backup software to realize automatization. Tapes are easy to be preserved offline, and

    can be taken out of the physical tape library and transported to another site toimplement remote disaster recovery. Now, users, however, find that at the same timethe automated data backup system brings convenience, it also poses new problems thatthreaten the practicability of the existing data backup solutions.

    Reliability

    Figure 2-1 shows the analysis of backup failures by IDC.

    Figure 2-1Analysis of backup failures

    "What are the most common causes of a backup failure?" --

    Percent of All Users

    (multiple responses accepted), N = 222

    3%

    3%

    32%

    40%

    47%

    53%

    59%

    0% 10% 20% 30% 40% 50% 60% 70%

    Don't Know

    Other

    Network Failure

    Software Failure

    Human Error

    Hardware Failure

    Media Failure

    A tape library consists of mechanical parts. The tape drive boasts hundreds of

    thousands of hours of operating life, but it often becomes faulty within one or two

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    6/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 6 of 25

    years in the practical use. The robot of the tape library has a high fault probability. Alarge proportion of the users of low-end and mid-range tape libraries suffer from at

    least one backup failure due to the fault of the tape library. The tape library is

    vulnerable to failures resulting from the external environment, such as dust andmoisture. The combination of components degrades the overall system availability.

    The tape library is fault intolerant. The whole tape library runs abnormally and eventhe whole backup system breaks down when a single failure of the tape drive, tape slot,robot, controller, barcode scanning system, or tape incoming and ejecting device. Thelow availability heightens the maintenance cost. According to the statistics, in 2002,the average yearly maintenance cost of tape libraries accounted for 10% to 15% of the

    procurement cost. What bores users more is that the repair of tape libraries must be

    performed by professionals. The long repair period messes the daily operation up. Thatcompels users to purchase multiple tape drives, which are the major expensive parts of

    a tape library. As a result, users' total cost of ownership (TCO) increases.

    To improve the reliability of the tape-based storage, many users adopt the tape

    replication method to implement dual backups of data. This time and labor consumingmethod brings extra operation costs. In essence, backup itself is not the objective.

    Backup only counts when it can ensure data recovery. The reliability of the backup

    media determines the reliability of backup data. Tapes are exposed to the air andvulnerable to electromagnetism, dust, moisture, magnetic particles, conglutination, and

    moldiness. Users sometimes find the tapes damaged before starting data recovery.

    Performance

    As the service requirements grow, each system requires shorter backup windows. The

    performance bottleneck of tape devices exists in data reading and writing, and also

    tape loading, which sometimes spends more time than data reading and writing. If the

    data on multiple tapes needs to be recovered, a complete system recovery takes a longtime and has a very low recovery performance. If users want to back up more data in ashorter time, users need to install more tape drives in their tape libraries. That meanshigher expenses, higher fault probabilities, and higher investment as well when the

    tape technology is updated. In fact, due to the limitations of the design of the tape

    library, the number of the tape drives that can be added is limited.

    Scalability

    On the one hand, the data amount increases ceaselessly; on the other hand, the

    expansion space for the tape library is limited. If the user purchases a large tape library(with over 200 slots for example), the procurement cost is very high even if arelatively low configuration is chosen.

    Return on Investment

    As the data amount increases, each system requires shorter backup windows. Under

    the current backup systems, data backup and recovery take more and more time.Consequently, uses are required to increase the performance and capacity of theexisting tape libraries. The results, however, are higher hardware costs, more difficult

    media management, higher software costs, higher fault probabilities, and highermaintenance costs. Moreover, the return on investment is reduced because of the low

    utilization of tapes and tape libraries, high maintenance costs of tape libraries, and

    short lifecycle of tape drive technologies.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    7/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 7 of 25

    Eventually, users will find that the investment on data protection is beyond expectedand the return is far from expected, and that the backup system itself increases the

    workload of maintaining the whole storage system. That has become a common

    problem for many organizations.

    2.1.2 Disk-based BackupFaced with the preceding problems, some users and consultants start to put their eyes

    on the disk-based backup. As SATA disks become popular, the disks of large capacityhave a low price and high performance.

    Under this background, the backup solution based on disk arrays comes into being,which is realized in the following methods:

    Employing the disk arrays of standard FC, SAS, or iSCSI interfaces and

    connecting SATA disks of a high capacity and low cost directly to the backup

    server

    Using the space of NAS for backup Adopting the mainstream backup software that supports disk-based backup

    This type of backup solution uses disks as the storage device, which is formatted intofile systems. This type of backup solution solves many problems found in the

    tape-based solution:

    Eliminating the reliability limitations of the tape library and media

    Avoiding the effect of tape loading and unloading on the performance (the

    sequential read/write performance equals or exceeds that of mid-range tape

    libraries)

    Increasing the utilization of storage space greatly

    Facilitating the maintenance and reducing the maintenance cost (disk arrays arecommon and can be easily managed by the administrators that do not haveprofessional knowledge)

    Theoretically, the investment is low, for the user only needs to purchase one storage

    array. In practical, however, the user finds that this backup solution based on diskarrays is not perfect. This solution is disadvantageous in the following aspects.

    Sharing

    If the user implements the LAN-free backup in the multi-server environment, the

    complexity and cost of configuration increase.

    Generally speaking, only when a file system is set up on a disk array can this disk

    array be identified and used by the backup software. Moreover, most file systemscannot be shared by multiple servers, whereas these file systems can be shared by

    multiple tape libraries.

    That is to say, if the user wants to make the same storage array shared by multipleservers over a SAN, just like for tape libraries, the user must set up multiplelogical devices in this storage array and assign each logical device to each backup

    server.

    A series of management problems face the user consequently:

    How to determine the number of disks assigned to each server?

    How to expand the capacity online when the allocated capacity is insufficient?

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    8/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 8 of 25

    How to reduce the capacity online when the allocated capacity isoverabundant?

    Must this function be realized through the expensive volume management

    software? Some types of backup software support the backup storage pool, but

    these types of backup software can only support data sharing in the sameplatform and cannot support data sharing across platforms. Moreover, this

    function requires additional data sharing options and supports limited number ofplatforms. The function still needs to be improved.

    Security

    This type of storage device is simply based on disk arrays and works as a file

    system in the server. This file system can be operated by any tool and accessed by

    any users. One unintentional rm r ordel * command can spoil all backups. Allin all, the backups are vulnerable like the files in the file system. That means

    many risks:

    Will data be lost due to misoperations of the administrator or maliciousdeletion by others?

    Will data be copied by others and recovered on another computer, thus

    causing the leak of confidential information?

    Can the backup data not be used for data recovery due to viruses?

    Performance

    First, the file system itself may be a performance bottleneck. Especially whenprocessing multiple tasks and processes, the file system probably becomes a

    performance bottleneck of the backup system.

    Second, the file system cannot solve the problem of disk fragments. Disk

    fragments degrade the performance of the file system. When a large amount ofdata is processed, the problem of disk fragments can hardly be solved.

    FunctionThe backup management software is specially designed for tape libraries.

    Currently, most types of backup software support the use of disk arrays as thebackup device, but the functions are different from under the tape-based

    circumstance. These differences can cause some serious problems:

    The existing backup environment must have the current backup policy

    changed. The seamless integration is unrealizable.

    The data hardware compression function cannot be realized under thedisk-based backup. The backup performance or storage space cannot be

    optimized effectively.

    The data backups saved on disk arrays cannot be copied via the media for

    remote data storage. Therefore, the advantages of tape in the flexibility, suchas offline storage, data migration, and remote disaster recovery, are lost.

    According to the preceding analysis, the use of disk arrays as the backup devicesolves some problems found in tape libraries, but it also brings new problems,

    which are more difficult to conquer.

    In fact, the applications that use disk arrays as the backup device are restricted to

    use disks as the cache for the tape-based backup. This function is supported bythe mainstream backup software, such as the Disk Staging of VERITAS

    NetBackup and the Disk Backup Option of Legato NetWorker. That is to say, the

    backup operation is implemented on disks within the time window, and then thedata is migrated from disks to the tapes in the background. This solution has also

    posed the preceding problems. Uses must rely on tape libraries to implement data

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    9/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 9 of 25

    storage. This solution is only a supplement to the disk-based backup and used foraccelerating backup and recovery.

    Management

    The disk-based backup is one of the functions of the backup software. The types

    of backup software from different vendors implement disk-based backup indifferent methods and no universal standard exists. As a result, under theenvironment of multiple backup systems, the user cannot realize centralizedbackup management or protect their investment.

    2.1.3 VTL Function Provided by the Backup Software

    At present, some types of backup software have the VTL function, such as the Virtual

    Disk Library of BakBone NetVault. The backup server is installed with a VTLsoftware module, through which part of the storage space of the backup server is

    virtualized into the tape library.

    The solution is easy to implement and also cheap. It provides the basic VTL function

    and partly solves the performance problem of the tape-based backup. This solutionstarts to be adopted by some users.

    This solution, however, has some obvious disadvantages, for example, sharing,management of LAN-free backup, security, and high consumption of system resources

    by the backup server. In a word, the solution can only be considered as a supplement

    to the disk-based backup method, and is mainly used as the cache of the tape-basedbackup. This solution cannot work independent of tape libraries.

    According to the previous analysis, when the VTL function, which is achieved throughphysical tape libraries, disk arrays, and backup software, is used to back data up,

    various problems rise. The VTL technology can solve these problems effectively.

    2.2 De-duplication

    As the Internet develops, large organizations, governments, and finance institutionshave increasingly growing data centers. The increasing requirement for storage spaceboosts the storage cost. The IT personnel must deal with the top three issues: saving

    energy, reducing power consumption, and lowering the system cost. As a hottechnology in the storage field, de-duplication solves these problems.

    De-duplication is developed for reducing space occupation by duplicate data and thuslowering costs and energy consumption. When adopting the de-duplication technology,

    the user must consider the following factors: Effect of de-duplication on the backup performance

    De-duplication ratio

    Efficiency of remote replication

    Total benefits

    Scalability

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    10/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 10 of 25

    3 SolutionThe Oceanspace VTL3500 virtual tape library (hereinafter referred to as the VTL3500)

    is a backup solution developed by Huawei Symantec Technologies Co., Ltd.(hereinafter referred to as Huawei Symantec) for the low-end market. The VTL3500

    virtualizes SATA disk arrays into a physical tape library through the software. TheVTL3500 provides a high performance and supports seamless deployment. Moreover,the VTL3500 supports de-duplication and integrated backup software to reduce users'

    investment on the IT infrastructure.

    3.1 Advantages of the VTL3500

    3.1.1 VTL3500 vs. Physical Tape Library

    By using the virtualization technology, the VTL3500 emulates the parts of a physical

    tape library. The robot, drive, tapes, and slots of the physical tape library exist in thelogical manner and do not need to be maintained manually. This manner avoids the

    inherent mechanical deficiencies such as tape location and tape errors, and the short

    service life problem resulting from being exposed to the air and being vulnerable toelectromagnetism, dust, moisture, magnetic particles, conglutination, and moldiness.

    The costs of managing the media and maintaining the device decrease greatly and thereliability of backup data increases. The VTL3500 stores backup data based on

    high-speed disks and high-reliability RAID technologies. The VTL3500 improves theperformance of backup and recovery, shortens greatly the time of backup and recovery,provides a high scalability, and increases the return on investment.

    3.1.2 VTL3500 vs. Disk-based Backup

    After the VTL3500 creates VTLs and assigns them to the backup servers, the backupservers recognize them as physical devices and share them between each other. On the

    use and allocation of storage space, even when the physical libraries are sharedbetween multiple servers, the user can create new tapes to be invoked by multiplebackup servers according to the share mechanism specified by the backup software.

    Therefore, the user does not need to worry about how to allocate proper space todifferent backup servers.

    Under the disk-based backup, the backup data is saved in the file system and can beaccessed by any user and virus. The disk-based backup cannot prevent human

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    11/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 11 of 25

    misoperations, vicious destroy, and virus attacks. The VTL3500 simulates the dataread/write method of physical tape and stores the backup data in the raw device. Thus,

    users cannot operate the backup data directly and viruses cannot destroy the data. The

    VTL3500 solves the security problems found in the disk-based backup.

    In the disk-based backup, a file system needs to be created in the storage unit. Data isread and written first through the I/O interface of the file system and then through theinvoked I/O interface of the raw device. During data transfer, the overhead of invokingthe two interfaces degrades the system performance. In addition, the file system itselfmay be a performance bottleneck. The VTL3500 transfers data through directly

    reading and writing the raw device. This method fully utilizes the high speed of the

    raw device and increases the transfer efficiency.

    3.1.3 VTL3500 vs. VTL Module of the Backup Software

    As an extension of the disk-based backup, the VTL module of the backup softwarecannot solve the management and security problems found in the LAN-free backup.

    This module consumes a large amount of server resources and even may degrade thebackup performance. In addition, this module cannot work independent of physicaltape libraries to meet the backup requirement in the complicated environment. The

    VTL3500 has the independent hardware and functional components. It fully emulates

    physical tape libraries and works as a backup device independent of physical tapelibraries. At the same time the VTL3500 helps to realize effective data backup and

    recovery, it hardly occupies any server resources.

    3.2 Powerful Virtualization Capability

    The VTL3500 can virtualize 16/64/128 tape libraries/tape drives and more than 60

    types of tape libraries and tape drives from the mainstream vendors such as HP, IBM,and Quantum. The backup servers consider the VTL3500 the same as physical tape

    libraries. Therefore, the VTL3500 can be seamlessly deployed into the existing backup

    system that is based on physical tape libraries.

    3.3 On-Demand Capacity Expansion

    Microcosmically, the VTL3500 uses the Capacity-on-Demand technology. The usercan set a small initial capacity for the virtual tapes. As more data is written to the

    virtual tapes, the VTL3500 automatically allocates more space to the virtual tapes. As

    for physical tape libraries, the media management causes space waste (50% or more ofthe total space) because a large number of tapes cannot be fully written. Compared

    with physical tape libraries and disks, the VTL3500 increases the utilization of storagespace dramatically.

    Macroscopically, the VTL3500 manages disks in the common way. New disks can beeasily added to expand the capacity. Therefore, the user does not need to purchase a

    high configuration of disks like tape libraries. The user can add new disksincrementally as the data amount grows. Thus, the initial procurement cost is much

    lower than that of tape libraries. For the routine maintenance, the cost is much lower,

    for disks are free of the various mechanical faults found in tape libraries.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    12/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 12 of 25

    3.4 IP Replication

    Replication is a common technology used for disaster recovery. Data replication refers

    to copying data from one medium onto another medium and generating a data copy by

    using the data replication software.The traditional disaster recovery generally uses the transportation method. The backupsoftware copies data onto a physical tape library, and the physical tape library is

    transported to a remote place for preservation. During the transportation, tapes may get

    lost or damaged; thus, the effect of disaster recovery cannot be ensured.

    Over an IP network, the local VTL3500 copies data on virtual tapes to the remote

    VTL3500. Through this method, the VTL3500 utilizes the convenience and high speedof the network to save the transportation cost. The local VTL3500 encrypts the tapedata by using the encryption algorithm before data transfer. Then the remote VTL3500decrypts the data after receiving it. As a result, the data security during transfer is

    ensured.

    The VTL3500 provides four options for the IP replication:

    Remote Copy

    Automatic Replication

    IP Replication

    Replication upon De-duplication.

    Among the four options, three support automatic replication and one supports manual

    replication. Table 3-1 lists the four options of IP replication.

    Table 3-1Four options of IP replication

    Option Type Description

    Auto

    ReplicationAutomatic When a virtual tape is exported from the VTL, the

    system automatically copies the data on the virtualtape to another VTL3500.

    Remote

    CopyManual The data on a virtual tape is copied to another VTL as

    required.

    IP

    ReplicationAutomatic Within the specified interval and according to the

    user-defined policy, the changed data on the primaryvirtual tape is copied to the same or another VTL.

    ReplicationuponDe-duplicati

    on

    Automatic When the de-duplication function is enabled, thedeletion policy is integrated with the replicationpolicy. The changed data is copied to another

    VTL3500 according to the replication policy.

    These four options differ mainly in the replication triggering mechanism.

    Auto Replication is triggered by the backup software. If the VTL is set Auto

    Replication, the replication of the virtual tape is triggered when the VTL receivesthe eject command from the backup software (For a physical tape library, the

    eject command for the backup software means to eject the tape out of the physical

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    13/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 13 of 25

    tape library; for a virtual tape library, this command means to put the virtual tapeinto the virtual vault).

    Remote Copy is triggered manually. The user can copy the data on the selected

    disk to the VTL3500 in the disaster recovery center. Then, the VTL3500 in the

    disaster recovery center allocates the space equal to that of the source tape to thetarget disk, and sets the same barcode. When the copy is complete, the system

    automatically promotes the disk to the virtual vault of the remote VTL3500 forfuture use. Through the Remote Copy function, the whole virtual tape can becopied to the remote VTL3500, without the need of creating a new virtual tape in

    the remote VTL3500. Before the copy, any virtual tape in the remote VTL3500must not have the same name as any virtual tape in the local VTL3500.

    IP Replication is triggered based on the policy.

    The policy can be:

    Data increment-based replication policy.

    The VTL3500 can identify the amount of the data backed up to the tape each

    time. If the data increment exceeds the pre-set threshold, the replication isautomatically triggered after the copy.

    Time point-based replication.

    The user can specify the time point for the first replication and the replication

    interval for each virtual tape. Then, the data on the virtual tape will be copiedaccording to the specified time point. The remote virtual tape that adopts IPReplication must be promoted manually before use.

    Replication upon De-duplication is manually triggered based on the policy.

    The triggering condition can be the specific date or time point, or upon the

    completion of the backup operation. The local VTL3500 transfers the data afterde-duplication to the remote VTL3500 over an IP network. After de-duplication,

    data blocks instead of data are transferred during the IP replication. Thebandwidth occupation decreases and the transfer efficiency increases. As a result,

    the remote data-level disaster recovery can be implemented with low costs, easy

    deployment, and high efficiencies.

    The remote IP replication has the following scenarios:

    One VTL3500 copies data to the remote VTL3500.

    Figure 3-1Networking of one-to-one remote disaster recovery

    Multiple VTL3500s copy data to the remote VTL3500.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    14/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 14 of 25

    Figure 3-2Networking of many-to-one remote disaster recovery

    3.5 Tape Caching

    Tape Caching is an advanced function of the VTL3500. This function uses thehigh-speed VTL3500 as the high-speed cache of the physical tape library. The backup

    data is written to the VTL3500 first. After the backup operation is complete, theVTL3500 migrates the backup data to the physical tape library according to the presetpolicy. In this way, the hierarchical storage architecture forms.

    The VTL3500 can shorten the backup window and quickly recover data. Physical tapelibraries are suitable for large-capacity offline data. Therefore, the VTL3500 can be

    combined with physical tape libraries to implement the hierarchical storage.

    The principles of the hierarchical storage include:

    The data that needs to be archived for a long time is stored on the physical tape

    libraries.

    The frequently-used data is stored in the VTL.

    The VTL takes over the physical tape libraries.

    Physical tape libraries have the slow backup speed and disks are unsuited forseldom-accessed data for a long time. The hierarchical storage eliminates theshortcomings of physical tape libraries and disks.

    Figure 3-3 shows the networking of the hierarchical storage.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    15/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 15 of 25

    Figure 3-3Networking of the hierarchical storage

    Data can be recovered directly from the VTL or physical tape library. To fully utilizethe high-speed cache, the VTL3500 provides various migration triggering policies andspace reclaiming policies.

    3.5.1 Data Migration Policies

    Tape Caching provides two policies for triggering data migration between theVTL3500 and the physical tape library: 1) time-based migration; 2) intelligentmigration. Table 3-2 and Table 3-3 list the two policies.

    Table 3-2Time-based migration policy

    Policy Name Description

    Certain time

    point each day

    Migration is performed in a one-day cycle. The VTL3500 starts

    data migration at the specified time point each day.

    Certain time

    point each week

    Migration is performed in a one-week cycle. The VTL3500

    starts data migration at the specified time point each day from

    Monday to Saturday.

    Table 3-3Intelligent migration policy

    Policy Name Description

    And/Or Conjunction/disjunction of the intelligent policy. The option

    And means migration is triggered only when all conditions are

    met; or means that migration is triggered when any condition ismet.

    Data storage

    period

    Migration is triggered when the backup data is stored on the

    VTL3500 for a specified period.

    Watermark Migration is triggered when the usage of the disk space of the

    VTL3500 reaches 90%.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    16/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 16 of 25

    Policy Name Description

    After backup

    (tape space usedout)

    Migration is triggered after each backup. "Tape space used out"

    is the additional policy for "after backup". If the two options arechosen, the VTL3500 checks the usage of the virtual tape when

    a virtual tape is ejected out of the tape drive. If the space of thistape is used out, migration is triggered.

    Postponed to a

    certain time

    point

    Migration is postponed to a specific time point after the

    condition is met this time. This policy must be used together

    with the preceding three policies. When the condition of anypreceding policy is met, migration can be postponed to a

    specific time point.

    The time-based migration policy and intelligent migration policy cannot be used

    simultaneously. For the time-based migration policy, "Certain time point each day"and "Certain time point each week" cannot be used at the same time. The user can onlyselect either for the condition of triggering migration. Multiple options of the

    intelligent policy can be chosen simultaneously. The options can be combined to meet

    different requirements of migration.

    3.5.2 Space Reclamation Policy

    To fully utilize the cache, the VTL3500 provides two space reclamation policies toensure the space utilization: 1) intelligent reclamation; 2) reclamation uponde-duplication. Table 3-4 lists the reclamation methods.

    Table 3-4Reclamation methodsPolicy Name Description

    Intelligent

    reclamation

    The space occupied by the virtual tapes of the VTL3500 used as

    the cache is reclaimed. That is, the data on these virtual tapes is

    deleted and only the indexes to the physical tapes are reserved.

    Reclamation

    upon

    de-duplication

    Through the de-duplication algorithm, the duplicate data is

    deleted to release the storage space of the VTL3500.

    Table 3-5 lists the methods of triggering space reclamation.

    Table 3-5Methods of triggering space reclamation

    Policy Name Description

    Immediate

    reclamation

    After the migration is complete, the space originally occupied

    by the migrated data is reclaimed.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    17/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 17 of 25

    Policy Name Description

    Watermark When the remaining disk space accounts for less than 10% of

    the total space, the space originally occupied by the migrateddata is reclaimed. This trigger method is available only under

    intelligent reclamation.

    Storage period When the backup data is stored on the VTL3500 for a specifiedperiod, the space occupied by the backup data is reclaimed.

    Users do not need to worry about data loss. The VTL3500 only reclaims the space

    originally occupied by the migrated data. The space occupied by the other data will notbe reclaimed. Thus, the data security and consistency are ensured.

    3.6 Tape EncryptionTo ensure the security of the data stored on tapes, the VTL3500 encrypts tapes when

    data is transferred to physical tape libraries.

    Figure 3-4Tape encryption

    The tape encryption function of the VTL3500 uses the 128-bit Advanced Encryption

    Standard (AES) encryption algorithm. The user can create one or more tape keys toencrypt the data exported to physical tapes and decrypt the data imported to virtual

    tapes. The data on the tape library is inaccessible unless the correct key has been used

    to decrypt the data. Moreover, the user can set passwords for each key. Only when thecorrect password is provided can the key name, password, and password hint be

    changed and can the key be deleted and exported.

    When data is being exported to a physical tape library or during the IP replication, the

    user can employ a created key to encrypt the data, thus ensuring the security of thetape data. Even if tapes are lost or stolen or data packets are intercepted during the

    transportation, the user does not need to worry the data security. If the correct key is

    not used, the data on tapes are totally inaccessible.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    18/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 18 of 25

    3.7 De-duplication

    The VTL3500 only saves one copy of the backup data in the Single Instance

    Repository (SIR). The redundant part of the original data is replaced by the index to

    the single instance. This index can be used to read and recover data.

    3.7.1 Types of De-duplication

    According to where it happens, de-duplication can be divided into front-end deletion

    and back-end deletion.

    The front-end deletion means that the duplicate data is deleted on backup servers. Thetype of deletion can reduce the amount of transferred data and the occupation of

    bandwidth. This type of deletion, however, has the following disadvantages:

    The front-end deletion occupies the CPU resources of backup servers anddegrades the backup performance. This type of deletion is considered

    unacceptable by many users because of the long backup window. The front-end deletion has a low deletion ratio of duplicate data. De-duplication

    upon a backup server can only delete the duplication data on the sole server. That

    is to say, the front-end deletion cannot work upon the duplicate data in theacross-server system.

    To implement the front-end deletion, the user needs to replace the existingbackup software and reconfigure the client. As a result, the investment on the

    existing backup software is wasted and the current applications are affected.

    Compared with the front-end deletion, the back-end deletion happens on the storageclient. This type of deletion cannot reduce the amount of the data transferred between

    backup servers and storage devices, but it can solve the preceding three problems

    found in the front-end deletion. The back-end deletion brings no impact on backupservers, provides a higher deletion ratio of duplicate data, and requires no change of

    the existing backup network to protect users' investment. According to when it

    happens, the back-end deletion can be divided into in-line deletion and post-processingdeletion.

    The in-line deletion means that de-duplication works the instant data reaches the

    storage device. Then the data after de-duplication is backed up on the storage

    media.

    The post-processing deletion means that de-duplication happens on the storagedevice after the backup operation is complete. Obviously, the former deletion

    degrades the backup performance, whereas the latter deletion prolongs the timewhen the storage device processes the data.

    The VTL3500 adopts the advanced back-end post-processing de-duplication that does

    not affect the backup performance. The VTL3500 provides a 20:1 deletion ratio ofduplicate data and increases the utilization of the storage space. During remote

    replication, the VTL3500 transfers the data after de-duplication and reduces thebandwidth occupation. The VTL3500 provides a raw capacity of up to 24 TB. By

    enabling the de-duplication function, the user can obtain a raw capacity equal to 480TB. As a result, the return on investment increases greatly.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    19/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 19 of 25

    3.7.2 Procedure for De-duplication

    When the conditions for triggering de-duplication are met, the VTL3500 scans the tape

    data, which includes the metadata and the file data to be backed up. The file data is theprocessing objective. The VTL3500 partitions the file data into data blocks of the samesize according to a specific algorithm, and performs de-duplication in steps.

    Figure 3-5Initializing the tape data

    Figure 3-6Processing procedure

    Step 1 Read the data blocks and calculate the index value (content identity) of each datablock.

    Step 2 Compare the index value with all the values in the original index table.

    1) If the index value of the data block already exists in the index table, it indicates thata data block of the same content already exists in the SIR. At that time, the VTL3500

    deletes this data block and replace it with a link to the SIR.

    2) If the index value of the data block does not exist in the index table, the VTL3500

    saves this index value into the index table, saves the data block into the SIR, andgenerates a link (specifies the location of the data block in the SIR) into the SIR.

    Step 3 Repeat the preceding steps until all the data blocks are processed.

    ----End

    After all the data blocks are processed, the file data is extracted and added into the SIR.The file data zone on the tape only saves the links to the locations of the data blocks in

    the SIR. When the data needs to be accessed, the data blocks can be quickly readaccording to these links.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    20/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 20 of 25

    Figure 3-7Tape data after de-duplication

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    21/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 21 of 25

    4 ExperienceThe VTL3500 is a low-end VTL designed for small and medium businesses (SMBs).According to their requirements on data protection, the VTL3500 provides a series of

    technologies and functions to help SMBs solve the problems in data backup and

    disaster recovery. The VTL3500 increases the return on investment and reduces theTCO of the IT infrastructure.

    4.1 Powerful Virtualization CapabilityWith the powerful virtualization capability, the VTL3500 can take over physical tape

    libraries and tape drives of most mainstream tape library suppliers, thus realizing the

    hierarchical storage. The user can obtain both the performance of disk-based backupand the long-period archiving feature of tapes.

    The VTL3500 can virtualize multiple tape libraries/tape drives without bringing anyextra cost. The user can assign each server a specific tape library that has its own tape

    drives, thus improving the management and backup.

    The VTL3500 can be seamlessly deployed in the existing backup system without

    needing any change of the exiting backup policies and configurations. The backupservers can manage the VTL3500 in the similar way for physical tapes.

    4.2 On-Demand Capacity Expansion

    The on-demand capacity expansion function of the VTL3500 can help automaticallyallocate the storage space to increase the utilization of the disk space. As for physical

    tape libraries, the media management causes space waste (50% or more of the totalspace) because a large number of tapes cannot be fully written.

    The VTL3500 can implement capacity expansion through the addition of disks.Therefore, the user does not need to purchase a high configuration of disks like tape

    libraries. The user can add new disks incrementally as the data amount grows. Thus,the initial procurement cost is much lower than that of tape libraries. For the routine

    maintenance, the cost is much more lower, for disks are free of the various mechanical

    faults found in tape libraries. The VTL3500 uses SATA disks to provide a highcapacity-price ratio without degrading the reliability and performance. For most users,

    the VTL3500 needs a lower investment than physical tape libraries.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    22/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 22 of 25

    4.3 Application Scenario

    The VTL3500 has the following four typical application scenarios.

    4.3.1 Integrated and Economical Backup

    The data on the heterogeneous hosts is backed up to the VTL3500 over an FC/IP SAN.

    The performance of the multi-host concurrent backup can reach up to 1.44 TB/h. Thedata duplication deletion ratio (20:1) and compression ratio (2:1) of the VTL3500

    increases the utilization of the storage space and meets the requirement of the everincreasing backup data.

    4.3.2 Hierarchical Backup

    The production data is backed up to the VTL3500 via backup servers. Through the

    auto archiving/tape caching function, the user can export the data to physical tape

    libraries, thus implementing data archiving. When the data needs to be recovered, theuser can read the backup data directly from the VTL3500 to achieve a high recovery

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    23/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 23 of 25

    performance. The user can also read the archived on tapes of the physical tape librariesto implement data recovery.

    4.3.3 Remote Disaster Recovery

    The data center and the remote disaster recovery center are respectively deployed withone VTL3500. Through remote replication, data is copied to the disaster recovery

    center over the wide area network (WAN). The remote replication function of the

    VTL3500 supports incremental replication. In addition, it supports the data replicationafter de-duplication to reduce the bandwidth occupation. The VTL3500 can encrypt

    the data for remote replication to ensure the security of data transfer.

    4.3.4 Distributed Backup

    For a multi-branch organization, one or more VTL3500s can be deployed according tothe data amount of each node. For a branch that has a small data amount, data can be

    backed up to the data center over the WAN. For a branch that has a large data amount,

    data can be first backed up to the local VTL3500 to achieve a high backup andrecovery performance. Then, the data is copied to the data center through remotereplication to back up and manage data in a centralized manner. The data center can be

    deployed with multiple VTL3500s that comprise a storage pool. The VTL3500 has aunified management interface, through which the resources in the storage pool can be

    centrally managed.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    24/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 24 of 25

    5 ConclusionThe VTL technology is indispensable to the storage market, for this technologyprovides a high backup/recovery performance and can be combined with physical tape

    libraries to implement the hierarchical storage. The de-duplication technology is an

    emerging storage technology, which can help solve the problem of soaring costs due toexplosive data growth.

    The VTL3500 developed by Huawei Symantec inherits the advantages of physicaltape libraries and disk arrays. At the same time it eliminates the hardware deficiencies

    of tape libraries, the VTL3500 provides a higher backup/recovery performance thandisk arrays, thus meeting the requirements for various backup windows.

    The powerful virtualization capability meets users' requirements for sharingbackup devices.

    The on-demand capacity expansion and high duplication deletion ratio improves

    the utilization of the storage space, thus increasing the return on investment andreducing the TCO.

    The tape caching function can be used to easily deploy the hierarchical storage

    system, and implement automatic backup and hierarchical storage. At the same

    time, the tape encryption technology eliminates the risks of data leakage andsafeguards the archived data.

    Combined with the de-duplication technology and tape encryption technology,

    the IP replication function can be used to realize low bandwidth-occupation

    replication. The user can construct a reliable and safe remote data-level disasterrecovery with low investment.

  • 8/9/2019 Oceanspace VTL3500V100R002 White Paper

    25/25

    VTL3500 Product

    Description

    1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential

    Copyright Huawei Symantec Technologies Co., Ltd. Page 25 of 25

    6 Acronyms and AbbreviationsTable 6-1List of Acronyms and Abbreviations

    Abbreviation Full Spelling

    VTL Virtual Tape Library

    SIR Single Instance Repository