oceanspace vtl3500v100r002 white paper
TRANSCRIPT
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
1/25
Doc. code
Oceanspace VTL3500 White Paper
Issue 1.0
Date 2010-05-18
Huawei Symantec Technologies CO., LTD.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
2/25
Copyright Huawei Symantec Technologies Co., Ltd. 2009. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without
prior written consent of Huawei Symantec Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei Symantec trademarks are trademarks of Huawei Symantec Technologies
Co., Ltd. All other trademarks and trade names mentioned in this document are the property of their
respective holders.
Notice
The purchased products, services and features are stipulated by the commercial contract made
between Huawei Symantec and the customer. All or partial products, services and features
described in this document may not be within the purchased scope or the usage scope. Unless
otherwise agreed by the contract, all statements, information, and recommendations in thisdocument are provided AS IS without warranties, guarantees or representations of any kind, either
express or implied.
The information in this document is subject to change without notice. Every effort has been made in
the preparation of this document to ensure accuracy of the contents, but all statements, information,
and recommendations in this document do not constitute the warranty of any kind, express orimplied.
Huawei Symantec Technologies Co., Ltd.
Address: Building 1
The West Zone Science Park of UESTC, No.88, Tianchen Road
Chengdu, 611731
P.R.China
Website: http:// www.huaweisymantec.com
Email: [email protected]
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
3/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 3 of 25
Contents
1 Executive Summary ............................................................................................................4
2 Introduction .........................................................................................................................5
2.1 Background of the VTL Technology....................................................................................................5
2.2 De-duplication......................................................................................................................................9
3 Solution...............................................................................................................................10
3.1 Advantages of the VTL3500 ........................................................... ................................................... 10
3.2 Powerful Virtualization Capability........................................................................ ............................. 11
3.3 On-Demand Capacity Expansion ............................................................... ........................................ 11
3.4 IP Replication.....................................................................................................................................12
3.5 Tape Caching......................................................................................................................................14
3.6 Tape Encryption ......................................................... ................................................................. .......17
3.7 De-duplication....................................................................................................................................18
4 Experience...........................................................................................................................21
4.1 Powerful Virtualization Capability.....................................................................................................21
4.2 On-Demand Capacity Expansion ............................................................... ........................................ 21
4.3 Application Scenario ............................................................ .............................................................. 22
5 Conclusion..........................................................................................................................24
6 Acronyms and Abbreviations.........................................................................................25
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
4/25
1 Executive SummaryAs the data amount increases rapidly and the market competition heats up, customers
have higher requirements on the reliability and performance of data backup and
recovery. The traditional physical tape library technology is already unable to meetcustomer requirements. Under the background that the virtual storage technologydevelops and SATA hard disks emerge, the virtual tape library (VTL) becomes amature and cost-effective kind of data backup device. The VTL uses disk arrays as the
storage device and virtualizes the existing hard disks as the mainstream tape librarythrough the built-in virtualization software. The VTL combines multiple advantages,
such as the high reliability, high performance, ease-of-management of disk devices,
and mature media management of tape devices. Therefore, the VTL has attracted moreand more attention.
Since the SATA disk is advantageous in the cost and performance, more and moreusers adopt disk to disk (D2D) backup to construct a fast and reliable backup system.
The capacity of the disk backup device, however, tends insufficient as the data amount
soars. Large amounts of duplicate data consumes much of the capacity. Under thiscircumstance, the de-duplication technology comes into being and has become hot in
recently years. De-duplication can greatly reduce the amount of the data that needs tobe stored. In addition, de-duplication can dramatically decrease the amount of the datareplicated between remote nodes, thus reducing the occupation of bandwidth.
This document is going to introduce some key technologies of VTL and analyze their
values for customers. These technologies include virtualization, on-demand capacityexpansion, multi-stream backup, FC/IP SAN backup, remote IP replication, tape
caching, and de-duplication.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
5/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 5 of 25
2 Introduction2.1 Background of the VTL Technology
2.1.1 Deficiencies of the Physical Tape Library
As informatization develops and data grows explosively in recent years, more andmore users recognize the importance of data protection and purchase tape libraries and
data backup software to construct their own data backup systems. By using tapelibraries, users can mange the media comprehensively and thoroughly, and can use the
backup software to realize automatization. Tapes are easy to be preserved offline, and
can be taken out of the physical tape library and transported to another site toimplement remote disaster recovery. Now, users, however, find that at the same timethe automated data backup system brings convenience, it also poses new problems thatthreaten the practicability of the existing data backup solutions.
Reliability
Figure 2-1 shows the analysis of backup failures by IDC.
Figure 2-1Analysis of backup failures
"What are the most common causes of a backup failure?" --
Percent of All Users
(multiple responses accepted), N = 222
3%
3%
32%
40%
47%
53%
59%
0% 10% 20% 30% 40% 50% 60% 70%
Don't Know
Other
Network Failure
Software Failure
Human Error
Hardware Failure
Media Failure
A tape library consists of mechanical parts. The tape drive boasts hundreds of
thousands of hours of operating life, but it often becomes faulty within one or two
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
6/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 6 of 25
years in the practical use. The robot of the tape library has a high fault probability. Alarge proportion of the users of low-end and mid-range tape libraries suffer from at
least one backup failure due to the fault of the tape library. The tape library is
vulnerable to failures resulting from the external environment, such as dust andmoisture. The combination of components degrades the overall system availability.
The tape library is fault intolerant. The whole tape library runs abnormally and eventhe whole backup system breaks down when a single failure of the tape drive, tape slot,robot, controller, barcode scanning system, or tape incoming and ejecting device. Thelow availability heightens the maintenance cost. According to the statistics, in 2002,the average yearly maintenance cost of tape libraries accounted for 10% to 15% of the
procurement cost. What bores users more is that the repair of tape libraries must be
performed by professionals. The long repair period messes the daily operation up. Thatcompels users to purchase multiple tape drives, which are the major expensive parts of
a tape library. As a result, users' total cost of ownership (TCO) increases.
To improve the reliability of the tape-based storage, many users adopt the tape
replication method to implement dual backups of data. This time and labor consumingmethod brings extra operation costs. In essence, backup itself is not the objective.
Backup only counts when it can ensure data recovery. The reliability of the backup
media determines the reliability of backup data. Tapes are exposed to the air andvulnerable to electromagnetism, dust, moisture, magnetic particles, conglutination, and
moldiness. Users sometimes find the tapes damaged before starting data recovery.
Performance
As the service requirements grow, each system requires shorter backup windows. The
performance bottleneck of tape devices exists in data reading and writing, and also
tape loading, which sometimes spends more time than data reading and writing. If the
data on multiple tapes needs to be recovered, a complete system recovery takes a longtime and has a very low recovery performance. If users want to back up more data in ashorter time, users need to install more tape drives in their tape libraries. That meanshigher expenses, higher fault probabilities, and higher investment as well when the
tape technology is updated. In fact, due to the limitations of the design of the tape
library, the number of the tape drives that can be added is limited.
Scalability
On the one hand, the data amount increases ceaselessly; on the other hand, the
expansion space for the tape library is limited. If the user purchases a large tape library(with over 200 slots for example), the procurement cost is very high even if arelatively low configuration is chosen.
Return on Investment
As the data amount increases, each system requires shorter backup windows. Under
the current backup systems, data backup and recovery take more and more time.Consequently, uses are required to increase the performance and capacity of theexisting tape libraries. The results, however, are higher hardware costs, more difficult
media management, higher software costs, higher fault probabilities, and highermaintenance costs. Moreover, the return on investment is reduced because of the low
utilization of tapes and tape libraries, high maintenance costs of tape libraries, and
short lifecycle of tape drive technologies.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
7/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 7 of 25
Eventually, users will find that the investment on data protection is beyond expectedand the return is far from expected, and that the backup system itself increases the
workload of maintaining the whole storage system. That has become a common
problem for many organizations.
2.1.2 Disk-based BackupFaced with the preceding problems, some users and consultants start to put their eyes
on the disk-based backup. As SATA disks become popular, the disks of large capacityhave a low price and high performance.
Under this background, the backup solution based on disk arrays comes into being,which is realized in the following methods:
Employing the disk arrays of standard FC, SAS, or iSCSI interfaces and
connecting SATA disks of a high capacity and low cost directly to the backup
server
Using the space of NAS for backup Adopting the mainstream backup software that supports disk-based backup
This type of backup solution uses disks as the storage device, which is formatted intofile systems. This type of backup solution solves many problems found in the
tape-based solution:
Eliminating the reliability limitations of the tape library and media
Avoiding the effect of tape loading and unloading on the performance (the
sequential read/write performance equals or exceeds that of mid-range tape
libraries)
Increasing the utilization of storage space greatly
Facilitating the maintenance and reducing the maintenance cost (disk arrays arecommon and can be easily managed by the administrators that do not haveprofessional knowledge)
Theoretically, the investment is low, for the user only needs to purchase one storage
array. In practical, however, the user finds that this backup solution based on diskarrays is not perfect. This solution is disadvantageous in the following aspects.
Sharing
If the user implements the LAN-free backup in the multi-server environment, the
complexity and cost of configuration increase.
Generally speaking, only when a file system is set up on a disk array can this disk
array be identified and used by the backup software. Moreover, most file systemscannot be shared by multiple servers, whereas these file systems can be shared by
multiple tape libraries.
That is to say, if the user wants to make the same storage array shared by multipleservers over a SAN, just like for tape libraries, the user must set up multiplelogical devices in this storage array and assign each logical device to each backup
server.
A series of management problems face the user consequently:
How to determine the number of disks assigned to each server?
How to expand the capacity online when the allocated capacity is insufficient?
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
8/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 8 of 25
How to reduce the capacity online when the allocated capacity isoverabundant?
Must this function be realized through the expensive volume management
software? Some types of backup software support the backup storage pool, but
these types of backup software can only support data sharing in the sameplatform and cannot support data sharing across platforms. Moreover, this
function requires additional data sharing options and supports limited number ofplatforms. The function still needs to be improved.
Security
This type of storage device is simply based on disk arrays and works as a file
system in the server. This file system can be operated by any tool and accessed by
any users. One unintentional rm r ordel * command can spoil all backups. Allin all, the backups are vulnerable like the files in the file system. That means
many risks:
Will data be lost due to misoperations of the administrator or maliciousdeletion by others?
Will data be copied by others and recovered on another computer, thus
causing the leak of confidential information?
Can the backup data not be used for data recovery due to viruses?
Performance
First, the file system itself may be a performance bottleneck. Especially whenprocessing multiple tasks and processes, the file system probably becomes a
performance bottleneck of the backup system.
Second, the file system cannot solve the problem of disk fragments. Disk
fragments degrade the performance of the file system. When a large amount ofdata is processed, the problem of disk fragments can hardly be solved.
FunctionThe backup management software is specially designed for tape libraries.
Currently, most types of backup software support the use of disk arrays as thebackup device, but the functions are different from under the tape-based
circumstance. These differences can cause some serious problems:
The existing backup environment must have the current backup policy
changed. The seamless integration is unrealizable.
The data hardware compression function cannot be realized under thedisk-based backup. The backup performance or storage space cannot be
optimized effectively.
The data backups saved on disk arrays cannot be copied via the media for
remote data storage. Therefore, the advantages of tape in the flexibility, suchas offline storage, data migration, and remote disaster recovery, are lost.
According to the preceding analysis, the use of disk arrays as the backup devicesolves some problems found in tape libraries, but it also brings new problems,
which are more difficult to conquer.
In fact, the applications that use disk arrays as the backup device are restricted to
use disks as the cache for the tape-based backup. This function is supported bythe mainstream backup software, such as the Disk Staging of VERITAS
NetBackup and the Disk Backup Option of Legato NetWorker. That is to say, the
backup operation is implemented on disks within the time window, and then thedata is migrated from disks to the tapes in the background. This solution has also
posed the preceding problems. Uses must rely on tape libraries to implement data
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
9/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 9 of 25
storage. This solution is only a supplement to the disk-based backup and used foraccelerating backup and recovery.
Management
The disk-based backup is one of the functions of the backup software. The types
of backup software from different vendors implement disk-based backup indifferent methods and no universal standard exists. As a result, under theenvironment of multiple backup systems, the user cannot realize centralizedbackup management or protect their investment.
2.1.3 VTL Function Provided by the Backup Software
At present, some types of backup software have the VTL function, such as the Virtual
Disk Library of BakBone NetVault. The backup server is installed with a VTLsoftware module, through which part of the storage space of the backup server is
virtualized into the tape library.
The solution is easy to implement and also cheap. It provides the basic VTL function
and partly solves the performance problem of the tape-based backup. This solutionstarts to be adopted by some users.
This solution, however, has some obvious disadvantages, for example, sharing,management of LAN-free backup, security, and high consumption of system resources
by the backup server. In a word, the solution can only be considered as a supplement
to the disk-based backup method, and is mainly used as the cache of the tape-basedbackup. This solution cannot work independent of tape libraries.
According to the previous analysis, when the VTL function, which is achieved throughphysical tape libraries, disk arrays, and backup software, is used to back data up,
various problems rise. The VTL technology can solve these problems effectively.
2.2 De-duplication
As the Internet develops, large organizations, governments, and finance institutionshave increasingly growing data centers. The increasing requirement for storage spaceboosts the storage cost. The IT personnel must deal with the top three issues: saving
energy, reducing power consumption, and lowering the system cost. As a hottechnology in the storage field, de-duplication solves these problems.
De-duplication is developed for reducing space occupation by duplicate data and thuslowering costs and energy consumption. When adopting the de-duplication technology,
the user must consider the following factors: Effect of de-duplication on the backup performance
De-duplication ratio
Efficiency of remote replication
Total benefits
Scalability
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
10/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 10 of 25
3 SolutionThe Oceanspace VTL3500 virtual tape library (hereinafter referred to as the VTL3500)
is a backup solution developed by Huawei Symantec Technologies Co., Ltd.(hereinafter referred to as Huawei Symantec) for the low-end market. The VTL3500
virtualizes SATA disk arrays into a physical tape library through the software. TheVTL3500 provides a high performance and supports seamless deployment. Moreover,the VTL3500 supports de-duplication and integrated backup software to reduce users'
investment on the IT infrastructure.
3.1 Advantages of the VTL3500
3.1.1 VTL3500 vs. Physical Tape Library
By using the virtualization technology, the VTL3500 emulates the parts of a physical
tape library. The robot, drive, tapes, and slots of the physical tape library exist in thelogical manner and do not need to be maintained manually. This manner avoids the
inherent mechanical deficiencies such as tape location and tape errors, and the short
service life problem resulting from being exposed to the air and being vulnerable toelectromagnetism, dust, moisture, magnetic particles, conglutination, and moldiness.
The costs of managing the media and maintaining the device decrease greatly and thereliability of backup data increases. The VTL3500 stores backup data based on
high-speed disks and high-reliability RAID technologies. The VTL3500 improves theperformance of backup and recovery, shortens greatly the time of backup and recovery,provides a high scalability, and increases the return on investment.
3.1.2 VTL3500 vs. Disk-based Backup
After the VTL3500 creates VTLs and assigns them to the backup servers, the backupservers recognize them as physical devices and share them between each other. On the
use and allocation of storage space, even when the physical libraries are sharedbetween multiple servers, the user can create new tapes to be invoked by multiplebackup servers according to the share mechanism specified by the backup software.
Therefore, the user does not need to worry about how to allocate proper space todifferent backup servers.
Under the disk-based backup, the backup data is saved in the file system and can beaccessed by any user and virus. The disk-based backup cannot prevent human
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
11/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 11 of 25
misoperations, vicious destroy, and virus attacks. The VTL3500 simulates the dataread/write method of physical tape and stores the backup data in the raw device. Thus,
users cannot operate the backup data directly and viruses cannot destroy the data. The
VTL3500 solves the security problems found in the disk-based backup.
In the disk-based backup, a file system needs to be created in the storage unit. Data isread and written first through the I/O interface of the file system and then through theinvoked I/O interface of the raw device. During data transfer, the overhead of invokingthe two interfaces degrades the system performance. In addition, the file system itselfmay be a performance bottleneck. The VTL3500 transfers data through directly
reading and writing the raw device. This method fully utilizes the high speed of the
raw device and increases the transfer efficiency.
3.1.3 VTL3500 vs. VTL Module of the Backup Software
As an extension of the disk-based backup, the VTL module of the backup softwarecannot solve the management and security problems found in the LAN-free backup.
This module consumes a large amount of server resources and even may degrade thebackup performance. In addition, this module cannot work independent of physicaltape libraries to meet the backup requirement in the complicated environment. The
VTL3500 has the independent hardware and functional components. It fully emulates
physical tape libraries and works as a backup device independent of physical tapelibraries. At the same time the VTL3500 helps to realize effective data backup and
recovery, it hardly occupies any server resources.
3.2 Powerful Virtualization Capability
The VTL3500 can virtualize 16/64/128 tape libraries/tape drives and more than 60
types of tape libraries and tape drives from the mainstream vendors such as HP, IBM,and Quantum. The backup servers consider the VTL3500 the same as physical tape
libraries. Therefore, the VTL3500 can be seamlessly deployed into the existing backup
system that is based on physical tape libraries.
3.3 On-Demand Capacity Expansion
Microcosmically, the VTL3500 uses the Capacity-on-Demand technology. The usercan set a small initial capacity for the virtual tapes. As more data is written to the
virtual tapes, the VTL3500 automatically allocates more space to the virtual tapes. As
for physical tape libraries, the media management causes space waste (50% or more ofthe total space) because a large number of tapes cannot be fully written. Compared
with physical tape libraries and disks, the VTL3500 increases the utilization of storagespace dramatically.
Macroscopically, the VTL3500 manages disks in the common way. New disks can beeasily added to expand the capacity. Therefore, the user does not need to purchase a
high configuration of disks like tape libraries. The user can add new disksincrementally as the data amount grows. Thus, the initial procurement cost is much
lower than that of tape libraries. For the routine maintenance, the cost is much lower,
for disks are free of the various mechanical faults found in tape libraries.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
12/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 12 of 25
3.4 IP Replication
Replication is a common technology used for disaster recovery. Data replication refers
to copying data from one medium onto another medium and generating a data copy by
using the data replication software.The traditional disaster recovery generally uses the transportation method. The backupsoftware copies data onto a physical tape library, and the physical tape library is
transported to a remote place for preservation. During the transportation, tapes may get
lost or damaged; thus, the effect of disaster recovery cannot be ensured.
Over an IP network, the local VTL3500 copies data on virtual tapes to the remote
VTL3500. Through this method, the VTL3500 utilizes the convenience and high speedof the network to save the transportation cost. The local VTL3500 encrypts the tapedata by using the encryption algorithm before data transfer. Then the remote VTL3500decrypts the data after receiving it. As a result, the data security during transfer is
ensured.
The VTL3500 provides four options for the IP replication:
Remote Copy
Automatic Replication
IP Replication
Replication upon De-duplication.
Among the four options, three support automatic replication and one supports manual
replication. Table 3-1 lists the four options of IP replication.
Table 3-1Four options of IP replication
Option Type Description
Auto
ReplicationAutomatic When a virtual tape is exported from the VTL, the
system automatically copies the data on the virtualtape to another VTL3500.
Remote
CopyManual The data on a virtual tape is copied to another VTL as
required.
IP
ReplicationAutomatic Within the specified interval and according to the
user-defined policy, the changed data on the primaryvirtual tape is copied to the same or another VTL.
ReplicationuponDe-duplicati
on
Automatic When the de-duplication function is enabled, thedeletion policy is integrated with the replicationpolicy. The changed data is copied to another
VTL3500 according to the replication policy.
These four options differ mainly in the replication triggering mechanism.
Auto Replication is triggered by the backup software. If the VTL is set Auto
Replication, the replication of the virtual tape is triggered when the VTL receivesthe eject command from the backup software (For a physical tape library, the
eject command for the backup software means to eject the tape out of the physical
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
13/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 13 of 25
tape library; for a virtual tape library, this command means to put the virtual tapeinto the virtual vault).
Remote Copy is triggered manually. The user can copy the data on the selected
disk to the VTL3500 in the disaster recovery center. Then, the VTL3500 in the
disaster recovery center allocates the space equal to that of the source tape to thetarget disk, and sets the same barcode. When the copy is complete, the system
automatically promotes the disk to the virtual vault of the remote VTL3500 forfuture use. Through the Remote Copy function, the whole virtual tape can becopied to the remote VTL3500, without the need of creating a new virtual tape in
the remote VTL3500. Before the copy, any virtual tape in the remote VTL3500must not have the same name as any virtual tape in the local VTL3500.
IP Replication is triggered based on the policy.
The policy can be:
Data increment-based replication policy.
The VTL3500 can identify the amount of the data backed up to the tape each
time. If the data increment exceeds the pre-set threshold, the replication isautomatically triggered after the copy.
Time point-based replication.
The user can specify the time point for the first replication and the replication
interval for each virtual tape. Then, the data on the virtual tape will be copiedaccording to the specified time point. The remote virtual tape that adopts IPReplication must be promoted manually before use.
Replication upon De-duplication is manually triggered based on the policy.
The triggering condition can be the specific date or time point, or upon the
completion of the backup operation. The local VTL3500 transfers the data afterde-duplication to the remote VTL3500 over an IP network. After de-duplication,
data blocks instead of data are transferred during the IP replication. Thebandwidth occupation decreases and the transfer efficiency increases. As a result,
the remote data-level disaster recovery can be implemented with low costs, easy
deployment, and high efficiencies.
The remote IP replication has the following scenarios:
One VTL3500 copies data to the remote VTL3500.
Figure 3-1Networking of one-to-one remote disaster recovery
Multiple VTL3500s copy data to the remote VTL3500.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
14/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 14 of 25
Figure 3-2Networking of many-to-one remote disaster recovery
3.5 Tape Caching
Tape Caching is an advanced function of the VTL3500. This function uses thehigh-speed VTL3500 as the high-speed cache of the physical tape library. The backup
data is written to the VTL3500 first. After the backup operation is complete, theVTL3500 migrates the backup data to the physical tape library according to the presetpolicy. In this way, the hierarchical storage architecture forms.
The VTL3500 can shorten the backup window and quickly recover data. Physical tapelibraries are suitable for large-capacity offline data. Therefore, the VTL3500 can be
combined with physical tape libraries to implement the hierarchical storage.
The principles of the hierarchical storage include:
The data that needs to be archived for a long time is stored on the physical tape
libraries.
The frequently-used data is stored in the VTL.
The VTL takes over the physical tape libraries.
Physical tape libraries have the slow backup speed and disks are unsuited forseldom-accessed data for a long time. The hierarchical storage eliminates theshortcomings of physical tape libraries and disks.
Figure 3-3 shows the networking of the hierarchical storage.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
15/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 15 of 25
Figure 3-3Networking of the hierarchical storage
Data can be recovered directly from the VTL or physical tape library. To fully utilizethe high-speed cache, the VTL3500 provides various migration triggering policies andspace reclaiming policies.
3.5.1 Data Migration Policies
Tape Caching provides two policies for triggering data migration between theVTL3500 and the physical tape library: 1) time-based migration; 2) intelligentmigration. Table 3-2 and Table 3-3 list the two policies.
Table 3-2Time-based migration policy
Policy Name Description
Certain time
point each day
Migration is performed in a one-day cycle. The VTL3500 starts
data migration at the specified time point each day.
Certain time
point each week
Migration is performed in a one-week cycle. The VTL3500
starts data migration at the specified time point each day from
Monday to Saturday.
Table 3-3Intelligent migration policy
Policy Name Description
And/Or Conjunction/disjunction of the intelligent policy. The option
And means migration is triggered only when all conditions are
met; or means that migration is triggered when any condition ismet.
Data storage
period
Migration is triggered when the backup data is stored on the
VTL3500 for a specified period.
Watermark Migration is triggered when the usage of the disk space of the
VTL3500 reaches 90%.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
16/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 16 of 25
Policy Name Description
After backup
(tape space usedout)
Migration is triggered after each backup. "Tape space used out"
is the additional policy for "after backup". If the two options arechosen, the VTL3500 checks the usage of the virtual tape when
a virtual tape is ejected out of the tape drive. If the space of thistape is used out, migration is triggered.
Postponed to a
certain time
point
Migration is postponed to a specific time point after the
condition is met this time. This policy must be used together
with the preceding three policies. When the condition of anypreceding policy is met, migration can be postponed to a
specific time point.
The time-based migration policy and intelligent migration policy cannot be used
simultaneously. For the time-based migration policy, "Certain time point each day"and "Certain time point each week" cannot be used at the same time. The user can onlyselect either for the condition of triggering migration. Multiple options of the
intelligent policy can be chosen simultaneously. The options can be combined to meet
different requirements of migration.
3.5.2 Space Reclamation Policy
To fully utilize the cache, the VTL3500 provides two space reclamation policies toensure the space utilization: 1) intelligent reclamation; 2) reclamation uponde-duplication. Table 3-4 lists the reclamation methods.
Table 3-4Reclamation methodsPolicy Name Description
Intelligent
reclamation
The space occupied by the virtual tapes of the VTL3500 used as
the cache is reclaimed. That is, the data on these virtual tapes is
deleted and only the indexes to the physical tapes are reserved.
Reclamation
upon
de-duplication
Through the de-duplication algorithm, the duplicate data is
deleted to release the storage space of the VTL3500.
Table 3-5 lists the methods of triggering space reclamation.
Table 3-5Methods of triggering space reclamation
Policy Name Description
Immediate
reclamation
After the migration is complete, the space originally occupied
by the migrated data is reclaimed.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
17/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 17 of 25
Policy Name Description
Watermark When the remaining disk space accounts for less than 10% of
the total space, the space originally occupied by the migrateddata is reclaimed. This trigger method is available only under
intelligent reclamation.
Storage period When the backup data is stored on the VTL3500 for a specifiedperiod, the space occupied by the backup data is reclaimed.
Users do not need to worry about data loss. The VTL3500 only reclaims the space
originally occupied by the migrated data. The space occupied by the other data will notbe reclaimed. Thus, the data security and consistency are ensured.
3.6 Tape EncryptionTo ensure the security of the data stored on tapes, the VTL3500 encrypts tapes when
data is transferred to physical tape libraries.
Figure 3-4Tape encryption
The tape encryption function of the VTL3500 uses the 128-bit Advanced Encryption
Standard (AES) encryption algorithm. The user can create one or more tape keys toencrypt the data exported to physical tapes and decrypt the data imported to virtual
tapes. The data on the tape library is inaccessible unless the correct key has been used
to decrypt the data. Moreover, the user can set passwords for each key. Only when thecorrect password is provided can the key name, password, and password hint be
changed and can the key be deleted and exported.
When data is being exported to a physical tape library or during the IP replication, the
user can employ a created key to encrypt the data, thus ensuring the security of thetape data. Even if tapes are lost or stolen or data packets are intercepted during the
transportation, the user does not need to worry the data security. If the correct key is
not used, the data on tapes are totally inaccessible.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
18/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 18 of 25
3.7 De-duplication
The VTL3500 only saves one copy of the backup data in the Single Instance
Repository (SIR). The redundant part of the original data is replaced by the index to
the single instance. This index can be used to read and recover data.
3.7.1 Types of De-duplication
According to where it happens, de-duplication can be divided into front-end deletion
and back-end deletion.
The front-end deletion means that the duplicate data is deleted on backup servers. Thetype of deletion can reduce the amount of transferred data and the occupation of
bandwidth. This type of deletion, however, has the following disadvantages:
The front-end deletion occupies the CPU resources of backup servers anddegrades the backup performance. This type of deletion is considered
unacceptable by many users because of the long backup window. The front-end deletion has a low deletion ratio of duplicate data. De-duplication
upon a backup server can only delete the duplication data on the sole server. That
is to say, the front-end deletion cannot work upon the duplicate data in theacross-server system.
To implement the front-end deletion, the user needs to replace the existingbackup software and reconfigure the client. As a result, the investment on the
existing backup software is wasted and the current applications are affected.
Compared with the front-end deletion, the back-end deletion happens on the storageclient. This type of deletion cannot reduce the amount of the data transferred between
backup servers and storage devices, but it can solve the preceding three problems
found in the front-end deletion. The back-end deletion brings no impact on backupservers, provides a higher deletion ratio of duplicate data, and requires no change of
the existing backup network to protect users' investment. According to when it
happens, the back-end deletion can be divided into in-line deletion and post-processingdeletion.
The in-line deletion means that de-duplication works the instant data reaches the
storage device. Then the data after de-duplication is backed up on the storage
media.
The post-processing deletion means that de-duplication happens on the storagedevice after the backup operation is complete. Obviously, the former deletion
degrades the backup performance, whereas the latter deletion prolongs the timewhen the storage device processes the data.
The VTL3500 adopts the advanced back-end post-processing de-duplication that does
not affect the backup performance. The VTL3500 provides a 20:1 deletion ratio ofduplicate data and increases the utilization of the storage space. During remote
replication, the VTL3500 transfers the data after de-duplication and reduces thebandwidth occupation. The VTL3500 provides a raw capacity of up to 24 TB. By
enabling the de-duplication function, the user can obtain a raw capacity equal to 480TB. As a result, the return on investment increases greatly.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
19/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 19 of 25
3.7.2 Procedure for De-duplication
When the conditions for triggering de-duplication are met, the VTL3500 scans the tape
data, which includes the metadata and the file data to be backed up. The file data is theprocessing objective. The VTL3500 partitions the file data into data blocks of the samesize according to a specific algorithm, and performs de-duplication in steps.
Figure 3-5Initializing the tape data
Figure 3-6Processing procedure
Step 1 Read the data blocks and calculate the index value (content identity) of each datablock.
Step 2 Compare the index value with all the values in the original index table.
1) If the index value of the data block already exists in the index table, it indicates thata data block of the same content already exists in the SIR. At that time, the VTL3500
deletes this data block and replace it with a link to the SIR.
2) If the index value of the data block does not exist in the index table, the VTL3500
saves this index value into the index table, saves the data block into the SIR, andgenerates a link (specifies the location of the data block in the SIR) into the SIR.
Step 3 Repeat the preceding steps until all the data blocks are processed.
----End
After all the data blocks are processed, the file data is extracted and added into the SIR.The file data zone on the tape only saves the links to the locations of the data blocks in
the SIR. When the data needs to be accessed, the data blocks can be quickly readaccording to these links.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
20/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 20 of 25
Figure 3-7Tape data after de-duplication
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
21/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 21 of 25
4 ExperienceThe VTL3500 is a low-end VTL designed for small and medium businesses (SMBs).According to their requirements on data protection, the VTL3500 provides a series of
technologies and functions to help SMBs solve the problems in data backup and
disaster recovery. The VTL3500 increases the return on investment and reduces theTCO of the IT infrastructure.
4.1 Powerful Virtualization CapabilityWith the powerful virtualization capability, the VTL3500 can take over physical tape
libraries and tape drives of most mainstream tape library suppliers, thus realizing the
hierarchical storage. The user can obtain both the performance of disk-based backupand the long-period archiving feature of tapes.
The VTL3500 can virtualize multiple tape libraries/tape drives without bringing anyextra cost. The user can assign each server a specific tape library that has its own tape
drives, thus improving the management and backup.
The VTL3500 can be seamlessly deployed in the existing backup system without
needing any change of the exiting backup policies and configurations. The backupservers can manage the VTL3500 in the similar way for physical tapes.
4.2 On-Demand Capacity Expansion
The on-demand capacity expansion function of the VTL3500 can help automaticallyallocate the storage space to increase the utilization of the disk space. As for physical
tape libraries, the media management causes space waste (50% or more of the totalspace) because a large number of tapes cannot be fully written.
The VTL3500 can implement capacity expansion through the addition of disks.Therefore, the user does not need to purchase a high configuration of disks like tape
libraries. The user can add new disks incrementally as the data amount grows. Thus,the initial procurement cost is much lower than that of tape libraries. For the routine
maintenance, the cost is much more lower, for disks are free of the various mechanical
faults found in tape libraries. The VTL3500 uses SATA disks to provide a highcapacity-price ratio without degrading the reliability and performance. For most users,
the VTL3500 needs a lower investment than physical tape libraries.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
22/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 22 of 25
4.3 Application Scenario
The VTL3500 has the following four typical application scenarios.
4.3.1 Integrated and Economical Backup
The data on the heterogeneous hosts is backed up to the VTL3500 over an FC/IP SAN.
The performance of the multi-host concurrent backup can reach up to 1.44 TB/h. Thedata duplication deletion ratio (20:1) and compression ratio (2:1) of the VTL3500
increases the utilization of the storage space and meets the requirement of the everincreasing backup data.
4.3.2 Hierarchical Backup
The production data is backed up to the VTL3500 via backup servers. Through the
auto archiving/tape caching function, the user can export the data to physical tape
libraries, thus implementing data archiving. When the data needs to be recovered, theuser can read the backup data directly from the VTL3500 to achieve a high recovery
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
23/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 23 of 25
performance. The user can also read the archived on tapes of the physical tape librariesto implement data recovery.
4.3.3 Remote Disaster Recovery
The data center and the remote disaster recovery center are respectively deployed withone VTL3500. Through remote replication, data is copied to the disaster recovery
center over the wide area network (WAN). The remote replication function of the
VTL3500 supports incremental replication. In addition, it supports the data replicationafter de-duplication to reduce the bandwidth occupation. The VTL3500 can encrypt
the data for remote replication to ensure the security of data transfer.
4.3.4 Distributed Backup
For a multi-branch organization, one or more VTL3500s can be deployed according tothe data amount of each node. For a branch that has a small data amount, data can be
backed up to the data center over the WAN. For a branch that has a large data amount,
data can be first backed up to the local VTL3500 to achieve a high backup andrecovery performance. Then, the data is copied to the data center through remotereplication to back up and manage data in a centralized manner. The data center can be
deployed with multiple VTL3500s that comprise a storage pool. The VTL3500 has aunified management interface, through which the resources in the storage pool can be
centrally managed.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
24/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 24 of 25
5 ConclusionThe VTL technology is indispensable to the storage market, for this technologyprovides a high backup/recovery performance and can be combined with physical tape
libraries to implement the hierarchical storage. The de-duplication technology is an
emerging storage technology, which can help solve the problem of soaring costs due toexplosive data growth.
The VTL3500 developed by Huawei Symantec inherits the advantages of physicaltape libraries and disk arrays. At the same time it eliminates the hardware deficiencies
of tape libraries, the VTL3500 provides a higher backup/recovery performance thandisk arrays, thus meeting the requirements for various backup windows.
The powerful virtualization capability meets users' requirements for sharingbackup devices.
The on-demand capacity expansion and high duplication deletion ratio improves
the utilization of the storage space, thus increasing the return on investment andreducing the TCO.
The tape caching function can be used to easily deploy the hierarchical storage
system, and implement automatic backup and hierarchical storage. At the same
time, the tape encryption technology eliminates the risks of data leakage andsafeguards the archived data.
Combined with the de-duplication technology and tape encryption technology,
the IP replication function can be used to realize low bandwidth-occupation
replication. The user can construct a reliable and safe remote data-level disasterrecovery with low investment.
-
8/9/2019 Oceanspace VTL3500V100R002 White Paper
25/25
VTL3500 Product
Description
1.0 (2010-05-18)Huawei Symantec Proprietary and Confidential
Copyright Huawei Symantec Technologies Co., Ltd. Page 25 of 25
6 Acronyms and AbbreviationsTable 6-1List of Acronyms and Abbreviations
Abbreviation Full Spelling
VTL Virtual Tape Library
SIR Single Instance Repository