sunday, may 24, 2015 data minimisation managing data growth while containing cost and carbon...

28
Sunday, March 27, 2022 Data Minimisation Managing Data Growth While Containing Cost and Carbon Footprint Ken Hall, Dimension Data

Upload: darleen-peters

Post on 18-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Tuesday, April 18, 2023

Data Minimisation

Managing Data Growth While Containing Cost and Carbon Footprint

Ken Hall, Dimension Data

Agenda

Introductions

Today’s data management challenges

Energy efficiency in the data centre

What is Data Minimisation?

Online Active Archiving

Backup Data De-Duplication

Data Minimisation effects

Developing the business case

Questions & Answers

Dimension Data - ‘Data Centre & Storage Solutions’

Network Integration

Microsoft Solutions Infrastructure

Microsoft Solutions Application Integration

Security

Managed Services

Customer Interactive Solutions

Data Centre & Storage Solutions – Availability, Compliance & Optimisation

• Storage Solutions – SAN, NAS, CAS

• Virtualisation Solutions – DR, Server & Desktop Consolidation

• Backup, Recovery & Archiving Solutions

• Data Centre Environmental’s – Power, Cooling & Rack Solutions

Key Technology Partners

• APC, Cisco, EMC, HDS, HP, IBM, Microsoft, NetApp, Quantum, Symantec, Sun

The Digital Universe is Rapidly Expanding

Source: IDC White Paper, "The Diverse and Exploding Digital Universe," March 2008

Ten-fold growth in five years!

1,773 exabytes

173 exabytes

Exa

byt

es

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

2006 2007 2008 2009 2010 2011

Amount of Digital Information Created and Replicated Each Year

Typical DD Customer – Exponential Data Growth

• Annual Compound Data Growth of 65%

• Daily Incremental and Weekly Full

• 2 Week Retention on Disk (3 Full’s - 10 Incr)

• 4 Week Retention on Tape

• 12 Monthly’s on Tape kept indefinitely

• Having to squeeze more into Backup Window

• B2D Requirement Growing Rapidly

• Backup Media Server/s Under Pressure

• Network Bandwidth Constraints

• Tape Infrastructure &Handling Costs Increasing

Coping with Information Growth in Today’s Economy

*“Global purchases of IT goods and services… will equal $1.66 trillion in 2009, declining by 3 percent after an 8 percent rise in 2008.”

Global IT Market Outlook: 2009, Forrester Research, January 12, 2009

In 2009, IT budgets are flat or declining* Escalating costs for primary storage

Difficulty meeting backup and recovery windows

Ensuring high availability of information

Providing timely access to historical information

Data Center Energy Use is Doubling

IT energy use has doubled since 2000 and will likely double again by 2011

Energy operating costs will soon exceed the cost of purchase for servers

Existing conservation technologies can reduce consumption to 2002 levels

Comparison of Projected Electricity Use, 2007 to 2011

Source: EPA report to Congress, 2007

0

20

40

60

80

100

120

140

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011

An

nu

al

Ele

ctr

icit

y U

se

(bil

lio

n k

Wh

/ye

ar

State of the art scenario

Historical energy use

Available Capabilities for Energy Efficiency

Improve Efficiency – Reduce Energy Consumption

INCREASEUTILIZATION

REDUCECAPACITY

Storage tiering

Virtual LUNS

File and e-mail tiering

Storage virtualisation

Large-capacity drives

Replication acrossstorage tiers

Snaps

Clones

Compression

De-duplication

Archiving

Server virtualisation

Data migration

Storage consolidation

Virtual Provisioning

Flash drives

Optimisation algorithms

Automated discovery

Document management

How can we...

Implement a Data Minimisation Strategy

Manage exponential data growth, while...

Improving access to organisational data Containing data management and infrastructure costs Reducing the data centre’s carbon footprint...

Online archiving of e-mail and file systems Backup with data de-duplication

Data Minimisation Elements

Retention and compliance

Data reduction

Universal access

Simplify management

Tier backup infrastructure

Optimise media: B2D, VTL, de-dupe and tape

Address security issues

Simplify management

Identify candidates for archiving

Classify and move

Establish SLAs based on information class

New Technologies and Services are Enablers

PrimaryStorage

BackupArchive

Data Minimisation – How it works

1. Archive the inactive data before you perform the backup process

Identify Inactive Data based on polices Automate the movement of the data to a lower cost storage tier or dedicated

archive platform leaving stubs behind Items are retrieved from the online archive on user demand Backup up the archive infrequently or never

2. Backup the remaining data using resource efficient data de-duplication

Rapid ‘Full Backups’ - only the ‘sub-file’ changes are sent and stored on disk Minimal Bandwidth – only a fraction of the typical 200% is sent over the wire Minimal Storage Consumption – only unique ‘sub-file’ blocks are stored Protect more, with less for longer

Today: Energy-Efficient Storage Design

1 TB Data on Different Capacity/Performance Drives

787 kWh/yr

1,434 kWh/yr

3,048 kWh/yr

94%

87%

73%50%

393 kWh/yr

CONSUME LESS ENERGY BY CAPACITY

15K73 GB

15K146 GB

10K300 GB

7.2K500 GB

7.2K 1 TB

6,096 kWh/yr

73 GBFlash drive

3,790 kWh/yr30x

IOPS

38%Less

Energy

April 18, 2023

File System Archiving

Extract inactive, final-form data to an archive

Enhance performance of production applications

Reduce size of backup datasets

Free up expensive Tier 1 disk

Store archived data on high density low cost energy efficient storage

10 TBExtract Alwaysavailable

Before Backup full, 10 TBAfter Back up 4 TB, active data only

Active archive

Primarystorage

4 TB

6 TB

Secondarystorage

Inactive data

Reclaimedstorage

Production

ActivedataActivedata

E-Mail Archiving

Message Server E-mail Archive ServerSpace saved on e-mail server is typically 60–80%

Shortcut

User’s Inbox

Message 1 Jan. 1, 2008To: Rick Subject: QuestionAttached:

Message 2 Jan. 1, 2008To: Ron Subject: UpdateAttached:

Message 3 Feb. 1, 2008To: Bill Subject: Training

Message 1 Jan. 1, 2008To: Rick Subject: QuestionAttached:

Message 2 Jan. 1, 2008To: Ron Subject: UpdateAttached:

Message 3 Feb. 1, 2008To: Bill Subject: Training

Shortcut

Shortcut

Mail Archival automatically create shortcuts to archived messages / attachments…and deletes the original attachments from the e-mail server

E-mail Archive

Message 1 Jan. 1, 2008To: Rick Subject: QuestionAttached:

Message 2 Jan. 1, 2008To: Ron Subject: UpdateAttached:

Message 3 Feb. 1, 2008To: Bill Subject: Training

Message 1 Jan. 1, 2008To: Rick Subject: QuestionAttached:

Message 2 Jan. 1, 2008To: Ron Subject: UpdateAttached:

Message 3 Feb. 1, 2008To: Bill Subject: Training

Definition of De-duplication

“The process of detecting and identifying the unique data segments within a given set of information, enabling the elimination of redundancy when stored or moved.”

Before: total segments = 39 After: Unique segments = 6

Data Set 3

Data Set 2

Data Set 1

De-duplication

Data De-duplication: How it Works

A B C D

Unique data stored on disk, available for immediate recovery

Only unique data segments are backed up

AB

CD

Data already backed up, so only a unique ID pointer is stored (20 bytes)

E

ENew data segment identified and backed up

First Instance Duplicate Instance Modified Instance

A B

C D

A B

C D

B

C D

E

May 2007 May 2007 June 2008

Key Point – Data Minimisation requires a platform that doesn’t need to be backed up!

WORM DISK

Tier 3 Disk

Active ArchivingWORM delivers unique features for online archives

Location independence

Self-healing and management

Guaranteed authenticity

Single-instancing

Online ArchivingTier 3 Disk with SATA and NAS with ATA

Offline ArchivingTape is best suited for offline archives

Tape

Customer Archival Requirements

Management Efficiency

Arc

hivi

ng F

unct

iona

lity

Data Minimisation Strategy - How it all fits together

Tier 1Primary Storage

Tier 2Secondary Storage

Tier 3Archive long termRetention on disk

80% of data

Tier 5Legacy long

Term retentionOn tape

Optional 20%

Tier 4Backup to disk

(De-Dupe)Quick recoveryOptional 20%

Daily data backups

Daily data

backups

O H De-duped DataStaticData

growth

StaticData

growth

Tier 3 Data Growth

No management

required

Quantified Results – Reduce Tier 1/2 with Archiving

Major reduction in expensive Tier1/2 Storage

Tier 3 Archive storage minimised due to single instancing & compression

73% reduction in power and cooling requirements for archived data

Quantified Results – The Data Minimisation Leverage

Good Tier 4 Savings with Archiving or De-Duplication

Excellent results by combining Archiving & Backup Data De-Duplication

6 x reduction in power and cooling requirements for B2D storage

Quantified Results – Less Tape Infrastructure

Associated reduction in Tape Library Slots, Drives, Management & Handling

Power of combining Archiving & De-Duplication – 560 Less LTO4 Tapes in Year3

Tape could be removed altogether – Offsite Replication & Disk Spin-Down

Data management cost comparison – Data Minimisation

Significant Reduction of Backup Infrastructure and Tape Management

• Tape Drive, Tape Licences, Slots, Library, Backup Server, Tape Media, Offsite Storage & Recall Costs, Admin Costs

April 18, 2023© Copyright Dimension Data 2000 - 200622

Data Minimisation Assessment – Business Case

• Current backup minimisation methods give you better efficient backups

• However it doesn't fix the cause of the problem which is data growth

• A combination of data archival, backup de-duplication and compression represents the most effective manner to contain data within your environment

• Helps quantify business case for archiving (or other appropriate solution)

• Workshop to identify costs/issues

April 18, 202323 © Copyright Dimension Data 2000 - 2008

Data Minimisation – Input Variables

April 18, 202324 © Copyright Dimension Data 2000 - 2008

Data Minimisation – Graphical View

April 18, 202325 © Copyright Dimension Data 2000 - 2008

Data Minimisation – Graphical View (Cont.)

April 18, 202326 © Copyright Dimension Data 2000 - 2008

Data minimisation strategy achieved by...

Archiving over 70% of data to a protected environment which removed the need for that data to be backed up via archiving

Minimised the impact of data backup via de-duplication and compression (reduction in data volume and backup data by 80%)

Minimised the impact of VMware on the environment through de-duplication

Contained Tier 1 disk growth and spend

Provided the most storage efficient backup method possible today

Estimated savings to be over 5 Million dollars in 5 years.

My initial Sync took 12 hours now I backup in 50 mins’ – Dimension Data Customer

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

4,500

2006 2008 2010 2012 2014 2016

Estimated Infrastructure Run Rate

Units / kW / Tons

0

5,000

10,000

15,000

20,000

25,000

Footprintsq. ft.

Equipment (Units) Power (kW) Cooling (Tons) Footprint (sq. ft.)

$

$

$

$0

$500

$1,000

$1,500

$2,000

$2,500

$3,000

$3,500

$4,000

$4,500

"K$"

Cost BAU $708 $1,410 $2,107 $4,226

Cost Optimized $278 $560 $840 $1,678

Savings $430 $850 $1,267 $2,548

Year 1 Year 2 Year 3 Total

$

Tuesday, April 18, 2023

Questions & Answers