replicationbackuphsmsecurityarchiveencryptionexpiration

31
Using Classification for Data Security and Data Management Clyde Law Software Design Engineer Microsoft Corporation SVR02

Upload: donald-pope

Post on 16-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Using Classification for Data Security and Data ManagementClyde LawSoftware Design EngineerMicrosoft Corporation

SVR02

Agenda

> Motivation> File Classification Infrastructure (FCI)

Overview and Demo> FCI Architecture> Retrieving Properties from Files> Custom File Management Tasks> FCI Extensions> Extensibility Demo

Data Management Challenges

Replication

Backup

HSM

Security

Archive

Encryption

Expiration

Storage Growth

Storage Costs

Compliance Security and

Information Leakage

Increasing data management needs with disparate data management products

Managing Data by LocationBusines

sIT

Need per-project file share

Ensure business secret files do not leak out

Back up files with personal information to encrypted store

Expire low business impact files created over three years ago and not touched in the past

year

Managing Data using ClassificationMitigate costs and risks

Manage data based on business value

Classify data

Apply policy

File Classification Infrastructure

Classify Manage Report Extend

Introducing the File Classification Infrastructure

Clyde LawSoftware Design EngineerFile Server Management Team

demo

Benefits of ClassificationManage Risk

Find sensitive files on public servers

Watermark documents with confidential data

Encrypt backups of files with personal informationApply rights management to high-secrecy filesComply with retention policies

Reduce Cost

Optimize backup SLAs

Replicate only business-related documents

Expire files to reduce storage purchasing needsMove files to less expensive storage

Available in Windows

Extend through IT or ISV solutions

FCI ArchitectureClassification Pipeline> Designed to enable an ecosystem around

classification> Comprehensive API for solutions> Extensible classification infrastructure

Discover Data

Extract Existing

Classification

Properties

Classify Data

Store Classificati

on Properties

Apply Policies

Based on Classificati

on

File Classification Extensibility Points

Get/Set Property API for external applications

Get/Set Property API

> Consume properties by specifying files> Automation-compatible COM API

> Works with native code, managed code, or scripts

> Available through classification manager object> Set is meant for manual classification

> Use extensibility modules instead to extend rule-based automatic classification

Get/Set Property APIUsing PowerShell

# Get an instance of the Classification Manager$cm = New-Object –ComObject Fsrm.FsrmClassificationManager

# Enumerate and display all properties associated with a file$props = $cm.EnumFileProperties("P:\foo\bar.txt", 0)foreach ($prop in $props) { Write-Host $prop.Name = $prop.Value}

# Get and display the value of the "Secrecy" property$secrecyProp = $cm.GetFileProperty("P:\foo\bar.txt", "Secrecy", 0)Write-Host $secrecyProp.Value

# Set the value of the "Secrecy" property to "High"$cm.SetFileProperty("P:\foo\bar.txt", "Secrecy", "High")

Get/Set Property APIUsing native C++

// Get an instance of the Classification ManagerCComPtr<IFsrmClassificationManager> spClassMgr;HRESULT hr = CoCreateInstance(CLSID_FsrmClassificationManager, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmClassificationManager), &spClassMgr);

// Get the "PII" propertyCComBSTR bstrFilename(L"P:\\foo\\bar.txt");CComBSTR bstrPropName(L"PII");CComPtr<IFsrmProperty> spPIIProp;hr = spClassMgr->GetFileProperty(bstrFilename, bstrPropName, 0, &spPIIProp);

Custom File Management Tasks

> Apply policies by running custom commands on files that match specified criteria

> Faster than scanning and retrieving properties yourself> No control on file

order> Task runs command in

new process per file

FCI Extensions

> Classification modules> Determine values of properties to apply to files> Available in Windows:

> Folder classifier – assigns properties based on file location

> Content classifier – assigns properties based on string and regular expression matches in file content

> Storage modules> Supply and persist properties associated with

files> Available in Windows:

> System storage module for all file types> Uses NTFS named stream to store properties> Functions as a cache for fast retrieval

> Office 97-2003 and Office 2007 in-file storage

Pipeline Anatomy

Classification Runtime Process

Hosting Process

Hosting Process

Hosting Process

Scanner

Gets basic file properties

Office Storage [Load]

Loads embedded properties

Folder Classifier

Classifies based on location

Content Classifier

Classifies based on content

Office Storage [Save]

Saves embedded properties

Reporting Engine

Adds files to report

Discover Data

Extract Properties

Classify Data Store Properties

Apply Policies

Streams can cross processes• Security checks are performed on

cross-process data transfers

Most modules are hosted within a separate process

Each module passes streams of property bags to the next one

Custom Pipeline Modules

> Register module by creating a module definition through the Classification Manager> Typically once during installation

> Module is a COM server that implements IFsrmClassifierModuleImplementation or IFsrmStorageModuleImplementation> Both native and managed are supported

> Pipeline calls OnLoad to initialize module> Module needs to return connector object to

connect hosting process> Instructions in MSDN documentation

Classifier ModulesModels for classification

> Yes/no> Pipeline asks module whether or not a property

value applies to the file> Explicit value

> Pipeline asks module what value to assign to a specified property

> Controlled by NeedsExplicitValue flag in module definition

Classifier ModulesClassification session call sequence

> UseRulesAndDefinitions called at start of session> Module can choose to cache these rules

> For each file:> OnBeginFile – specifies the property bag of the

file to classify and the rules to classify it with> Module can choose to process file right away

> For each rule:> Yes/no – DoesPropertyValueApply

> Return TRUE or FALSE> Explicit value – GetPropertyValueToApply

> Return value to apply, or return error code FSRM_E_NO_PROPERTY_VALUE if no value should be applied

> OnEndFile – indicates end of file processing

Storage Modules

> Supply or persist properties associated with file

> Two types supported: InFile and Database> Cache is reserved for the built-in System Cache

Module> Capabilities field in module definition

determines whether module is instantiated for loading and/or saving properties> Separate instances created for load and save

> LoadProperties – provide property values by calling SetFileProperty in the property bag

> SaveProperties – retrieve properties in the property bag and persist them

Accessing File Contents

> Modules should never open files directly> May not have proper permissions> Stream state may not be consistent with

metadata> Use GetFileStreamInterface in the property

bag> Supports ILockBytes and IStream interfaces> Takes care of getting the right permissions> Ensures last access and last modified times are

unchanged> Ensures changes are properly committed (for

storage modules)

PowerShell Host Classifier

> Included in Windows SDK> Presents itself as a classifier to FCI that

hosts PowerShell scripts to do the actual classification

> Create custom classifiers without compiling and registering your own modules

> Simpler to build, but has slower performance> Intended for in-house IT solutions and

prototyping> More information at

http://blogs.technet.com/filecab/archive/2009/08/14/using-windows-powershell-scripts-for-file-classification.aspx

Putting it all together

Clyde LawSoftware Design EngineerFile Server Management Team

demo

Developer OpportunitiesCall to action

> FCI provides many avenues to be part of end-to-end data lifecycle management solutions> Classifiers – provide classification based on

content, identity, regulations, etc.> Data management products – leverage

classification in solutions to backup, archival, leakage-prevention, etc.

> Storage modules – provide property storage for new file formats

> Flexible COM API> Native code, managed code, or scripting> PowerShell support enables fast deployment of

solutions

Additional Resources

> FCI Overview> http://microsoft.com/fci/

> Microsoft TechNet> http://technet.microsoft.com/en-us/library/dd758765%28

WS.10%29.aspx> http://technet.microsoft.com/en-us/library/dd758756%28

WS.10%29.aspx> Developing for FCI

> Windows SDK> http://msdn.microsoft.com/en-us/windows/bb980924.aspx

> FSRM API Documentation on MSDN> http://msdn.microsoft.com/en-us/library/bb972746%28VS.85%29.asp

x> FCI Code Gallery

> http://code.msdn.microsoft.com/fci/

Contact Us

> Storage Team Blog> http://blogs.technet.com/filecab/default.aspx

> E-mail> FCI Team

> [email protected]> Clyde Law, Developer

> [email protected]> Matthias Wollnik, Program Manager

> [email protected]

YOUR FEEDBACK IS IMPORTANT TO US!

Please fill out session evaluation

forms online atMicrosoftPDC.com

Learn More On Channel 9

> Expand your PDC experience through Channel 9

> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses

channel9.msdn.com/learnBuilt by Developers for Developers….

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

AppendixProperty aggregation and conflict resolution

> Values from Storage:> In-file > Database > Cache

> Values from Classification Rules:> Default values applied once if not

already present> Can also choose to explicitly

aggregate or overwrite existing values

> Ordered lists, Booleans, Multi-choice lists, and Multi-strings can be aggregated

[Default] Apply only if there is no value stored in the file

[Consider Existing] Apply but aggregate with values from Storage and Default rules

[Ignore Existing] Apply and ignore (replace) values from Storage and Default rules

AppendixProperty bags> Property bag object holds the metadata of a file being

classified> The object flows through the classification pipeline> Each pipeline module can assign property values

Property Bag

File System InfoRelative Path, Creation Time, etc.

Properties

Messages

Read Stream Write Stream

Current ContextModule Type, Rule, etc.

Property

Name

Type

Assigned Values and Sources

From Storage Modules

From Default and CE Rules

From IE Rules

Aggregated Value

Aggregated Sources

AppendixConnecting a module to the pipeline

STDMETHODIMP CCustomModule::OnLoad( __in IFsrmPipelineModuleDefinition *pDefinition, __deref_out IFsrmPipelineModuleConnector **ppModuleConnector ){ ...perform module initialization...

// Create the connector CComPtr<IFsrmPipelineModuleConnector> spConnector; hr = CoCreateInstance(CLSID_FsrmPipelineModuleConnector, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmPipelineModuleConnector), &spConnector); ...handle any errors... CComQIPtr<IFsrmPipelineModuleImplementation> spModuleImpl = GetControllingUnknown(); if (spModuleImpl == NULL) ...handle error...

// Bind the connector to the module hr = spConnector->Bind(pDefinition, spModuleImpl); ...handle any errors...

// Return the connector *ppModuleConnector = spConnector.Detach();

return hr;}