replicationbackuphsmsecurityarchiveencryptionexpiration
TRANSCRIPT
Using Classification for Data Security and Data ManagementClyde LawSoftware Design EngineerMicrosoft Corporation
SVR02
Agenda
> Motivation> File Classification Infrastructure (FCI)
Overview and Demo> FCI Architecture> Retrieving Properties from Files> Custom File Management Tasks> FCI Extensions> Extensibility Demo
Data Management Challenges
Replication
Backup
HSM
Security
Archive
Encryption
Expiration
Storage Growth
Storage Costs
Compliance Security and
Information Leakage
Increasing data management needs with disparate data management products
Managing Data by LocationBusines
sIT
Need per-project file share
Ensure business secret files do not leak out
Back up files with personal information to encrypted store
Expire low business impact files created over three years ago and not touched in the past
year
Managing Data using ClassificationMitigate costs and risks
Manage data based on business value
Classify data
Apply policy
File Classification Infrastructure
Classify Manage Report Extend
Introducing the File Classification Infrastructure
Clyde LawSoftware Design EngineerFile Server Management Team
demo
Benefits of ClassificationManage Risk
Find sensitive files on public servers
Watermark documents with confidential data
Encrypt backups of files with personal informationApply rights management to high-secrecy filesComply with retention policies
Reduce Cost
Optimize backup SLAs
Replicate only business-related documents
Expire files to reduce storage purchasing needsMove files to less expensive storage
Available in Windows
Extend through IT or ISV solutions
FCI ArchitectureClassification Pipeline> Designed to enable an ecosystem around
classification> Comprehensive API for solutions> Extensible classification infrastructure
Discover Data
Extract Existing
Classification
Properties
Classify Data
Store Classificati
on Properties
Apply Policies
Based on Classificati
on
File Classification Extensibility Points
Get/Set Property API for external applications
Get/Set Property API
> Consume properties by specifying files> Automation-compatible COM API
> Works with native code, managed code, or scripts
> Available through classification manager object> Set is meant for manual classification
> Use extensibility modules instead to extend rule-based automatic classification
Get/Set Property APIUsing PowerShell
# Get an instance of the Classification Manager$cm = New-Object –ComObject Fsrm.FsrmClassificationManager
# Enumerate and display all properties associated with a file$props = $cm.EnumFileProperties("P:\foo\bar.txt", 0)foreach ($prop in $props) { Write-Host $prop.Name = $prop.Value}
# Get and display the value of the "Secrecy" property$secrecyProp = $cm.GetFileProperty("P:\foo\bar.txt", "Secrecy", 0)Write-Host $secrecyProp.Value
# Set the value of the "Secrecy" property to "High"$cm.SetFileProperty("P:\foo\bar.txt", "Secrecy", "High")
Get/Set Property APIUsing native C++
// Get an instance of the Classification ManagerCComPtr<IFsrmClassificationManager> spClassMgr;HRESULT hr = CoCreateInstance(CLSID_FsrmClassificationManager, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmClassificationManager), &spClassMgr);
// Get the "PII" propertyCComBSTR bstrFilename(L"P:\\foo\\bar.txt");CComBSTR bstrPropName(L"PII");CComPtr<IFsrmProperty> spPIIProp;hr = spClassMgr->GetFileProperty(bstrFilename, bstrPropName, 0, &spPIIProp);
Custom File Management Tasks
> Apply policies by running custom commands on files that match specified criteria
> Faster than scanning and retrieving properties yourself> No control on file
order> Task runs command in
new process per file
FCI Extensions
> Classification modules> Determine values of properties to apply to files> Available in Windows:
> Folder classifier – assigns properties based on file location
> Content classifier – assigns properties based on string and regular expression matches in file content
> Storage modules> Supply and persist properties associated with
files> Available in Windows:
> System storage module for all file types> Uses NTFS named stream to store properties> Functions as a cache for fast retrieval
> Office 97-2003 and Office 2007 in-file storage
Pipeline Anatomy
Classification Runtime Process
Hosting Process
Hosting Process
Hosting Process
Scanner
Gets basic file properties
Office Storage [Load]
Loads embedded properties
Folder Classifier
Classifies based on location
Content Classifier
Classifies based on content
Office Storage [Save]
Saves embedded properties
Reporting Engine
Adds files to report
Discover Data
Extract Properties
Classify Data Store Properties
Apply Policies
Streams can cross processes• Security checks are performed on
cross-process data transfers
Most modules are hosted within a separate process
Each module passes streams of property bags to the next one
Custom Pipeline Modules
> Register module by creating a module definition through the Classification Manager> Typically once during installation
> Module is a COM server that implements IFsrmClassifierModuleImplementation or IFsrmStorageModuleImplementation> Both native and managed are supported
> Pipeline calls OnLoad to initialize module> Module needs to return connector object to
connect hosting process> Instructions in MSDN documentation
Classifier ModulesModels for classification
> Yes/no> Pipeline asks module whether or not a property
value applies to the file> Explicit value
> Pipeline asks module what value to assign to a specified property
> Controlled by NeedsExplicitValue flag in module definition
Classifier ModulesClassification session call sequence
> UseRulesAndDefinitions called at start of session> Module can choose to cache these rules
> For each file:> OnBeginFile – specifies the property bag of the
file to classify and the rules to classify it with> Module can choose to process file right away
> For each rule:> Yes/no – DoesPropertyValueApply
> Return TRUE or FALSE> Explicit value – GetPropertyValueToApply
> Return value to apply, or return error code FSRM_E_NO_PROPERTY_VALUE if no value should be applied
> OnEndFile – indicates end of file processing
Storage Modules
> Supply or persist properties associated with file
> Two types supported: InFile and Database> Cache is reserved for the built-in System Cache
Module> Capabilities field in module definition
determines whether module is instantiated for loading and/or saving properties> Separate instances created for load and save
> LoadProperties – provide property values by calling SetFileProperty in the property bag
> SaveProperties – retrieve properties in the property bag and persist them
Accessing File Contents
> Modules should never open files directly> May not have proper permissions> Stream state may not be consistent with
metadata> Use GetFileStreamInterface in the property
bag> Supports ILockBytes and IStream interfaces> Takes care of getting the right permissions> Ensures last access and last modified times are
unchanged> Ensures changes are properly committed (for
storage modules)
PowerShell Host Classifier
> Included in Windows SDK> Presents itself as a classifier to FCI that
hosts PowerShell scripts to do the actual classification
> Create custom classifiers without compiling and registering your own modules
> Simpler to build, but has slower performance> Intended for in-house IT solutions and
prototyping> More information at
http://blogs.technet.com/filecab/archive/2009/08/14/using-windows-powershell-scripts-for-file-classification.aspx
Developer OpportunitiesCall to action
> FCI provides many avenues to be part of end-to-end data lifecycle management solutions> Classifiers – provide classification based on
content, identity, regulations, etc.> Data management products – leverage
classification in solutions to backup, archival, leakage-prevention, etc.
> Storage modules – provide property storage for new file formats
> Flexible COM API> Native code, managed code, or scripting> PowerShell support enables fast deployment of
solutions
Additional Resources
> FCI Overview> http://microsoft.com/fci/
> Microsoft TechNet> http://technet.microsoft.com/en-us/library/dd758765%28
WS.10%29.aspx> http://technet.microsoft.com/en-us/library/dd758756%28
WS.10%29.aspx> Developing for FCI
> Windows SDK> http://msdn.microsoft.com/en-us/windows/bb980924.aspx
> FSRM API Documentation on MSDN> http://msdn.microsoft.com/en-us/library/bb972746%28VS.85%29.asp
x> FCI Code Gallery
> http://code.msdn.microsoft.com/fci/
Contact Us
> Storage Team Blog> http://blogs.technet.com/filecab/default.aspx
> E-mail> FCI Team
> [email protected]> Clyde Law, Developer
> [email protected]> Matthias Wollnik, Program Manager
YOUR FEEDBACK IS IMPORTANT TO US!
Please fill out session evaluation
forms online atMicrosoftPDC.com
Learn More On Channel 9
> Expand your PDC experience through Channel 9
> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses
channel9.msdn.com/learnBuilt by Developers for Developers….
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
AppendixProperty aggregation and conflict resolution
> Values from Storage:> In-file > Database > Cache
> Values from Classification Rules:> Default values applied once if not
already present> Can also choose to explicitly
aggregate or overwrite existing values
> Ordered lists, Booleans, Multi-choice lists, and Multi-strings can be aggregated
[Default] Apply only if there is no value stored in the file
[Consider Existing] Apply but aggregate with values from Storage and Default rules
[Ignore Existing] Apply and ignore (replace) values from Storage and Default rules
AppendixProperty bags> Property bag object holds the metadata of a file being
classified> The object flows through the classification pipeline> Each pipeline module can assign property values
Property Bag
File System InfoRelative Path, Creation Time, etc.
Properties
Messages
Read Stream Write Stream
Current ContextModule Type, Rule, etc.
Property
Name
Type
Assigned Values and Sources
From Storage Modules
From Default and CE Rules
From IE Rules
Aggregated Value
Aggregated Sources
AppendixConnecting a module to the pipeline
STDMETHODIMP CCustomModule::OnLoad( __in IFsrmPipelineModuleDefinition *pDefinition, __deref_out IFsrmPipelineModuleConnector **ppModuleConnector ){ ...perform module initialization...
// Create the connector CComPtr<IFsrmPipelineModuleConnector> spConnector; hr = CoCreateInstance(CLSID_FsrmPipelineModuleConnector, NULL, CLSCTX_LOCAL_SERVER, __uuidof(IFsrmPipelineModuleConnector), &spConnector); ...handle any errors... CComQIPtr<IFsrmPipelineModuleImplementation> spModuleImpl = GetControllingUnknown(); if (spModuleImpl == NULL) ...handle error...
// Bind the connector to the module hr = spConnector->Bind(pDefinition, spModuleImpl); ...handle any errors...
// Return the connector *ppModuleConnector = spConnector.Detach();
return hr;}