digital vault kick-off 02/12/2015. fast & scalable object level storage secure content...

Post on 21-Jan-2016

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Digital Vault

Kick-off 02/12/2015

© T

rust

1Tea

m 2

015

Fast & scalable object level storage Secure content persistence Secure bi-directional content sharing Secure content provenance Shared content libraries Private-hosted file synchronization Built upon open source components Basic security model with optional extensions (consumer-driven

security enforcement) Vault-in-vault concept

Intro Digital Vault Engineour understanding

© T

rust

1Tea

m 2

015

Policies for user storage quota available API Engine available On-premise solution User OAuth2 consent available in existing authorization

infrastructure User authentication available in existing authentication

infrastructure

Conceptassumptions

© T

rust

1Tea

m 2

015

Conceptat the center of the solution: secured content

© T

rust

1Tea

m 2

015

Content encryption standards• X509 private keys for desktop clients• PDKDF2 session keys using RSA encryption• AES-256/CBC encryption for data transfer• ISO-32000 AES256 encryption for PDF encryption• ISO/IEC 9899:1999• Digital sign shared documents using ETSI AdES and ASiC

Adaptable feature-rich security model• Optional password protection on shared link• Optional expiration time on shared link• Optional signing for content integrity• Optional X509 public key signing for content transfer

Conceptsecurity Standards

© T

rust

1Tea

m 2

015

Conceptsecurity at all levels

© T

rust

1Tea

m 2

015

Vault Security• Secured local storage• Secured cloud storage• Secured content transfer• Trusted list of sync devices• Secured token distribution• Content provenance and Content integrity

Vault Archiving• Content retention using Apache CMIS• Content retention to private cloud distributed storage

Conceptsecurity at all levels

© T

rust

1Tea

m 2

015

Micro-service design Stateless services design API-first design Behavior driven development User-centric Semantic Versioning

Conceptdesign principles

resilient

elastic

stateless

responsive

© T

rust

1Tea

m 2

015

In scope – Demo DV application• OAuth2 enabled• Angular JS • To test all endpoints with user actions:

– upload, share, download,…

Synchronization client• cfr. Seafile clients available• OSX, Windows, Linux, terminal based• Mobile Android• Mobile iOS

Conceptwireframes

© T

rust

1Tea

m 2

015

Mobile applications (Android & iOS)

Conceptwireframes

© T

rust

1Tea

m 2

015

Conceptarchitecture component model

assumption: existing search engine

© T

rust

1Tea

m 2

015

• Client side (front-end)– 3rd party web applications for a variety of devices– Demo DV application made within the scope of the project– Desktop synchronization clients– Mobile synchronization clients

• Server side (back-end)– Digital Vault Engine– Integration with API Engine– Integration with Search Engine

• Server side (storage)– Storage and storage replication (quota storage policy)– Archiving to private distributed cloud storage– Archiving to ECM via Apache Chemistry layer

Conceptarchitecture component model

© T

rust

1Tea

m 2

015

Basic version of the DV Demo application Connects directly to the micro-service API Implements following user stories:• 1) upload file from DV Demo app into existing DV folder• 2) share file from DV Demo app => mail to user with link• 3) user downloads file using the link from the received mail

Conceptproof of Concept

© T

rust

1Tea

m 2

015

Technologyfile system design

© T

rust

1Tea

m 2

015

Files are organized into Libraries – designed for synchronization• Network/storage deduplication• No upload/download limit• Fast upload (back-end daemons)

Data model and sync similar to GIT (Repo, Branch, Commit, FS, Block) Selective sync library to devices Sync with existing folder Sync client-side end-to-end data encryption Full platform support: Win, OSX, Linux, mobile Share to a person or a group Share specific content or a folder Read-write and read-only share

Technologyfile system design

© T

rust

1Tea

m 2

015

Technologydeduplication

© T

rust

1Tea

m 2

015

Technologyhigh-level architecture

© T

rust

1Tea

m 2

015

Seafile• C, C++• OpenSSL

Java EE• JAXRS, CDI• Maven• Bouncy Castle Crypto API

Sync desktop clients• Qt4/5• C++

Sync mobile clients• Android• iOS

Technology stack

© T

rust

1Tea

m 2

015

Content Integrity and Content Provenance

Archiving to cloud storage Archiving to ECM platforms

Basic security on all levels Customizable security

Technology stackinnovative features in the solution

Different from cloud storage solutions for personal use

Open API security :every application can enforce strong security

© T

rust

1Tea

m 2

015

Digipolis and T1T agree on list of detailed product requirementsT1T creates product backlog based on product requirementsSprints of 2 weeksSprint demo

Transparency via JIRA projectRegular sync meetings with Digipolis stakeholder

Approachsprint planning with monthly releases

© T

rust

1Tea

m 2

015

sprint 1-2

• password, AES folder• storage• Account Mgmt• synch• Token Distribution• Content Sharing

sprint 3-4

• security features• key store

management• zip creation &

encryption• pdf encryption

sprint 5-6

• content provenance• archiving to ECM• integration with

search engine

Approachmilestones part 1

PO

C0.0

.1

Versio

n

0.0

.5

© T

rust

1Tea

m 2

015

sprint 7-8

• archiving to personal cloud storage

• trusted devices list• bug fixing

sprint 9

• bug fixing• move to Acceptance

sprint 10

• move to production

Approachmilestones part 2

Versio

n

0.5

.0

Versio

n

1.0

.0

© T

rust

1Tea

m 2

015

Deliverables• Source code• Builds• Technical documentation• User documentation

Project closing• Hand-over to technical team• User training

Duration of the project is approx 4 months

Approachdeliverables and project closing

Thank you for your kind attention

Do you have any questions?

© T

rust

1Tea

m 2

015

A typical synchronization work flow consists of the following steps:• Seafile client daemon detects changes in the worktree (via inotify etc).• The daemon commits the changes to the local branch.• Download new changes from the master branch on the server (if any).• Merge the downloaded branch into local branch (also checkout changes to

worktree).• Fast-forward upload local branch to server's master branch.

Custom merge algorithm• Auto-sync Git is unreliable• Merge after file write-protection releases lock

Annex 1Synch algorithm

© T

rust

1Tea

m 2

015

Annex 2Git approach – why?

Synchronization may be interrupted at any point by shutting down the program or computer, after reboot we lose all notifications from the OS. We need a reliable and efficient way to determine which files in the worktree has been changed (even after reboots).

Git's index file are used to do this. It caches the timestamps of every file in the worktree when the last commit is generated. So we can easily and reliably detect changed files in the worktree since the latest commit by comparing timestamps.

Another notable case is what happens if two clients try to upload to the server simultaneously. The commit procedure on the server ensures atomicity. So only one client will update the master branch successfully, while the other will fail.

The failing client will restart the sync work flow later. It will first merge the changes from the succeeded client then upload again.

© T

rust

1Tea

m 2

015

https://github.com/haiwen/seafile - 3200+ stars Estimated at least 200K users worldwide, mostly in Europe Open Source Software (AGPLv2) Available Open Source sync clients for desktop and mobile GIT approach but enhanced for auto-sync and handling large files Custom merge algorithm Basic privacy protection Efficient network transfer (LBFS-based) Only does what it should do best - approach

Annex 5Why Seafile?

© T

rust

1Tea

m 2

015

Automatic synchronization Clients do not store file history, thus they avoid the overhead of storing data

twice. Git is not efficient for larger files such as images. Files are further divided into blocks for more efficient network transfer and

storage usage. File transfer can be paused and resumed. Support for different storage backends on the server side. Support for downloading from multiple block servers to accelerate file transfer. More user-friendly file conflict handling. (Seafile adds the user's name as a

suffix to conflicting files.) Graceful handling of files the user modifies while auto-sync is running. Git is

not designed to work in these cases.

Annex 6What are the differences for Seafile vs Git?

top related