grasp(eo) versioning system final presentation
DESCRIPTION
Master semester project presentationTRANSCRIPT
Versioning system in Graspeo
Nikita Grishin, EPFL, 02.06.2014 Supervised by Andrii Vozniuk
Motivation
Every document management system has the versioning.
Motivation
Every document management system has the versioning.
Various resources from various locations need to be tracked back
Motivation
Every document management system has the versioning.
Various resources from various locations need to be tracked back
The collaborators system implies the recovery system
State of the art
• Mercurial, SVN/CVS!
• On document change, only modifications are saved
• Easy to track modifications
• Difficult to recover old versions
• Git
• On file change, new version of file is created
• Easy to recover old versions
• Difficult to track modifications
• Mercurial, SVN/CVS!
• On document change, only modifications are saved
• Easy to track modifications
• Difficult to recover old versions
• Git
• On file change, new version of file is created
• Pointer to the last version
• Easy to recover old versions
• Difficult to track modifications
Only for text documents.
Works very bad with media content.
• Snapshots: new version of document for each modification
• Snapshots: new version of document for each modification
• Recovering the older version: creates a copy of the asked version on the top of the last version
• Snapshots: new version of document for each modification
• Recovering the older version: creates a copy of the asked version on the top of the last version
• Some old versions can be removed by Google in case of lack of space
• Snapshots: new version of document for each modification
• Recovering the older version: creates a copy of the asked version on the top of the last version
• Some old versions can be removed by Google in case of lack of space
• Only users that can edit the document can see its modification history
• Hard drive snapshots.
Apple TimeMachine
• Hard drive snapshots.
• Keeps:
• Hourly backups for last 24 hours
Apple TimeMachine
• Hard drive snapshots.
• Keeps:
• Hourly backups for last 24 hours
• Daily backups for the past month
Apple TimeMachine
• Hard drive snapshots.
• Keeps:
• Hourly backups for last 24 hours
• Daily backups for the past month
• Weekly backups until your backup drive is full.
Apple TimeMachine
• Hard drive snapshots.
• Keeps:
• Hourly backups for last 24 hours
• Daily backups for the past month
• Weekly backups until your backup drive is full.
• When your backup drive is full, TimeMachine removes old backups to free space.
Apple TimeMachine
Graspeo versioning system solution
Versioning
Graspeo versioning system solution
Versioning
• Snapshots: new version of item for each significant modification
Graspeo versioning system solution
Versioning
• Snapshots: new version of item for each significant modification
• Significant modification is a modification that needs to be stored in versions.
• Space: creation of space, modification in subitems list
• Resource: creation of resource, modification of its content (resource replacement)
Graspeo versioning system solution• Snapshots: new version of item for each
significant modification
• Significant modification is a modification that needs to be stored in versions.
• Space: creation of space, modification in subitems list
• Resource: creation of resource, modification of its content (resource replacement)
• Restore a version remains to copy the content from version document to an item document. This will automatically create a new version on the top of versioning tree.
Versioning
How does it work?• New collection for versions: easy scalable
How does it work?• New collection for versions: easy scalable
• Version document is almost a full copy of item document
How does it work?• New collection for versions: easy scalable
• Version document is almost a full copy of item document
• Version creation happens in mongoose post-save hook
How does it work?• New collection for versions: easy scalable
• Version document is almost a full copy of item document
• Version creation happens in mongoose post-save hook
• The versions controller takes a decision if a new version is needed (i.e. defining a notion of significant modification)
How does it work?
• Resources version:
• GridFS to store old file versions
• The last version of file is also stored on local filesystem because of performance issues with GridFS and use of BTSync
How does it work?
• Version Schema
• Extends Item schema
• New field called «originId» to track one item modifications
How does it work?• Version is created on item creation and on
significant modification
How does it work?• Version is created on item creation and on
significant modification
• Space:
• Modification of subitems list (subitem created/removed)
• Resource replacement
How does it work?• Version is created on item creation and on significant
modification
• Space:
• Modification of subitems list (subitem created/removed)
• Resource replacement
• Resource:
• Replacement
How does it work?
• Versioning API
• List of versions timestamps
• Version by date
• Restore a chosen version of space
How does it work?• Version restore
• User asks to restore the version of space for given date
• System looks for that space in items collection and for its version in versions collection
• Copies subitems list from version to item document
• Removes all items in items collection that are no more in recovered space
• Recursively restore subitems of that space
How does it work?• Version restore
• While recursively restoring subitems of space, some of them can be already removed from items collection. For that case:
• The system creates a new document in item collections and fill it with data from version document
• New document = new id: the system fix all path- and id-related problems
• That is the most expensive part of restoring process
How does it work?
• Version restore: Resource
• Make a restore in database as described above
• Write the asked version of resource to disk
• Regenerate thumbnails for the new resource content.
How does it work?• On the user side:
• Version creation is completely transparent
• Version restore is a little bit slow
• On the server side:
• Version creation is asynchronous and with a low load on database
• Version restore is expensive in terms of time and database load
Future Plans
Future Plans• Optimize the version restore process
• Extend the current versioning system to a multiuser versioning
• Integrate Git with current versioning system for text files
• Create a movie-style visualization of versioning and modification tracking
Thank you for your attentionQuestions