datafinder concepts and example: general (20100503)
DESCRIPTION
TRANSCRIPT
Folie 1DataFinder General >03.05.2010
DataFinder: Concepts and Usage
German Aerospace Center (DLR), Cologne/Berlin/Braunschweig
http://www.dlr.de/sc
DataFinder General > 03.05.2010
Folie 2
Outline
Introduction
Configuration and customization Requirements Analysis
Installation
Configuration
Customization
Data Migration
DataFinder General > 03.05.2010
Folie 3
DataFinder IntroductionBackground: Data Management Problem
Absent organizational structures
No central data management policy
Every employee organizes his/her data individually
Researchers spend about 30% of their time searching for data
Problem with data left behind by temporary staff
Increase of data because of growing size and regulations
Rapidly growing volume of simulation and experimental data
Legal requirements for long-term availability of data (up to 50 years!)
Situation is similar for every DLR institute, many research labs and agencies and even for the industry
DataFinder General > 03.05.2010
Folie 4
DataFinder IntroductionBasic Concept
Lightweight Client-Server solution
Based on open and stable standards, such as XML and WebDAV
Extensible through Python scripts to fit multiple scenarios
DataFinder General > 03.05.2010
Folie 5
DataFinder IntroductionGraphical User Interfaces of DataFinder 1.x
User Client Administrator Client
Implementation in Python with Qt/PyQt
Implementation in Python with Qt/PyQt
Current Version differs
Current Version differs
DataFinder General > 03.05.2010
Folie 6External Medias
(CD, DVD,…)
DataFinder IntroductionData Store Concept
Meta Data Server
Department
Employee
Simulation
Geometry
Grid Generation
Flow Solution
Visualisation
Data Access
WebDAV Server
FTP/GridFTP Server
Tivoli StorageManager
Storage Resource Broker
File System
Amazon S3
Logical View User ClientStorage
Locations
Folie 7DataFinder General >03.05.2010
DataFinder Configuration and Customization
DataFinder General > 03.05.2010
Folie 8
DataFinder Configuration and CustomizationPreparing DataFinder for certain “use cases”
Requirements AnalysisAnalyze data, working environment and user workflows
ConfigurationServer and Client setupDefine and configure data modelConfigure distributed storage resources (Data Stores)
CustomizationWrite functional extensions with Python scripts (GUI)Tool integration
Data MigrationAnalyzing current data Migration of the data into new system
DataFinder General > 03.05.2010
Folie 9
Meta data server
Apache and Catacomb (based on the WebDAV Protocol)
Apache and mod_dav (xampp)
Data server
Apache and Catacomb (based on the WebDAV Protocol)
Apache and mod_dav (xampp)
Administrator and user client
Source and precompiled Versions (for WinXP and SUSE64) available
DataFinder Configuration and CustomizationInstallation
DataFinder General > 03.05.2010
Folie 10
DataFinder Configuration and CustomizationData Model: Mapping of Organizational Data Structures
User
Project A
Project B
Project C
File 1
File 2
Simulation I
Experiment
Simulation II
Object(directory)
Object(file)
Relation
DataFinder General > 03.05.2010
Folie 11
DataFinder Configuration and CustomizationExkurs: Meta Data
Describe and annotate data (“files”) and collections (“directories”)
Different levels of meta dataRequired meta data defined by administratorUser is free to choose additional ones
Different types of meta dataStringNumbers (float, double, …)ListsDates
User can search in meta data
DataFinder General > 03.05.2010
Folie 12
DataFinder Configuration and CustomizationExkurs: Meta Data and the User Impact
“Damn! I’m a great scientist!I want freedom to have
my own directory layout…”
DataFinder restricts the rights of users!
Enforcement of “good behavior”
User must comply to organizational standards
Data is stored in defined (directory) hierarchy on data server
Required meta data must be set prior upload
User have certain access rights within hierarchy
DataFinder General > 03.05.2010
Folie 13
DataFinder Configuration and CustomizationCustomization: Python-Scripting for Extension and AutomationIntegration of DataFinder with environment
User, infrastructure, software, …
Extension of DataFinder by Python scripts
Actions for resources (i.e., files, directories)
User interface extensions
Typical automations and customizations
Data migration and data import
Start of external application (with downloaded data files)
Extraction of meta data from result files
Automation of recurring tasks (“workflows”)
DataFinder General > 03.05.2010
Folie 14
DataFinder Configuration and CustomizationExample: Downloading File and Starting Application
# Creating a file “/text.txt” using data store “Data Store”.from datafinder.gui.user import script_api as gui_apifrom datafinder.script_api.repository import setWorkingRepositoryfrom datafinder.script_api.item.item_support import createLeaf
# Get representation of the current managed repositorymr = gui_api.managedRepositoryDescription() # Get currently selected collection in DataFinder Server-View if not mr is None:setWorkingRepository(mr) def _createLeaf(): properties = dict() properties["____dataformat____"] = "TEXT" properties["____datastorename____"] = "Data Store" …
createLeaf("/test.txt", properties)script_api.performWithProgressDialog(_createLeaf)
# Creating a file “/text.txt” using data store “Data Store”.from datafinder.gui.user import script_api as gui_apifrom datafinder.script_api.repository import setWorkingRepositoryfrom datafinder.script_api.item.item_support import createLeaf
# Get representation of the current managed repositorymr = gui_api.managedRepositoryDescription() # Get currently selected collection in DataFinder Server-View if not mr is None:setWorkingRepository(mr) def _createLeaf(): properties = dict() properties["____dataformat____"] = "TEXT" properties["____datastorename____"] = "Data Store" …
createLeaf("/test.txt", properties)script_api.performWithProgressDialog(_createLeaf)
DataFinder General > 03.05.2010
Folie 15
DataFinder DemoExample
Live Demo DataFinder
Server structure
Admin client: showing XML file of meta model and in clientAdmin client: setting up a DataStore for development files Admin client: loading a script extension
User client: loading a script extensionUser client: making a structureUser client: upload of a Experimental file into the storeUser client: double-click on the file opening itUser client: script extension: creating a file
DataFinder General > 03.05.2010
Folie 16
Availability
DataFinder core available as Open Source
Current stable release: DataFinder 2.0
Simplified BSD License
Open Source platforms
Launchpad
Sourceforge
FreshmeatWindows XP and SLED64 bit precompiled
Become a DataFinder fan on Facebook!
DataFinder General > 03.05.2010
Folie 17
Links
DataFinder Web sitehttp://www.dlr.de/datafinder
DataFinder Open Source http://sourceforge.net/projects/datafinderhttp://launchpad.net/datafinder
DataFinder Wikihttp://wiki.sistec.dlr.de/DataFinderOpenSource
Catacomb – recommended Serverhttp://catacomb.tigris.org