drupal campchicago2010.rachel.datamigration
DESCRIPTION
Promet Source - Rachel Joaro - Drupal Camp Chicago presentation on Data Migration. Migrating 100,000 pages of contentFrom Legacy CMS to DrupalTRANSCRIPT
Drupal MigrationMigrating 100,000 pages of content
From Legacy CMS to Drupal
Rachel JaroSolutions Architect at PrometSource
www.prometsource.com
OverviewWe’ll talk about:Successful migration recipeCommon questions you should be asking
before you startTop 3 tools to do migration in DrupalIssues
Tools to use in URL RewritingFile management Comparison in D6
TestingDeploying Solution
Data Migration
“Data migration solutions extract data from a source system, correct errors, reformat, restructure and load the data into a replacement target system”.
It sounds simple, but poorly managed data migration is the most common cause of failure in implementing a replacement system.
-- Gershon Pick, March 2001
Successful Migration Recipe
Planning
Source: http://www.flickr.com/photos/bjornmeansbear/4380595283/
Plan: What to AskNode types (Content separation, fields)
Do you want to separate contents into pages, articles, biography, news, etc.
What fields are needed for each node?Who can access it?Do you really need that content type? Or can
we just use taxonomies instead for similar contents.
Plan: What to AskTaxonomy (Categorization, tags)
Do you need to categorize nodes? Would you need different access?What kind of taxonomy groups or vocabularies
you would need?Permission (per nodes) and User Roles
Who are going to use the site? What are particularly their access rights?
Plan: What to AskNew URL mapping
Do you need to make SEO friendly URLs?Files, files permissions and file directory
Do you need advance file management or document management tool?
Do you need simpler solutions? How simple is that. Do you need access rights for each folder?Do you need browser type interface to access
them?What kind of files do you need to store? Images,
pdfs?
Build
RequirementsUse CSV files to import dataDivide migration into group or sectionsMap and replace old URL to SEO friendly
URLBefore: 05-200.htm
Data in CSV ExampleDecember 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report
Spotlights Need for Reform in Jackpot Jurisdictions||||||||||/press/releases/2005/december/||||||||||05-200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy ||||||||||
<p> Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p>
<p> Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p>
$$$$$$$$$$
Separator: ||||||||||End of Row: $$$$$$$$$$
Content Type Division
Example: CNN.comDivide migration sequences into US, World, Politics, Justice, etc
Solutions/ToolsTW and Migrate modules Combonode_import()Drush + custom script
TW & Migrate Module Combohttp://drupal.org/project/tw
Supports Migrate module to run views of source data
http://drupal.org/project/migratea flexible framework for migrating content
Migrate ModuleFeatures:users browse their legacy data using viewssupport for creating Drupal nodes, users, and
comments is includedhooks permit migration of other types of
content.provides a dashboard for running mini
migrationsDrush support
Why I did not choose migrateImporting to mysql was not an option. CSV
were used insteadCannot map old URL to new URL
node_import()http://drupal.org/project/node_importFeatures:Easy to learn, Point and clickUses CSV to upload contentsCan easily delete previous imported dataCan download errors when import failed for
easy reference to fix issues
node_import() ProblemsI can’t define map old URL to new URLNo drush supportIt doesn’t save my old settings for a csv.
Drush + Custom script
Flexibility - I can do whatever I want with the data
Create your own migration script
[demo]
IssuesFile ManagementURL Rewriting
File ManagementClient requirementsIntuitiveHas wysiwyg supportAccess control – upload, edit, delete, revise
files by different rolesRevision control – optional but good to haveLimited time!
File Management Modules
*DbFm was not included due to problems encountered during tests in D6
URL Rewriting
Source: http://www.flickr.com/photos/randomfactor/483264915/
URLs Rewriting SolutionNot recommended.htaccess
Too many URL to handle. Too much server load
Recommendedpathauto + path_redirect modules
automated alias settings 301 redirect set
global redirect
Additional reference:http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
URL Checkerhttp://drupal.org/project/linkchecker
Access control Alternative/default/files/PressReleases/default/files/Documents/default/files/International
/default/files/International/America/default/files/International/England/default/files/International/Asia
Test, Test and did I say Test?
Source: http://www.flickr.com/photos/paperpariah/2424107350/
Common problemsBroken linksMisconfigured pageEmpty pagesInvalid dateFile not found or orphan pagesPage format
Test when CACHE is on
Deployment
Deployment2 Ways to Deploy your data to live environment1. All at once2. Divide and conquer
Deployment: Divide and Conquer
Example: CNN Division
Deployment Mockup
* shadow box is your migrated data’s production box* old CMS is still active at this time
Deployment• Coordination between the old CMS and
Drupal• URL Testing
Deployment Mockup
* shadow box is your migrated data’s production box* replacing old CMS with Drupal
DeploymentProsLess risk, less stress Editors can do continues data entry daily
ConsURL rewriting can be a trickyUpdating the production box with new
content can be an arduous task
Deployment: Updating ProductionAutomationSVNDrush scripts to migrate contents from
tester’s box to shadow boxDeploy – http://drupal.org/project/deploy
ManualDocument configuration changesDocument database changes
RecapSDLC + AgileCommon questions you should be asking
before you startTop 3 tools to do migration in Drupal
TW & Migrate, node_import(), drushIssues
File management Comparison in D6Tools to use in URL Rewriting
TestingDeployment Solution
Questions?
Resourceshttp://groups.drupal.org/content-migration-im
port-and-exporthttp://drupal.org/handbook/migrating