Migrating from PostgreSQL to MySQL at Cocolog
Naoto Yokoyama, NIFTY Corporation
Garth Webb, Six Apart
Lisa Phillips, Six Apart
Credits:Kenji Hirohama, Sumisho Computer Systems Corp.
Agenda
1. What is Cocolog 2. History of Cocolog 3. DBP: Database Partitioning 4. Migration From PostgreSQL to MySQL
1. What is Cocolog
What is Cocolog
NIFTY Corporation Established in 1986 A Fujitsu Group Company NIFTY-Serve (licensed and interconnected with CompuServe) One of the largest ISPs in Japan
Cocolog First blog community at a Japanese ISP Based on TypePad technology by SixApart Several hundred million PV/month
History Dec/02/2003: Cocolog for ISP users launch Nov/24/2005: Cocolog Free for free launch April/05/2007: Cocolog for Mobile Phone launch
2008/04700 Thousand Users
Cocolog (Screenshot of home page)
Cocolog (Screenshot of home page)
TypePadCocolog
Cocolog template sets
Cocolog Growth (User) ■ Cocolog ■ Cocolog Free
phase1
phase2
phase3
phase4
Cocolog Growth (Entry) ■ Cocolog ■ Cocolog Free
phase1
phase2
phase3
phase4
Technology at Cocolog
Core System Linux 2.4/2.6 Apache 1.3/2.0/2.2 & mod_perl Perl 5.8+CPAN PostgreSQL 8.1 MySQL 5.0 memcached/TheSchwartz/cfengine
Eco System LAMP,LAPP,Ruby+ActiveRecord, Capistrano Etc...
Monitoring Management Tool
Proprietary in-house development with PostgreSQL, PHP, and Perl
Monitoring points (order of priority) response time of each post number of spam comments/trackbacks number of comments/trackbacks source IP address of spam number of entries number of comments via mobile devices page views via mobile devices time of batch completion amount of API usage bandwidth usage
DB Disk I/O Memory and CPU usage time of VACUUM analyze
APP number of active processes CPU usage Memory usage
Hard
DB
Service
APL
2. History of Cocolog
Phase1 2003/12 ~ (Entry: 0.04Million)
Register
PostgreSQL
NAS
WEB
Static contents Published
Before DBP10servers
TypePad
PodcastPortal
Profile Etc..
Phase2 2004/12 ~ (Entry: 7Million)
Rich templatePublish Book
Tel Operator Support
NAS
WEB
Static contents Published
PostgreSQL
Register
TypePad2004/12 ~
2005/5 ~
Before DBP50servers
Phase2 - Problems
The system is tightly coupled. Database server is receiving from multiple poi
nts. It is difficult to change the system design and
database schema.
Phase3 2006/3 ~ (Entry: 12Million)
NAS
WEB
Static contents Published
Web-API
memcached
PodcastPortal
Profile Etc..
PostgreSQL
Rich templatePublish Book
Tel Operator Support
RegisterTypePad
Before DBP200servers
Phase4 2007/4 ~ (Entry: 16Million)
Web-API
NASW
EB
Static contents Published
memcached
Atom
MobileWEB
Rich templatePublish Book
Tel Operator Support
Register
Typepad
PostgreSQL
Before DBP300servers
Now 2008/4 ~
Web-API
NASW
EB
Static contents Published
memcached
Atom
MobileWEB
Typepad
Rich templatePublish Book
Tel Operator Support
Register
Multi MySQL
After DBP150servers
3. TypePad Database Partitioning
Steps for Transitioning
• Server Preparation Hardware and software setup
• Global Write Write user information to the global DB
• Global Read Read/write user information on the global DB
• Move Sequence Table sequences served by global DB
• User Data Move Move user data to user partitions
• New User Partition All new users saved directly to user partition 1
• New User Strategy Decide on a strategy for the new user partition
• Non User Data Move Move all non-user owned data
Storage
TypePad Overview (PreDBP)
Database(Postgres)
Static Content (HTML, Images, etc)
ApplicationServer
WebServer
TypeCastServer
ATOMServer
MEMCACHED
Data Caching servers to reduce DB load
Dedicated Server for TypeCast (via ATOM)
https(443)http(80)
http(80) : atom apimemcached(11211)
postgres(5432)
MailServer
Internet
nfs(2049)
ADMIN(CRON)Server
smtp(25) / pop(110)Blog Readers
Blog Owners
Mobile Blog Readers
smtp(25) / pop(110)
Cron Server for periodic asynchronous tasks
TypePadTypePad
TypePad
Non-User Role
Why Partition?
TypePad
User Role(User0)
All inquires (access) go to one DB(Postgres)
After DBPCurrent setup
Inquiries (access) are divided among several DB(MySQL)
TypePadTypePad
TypePadTypePad
GlobalRole
Non-UserRole
User Role(User1)
User Role(User2)
User Role(User3)
Non-User Role
Server Preparation
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
New expanded setup
DB(MySQL) for partitioned data
Current Setup
Job Server+ TypePad + Schwartz
SchwartzDB
User information is partitioned
Maintains user mapping and primary key generation Stores job
details
Server for executing Jobs
※Grey areas are not used in current steps
Asynchronous Job Server
Information that does not need to be partitioned (such as session information)
Global WriteCreating the user map
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
①
②
Explanation
①: For new registrations only, uniquely identifying user data is written to the global DB ②: This same data continues to be written to the existing DB
DB(MySQL) for partitioned data
Asynchronous Job Server
Maintains user mapping and primary key generation
※Grey areas are not used in current steps
Global ReadUse the user map to find the user partition
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: Migrate existing user data to the global DB ②: At start of the request, the application queries global DB for the location of user data ③: The application then talks to this DB for all queries about this user. At this stage the global DB points to the user0 partition in all cases.
DB(MySQL) for partitioned data
Maintains user mapping and primary key generation
①Migrate existing
user data
Asynchronous Job Server
②
③
※Grey areas are not used in current steps
Move SequenceMigrating primary key generation
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that act as “pseudo-sequences”. ② Application requests new primary keys from global DB rather than the user partition.
DB(MySQL) for partitioned data
Maintains user mapping and primary key generation
①
※Grey areas are not used in current steps
Migrate sequence management
Asynchronous Job Server
②
User Data MoveMoving user data to the new user-role partitions
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is then migrated asynchronously ②: If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later. ③: After being migrated all user data will exist on the user-role DB partitions ④: Once all user data is migrated, only non-user data is on Postgres
DB(MySQL) for partitioned data
Stores job details
Server for executing Jobs
Maintains user mapping and primary key generation
User information is partitioned
①
②
※Grey areas are not used in current steps
③
Migrating each user data
DB(MySQL) for partitioned data
④
New User PartitionNew registrations are created on one user role partition
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: When new users register, user data is written to a user role partition. ②: Non-user data continues to be served off Postgres
DB(MySQL) for partitioned data
Maintains user mapping and primary key generation
User information is partitioned
①
②
※Grey areas are not used in current steps
Asynchronous Job Server
New User StrategyPick a scheme for distributing new users
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: When new users register, user data is written to one of the user role partitions, depending on a set distribution method (round robin, random, etc) ②: Non-user data continues to be served off Postgres
DB(MySQL) for partitioned data
Maintains user mapping and primary key generation
User information is partitioned
①
②
※Grey areas are not used in current steps
Asynchronous Job Server
Non User Data MoveMigrate data that cannot be partitioned by user
Non-User Role
TypePad
User Role(User0)
DB(PostgreSQL)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation ①: Migrate non-user role data left on PostgreSQL to the MySQL side.
DB(MySQL) for partitioned data
Maintains user mapping and primary key generation
User information is partitioned
①
※Grey areas are not used in current steps
Migrate non-User data
Asynchronous Job Server
Information that does not need to be partitioned (such as session information)
Data migration done
Non-User Role
TypePad
User Role(User0)
DB(Postgres)
User Role(User1)
User Role(User2)
User Role(User3)
GlobalRole
Non-UserRole
Job Server+ TypePad + Schwartz
SchwartzDB
Explanation
①: All data access is now done through MySQL ②: Continue to use The Schwartz for asynchronous jobs
DB(MySQL) for partitioned data
Stores job details
Server for executing Jobs
Maintains user mapping and primary key generation
User information is partitioned
①
※Grey areas are not used in current steps
①
② Asynchronous Job Server
Information that does not need to be partitioned (such as session information)
Storage
The New TypePad configuration
Database(MySQL)
Static Content (HTML,
Images, etc)
ApplicationServer
WebServer
TypeCastServer
ATOMServer
MEMCACHED
Data Caching servers to reduce DB load
Dedicated Server for TypeCast (via ATOM)
https(443)http(80)
http(80) : atom api
memcached(11211)
MySQL(3306)
MailServer
Internet
nfs(2049)
ADMIN(CRON)Server
smtp(25) / pop(110)
Blog Readers
Blog Owners (management
interface)
Mobile Blog Readers
smtp(25) / pop(110)
Cron Server for periodic asynchronous tasks
JobServer
TheSchwartz server for running ad-hoc jobs
asynchronously
4. Migration from PostgreSQL to MySQL
DB Node Spec History
Time OS(RedHat) CPU Xeon MEM DiskArray
2003/12
2007/11
7.4(2.4.9) 1.8GHz/512k×1 1GB No
ES2.1(2.4.9) 3.2GHz/1M×2 4GB No
ES2.1(2.4.9) 3.2GHz/1M×2 4GB Yes
AS2.1(2.4.9) 3.2GHz/1M×4 12GB Yes
AS4 (2.6.9) 3.2GHz/1M×4 12GB Yes
AS4 (2.6.9) MP3.3GHz/1M×4
〔 2Core×4 〕16GB Yes
History of scale up PostgreSQL server, Before DBP
DB DiskArray Spec [FUJITSU ETERNUS8000]
Best I/O transaction performance in the world 146GB (15 krpm) * 32disk with RAID - 10 MultiPath FibreChannel 4Gbps QuickOPC (One Point Copy)
OPC copy functions let you create a duplicate copy of any data from the original
at any chosen time.
http://www.computers.us.fujitsu.com/www/products_storage.shtml?products/storage/fujitsu/e8000/e8000
History of scale up PostgreSQL server, Before DBP
Scale out MySQL servers, After DBP
A role configuration Each role is configured as HA cluster
HA Software: NEC ClusterPro
Shared Storage
Scale out MySQL servers, After DBP
PostgreSQL
FibreChannel SAN
DiskArray
…
heart beat
MySQLRole3
MySQLRole2
MySQLRole1
TypePadApplication
Scale out MySQL servers, After DBP
Backup Replication w/ Hot backup
Scale out MySQL servers, After DBP
PostgreSQL
FibreChannel SAN
DiskArray
…
heart beat
MySQLRole3
MySQLRole2
MySQLRole1
MySQLBackupRole
TypePadApplication
mysqld mysqld mysqld
rep rep rep
opc
mysqldmysqldmysqld
Troubles with PostreSQL 7.4 – 8.1
Data size over 100 GB 40% is index
Severe Data Fragmentation VACUUM
“VACUUM analyze” cause the performance problem Takes too long to VACUUM large amounts of data dump/restore is the only solution for de-fragmentation
Auto VACUUM We don’t use Auto VACUUM since we are worried about
latent response time
Troubles with PostgreSQL 7.4 – 8.1
Character set PostgreSQL allow the out of boundary UTF-8
Japanese extended character sets and multi
bytes character sets which normally should
come back with an error - instead of
accepting them.
“Cleaning” data
Removing characters set that are out of the boundries UTF-8 character sets.
Steps PostgreSQL.dumpALL Split for Piconv UTF8 -> UCS2 -> UTF8 & Merge PostgreSQL.restore
dump Split UTF8->UCS2->UTF8 Mergerestore
TypePadTypePad
Migration from PostgreSQL to MySQL using TypePad script
Steps PostgreSQL -> PerlObject & tmp publish
-> MySQL -> PerlObject & last publish diff tmp & last Object ( data check ) diff tmp & last publish ( file check )
PostgreSQL
Document
Object
tmp
Document
Object
lastFile check
data check
Troubles with MySQL
convert_tz function doesn't support the input value outside the
scope of Unix Time
sort order different sort order without “order by” clause
Cocolog Future Plans
Dynamic Job queue
Consulting by
Sumisho Computer Systems Corp. System Integrator first and best partner of MySQL in Japan
since 2003 provide MySQL consulting, support, training
service HA Maintenance
online backup Japanese character support
Questions