planning for synchronization with browser-local databases

29
Eric Farrar, Product Manager Sybase iAnywhere October 20 th , 2009 Preparing for Synchronization with Browser-local Databases

Upload: zendcon

Post on 13-May-2015

2.499 views

Category:

Technology


2 download

DESCRIPTION

Talk by Eric Farrar of Sybase at ZendCon 2009

TRANSCRIPT

Page 1: Planning for Synchronization with Browser-Local Databases

Eric Farrar, Product ManagerSybase iAnywhereOctober 20th, 2009

Preparing for Synchronization with Browser-local Databases

Page 2: Planning for Synchronization with Browser-Local Databases

Web applications are always online. Aren’t they?

• Traditionally, but the lines are being blurred• Web applications are starting to act like desktop applications• Desktop applications are starting to act like web applications

• Offline Application Caching– Gears– HTML5– Adobe Air

• This is a new space for web applications, but a well-known area for desktop and mobile applications

Page 3: Planning for Synchronization with Browser-Local Databases

It Isn’t Only About Going Offline

• It’s also about speed-ups for online applications• Data intensive applications can save many round-trips to the

server by storing reference data locally

Page 4: Planning for Synchronization with Browser-Local Databases

Browser-Local Databases

• SQLite– Gears

Chrome Android

– HTML 5 Specification Draft Firefox Safari

– Adobe Air• Semi-structured Storage (key-value pairs)

– Cookies– Flash Storage

• Isolated Storage– Silverlight

Page 5: Planning for Synchronization with Browser-Local Databases

But...

• For most applications, the ultimate destination for local data is a central server (“consolidated database”)

• This means two more copies of the same data that can be changed independently

• This introduces synchronization!

Page 6: Planning for Synchronization with Browser-Local Databases

Synchronization

• Synchronization is a huge topic• This talk aims to show two things:

1. Synchronization is a complex problem that requires careful planning

2. The more planning done upfront, the easier the synchronization process will be

• Although related, this talk will discuss synchronization of application data, not the application itself (which is usually done separately)

Page 7: Planning for Synchronization with Browser-Local Databases

How often are you synchronizing?

• Occasionally-connected– Normal application state is connected– Disconnection is treated as special case

• Occasionally-disconnected– Normal application state is disconnected– Connection is treated as a special case

Page 8: Planning for Synchronization with Browser-Local Databases

What are you synchronizing?

• Object-based synchronization– Serialized objects (perhaps hierarchical) are the basic object

of synchronization– Similar to document-based synchronization (CouchDB)

• Action-based synchronization– Actions on data, rather than data itself, in synchronized– Store a log of actions to take once connection is available– Difficult to implement for moderately complex systems

• Row-based synchronization– Individual database rows are the basic object of

synchronization– Problem is the best defined– Often the ultimate destination will of the data will be a

relational database

Page 9: Planning for Synchronization with Browser-Local Databases

“A few rows up, a few rows down. What’s so hard about that?”

• Synchronization is deceivingly complicated• Lets build an application that only synchronizes a single table

holding a contact list

• Typical record

Contact ID Name Address City Last Contact102 Homer Simpson 742 Evergreen Terrace Springfield 2009-10-10 10:00

Page 10: Planning for Synchronization with Browser-Local Databases

Data Subsetting

• Each user should get all of their data, and only their data• Data must be filtered at the central server, not the end clients

– Network Traffic– Data Storage– Application Performance– User Experience – Security

• Clients should only need to provide a user name and some subscriptions, and the servers should be able to figure out what they need

Page 11: Planning for Synchronization with Browser-Local Databases

Implementing Subsetting for Contacts

• Create a mapping (“subscription”) table

RowID User ID City1 0001 Springfield

2 0002 Ogdenville

3 0003 Springfield

4 0003 Ogdenville

5 ... ...

Page 12: Planning for Synchronization with Browser-Local Databases

Adding Contacts

• Add the following contact

• What do we use as a Contact ID?– Autoincrement won’t work

• What can we do?

Contact ID Name Address City Last Contact???? Mr. Teeny 123 Fake Street Springfield 2009-10-10 10:00

Page 13: Planning for Synchronization with Browser-Local Databases

GUIDs to the Rescue. Maybe...

• The simplest solution is to use Globally Unique Identifiers (GUIDs)

• GUIDS are 128-bit number (often expressed as a 32-character alphanumeric string)

• Can be safely generated and be guaranteed to be unique if...– ...you know the same generation algorithm is being used– ...you have a good pseudo-random number generator

• Keys are meaningless, large, and awkward• May not integrate very well into an existing system• Can never use self-checking features in your primary keys

Page 14: Planning for Synchronization with Browser-Local Databases

Composite Keys

• Assign each application (not each user) a unique ID number• Combine this unique ID along with a regular

autoincrementing number– Can be concatenated to create a single key, or use a

composite-column key• Key is much smaller than GUIDs, and they carry built-in

meaning• Possible that you will exhaust your “key range” before you

can synchronize and be assigned a new ID• Application IDs can be assigned by the central server using a

simple autoincrement

Page 15: Planning for Synchronization with Browser-Local Databases

Primary Key Pools

• Composite keys use Application ID to implicitly reserve a range of keys

• Primary key pools explicitly reserve a range of keys by taking them!

• At each synchronization, the application requests a range of unassigned keys from the central server

• Every time the local application needs a key, it takes one from the primary key pool

• Does not require a unique application identifier• Possible you will exhaust your pool of keys before

synchronizing• Keys don’t carry any meaning• Complex to setup

Page 16: Planning for Synchronization with Browser-Local Databases

Deleting a Contact

• Once you delete something, it is gone• If it is gone, how do you know you deleted it?• Must implement some method to “remember” that

something has been deleted– deleted status column– shadow (“tombstone”) table

Page 17: Planning for Synchronization with Browser-Local Databases

Updating a Column

• Need a method to distinguish that a column has been modified– Status column– Last-Modified timestamp– Version number

• What happens if two people update the same row?– Detect that a change conflict has happened

Row-level vs Column-level conflict detection– Resolve the conflict

• Idempotent changes• The “Delete-then-Insert” problem• Mapping data types to consolidated database

Page 18: Planning for Synchronization with Browser-Local Databases

Non-Synchronized Deletes

• What if you want to delete something off your application, but not delete it system wide

• Need some method to turn off your change tracking algorithm

Page 19: Planning for Synchronization with Browser-Local Databases

Our Simple Contact List

• Went from...– Contacts table

Page 20: Planning for Synchronization with Browser-Local Databases

Our Not-So-Simple Synced Contact List

• To...– Contacts table– Contacts_deleted shadow table– Contacts_Users_subscription table– Contacts_key_pool table– TRIGGER AFTER UPDATE on Contacts– TRIGGER AFTER INSERT on Contacts– TRIGGER AFTER DELETE on Contacts– Disable/Enable change tracking

• And that says nothing about the actual synchronization or conflict resolution logic!

Page 21: Planning for Synchronization with Browser-Local Databases

Data Reassignment

• A user is reassigned to a new city. What needs to happen?

1. Complete a final upload of all change on the “old” set of data

2. Download a set of operations that delete all the old contacts

3. Download the contacts of the new city

• Easy enough for one table. But what happens when we have two or more tables with a foreign key relationship?

Page 22: Planning for Synchronization with Browser-Local Databases

Referential Integrity: Friend or Foe?

• Reassignment problems can quickly become lost in a referential integrity nightmare

• It may become tempting to disable (or at least never enforce) referential integrity checks

• This is usually a bad idea:– Referential integrity should be your friend. It ensures your

data stays consistent– Likely your server-side database will enforce referential

integrity, so it is better to do a client-side check before sending the data up

– There are performance benefits to defining foreign key relations

Page 23: Planning for Synchronization with Browser-Local Databases

Application and Schema Upgrades

• The traditional problems associated with dealing with legacy deployed software are usually avoided by web applications

• Cached versions of applications means you will no longer be able to guarantee that everyone is running with the latest version

• Need some provision to let “older” applications synchronize against logic that is correct for their schema

• This is typically achieved by adding a full level of indirection between the local database and the consolidated database

Page 24: Planning for Synchronization with Browser-Local Databases

Data Integrity in the Field

• Everyone should always have a consistent view of the data at every moment– Inconsistent data can quickly propagate though a

synchronization environment and infect everyone• This typically means synchronizations must be fully atomic

and both ends. – Error reporting must happen outside this atomic

transaction, otherwise it would be lost• Most applications can not handle half-synchronized data• Need to handle broken and partial synchronizations

Page 25: Planning for Synchronization with Browser-Local Databases

What else might you want?

• High-priority synchronization• Implementing secure authentication at every point in the

chain• Encryption• Server-initiated synchronization (“Push” synchronization)• Lots more...

Page 26: Planning for Synchronization with Browser-Local Databases

Summary

• Synchronization is deceptively hard– It is relatively easy to put together a simple, controlled

synchronization in a lab between two computers– The real complications only show up in the real world

• A full synchronization strategy should be planned at the start– All projects suffer from scope-creep– It is better to decide early on what you will and won’t do,

and architect for it• Synchronization is rarely, if ever, a simple “bolt-on” solution• Test with under realistic conditions and realistic load!!!!

Page 27: Planning for Synchronization with Browser-Local Databases

• A patched version of Gears that adds synchronization functionality• Allows the use of the Sybase UltraLite relational database• UltraLite is a small footprint, fully relational database capable of

totally self-contained synchronization• Contains all its own change tracking and synchronization logic that

is totally transparent to the end user• Synchronizes with a MobiLink Synchronization sever providing

out-of-the-box synchronization with Oracle, SQL Server, DB2, ASE, SQL Anywhere, and MySQL– 10 years old– Heavily deployed and tested

• Handles all of the mechanics and plumbing of synchronization, and lets you focus on your business logic

• Business logic can be written in SQL, .NET, or Java

Page 28: Planning for Synchronization with Browser-Local Databases

• Implemented as an open-source patch to the Gears project released under the Apache 2 project

• Beta went live last night • Available for Internet Explorer (Windows) and Firefox

(Windows and Linux)• Free deployment for SQL Anywhere and MySQL-based

applications

www.sybase.com/ultraliteweb

Page 29: Planning for Synchronization with Browser-Local Databases

Thank You

Eric [email protected]

http://iablog.sybase.com/efarrar