webinar- couchbase mobile sync gateway configuration and management webinar mobile advanced...
TRANSCRIPT
Sync Gateway In-‐Depth
Chris Anderson
June 24, 2014
V1.0 - 3/10/14
Architecture
What’s It For?
1. Syncing With Couchbase Lite
• MulH-‐master sync, scalable to any number of clients
• Based on CouchDB replicaHon protocol
2. AuthenHcaHon
• Sync Gateway is publicly reachable outside the firewall
• Security barrier between clients and Couchbase Server
• User Accounts Stored as docs in bucket Passwords stored as digests, using bcrypt
• Logins HTTP Basic auth Cookie-‐based sessions App server can generate sessions for custom auth
3. AuthorizaHon
• Channel tags grant read access to documents Each user has access to a specific set of channels Channel access can be configured via REST API Sync funcEon can grant access on the fly
• Sync funcHon performs data validaHon JavaScript funcEon defined by app developer, provided in config Called on every document update Throws excepEon to reject a change Must programmaEcally enforce write access to channels
4. App Workflow
• Channels route documents between users
• Sync funcHon drives the enHre workflow Tagging documents with channels GranEng channel access to users ValidaEng data
Architecture Diagram
Architecture
Server 3Server 1 Server 2
Couchbase Server
Channel
Sync Gateway
Channel Channel
Sync Gateway
Channel
Couchbase Lite for iOS and Android
On Premise In the cloud
Sync Gateway Components
Sync REST API
Couchbase Smart Client
Revision/Conflict
Management
AuthenHcaHon
App’s Sync FuncHon
Channel Change Tracking
External Auth Services
to client
to Couchbase Server
Sync Gateway: Incoming Changes
Sync REST API AuthenHcaHon External Auth
Services
!• Pushes changes from client: • POST /db/_revs_diff • POST /db/_bulk_docs
!• HTTP Basic (over SSL), or • Session cookie
!• Facebook • Persona (email-‐based) • Custom • LDAP, etc. *
Sync Gateway: App Logic & Storage
Couchbase Smart Client
Revision/Conflict
Management
App’s Sync FuncHon
to Couchbase Server
!function(doc, oldDoc) {
… requireUser(oldDoc.owner); … channel(doc.channel); … access(doc.members, doc.roomID);
}
validaHonrouHngaccess ctrl
rev 1 rev 2
rev 3a
rev 3b
AuthenHcaHon
Sync Gateway Components
Sync REST API
Couchbase Smart Client
Revision/Conflict
Management
AuthenHcaHon
App’s Sync FuncHon
Channel Change Tracking
TAP feed from server
MulH-‐Master ReplicaHon
Features
• Any number of clients
• Arbitrary topologies (from centralized to P2P)
• Occasionally-‐connected clients
• Conflict resoluHon No data loss Usually client-‐driven Asynchronous
• Some delta opHmizaHons Unchanged aOachments aren’t sent Lots of room to opEmize here (delta encoding)
ReplicaHon Types
• One-‐direcHonal “Push” to server “Pull” from server
• One-‐shot or conHnuous ConEnuous offers low-‐latency changes but locks up a server socket Polling is a compromise
• Always client-‐iniHated Sync Gateway is passive
“Push” ReplicaHon
• Consult local db for revisions added since last checkpoint
• POST list of {doc ID, rev ID} tuples to _revs_diff Response contains subset that are new to the server
• plus latest rev IDs known to server
• PUT each new revision to server Including revision history to incorporate into tree and aOachments added since server’s last known revision
• Save checkpoint with latest sequence processed
“Pull” replicaHon
• Read server’s “_changes” feed StarEng from just past last checkpoint sequence List of {sequence, doc ID, leaf rev ID(s)} tuples
• Consult local db to find unknown revisions
• GET each revision Tell server latest rev ID(s) I have, to prune unchanged aOachments Server includes rev history to incorporate into tree Response usually MIME mulEpart/related
• Save checkpoint with latest sequence processed
Changes Feed
• Most difficult part of enHre project Linear history of all document changes, per channel Must be efficient, scalable, reliable
• Source of truth: a view But view queries are expensive
• Source of speed: Tap feed Parse incoming document changes Queue in sequence order Cache by channel Consult view for older changes
for (channel in doc.channels) { emit([channel, doc.sequence], doc.revID, doc.deleted); }
Demo ApplicaHon
Sync FuncHon
JSON Document Schema
Sync FuncHon
Sync FuncHon
• Task documents belong to lists
Sync FuncHon
• List documents specify owners and members
!!!!!
• This app shares the list info with all members (easy to change)
Sync FuncHon
• Profile documents are distributed to all users
• Owned by the user they describe
Admin API
Port :4985
• Bind only to localhost!!!!
• Superset of the REST API on :4984
Read all changes
• Bypasses authenHcaHon / authorizaHon
• Great for admin tasks
Admin UI• Browse channels
• Simulate results of changing the sync funcHon
hOp://localhost:4985/_admin/
Edit User Accounts
• Add admin_channels, change password, etc
Document Model
Differences from Couchbase Server
• Metadata is inside the JSON “_”-‐prefixed fields (“_id”, “_rev”, etc.)
• More-‐robust MVCC Digest-‐based “_rev” property, not uint64 CAS “_rev” idenEfies a revision globally across all replicas Ties into revision tree (q.v.)
• Anachments Arbitrary-‐size binary blobs Tagged with name and MIME type Metadata visible as “_aOachments” property
Revision Trees
• Documents store revision trees (“hash histories”)
• Tree stores metadata Revision ID (based on SHA-‐1 digest of contents) DeleEon status (“tombstone”) JSON contents of old revs deleted during compacEon
• “Pruning” eventually deletes oldest tree items
• Tree structure supports conflicts Conflicts are not errors! ResoluEon can be deferred unEl convenient, or never There is always a single “default” or “winning” revision
Fiong This Into Couchbase Server
{ "_sync": { "channels": { "short": null, "word": null }, "history": { "bodies": [ ""], "channels": [ ["short","word"] ], "parents": [ -1 ], "revs": [ "1-86effb929acbf953905dd0e3974f6051" ] }, "rev": "1-86effb929acbf953905dd0e3974f6051", "sequence": 1, "time_saved": "0001-01-01T00:00:00Z" }, "word": "cat" }
Document Types
• ApplicaHon data documents
• “Local” documents Used by client replicators to store checkpoints
• User accounts
• Roles
• Binary anachments
• Obsolete revisions Removed when database is compacted
• A single sequence counter
Coexistence With Couchbase Apps
Sharing Isn’t Easy
• App servers reading from Gateway’s bucket “What’s this ‘_sync’ crap in my data?” “What are all these extra docs like ‘_sync:user:snej’”?
• App servers updaHng docs in the bucket is worse Gateway: “Who moved my cheese?!” App removing “_sync” property is disastrous App preserving “sync” property is sEll bad:
• Rev tree isn’t updated
• Sequence number isn’t bumped
• But Gateway can’t tell anything’s wrong
Bucket Shadowing
• Give app and Gateway their own buckets
• Shadower task watches both buckets’ Tap feeds Adds changes from app bucket to Gateway docs as new revisions Copies current rev of Gateway doc to app bucket
• Asynchronous replicaHon Shadowing is best for adding sync to exisEng high-‐volume web apps
Read-‐only direct access
• Reads, writes and channel subscripHons via Gateway
• Map reduce queries directly to Couchbase Server