building c lipboard.com architecture, practices, and lessons

Post on 24-Feb-2016

47 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Gary Flake, Founder gary@clipboard.com. Building c lipboard.com Architecture, Practices, and Lessons. Outline. Introduction Architecture & Practices Lessons Q&A. Demo. Backstory. Founded by me (Gary Flake) Took ~1.4M angel investment in April, 2011 - PowerPoint PPT Presentation

TRANSCRIPT

Building clipboard.com

Architecture, Practices, and Lessons

Gary Flake, Foundergary@clipboard.com

Outline

• Introduction• Architecture & Practices• Lessons• Q&A

Introduction

Demo

Introduction

Backstory

• Founded by me (Gary Flake)• Took ~1.4M angel investment in April, 2011• 6+ full time employees (almost all dev):

• Mark Dawson, Greg Pascale, Ken Perkins, Tommy Montgomery, Steve Courtney

• Investors include AH, Index, FRC, SVA, FCO, Betaworks, Crunchfund, … individual angels

• 12+ month runway left• Looking to hire one more engineer

Scenarios

Near Term Long term• Individual saving• Micro-blogging• Curating• Collaboration• Shared curating• Link aggregation

• Application and service backup

• Personal data visualization

• Web search• Advertising• Clip platform

Introduction

Overlapping Clip Spaces

Home

SharedPublic

Introduction

Overlapping Clip Spaces

Home

SharedPublic

Introduction

Your private stuff

Your stuff, selectively shared

Other’s stuff, selectively shared

with you

Your public stuff

Other people’spublic stuff

Other’s public stuff, explicitly shared with youYour public stuff, explicitly shared

Why Clipboard?

• Fidelity and functionality preserved• Heterogeneous objects• Simple overlapping spaces• Shareable in several ways:

• 1→1 @mention, email, permalinks• 1→N @mentions, Facebook• 1→∞ publish, twitter, embed

• Tagging and search

Introduction

Outline

• Introduction• Architecture & Practices• Lessons• Q&A

Architecture & Practices

Architectural Goals

• Development Efficiency - Development speed and cost are critical for startups.

• Scalability – We want to support millions of users without rewriting our whole backend.

• Simplicity – Little, clear code. Few moving parts. Painless operations.

Combination helps towards other goals.

Architecture & Practices

Architecture

riak-01

riak-02

riak-03riak-04

riak-05

web-01Node.js + Nginx

web-02Node.js + Nginx

web-03Node.js + Nginx

cache-01

cache-02

cache-03

redis-01

redis-02

thumb-01 thumb-02 job-01

admin-01

Architecture & Practices

Other Infrastructure Parts

• Rackspace API for spinning up/down VMs• AWS for thumbnails storage and CDN• A few 3rd party components:

• Mixpanel and Google for analytics• Sendgrid for email• Paper Trail for log aggregation• Scout for monitoring

Client – Single Page App

• All clip views use same html page• Dependencies on jQuery and a few

plugins• No fancy frameworks (sort of MVVMC)• Express, EJS, & Less on backend help• Almost no server-side composition• Backend code is essentially an API

Architecture & Practices

Nginx• Lost faith in Apache long ago• Nginx is wicked fast• Handles static content (obviously)• Can act as a micro-cache for static and dynamic

content (FTW!)

Architecture & Practices

App Logic – Node.js

You’ve heard the arguments, but for us…

• We like JavaScript• 1 dev can develop features end-to-end• JavaScript + JSON ≈ Buttah!

• Easy to make stateless easy to scale out• Well-suited for Riak

Architecture & Practices

Redis

Lightning fast in-memory key-ADT store:

• Atomic operations for mutations, so no locks, nor write contentions

• Excellent complement to Riak• Uses: top lists, session tokens,

notifications, batch queue, invite tokens, promises, mutex

Architecture & Practices

Memcached

• Simple cache invalidation for K/V reads.• We make no attempt to do proper cache

invalidation on search cache.• Instead, we embrace eventual

consistency as a way of life.• Translation: object have type specific TTLs

that range from seconds to a few minutes.

Architecture & Practices

Operations

• Hosted on VMs at Rackspace• Staging and test clusters identical to

production. Dev on Vagrant.• Puppet for managing configurations• Build and deployment done with home

grown tools:• Devdo: handles stuff on dev box side• Manage: handles stuff on cloud side

Architecture & Practices

Riak

An awesome noSQL data store:

• Super easy to scale up AND down• Fault tolerant – no SPoF• Flexible schema• Full-text search out of the box• Can be fixed and improved in Erlang (the

Basho folks awesomely take our commits)

Architecture & Practices

Riak – Basics

• Data in Riak is grouped buckets(effectively namespaces)

• Basic operations are:• Get, save, delete, search, map, reduce

• Eventual consistency managed through N, R, and W bucket parameters.

• Everything we put in Riak is JSON• We talk to Riak through the excellent riak-

js node library by Francisco TreacyArchitecture & Practices

Data Model – Clips

annotation

title

author

ctime

tags

domain

mentions

Architecture & Practices

Data Model - ClipsClips are the gateway to all of our data

key: abc

<html>

</html>

Key: abc

“F1rst”

“Nice clip yo!”

“Saw this on Reddit…”Clip

Blob

Comment Cache

Comments on Clip ‘abc’

Architecture & Practices

Other Buckets

• Users• Blobs• Comments• Templates• Counts• Search Caches• Transactions

Architecture & Practices

Riak Search

• Gets many things out of Riak by something other than the primary key.

• You specify a schema (the types for the field within a JSON object).

• Works great but with one big gotcha:– Index is uses term-based partitioning

instead of document-based partitioning– Implication: joins + sort + pagination sucks– We know how to work around this

Architecture & Practices

Riak Search – Querying

• Query syntax based on Lucene• Basic Query

text:funny • Compound Query

login:greg OR (login:gary AND tags:riak)• Range Query

ctime:[98685879630026 TO 98686484430026]

Architecture & Practices

Clipboard App FlowClient node.js Riak

Go to clipboard.com/homeSearch clips bucket query = login:greg

Top 20 resultsTop 20 results

start rendering

(For each clip)API Request for blob

GET from blobs bucket

Return blob to client

render blob

Architecture & Practices

Outline

• Introduction• Architecture & Practices• Lessons• Q&A

Lessons

Web development doesn’t suckWe are all indebted to Google / Chrome for making web development better and more rewarding.

• “Edit build test” is the new REPL• Good debugging within the client• Fast runtime makes new apps possible

Bet on modules, not frameworks• jQuery plugins are great working

examples of modules that you can take a la carte.

• Frameworks are trickier because they permeate your entire code base.

• You can pick the wrong module and recover, but recovery from choosing the wrong framework is much harder.

• My advice: just use good code hygiene.

Open source and SaaS are critical• Open source is like lego for developers• Paid SaaS is great too – I’ll happily pay for

services when:• They are better than what we could build,• Is not part of our core offering,• Frees up a dev to do something that only we

can do in house.

Browsers and jQuery have bugsWe spent a lot of time tracking down bugs in surprising places:• Chrome Google Apps break bookmarklets• Safari layout can be corrupted by reading

computed CSS• jQuery mishandles position:relative on

body• IE8 and IE9 – don’t even get me started

Node.js is ready for prime time

• This wasn’t the case a year ago.• Callback style takes time to get used to.• Common coding patterns are still ugly.• The result is pretty phenomenal: a

backend that is effectively non-blocking.• It’s really great to work with the same

JSON / JS objects on all 3 tiers.

Redis & Riak are yin & yang

Redis• Abstract data types• In RAM, single node• Fast and atomic

operations• Can handle easily

write contentions

Riak• Documents• On disk, many nodes• Slow and eventually

consistent• Have to think about

write contention

Think in terms of write contention• noSQL patterns will have you writing a lot

of independent objects.• Simple contention can be managed with a

mutex, keeping code simple.• Complex contention can be batched into a

work queue.

Cache, cache, cache

• There is more than one way to cache.• Don’t get too clever (embrace noSQL and

don’t worry about cache invalidation).• Cache in multiple places and on multiple

time scales.

Balance agility with process

• Dev’s do testing and deploying• Code reviews author initiated• Many small features done branched off of

master. (No “dev” branch.)• Bug fixes done right on master.

Recap

• We don’t have big data… yet. But we think we can handle it.

• Our stack, architecture, and practices allow us to move fast while also designing for scalability.

• It’s also a really fun stack to work on.

Lessons

We’re hiring!

www.clipboard.com/jobs

Or talk to us right now!

Thanks!

Questions?

top related