Transcript
Page 1: Realtime Data Visualization

REAL-TIME DATAVISUALIZATION

@phil_renaud

Page 2: Realtime Data Visualization

1. RECORDING DATA

Page 3: Realtime Data Visualization

Technology has allowed us to become pretty good at recordingcertain types of data.

remembers what songs I listen toIn my case, 138,000+ tracks since 2005

I used to listen to an embarrassing amount of Coldplay

Still recording, years after I last interacted with theirservice.

GMail: shift from deletion to archivalUse tools like to export data, view in maps,etc.

Last.FM

Google Takeout

Page 4: Realtime Data Visualization

But, these are often passive modes of data reporting. Low-touch,but limited in scope.

What about actively-recorded data?

Page 5: Realtime Data Visualization

Apps like let us track certain metrics

Loosely dynamic: You set the questions,and it picks random intervals during theday to poll you.

Daily Weight; Amount of CoffeeConsumed; Current Location, TimeArrived at Work, etc.

Data is aggregated to show trends overtime

Reporter

Page 6: Realtime Data Visualization

Yes, this is actively-recorded data.

However, You need to know the questions long before theanswers become relevant.

What about real-time data?

Page 7: Realtime Data Visualization

Social networks like let us communicate quickly.

Users talk about the subject of theirchoosing, unprovoked.

Lots of noise for very little signal

With the right tools, extraction of data isvery rewarding

Shameless plug: Come work with me at if you dig this sort of thing.

Twitter

Unless you want to know about what yourfriends are eating for breakfast. Then, lotsof signal.

Affin.io

Page 8: Realtime Data Visualization

Yep! Twitter records data in real-time, generally actively.

But getting a clear picture of trends on a given subject is a hardproblem.

Unprovoked nature of tweeting means that your control groupis elastic. Tricky to get a response from a set audience.

Page 9: Realtime Data Visualization

The kind of data we're looking for is

Real-Time: Answers should be quickly available.

Active: The user providing the data decides whether or notthey want to participate.

Directed: We should be able to prompt the user for an answerto a specific question, rather than loose thoughts.

Impartial: The data should not have an inherent bias (ex: don'task me my daily weight right after I've had dinner)

Cohesive: Once aggregated, should show meaning / trends

Adaptable: The questions we ask should be able to change.

L ...couldn't think of an L

Page 10: Realtime Data Visualization

WHAT I'M DESCRIBING IS BASICALLY AUDIENCE POLLING.

Page 11: Realtime Data Visualization

But, live polling uses devices that look like this:

Page 12: Realtime Data Visualization

And that's a bummer.

Page 13: Realtime Data Visualization

SO LET'S BUILD A BETTER ONE.

Page 14: Realtime Data Visualization

2. WHEREIN T-SHIRTS ARE GIVEN AWAY

Page 15: Realtime Data Visualization

There's this company called

They provide an API layer to Telephony services

Example: Record any calls to a given number and use speech-to-text to make it searchable later

Example 2: all SMS messages to a given number can berecorded, parsed, and passed as a stream of JSON data.

Twilio

Page 16: Realtime Data Visualization

Carter Rabasa is a developer evangelist at Twilio

He sent us a bunch of t-shirts and stickers to give away

Make sure you say thanks! / @CarterRabasa @Twilio

Page 17: Realtime Data Visualization

Here's how to get a free shirt:

Text your Name, Age, Gender and Shirt Size to 902-707-1128

Format: comma separated: "Phil Renaud, 30, Male, L"

Shirt Sizes: S, M, L, XL

Free shirts until we run out of your size!

Shirt Requests:

Average Age:

Page 18: Realtime Data Visualization

3. A BRIEF HISTORY OF ASKING QUESTIONS.

Page 19: Realtime Data Visualization

In 1890, a weekly magazine calledLiterary Digest was founded. It primarilycovered analysis of news and currentopinion pieces.

Over the years, circulation increased.By 1922, the magazine had absorbedweekly magazines Public Opinion andCurrent Opinion.

Starting in 1916, Literary Digestconducted straw polls to determine theoutcome of the United States

presidential election.

In 1916, 1920, 1924, 1928, and 1932, they correctedlypredicted the outcome of the presidential race

Page 20: Realtime Data Visualization

In the 1936 presidential election, the incumbent Franklin D.Roosevelt was challenged by Governor Alfred Landon, ofKansas.

Literary Digest committed to creating the largest Americanpresidential poll ever conducted.

They sent out approx. 10 million questionaires in the form ofpostcards - representing about 1 in 4 potential voters - ofwhich 2.3 million were returned.

The poll indicated that Landon would win in a landslide, with57.1% of the popular vote, and taking the electoral college 370to 161

Original Literary Digest Article:historymatters.gmu.edu/d/5168

Page 21: Realtime Data Visualization

As it turned out...

Roosevelt won 46 of the 48 states

...taking 60.8% of the popular vote

...and taking the electoral college 523 to 8; the largest everdifference in a presidential election.

Two years later, Literary Digest shut down forever.

Page 22: Realtime Data Visualization

When Literary Digest sent out their 10 million questionaires,they used their subscription base, phone books, and clubregistries in order to find potential voters.

In doing so, they had an inherent bias working against them:club members, magazine subscribers, and owners oftelephones in 1936 counted themselves on the wealthier sideof Americans. Americans who notably voted differently fromthe average.

The same year, George Gallup predicted both the correctoutcome of the election, and why the Literary Digest poll wouldbe wrong, based on a relatively small sample of 50,000americans.

Gallup's methods in 1936 ushered in the modern era ofscientific opinion research.

Page 23: Realtime Data Visualization

THE POINT IS, IT'S REALLY EASY TO SCREW THIS UP.Asking questions isn't all that hard. Asking them in the right way,

and drawing meaningful conclusions: that's difficult.

Accidental bias is still a very real concern in the sort of pollingwe want to be doing:

Not everybody has mobile phones

Not everyone uses twitter

Not everyone is in the same time zone; not everyone is awakeat the time a "real-time" poll might be conducted

Being aware of these limitations is the first step to overcomingthem.

Page 24: Realtime Data Visualization

4. "IF YOU CAN'T EXPLAIN IT SIMPLY, YOUDON’T UNDERSTAND IT WELL ENOUGH"

A. Einstein

Page 25: Realtime Data Visualization

WITH HOW GOOD WE ARE AT RECORDING DATA, WE SURE ENDUP WITH A LOT OF IT.

On an average day, Twitter generates about 500 milliontweets. That's about 15 times the number of books in theLibrary of Congress

These mountains of data add up. We record much more datathan we can possibly analyze 1:1 in a lifetime.

Page 26: Realtime Data Visualization

EVEN WHEN WE AGGREGATE DATA,IT'S NOT ALWAYS CLEAR WHAT'S GOING ON.

Page 27: Realtime Data Visualization

And that's sort of a bummer. See...

DATA IS ONLY VALUABLEWHEN IT'S UNDERSTOOD

Page 29: Realtime Data Visualization

STATISTICAL DETECTION OF ELECTION FRAUD

source

Page 30: Realtime Data Visualization

VISUALIZATION LENDS MEANING TO RAW DATA.

Page 31: Realtime Data Visualization

.ENTER() D3

Page 32: Realtime Data Visualization

is a javascript library for manipulating documents based ondata.

D3.js

Has methods for binding data to DOM elements (often, but notnecessarily, SVG shapes)

Allows for reflection of data state change

Modern methods for data loading (from JSON, from .csv, .tsv)

Good introduction to the library: bost.ocks.org/mike/bar

Page 33: Realtime Data Visualization

We'll be using D3 help with some of the data visualization we dotoday, but basic visualizations like bar charts are pretty

straightforward to create with just HTML, CSS and Javascript.

<div class="chart"></div><a class="newArray">New Values in my Array, plz</a>

.bar { background: #ccc; color: #444; margin-bottom: 2px; padding: 5px; box-sizing: border-box; transition: .5s;}

Page 34: Realtime Data Visualization

barChartDataHandler();$(document).on('click', '.newArray', function(){ barChartDataHandler();})

function barChartDataHandler(){ var arr = []; for (var i = 0; i < 10; i++) { arr.push(Math.round(Math.random() * 100)) } // example: [80, 76, 11, 9, 35, 96, 54, 48, 34, 51]

var maxValue = _.max(arr); if ($('.bar-chart').is(':empty')) { $.each(arr, function(i){ $('.bar-chart').append('<div class="bar" style="width: 19.387755102040817%;">19</div>'); $('.bar').eq(i).css({'width': arr[i] / maxValue * 100 + '%'}).text(arr[i]); }) } else { $.each(arr, function(i){ $('.bar').eq(i).css({'width': arr[i] / maxValue * 100 + '%'}).text(arr[i]); }) }; //if our chart has been populated already}

See it in action at realtime.affin.io/bars.html

Page 35: Realtime Data Visualization

No magic happening; just bar-width being a relativepercentage of max-width.

Underscore.js is your friend here - - toolbeltlibrary for figuring out the _.max() value, mapping and reducingon the fly, etc.

But D3 affords us several things we can do with SVG shapesthat aren't easily reproduced with plain HTML.

Let's leverage a charting library that sits on top of D3 calledC3.js - find it at

underscorejs.org

c3js.org

Page 36: Realtime Data Visualization

D3 GAUGE CHARTNo CSS this time - we'll let D3 handle any styling needed

<a class="approve button">Love it</a><a class="disapprove button">Hate it</a><div id="gauge"></div>

Page 37: Realtime Data Visualization

D3 GAUGE CHARTCall loadGauge():

function loadGauge(){ gaugeArr = []; var pos = _.countBy(gaugeArr)['+'] / gaugeArr.length * 100; gaugeChart = c3.generate({ data: { columns: [ ['Approve', pos] ], type: 'gauge' }, }); //chart

$(document).on('click', '.approve', function(){ gaugeArr.push('+'); gaugeHandler(); }) $(document).on('click', '.disapprove', function(){ gaugeArr.push('-'); gaugeHandler(); })} //loadGauge

See it in action at realtime.affin.io/gauge.html

Page 38: Realtime Data Visualization

So now we have the basic mechanism of binary voting!

Very little code outside of a template here - C3 and D3 do theheavy lifting

Underscore function to _.countBy and figure out the relativepercentage of "Approve" votes.

Page 39: Realtime Data Visualization

5. DATA OVER THE WIRE

Page 40: Realtime Data Visualization

I've set up a project to handle the combination of RADICALdata/polling and Data Visualization. Here's a brief overview of my

stack:

Twilio SMS account that will receive text messages and passthem along to...

Local server running Python/Flask, to handle stream fromTwilio and feed it as an API for...

Javascript running in the browser, using a streaming APIrequest service (Oboe.js) to poll data and pass it into...

D3.js, which renders it as a gauge.

Page 41: Realtime Data Visualization

Over the next few slides, I'm going to put a word on the screenand a phone number.

Text either + or – to that number to indicate your Approval orDisapproval of the subject at hand.

Page 42: Realtime Data Visualization

READY?

Page 43: Realtime Data Visualization

PIZZATEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 44: Realtime Data Visualization

MONTREAL CANADIENSTEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 45: Realtime Data Visualization

BRUSSELS SPROUTSTEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 46: Realtime Data Visualization

STAR TREKTEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 47: Realtime Data Visualization

WORLD CUP SOCCERTEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 48: Realtime Data Visualization

US FOREIGN POLICYTEXT

+ OR —TO 902-707-1128

0 100

0.0%

UNDER 30: 0% 30+: 0%

Next

Page 49: Realtime Data Visualization

JAR JAR BINKSTEXT

+ OR —TO 902-707-1128

UNDER 30: 0% 30+: 0%

0 100

0.0%

Page 50: Realtime Data Visualization

SO THAT'S HOW I DO REAL-TIME DATA VISUALIZATION.

Page 51: Realtime Data Visualization

6. CLOSING REMARKS

Page 52: Realtime Data Visualization

DIVERSITY OF DEVELOPERS IN HALIFAXWhen I polled for T-shirt sizes earlier, I asked for gender to benoted as "male" or "female" when you sent your SMS

Here's how that broke down:

Male:

Female:

This is not a problem exclusive to Halifax, nor is it a problemexclusive to Tech, but it is a problem we can do somethingabout.

Next time you have the chance to attend a tech meetup, pleaseencourage a female colleague or friend to attend with you! Alack of diversity only holds us back as a community.

Page 53: Realtime Data Visualization

OFFICE HOURSIf you're working on something cool and you need advice orjust want to wax about it, I want to help!

I'm at Humani-T Café on South Park street most evenings.Stop in and say hi anytime!

The Affinio offices are in Bedford - there's someone theremore or less all day and we love drop-ins.

Finally, find me on twitter: @phil_renaud

Page 54: Realtime Data Visualization

THANK YOUFind me at:

| | @phil_renaud [email protected] affin.io

Slides up at realtime.affin.io/slides


Top Related