Realtime Data Visualization

Download Realtime Data Visualization

Post on 11-Jul-2015




1 download

Embed Size (px)


<ul><li><p>REAL-TIME DATAVISUALIZATION</p><p>@phil_renaud</p></li><li><p>1. RECORDING DATA</p></li><li><p>Technology has allowed us to become pretty good at recordingcertain types of data.</p><p> remembers what songs I listen toIn my case, 138,000+ tracks since 2005I used to listen to an embarrassing amount of ColdplayStill recording, years after I last interacted with theirservice.</p><p>GMail: shift from deletion to archivalUse tools like to export data, view in maps,etc.</p><p>Last.FM</p><p>Google Takeout</p></li><li><p>But, these are often passive modes of data reporting. Low-touch,but limited in scope.</p><p>What about actively-recorded data?</p></li><li><p>Apps like let us track certain metrics</p><p>Loosely dynamic: You set the questions,and it picks random intervals during theday to poll you.Daily Weight; Amount of CoffeeConsumed; Current Location, TimeArrived at Work, etc.Data is aggregated to show trends overtime</p><p>Reporter</p></li><li><p>Yes, this is actively-recorded data.However, You need to know the questions long before the</p><p>answers become relevant.What about real-time data?</p></li><li><p>Social networks like let us communicate quickly.</p><p>Users talk about the subject of theirchoosing, unprovoked.Lots of noise for very little signal</p><p>With the right tools, extraction of data isvery rewardingShameless plug: Come work with me at</p><p> if you dig this sort of thing.</p><p>Twitter</p><p>Unless you want to know about what yourfriends are eating for breakfast. Then, lotsof signal.</p><p></p></li><li><p>Yep! Twitter records data in real-time, generally actively.But getting a clear picture of trends on a given subject is a hardproblem.Unprovoked nature of tweeting means that your control groupis elastic. Tricky to get a response from a set audience.</p></li><li><p>The kind of data we're looking for isReal-Time: Answers should be quickly available.Active: The user providing the data decides whether or notthey want to participate.Directed: We should be able to prompt the user for an answerto a specific question, rather than loose thoughts.Impartial: The data should not have an inherent bias (ex: don'task me my daily weight right after I've had dinner)Cohesive: Once aggregated, should show meaning / trendsAdaptable: The questions we ask should be able to change.L ...couldn't think of an L</p></li><li><p>WHAT I'M DESCRIBING IS BASICALLY AUDIENCE POLLING.</p></li><li><p>But, live polling uses devices that look like this:</p></li><li><p>And that's a bummer.</p></li><li><p>SO LET'S BUILD A BETTER ONE.</p></li><li><p>2. WHEREIN T-SHIRTS ARE GIVEN AWAY</p></li><li><p>There's this company called They provide an API layer to Telephony servicesExample: Record any calls to a given number and use speech-to-text to make it searchable laterExample 2: all SMS messages to a given number can berecorded, parsed, and passed as a stream of JSON data.</p><p>Twilio</p></li><li><p>Carter Rabasa is a developer evangelist at TwilioHe sent us a bunch of t-shirts and stickers to give awayMake sure you say thanks! / @CarterRabasa @Twilio</p></li><li><p>Here's how to get a free shirt:Text your Name, Age, Gender and Shirt Size to 902-707-1128Format: comma separated: "Phil Renaud, 30, Male, L"Shirt Sizes: S, M, L, XLFree shirts until we run out of your size!Shirt Requests:Average Age:</p></li><li><p>3. A BRIEF HISTORY OF ASKING QUESTIONS.</p></li><li><p>In 1890, a weekly magazine calledLiterary Digest was founded. It primarilycovered analysis of news and currentopinion pieces.Over the years, circulation increased.By 1922, the magazine had absorbedweekly magazines Public Opinion andCurrent Opinion.Starting in 1916, Literary Digestconducted straw polls to determine theoutcome of the United States</p><p>presidential election.In 1916, 1920, 1924, 1928, and 1932, they correctedlypredicted the outcome of the presidential race</p></li><li><p>In the 1936 presidential election, the incumbent Franklin D.Roosevelt was challenged by Governor Alfred Landon, ofKansas.Literary Digest committed to creating the largest Americanpresidential poll ever conducted.They sent out approx. 10 million questionaires in the form ofpostcards - representing about 1 in 4 potential voters - ofwhich 2.3 million were returned.The poll indicated that Landon would win in a landslide, with57.1% of the popular vote, and taking the electoral college 370to 161Original Literary Digest</p></li><li><p>As it turned out...Roosevelt won 46 of the 48 states...taking 60.8% of the popular vote...and taking the electoral college 523 to 8; the largest everdifference in a presidential election.Two years later, Literary Digest shut down forever.</p></li><li><p>When Literary Digest sent out their 10 million questionaires,they used their subscription base, phone books, and clubregistries in order to find potential voters.In doing so, they had an inherent bias working against them:club members, magazine subscribers, and owners oftelephones in 1936 counted themselves on the wealthier sideof Americans. Americans who notably voted differently fromthe average.The same year, George Gallup predicted both the correctoutcome of the election, and why the Literary Digest poll wouldbe wrong, based on a relatively small sample of 50,000americans.Gallup's methods in 1936 ushered in the modern era ofscientific opinion research.</p></li><li><p>THE POINT IS, IT'S REALLY EASY TO SCREW THIS UP.Asking questions isn't all that hard. Asking them in the right way,</p><p>and drawing meaningful conclusions: that's difficult.Accidental bias is still a very real concern in the sort of pollingwe want to be doing:Not everybody has mobile phonesNot everyone uses twitterNot everyone is in the same time zone; not everyone is awakeat the time a "real-time" poll might be conductedBeing aware of these limitations is the first step to overcomingthem.</p></li><li><p>4. "IF YOU CAN'T EXPLAIN IT SIMPLY, YOUDONT UNDERSTAND IT WELL ENOUGH"</p><p>A. Einstein</p></li><li><p>WITH HOW GOOD WE ARE AT RECORDING DATA, WE SURE ENDUP WITH A LOT OF IT.</p><p>On an average day, Twitter generates about 500 milliontweets. That's about 15 times the number of books in theLibrary of CongressThese mountains of data add up. We record much more datathan we can possibly analyze 1:1 in a lifetime.</p></li><li><p>EVEN WHEN WE AGGREGATE DATA,IT'S NOT ALWAYS CLEAR WHAT'S GOING ON.</p></li><li><p>And that's sort of a bummer. See...</p><p>DATA IS ONLY VALUABLEWHEN IT'S UNDERSTOOD</p></li><li><p>STATISTICAL DETECTION OF ... ?</p><p>source</p></li><li><p>STATISTICAL DETECTION OF ELECTION FRAUD</p><p>source</p></li><li><p>VISUALIZATION LENDS MEANING TO RAW DATA.</p></li><li><p>.ENTER() D3</p></li><li><p> is a javascript library for manipulating documents based ondata.</p><p>D3.js</p><p>Has methods for binding data to DOM elements (often, but notnecessarily, SVG shapes)Allows for reflection of data state changeModern methods for data loading (from JSON, from .csv, .tsv)Good introduction to the library:</p></li><li><p>We'll be using D3 help with some of the data visualization we dotoday, but basic visualizations like bar charts are pretty</p><p>straightforward to create with just HTML, CSS and Javascript.</p><p>New Values in my Array, plz</p><p>.bar { background: #ccc; color: #444; margin-bottom: 2px; padding: 5px; box-sizing: border-box; transition: .5s;}</p></li><li><p>barChartDataHandler();$(document).on('click', '.newArray', function(){ barChartDataHandler();})</p><p>function barChartDataHandler(){ var arr = []; for (var i = 0; i &lt; 10; i++) { arr.push(Math.round(Math.random() * 100)) } // example: [80, 76, 11, 9, 35, 96, 54, 48, 34, 51]</p><p> var maxValue = _.max(arr); if ($('.bar-chart').is(':empty')) { $.each(arr, function(i){ $('.bar-chart').append('19'); $('.bar').eq(i).css({'width': arr[i] / maxValue * 100 + '%'}).text(arr[i]); }) } else { $.each(arr, function(i){ $('.bar').eq(i).css({'width': arr[i] / maxValue * 100 + '%'}).text(arr[i]); }) }; //if our chart has been populated already}</p><p>See it in action at</p></li><li><p>No magic happening; just bar-width being a relativepercentage of max-width.Underscore.js is your friend here - - toolbeltlibrary for figuring out the _.max() value, mapping and reducingon the fly, etc.But D3 affords us several things we can do with SVG shapesthat aren't easily reproduced with plain HTML.Let's leverage a charting library that sits on top of D3 calledC3.js - find it at </p><p></p><p></p></li><li><p>D3 GAUGE CHARTNo CSS this time - we'll let D3 handle any styling needed</p><p>Love itHate it</p></li><li><p>D3 GAUGE CHARTCall loadGauge():</p><p>function loadGauge(){ gaugeArr = []; var pos = _.countBy(gaugeArr)['+'] / gaugeArr.length * 100; gaugeChart = c3.generate({ data: { columns: [ ['Approve', pos] ], type: 'gauge' }, }); //chart</p><p> $(document).on('click', '.approve', function(){ gaugeArr.push('+'); gaugeHandler(); }) $(document).on('click', '.disapprove', function(){ gaugeArr.push('-'); gaugeHandler(); })} //loadGauge</p><p>See it in action at</p></li><li><p>So now we have the basic mechanism of binary voting!Very little code outside of a template here - C3 and D3 do theheavy liftingUnderscore function to _.countBy and figure out the relativepercentage of "Approve" votes.</p></li><li><p>5. DATA OVER THE WIRE</p></li><li><p>I've set up a project to handle the combination of RADICALdata/polling and Data Visualization. Here's a brief overview of my</p><p>stack:Twilio SMS account that will receive text messages and passthem along to...Local server running Python/Flask, to handle stream fromTwilio and feed it as an API for...Javascript running in the browser, using a streaming APIrequest service (Oboe.js) to poll data and pass it into...D3.js, which renders it as a gauge.</p></li><li><p>Over the next few slides, I'm going to put a word on the screenand a phone number.</p><p>Text either + or to that number to indicate your Approval orDisapproval of the subject at hand.</p></li><li><p>READY?</p></li><li><p>PIZZATEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>MONTREAL CANADIENSTEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>BRUSSELS SPROUTSTEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>STAR TREKTEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>WORLD CUP SOCCERTEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>US FOREIGN POLICYTEXT</p><p>+ OR TO 902-707-1128</p><p>0 100</p><p>0.0%</p><p>UNDER 30: 0% 30+: 0%</p><p>Next</p></li><li><p>JAR JAR BINKSTEXT</p><p>+ OR TO 902-707-1128</p><p>UNDER 30: 0% 30+: 0%</p><p>0 100</p><p>0.0%</p></li><li><p>SO THAT'S HOW I DO REAL-TIME DATA VISUALIZATION.</p></li><li><p>6. CLOSING REMARKS</p></li><li><p>DIVERSITY OF DEVELOPERS IN HALIFAXWhen I polled for T-shirt sizes earlier, I asked for gender to benoted as "male" or "female" when you sent your SMSHere's how that broke down:</p><p>Male:Female:</p><p>This is not a problem exclusive to Halifax, nor is it a problemexclusive to Tech, but it is a problem we can do somethingabout.Next time you have the chance to attend a tech meetup, pleaseencourage a female colleague or friend to attend with you! Alack of diversity only holds us back as a community.</p></li><li><p>OFFICE HOURSIf you're working on something cool and you need advice orjust want to wax about it, I want to help!I'm at Humani-T Caf on South Park street most evenings.Stop in and say hi anytime!The Affinio offices are in Bedford - there's someone theremore or less all day and we love drop-ins.Finally, find me on twitter: @phil_renaud</p></li><li><p>THANK YOUFind me at:</p><p> | | @phil_renaud</p><p>Slides up at</p></li></ul>