we help you understand audience attention. · 2020. 7. 13. · provide a real-time and historical...

Post on 31-Dec-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Website: parse.ly Blog: blog.parse.ly Email: andrew@parsely.com

We help you understand audience attention.

Follow me: @amontalenti Our research: @parsely Our podcast: @attnpod

How? Parse.ly Analytics.

Web content visits represent attention at global scale.

+ hundreds of other companies who run thousands of high-traffic sites. + the long tail.

Sites with content and audience Platforms

Parse.ly measures content and audience …

Page views Visitors Engaged time Social shares

Audience loyalty Devices Video Titles

Authors Sections Tags Referrers

Campaigns Publish dates Channels

+Much more

… to tell the story behind the story.

Our dashboard can answer this question: What’s gaining attention on your sites and apps?

Provide a real-time and historical window into what’s happening with your content when it comes to audience attention.

• 30,000 monthly active users across 350+ media companies.

• Measures the attention of over 2 million page views per minute at peak time.

• Sub-second data latency with 99.99% internal SLO.

We make data accessible and essential.

Parse.ly Analytics: What’s running under the hood?

Powered by mage:

• 100+ Elasticsearch nodes storing over 20 terabytes of production live query data.

• 3,600+ real-time processing CPU cores using Storm.

• Kafka and Cassandra for rock-solid distributed streaming data.

• Elastic scalability for hourly and nightly jobs using Spark.

Parse.ly Analytics: What does the team release publicly?

We love open source!

• streamparse is our publicly-maintained and popular project for running production parallel computation systems with Python 2.x and 3.x, using Apache Storm.

• PyKafka is the community’s fastest and most production-tested Python driver for Apache Kafka.

+ PyKafka

+ parsely_raw_data

+ time-engaged

+ others

Why now? Parse.ly Currents.

Aggregate attention data already guides the industry.

And answers questions it could never answer on its own.

Our network data can answer this question: What do people care about?

Front row seat to the web interests of over 1 billion people per month and 150 million people per day.

Categories include: news, entertainment, finance, politics, sports, opinion, culture, and more.

Apply modern machine learning and natural language processing techniques.

Parse.ly Currents: What is our petabyte-scale analysis stack?

Petabytes of event data and terabytes of web crawl data.

• BigQuery used with day-partitioned tables to do fast aggregation over petabyte-scale event data without running a cluster.

• PyData stack used for statistics and machine learning over time series data.

• Natural language processing on text data using Python, leveraging a web-based ontology (knowledge graph), domain-specific keyword/entity lists, word vectors, document classifiers, unsupervised clustering, and more.

1 billion unique visitors per month

20 billion page views per month

5 billion clicks from search, social, & others

900k posts published and analyzed each day

2 million topics, categories, and keywords

Does discovery vary by topic?

87.1% Facebook

61.4%

60.8%

59.5%

58.9%

53.5%

52.7%

41.3%

36.3%

35.5%

21.3%

19.2%

14.1%

11.9% 3.7% 84.4%

39.0% 14.1%

30.4% 50.4%

18.0% 60.8%

22.3% 42.2%

20.7% 43.0%

28.9% 29.7%

22.6% 24.6%

22.2% 24.4%

19.8% 21.3%

15.9% 24.6%

10.1% 29.1%

12.3% 26.2%

6.2% 6.7%Google

Job Postings

Business & Finance

Sports

Technology

State & Local Politics

World Economy

National Security

Local Crime & Incidents

Criminal Justice

Education & Research

U.S. Presidential Politics

Entertainment

Local Events

Lifestyle

2.7k posts

39k posts

210k posts

67k posts

17k posts

26k posts

49k posts

98k posts

55k posts

36k posts

110k posts

190k posts

96k posts

110k posts

Topics are derived from posts in the Parse.ly network of sites from 2016 using a topic modeling algorithm called LDA (Latent Dirlichet Allocation). For more information: parsely.com/authority

Number of posts for each topic

110kposts

U.S.

Pre

s. P

oliti

cs

43% 47% 10%

Desktop Mobile Tablet

Device tra ic breakdown

Number of posts for each topic

26kposts

Wor

ld E

cono

my

46% 45% 9%

Desktop Mobile Tablet

Device tra ic breakdown

CLINTONPRESIDENTCAMPAIGNDONALDPRESIDENTIAL

OBAMAELECTION

PARTYHILLARY

STATE

POLITICALDEMOCRATIC

WHITE

CANDIDATE

VOTE

SANDERSHOUSE

VOTERS

FORMERAMERICAN

NEWSSTATES

COUNTRY

NATIONAL

DEBATE

WOMENAMERICA

CRUZCO

MM

ON

W

OR

DS

IN

PO

ST

S TRUMPU.S. Presidential Politics

REPUBLICAN

CHINAOILEUPERCENT

CHINESE

ENERGYSINCEPER

EUROPEANTRADE

CO

MM

ON

W

OR

DS

IN

PO

ST

S

STOCKSBREXITPRICESDEALBANKCENTNFLAPUK

World Economy

ACCORDINGMARKETSTRADINGBILLIONBRITAINMARKETSTOCKWORLDGLOBALPOWER

Google Search

Facebook

Other

43.0%

36.3%

20.7%

External referral sources

4.6%

4.0%

2.4%

1.4%1.1%0.9%0.9%0.8%0.7%

news.google.com

twitter.com

yahoo!

drudgereport.comflipboard.combinglinkedin.comreddit.comtra ic.outbrain.com

Facebook

Google Search

Other

59.5%

24.6%

15.9%

External referral sources

4.3%

4.1%

1.9%

1.1%0.9%0.7%

news.google.com

twitter.com

drudgereport.com

yahoo!bingreddit.com

Can Internet attention predict public opinion?

Can Internet attention predict a film’s revenue?

600k

500k

400k

300k

200k

100k

10k 20k 30k 40k 50k 60k 70k

Cumulative Box Office Gross Revenue

Print Ad Cost in US $

600k

500k

400k

300k

200k

100k

Cumulative Box Office Gross Revenue

Negative Cost in US $

50k 100k 150k 200k 250k200k

600k

500k

400k

300k

200k

100k

400k 600k 800k 1M

Cumulative Box Office Gross Revenue

Unique Views

0.955Pearson Correlation Coefficient

when excluding PG rated movies

Movies rated PG

Movies not rated PG

0.474Pearson Correlation Coefficient

when excluding PG rated movies

0.829Pearson Correlation Coefficient

when excluding PG rated movies

Revenue Compared toUnique Views

for Related Web Posts 3 Days Prior to Release

Revenue Compared toPrint Ad Cost in US $

Revenue Compared toProduction Cost in US $

Total unique views for posts related to a movie three days prior to its release has the highest correlation with revenue compared to production cost and advertising budget.

200k

600k

500k

400k

300k

200k

100k

400k 600k 800k 1M

Cumulative Box Office Gross

Revenue

Unique Views

0.955Pearson Correlation Coefficientwhen excluding PG rated movies

Movies rated PGMovies not rated PG

Revenue Compared to Unique Viewsfor Related Web Posts 3 Days Prior to Release

We are a partner you can trust. 400+ paying clients. 3000+ big sites. 1B+ network visitors.

We’re small and nimble, yet we operate with scale and integrity. We are 70+ people.

• A client services, support, and ops team of 40 people, with a head office in NYC.

• A fully distributed product team of engineers, data scientists, and designers. 30 people across US, Canada, and Europe.

• $12M+ USD in financing raised from 2011 to 2017.

Three asks for the audience today.

Sign up free, give us feedback!

http://parse.ly/currents

Follow me on Twitter!

@amontalenti

Website: parse.ly Blog: blog.parse.ly Email: andrew@parsely.com

Let’s continue the conversation about internet attention.

Follow me: @amontalenti Our research: @parsely Our podcast: @attnpod

top related