real time pipeline at terabyte sacle
TRANSCRIPT
WE MAKE SOCIAL DATA ACTIONABLE
Over 1B social
signals are processed
monthly by the
ShareThis Social
Intelligence Platform™
to generate insights
about your brand,
industry and events.
ENGAGEMENT
Users consume and share content
across web and mobile
TARGETING
Desktop and mobiletargeting at scale
INSIGHTS
Actionable cross-device insights
DATA
1B+ first party Social Actions
Monthly
ENGAGEMENT
TARGETING INSIGHTS
DATA
• Lookalike Audiences• Audience Segments
“Wow small SUVs are fuel efficient!”
User #12345
• Automotive Study• Car Buying Infographic
Previous Architecture ProblemsDuplicated Data
Query Engine
Share Data
Insights
Query Engine
Ad Tech
Query Engine
Consumer Engagement
Query Engine
Data Science
Fragmented & Siloed Data Sources
Query Engine
Share Data
Insights
Query Engine
Ad Tech
Query Engine
Consumer Engagement
Query Engine
Data Science
Campaign RTB Conversion
Summarization3rd Party
Trends Studies
Generating Reports From Old Platform
Raw Data
PreAggregation
Staged Data
ResultsConsumers
Query
Rest API
New Report Type
Why Focus On These Problems?
Faster Iterations Data Science
New Applications
Business Value
Targeting
Real Time All The Things
Raw Social Data
DLX Geo Device Mappings
SentimentSocial Keywords
Downstream Applications
Kafka ArchitectureData
ScienceApplication
Data Science
Logs
Data Science
Producers
Data Science
Application
Data Science
Logs
Data Science
Producers
Brokers
Data ScienceConsumers
Data Loaders
Data ScienceAnnotations Data ScienceFilters
Destinations
Big QuerySocial Ad Tech
Integrate Campaign
Social Data
DLX Geo Device Mappings
SentimentSocial Keywords
RTB Bid Data
Campaign Data
Downstream Applications
Build An Active Warehouse
3 Trillion Row Interactive Query
Engine
Share Data
Data Science
Ad TechConsumer
Engagement
Sales Strategy
Insights
RTBImpressionS &
Clicks + RT
External
Data Scienc
e
Data Scienc
eATDs
Data Scien
ceDMPs DSPs
Internal
Google Big Query
Add in redundancy and robustness into our data pipeline that protects us against data loss.
Reliability
Monitoring InfrastructureConsumer App
Metrics Library
Producer App
Metrics Library
Graphite
Slack
Dev Team
Seyren
Dashboards
Data Stream Filter Prototype
Real Time Pipeline
shares from top 100 domains
user actions in north east
region
users who recently bought
car
user likely to buy a car soon
actions from user ids in
(1234, 5432, 9999)
Data ScienceExternal
Customers
Data ScienceInternal Teams
Predictive Algorithms
Dynamically create filters based on customer’s needs. These can be created instantly on-demand.