l'évolution de l'infrastructure bi viadeo par françois le lay

16
Techdays 22/11/2012 The evolution of Business Intelligence at Viadeo

Upload: olivier-dasini

Post on 01-Nov-2014

1.755 views

Category:

Documents


0 download

DESCRIPTION

http://fr.viadeo.com/fr/profile/francois.lelay

TRANSCRIPT

Page 1: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Techdays 22/11/2012

The evolution of Business Intelligence at Viadeo

Page 2: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Agenda

What is Business Intelligence?

Key Roles

Viadeo Data

Technical Solutions : a short history

Page 3: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Actions

Insights

Awareness

Application Stack

Data Warehouse & ETL

What is Business Intelligence ?

ActionsActionsInsightsInsightsAwarenessAwarenessApplication StackApplication StackData Warehouse & ETLData Warehouse & ETL

Plumbing of structured and unstructured data, logic to persists data

Meta Data, KPI’s, Visual Templates, Security, Information Dissemination, Scheduling

Reports, Dashboards

Forecasting, Predicting, Statistics, Competitor Information, Analysis

Marketing Actions, Business Strategies, Operations

Feedback

Page 4: L'évolution de l'infrastructure BI Viadeo par François Le Lay

BI Dashboards Specification

Simple (Metrics)

Complex (Data viz)

Information Access

BI Dashboard

s

(Scalars)

Direct (SQL,

Datameer)

AnalysisFollowup

Proactive

Web Product

Specification

Functional

(Challenge PO)

Technical

(Enforce data quality)

Key Roles : the Business Analyst

BI

BI

● Simple (Metrics)

● Complex (Data

● BI Dashboards

● Direct (SQL,

● Followup

Information Access

Information Access

● Simple (Metrics)

● Complex (Data

● BI Dashboards

● Direct (SQL,

● Followup

Analysis

Analysis

Web Product Specification

Web Product Specification

Page 5: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Data plumbingReal Time

Batch

Expose to AppsREST/Scala/Java APIs

JDBC/ODBC

Awareness

Implement Data

Visualization

Enforce data quality

Key Roles : the Big Data Engineer

Data plumbing

Data plumbing

● Real Time

● Implement Data Visualization

● Enforce

Expose to Apps

Expose to Apps

● Real Time

● Implement Data Visualization

● Enforce

Awareness

Awareness

Page 6: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Usage

Mining

Viadeo data : The Dynamics

Usage

• 45 million members• Worldwide presence

• China, India, Russia, Mexico,..• Mobile App, Web, API• B2B / B2C

User Engagement

Page 7: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Viadeo data : Graph

Page 8: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : The Beginnings

MysqlServer name : Peach

Phase 1: 2006-2008

Internal tool to allow C-Level, Sales,…Access data

Phase 2 : 2008-2010

MysqlServer name : Lakitu

Page 9: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : A better architecture

Mysql

Phase 3: 2010 - 2012

Server name : « Unfied ODS »

Server name : ODS LiveCluster 1

Server name : ODS LiveCluster 2

Server name : ODS LiveCluster 3

Server name : ODS LiveCluster 5

MySQL

Page 10: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : 2 new internal productsScala-centric, Play! framework

Cross-channel messaging systemEmail, Mobile, SocialFlexible content managementFlexible targeting of recipientsContent testing strategies : A/B, multivariateEvent-driven : web app events, mobile events, ad hoc eventsAutomation, scheduling, frequency capping

Analytics Data visualization : based on Javascript D3.js, processing.js etc.Tabular Reports, OLAP navigationPluggable alerts : business activity monitoring

A common requirement : scalability!!!Viadeo data is BigProcessing performance is not an option, it is mandatory

Page 11: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : a new architecturebased on CQRS pattern

Page 12: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : a new architecture

• Master dataset : • Historical data stored in HBase• Provided as a service by architects team

• Datamarts : • Built on HDFS using MapReduce jobs• MapReduce eased by use of Cascading library

and Scala DSL (Scalding) • Pushed to in-memory distributed storage• Elastic Search, Riak

Page 13: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Technical solutions : A better architecture

MySQLSQ

OOP

Page 14: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Conclusion

• Many scalable data storage solutions• Rapid application development frameworks and low-risk

programming languages on the JVM• Custom analytics = what we implement is what we use

• Analytical needs are very well identified• Blend data stream and batch processing to answer

different needs• Pluggable Data mining R&D• Analytics for Viadeo members/recruiters/companies :

Social Media Monitoring as a Complex Event Processing topic

Page 15: L'évolution de l'infrastructure BI Viadeo par François Le Lay

?

Page 16: L'évolution de l'infrastructure BI Viadeo par François Le Lay

Thanks !

[email protected]él : 01 75 70 12 93