tufco project brainstormdownload.101com.com › pub › tdwi › files › r_osbi_integration.pdf–...

22
© OpenBI, LLC 2009 1 OpenBI LLC The Open Source Business Intelligence Experts Extending Open Source BI platforms with R analytics.

Upload: others

Post on 07-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 1

OpenBI LLCThe Open Source Business Intelligence Experts

Extending Open Source BI platforms with R analytics.

Page 2: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 2

Who we are

• Professional Services Firm– Specialized on BI and open source – Expert at bridging business and technology– Unwavering commitment to our customers

• Seasoned BI Professionals– Partners average 20+ years experience– Demonstrated BI thought leadership (e.g. DM Review, B-Eye Network)– Reputation for high-quality service, personal and professional integrity– Deep consulting/training services roots

• Extensive Expertise with BI technologies– Databases– ETL– Query/Reporting/OLAP– Dashboards & Scorecards– Statistical Modeling/Data Mining– Analytical CRM

Page 3: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

Our Point in a Nutshell

“If the decision is going to be made by the facts, then everyone’s facts, as long as they are relevant, are equal. If the decision is going to be made on the basis of people’s opinions, then mine count for a lot more.”

Jack Barksdale, then CEO of Netscape

Page 4: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 4

The Case for BI and Super-Crunching

Business Problem:Flawed Decision-Making

Technical Solution

• BI• Analytics• Experimentation

Business Solution

Performance Management Focus

Strategic Solution

Evidence orFact-based Decision-Making

Page 5: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 5

Evidence Based ManagementA new philosophy for decision-making…

Management MindsetChange • Downgrade conventional wisdom and ego

• Upgrade test results and facts

Attitude of Wisdom • Humbly appreciate what you don’t know• Constantly question what you do know• Act on best knowledge available

Scientifically-Based• Generate Hypothesis ->• Conduct Research & Test ->• Assemble Evidence & Draw Conclusions ->• Act!

Strategy as Hypothesis “The organization is an unfinished prototype requiring trial programs, pilot studies, experimentation, etc.”

… requires a performance based approach.

Page 6: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

Performance Management directs BI investments to business results

How am I doing? • Standard QRA• Executives as Consumers

What opportunities exist to improve performance in the future?• Data Mining & Operational BI• Line Staff as Consumers• Automated Decision-making

Why am I doing well or poorly?• Analytical Apps• Managers as Consumers

Enable Effective Business Decisions

Understand Trends &

Anomalies

Performance Relative to Objectives

Measure

Explore & ExecuteStrategize

Performance Management & BI

Page 7: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 7

The R Project for Statistical Computing

• www.r-project.org • Derived from award-winning S

language developed at Bell Labs by John Chambers

• Object-based and readily extensible• Open source GPL• R provides:

– language, – storage, – data manipulation, – statistical/mathematical

procedures, – production-quality graphics

Page 8: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 8

Adoption of R

• One of the most successful open source projects

• Lingua franca of academic statistical computing

– Over 1M users world-wide

• Unix, Linux, Windows, MacOS ports

• Enthusiastic world-wide support forums

• 1650+ contributed packages to extend the base platform

• Latest procedures submitted by originators well before they're

available commercially

• Cadre of R users/developers coming to the business world

Page 9: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 9

Support for R

• Incredibly stable core product

• Excellent documentation/manuals/tutorials

• Large and growing # of R texts (42 in 2006-2009)

• Wiki, Newsletter

• 18 mailing/support lists

• International user conference, UseR! 2009, Agrocampus-Ouest,

Rennes, France

• Strong inroads into financial services and health sciences

Page 10: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 10

• Extend OSBI’s core QRA business intelligence capabilities– Statistical modelling– Advanced data visualizations

• Bridge R and OSBI communities– Generate broader adoption– Advance mutual innovations– Engage statistics and BI communities together

The Case for OSBI/R Integration

Page 11: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

11

Developers

Business Users

Business Users

Developers & DBAs

Interactive, Ad Hoc, and Managed Query and

Reporting Server

Interactive OLAP Data Analysis

High Performance Data Integration

Report Development

Library

Reporting, Analysis and Data Integration

Jaspersoft Business Intelligence Suite

Page 12: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

12

Repository, Scheduling, Security, Integration

Production Reporting

End-User Ad Hoc Query &

Reporting Dashboards Data Analysis / Exploration

Data Mart / Warehouse / ODS

Operational RDBMSor

POJO, EJB, XML, Hibernate, MDX,

Custom

Advanced Reporting

Jaspersoft BI Suite Architecture

Page 13: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

Demo

Page 14: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 14

Pentaho/R Integration - Components

• Pentaho BI Suite 2.0

– Note that R graphs are generated as image files that are returned via norma Pentaho action sequence processing. No “container” report is necessary.

• R v2.8.1 (http://cran.r-project.org)

– The core R platform which must be installed on your server

• RServe v0.5-3 (http://www.rforge.net/Rserve/files/)

– TCP/IP Server which allows clients to use facilities of R

• REngine

– Java class library which enables java client to interact with RServe

In order to develop/deploy the following must be downloaded and installed

Page 15: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

Pentaho Deployment Architecture

Pentaho BI Server

RConnection

R Function CodeRServe

R Datafile

Image File

Tomcat1. User requests execution of a Pentaho

component to create an R graph

2. Server initiates action sequence, passing session-maintained RConnection object and all user entered execution parameters.

3. Action Sequence JavaScript step performs the following by using the REngine API and supplied RConnection object:

a. Source the R Function Code file

b. Source the cached R Datafile

c. Convert Pentaho parameters into R function parameters

d. Invoke the R function to process and generate an Image File

4. An image file (jpg, png, etc) is created in the Pentaho data file repository cache

5. The image file is referenced in the html response and rendered to the user’s browser

1

2

3a

3b

5

3d

43c

ActionSequence

File Cache Approach

Page 16: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

Pentaho Design Tips

• Place all R functionality within an r code file and provide a single function call for Java client

– Pentaho action sequence simply prepares parameters and calls an R function (e.g. create_chart) which generates a graph output file.

– R programmer works in “R”, Pentaho developer works in Java/Pentaho

• Utilize power of Pentaho session parameters to create and maintain RConnection object for a user session

– Could be created on session startup

• At present, have not created capability to pass Pentaho generated dataset to R as a data frame

– REngine API to create data frames will not work with Javascript – Requires development of a “Pentaho ResultSet -> R Data Frame” utility Java

class– Once created, Pentaho platform ability to create datasets from SQL, MDX,

XQuery, ETL Script, Javascript, etc would be enabled for data acquisition.

Enable modular development and better runtime performance

Page 17: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 17

JasperSoft/R Integration - Components

• JasperServer v3

• iReport v3

– To enable scriptlet development copy tools.jar from your jdk/jre deployment directory to ~iReport-3.x.x\lib

– Note that R graphs are generated as an image field inside a JasperReport

• R v2.8.1 (http://cran.r-project.org)

– The core R platform which must be installed on your server

• RServe v0.5-3 (http://www.rforge.net/Rserve/files/)

– TCP/IP Server which allows clients to use facilities of R

• REngine

– Java class library which enables java client to interact with RServe

In order to develop/deploy the following must be downloaded and installed

Page 18: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

JasperSoft Deployment Architecture

JasperServer and JasperReport

RConnection

R Function Code

RServe

R Datafile

Image File

Tomcat1. User requests report execution

2. Report execution causes afterReportInit() scriptlet method to fire

3. The code in the method uses the REngine API to:

a. Obtain an RServe connection

b. Source the R Function Code file

c. Source the cached R Datafile

d. Convert Jasper parameters into R function parameters

e. Invoke the R function to process and generate an Image File

f. Release the RServe connection

4. The report contains a single image field which is populated with the generated Image File

5. The report is rendered to the user

1

2

3a

3b

3c

5

3e

4

3d

3f

Report Scriptlet

File Cache Approach

Page 19: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

JasperSoft Deployment Architecture

JasperServer and JasperReport

RConnection

R Function Code

RServe

Image File

Tomcat1. User requests report execution

2. Report execution causes afterReportInit() scriptlet method to fire after JasperReport executes its SQL

3. The code in the method uses the REngine API to:

a. Obtain an RServe connection

b. Source the R Function Code file

c. Convert the SQL result set into an R data frame

d. Convert Jasper parameters into R function parameters

e. Invoke the R function to process and generate an Image File

f. Release the RServe connection

4. The report contains a single image field which is populated with the generated Image File

5. The report is rendered to the user

1

2

3a

3b

5

3e

4

3c/d

3f

Report Scriptlet

Data Query Approach

Page 20: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

JasperSoft Design Tips

• Place all R functionality within an R code file and provide a single function call for Java client– JasperReport Scriptlet simply prepares parameters from the Jasper

environment and calls an R function (e.g. create_chart) which generates a graph output file.

– R programmer works in “R”, Jasper developer works in Java/Jasper

• When to utilize a prepared R data file vs dynamically creating R Data Frame?– Still being studied…

• JasperReport is simply structured– Single Image Field that is populated with the generated image

file

Enable modular development and better runtime performance

Page 21: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009

JasperForge

• We plan to sponsor a project on JasperForge which will contain:– Examples– How-to Instructions– Forums, etc.

• Goal is to have first iteration available in Feb 09

Page 22: Tufco Project Brainstormdownload.101com.com › pub › TDWI › Files › R_OSBI_Integration.pdf– Advanced data visualizations • Bridge R and OSBI communities – Generate broader

© OpenBI, LLC 2009 22

Thank You!

• www.openbi.comWeb

Phone

Email

• Office: 312.863.8660• Kevin Cell: 773-425-6010• Dave’s Cell: 630-405-8404• Steve’s Cell: 847-778-1145

[email protected][email protected][email protected]