business intelligence on a budget: open source...

31
Business Intelligence on a Budget: Open Source BI Paul O’Rorke

Upload: ngotram

Post on 26-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Business Intelligenceon a Budget:

Open Source BIPaul O’Rorke

Goals

• provide background & motivation

• discuss business models & licenses

• survey open source BI

• compare open versus closed BI

• identify trends

• provide conclusion & suggest next step

Background & Motivation: Why OSBI?

• it’s free low cost

• it’s free: provides greater control over the code

• more knowledge (e.g., can examine source)

• can adapt, include, and use

• can redistribute

Viva Software Libre!

Background & Motivation: Why Now?• OSBI is at a tipping point:

• open source is an integral part of most companies’ software and well trusted now

• OSBI has reached a level of maturity comparable to many of the earlier open source successes

• OSBI is “mega-trend number one” of nine listed by Intelligent Enterprise for 2009

• Economic hard times will encourage or force many companies to save money

OS is achieving world domination: according to Gartner, 85% of enterprises use it and the remaining 15% plan to do so in the next year.

Talend: performance, usability, avoid lock-in, no licensing costs, source code access

OSBI Business Models

• give away free version, eliminate costly software licenses and sell (typically by subscription)...

• advanced features / pro & enterprise editions

• consulting, services, support

• maintenance

• access to hosting & platforms

Open Source Licenses

• Goal: provide a quick overview of OS and OSBI licenses

• What are the (major) different licenses?

• How are they different?

• What are the important issues?

I AM NOT A LAWYER

OS Licenses: Key Concepts

• Reciprocal:

• give freedom to licensee

• but also bind the licensee (usually with the goal of preserving and propagating freedom)

• Copyleft:

• strong: all code that is based on, adapts, or links with open source must be open

• weak: adaptations must be open but linked software need not be open

Richard Stallman’s Free Software Foundation’s “copyleft”

OS Licenses: +

• increase developers’ and licensees’ freedom

• to study, adapt, and redistribute

• flexible:

• allow modifications

• support purchases and subscriptions

Viva Software Libre!

OS Licenses: Examples

• “Restricted”: Gnu (GPL)

• Examples: Emacs, Linux, & MySQL (GPLv2)

• “Less Restricted”: Mozilla (MPL), Eclipse (EPL)

• Examples: Eclipse (EPL), Firefox, Java (MPL)

• “Free”: Apache, Berkeley (BSD)

• Examples: Apache, PostgreSQL (BSD)

Whether a license is less or more “restricted” depends on POV: licensor versus licensee. “strong copyleft” / GPL: Good if building something with other GPL software. Good for one-off projects if client doesn’t care if software is made open. Good for world domination.“weak copyleft” / LGPL Good if you want to build a proprietary system on an open library.

OS Licenses: -

• Many companies insist on or prefer:

• indemnification

• SLA

• security

• support

• the right to include OSS without having to open their own code

Goals

• provide background & motivation

• discuss business models & licenses

• survey open source BI

• compare open versus closed BI

• identify trends

• provide conclusions & suggest next steps

Survey OSBI

• reporting & dashboards

• ETL & integration

• databases and data-warehouses

• OLAP

• analysis languages & tools

• data mining

• suites

• CRM, sales automation

add or subtract?

Reporting & Dashboards

• BI Reporting Tools - BIRT (Actuate , Eclipse)

• JasperReports (JasperSoft)

• JReport (JInfonet)

• OpenI (originally Loyalty Matrix, now OpenI)

• Palo (Jedox AG)

• Pentaho Reporting (includes JFreeReport, Pentaho)

OpenI: “visualize data from OLAP, RDBMS, and data mining tools, and intuitively build and publish interactive reports, analyses, and dashboards.”

Most of the pure dashboard companies on the web appear to be very small and do consulting and custom design mostly for small companies.

Excluded:

DataVision: Java-based but uses JRuby for formulas. Seems too slow, small & not enterprise level.

RLIB: C based although it has bindings for Java, PHP, etc.

ETL & Integration

• Jitterbit, easy to use. SaaS and SOA oriented

• Snaplogic, really simple OSDI for SaaS

• Web-based, including IDE

• Python

• Talend (also offered by JasperSoft)

• Eclipse-based IDE “open studio”

• generates Java or Perl

Snaplogic - good example: mashup of linked-in and salesforce.

According to a Talend Survey of 1000 enterprises reported by ComputerWorld UK in InfoWeek 2/6/2009, “31.2 percent of respondents use open source tools in combination with commercial applications for data integration...“The key drivers for using open source tools were ease of use (59 percent), performance (53.9 percent), and no vendor lock-in (42.5 percent), followed by licensing costs with only 42.1 percent respondents”performance, usability, avoid lock-in, no licensing costs, source code access

Databases &Data Warehouses

• Column-oriented databases:

• LucidDB (LucidEra)

• Infobright

• MonetDB

OLAP

• Julian Hyde’s Mondrian

• now part of Pentaho

Analysis Tools

• Query and Reporting Tools

• OLAP based

• Languages for custom analysis:

• R

• replacing S

• in use at Facebook, Google, Spotfire (TIBCO)

Data Mining

• Weka (now part of Pentaho)

• commonly used machine learning algorithms (e.g., classification rule & tree learning, clustering, Bayes nets)

• Java

• Mahout

• Java, on the Hadoop MapReduce platform

Weka rhymes with Mecca.

Trying to schedule someone from Mahout (Jeff Eastman) for later in the year.

Suites

• JasperSoft

• Palo

• Pentaho

• SpagoBI

ignoring Palo as it is relatively small: started with reporting and expanded to OLAP server

are there more?

Suites: JasperSoft

• Components:

• JasperReports - for Java developers

• JasperStudio (aka/fka JReport)

• JasperServer - query & reporting server for end user

• JasperAnalysis - includes OLAP data analysis

• JasperETL (Talend) - for DBAs & developers

Suites: Pentaho

• Components:

• Reporting, Data Integration (Kettle), OLAP Server (Mondrian), Data Mining (Weka)

• Platform...

• engine: core, security, & services

• repository, & UI foundation

Suites: SpagoBI

• by Italian IT company Engineering Ingegneria Informatica (6k employees)

• Components:

• JasperReports

• Mondrian

• Talend

Customer Relationship Mgmt. & Sales Automation

• SugarCRM (30M downloads, PHP)

• Splendid CRM (.Net)

• Concursive (backed by Intel Capital, Java)

• Hipergate (Java)

• Compiere (Java & Javascript)

369 Sourceforge CRM projects!!!

Top 5 as of December 2008 according to InsideCRM / Sourceforge.

Database support differs, for example Compiere uses PL/SQL and Oracle.Some others (e.g., Hipergate) are database independent.

OSBI Licenses

• GPL

• Pentaho (Platform v2 - GPLv2)

• SugarCRM (Community ed. - GPLv3)

• LGPL: SpagoBI

• MPL: Jitterbit (JPL ~ MPL), OpenI (MPL1.1)

Pentaho claims they went to GPLv2 rather than GPLv3 in their v2 because they wanted to make it easier for others to embed other GPLv2 software like MySQL with Pentaho

Closed versus Open BI

• CRM: Salesforce.com

• Reporting: BO/SAP Crystal, TIBCO Spotfire

• ETL, Integration: Datastage, Informatica

• column oriented databases:

• Sybase IQ (first), Vertica

• IBM, Oracle, MS expected to follow

• Data Mining: SGI?, IBM?, Oracle, SAS, Fair Isaac

Trends

• Self service / Simplification

• Rich Internet Applications (RIAs)

• Expansion (to Suite; to Platform)

• Platforms

• Cloud Computing

Everyone is trying to simplify and move their apps from developers to end-users (e.g., Snaplogic)

Trend toward RIAs has been underway for at least six years. Spotfire is a good example.

Many OSBI vendors start with a single offering (e.g., for reporting) and expand out to cover more of the BI spectrum. On the CBI side, Salesforce started with CRM & Salesforce automation and is expanding out.

Pentaho is a good example of a well developed OSBI Platform while Salesforce’s force.com is a good example of a CBI platform.

Conclusion /Next Step

• OSBI is bigger and better and may be a good choice for your next project

• consider OSBI alternatives to custom development or closed source BI

References & Sources

• 2006, Open Source BI, Ventana Research

• 2008, “Fine But Not Fine-Tuned Yet” InformationWeek.com

• 2009, Nine BI Megatrends for 2009, Intelligent Enterprise,

• http://www.intelligententerprise.com/showArticle.jhtml?articleID=212700482

• Free Software Foundation: http://www.gnu.org/copyleft/

• Wikipedia: Copyleft, FSF, GPL, LGPL

Resources

• SDForum archives contain presentations by or on

• Actuate

• Jaspersoft

• LucidEra

• Mondrian

• Snaplogic

• SugarCRM

OSBI Companies• Actuate

• Infobright

• JasperSoft

• Jedox AG

• JInfonet

• Jitterbit

• LucidEra

• Kickfire

• Pentaho

• Snaplogic

• SugarCRM

• Talend

Acknowledgments

• thanks to Sonja London for her contribution to the title

• thanks to Richard Taylor for contributing info about databases and licenses