big data for small businesses & startups

Post on 19-Jun-2015

285 Views

Category:

Marketing

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Big Data is not just for Big Businesses. In this slideshare we will cover how small businesses and startups can leverage Big Data to increase revenue. HPCC Systems lets you get started with only one machine and grow to exabytes. 1. Mining and understanding customers behavior from data outside the firewall and joining it with internal data to turn it into actionable marketing strategies. 2. Understanding your whole business with BI tools. Learn how Big Data help join data from different parts of your business to see the big picture.

TRANSCRIPT

HPCC Systems !Big Data for Small

Business

By Fujio Turner

@FujioTurner

LexisNexis is a provider of legal, tax, regulatory, news, business information, and analysis to legal, corporate, government,!

accounting and academic markets. !!

LexisNexis has been in business since 1977 with over 30,000 employees worldwide. 

What is HPCC Systems?Who is ?

LexisNexis Risk is the division of the LexisNexis which focuses on data, Big Data processing, linking and vertical expertise and supports HPCC Systems as an open source project under Apache 2.0 License.

3 Parts to Big Data

Complex Query

SpeedAmount

3 Parts to Big Data

Complex Query

SpeedAmount

+ Low CostLet focus on

Comparison

JAVA C++

Petabytes

1-80,000 Jobs/day

Since 2005

Exabytes

Non-Indexed 4X-13X

Since 2000

Indexed: 2K-3K Jobs/sec

? ? ? ? ? ?

Thor Roxie

Speed

Block Based File Based

BusinessDevelopmentCustomers1 20

Non-Indexed Full Data Set

http://hpccsystems.com/why-hpcc/benchmarks Low Cost

Map/Reduce

SQL w/ JOINS

GraphDB

Machine Learning

Simple to Complex Queries

ECL (Enterprise Control Language) C++ based query language

Complex

Your!Business Analytics

You know whats going on inside your business

Your!Business

What do customers think

before interacting with you?

What do customers think after interacting

with you?

Analytics

You know whats going on inside your business

Your!Business

Before After

Analytics

?Your!Competition

You know whats going on inside your business

Power of Relationships

Friends

Other

Your Customers

Not Your Customers

CustomersBasic Social Media

Who are they following?

Customers Following

Followers (50)

Followers (13M)

Followers (5.8M)

Basic Social Media

Customers Friend

John Legend

News Source

Lets focus on one person

@waterboy11

Customers Following

Followers (5.8M)

Basic Social Media

John Legend

!15 - 20% of all your customers

follow John Legend

Customers Following

Followers (5.8M)

Basic Social Media

John Legend!

What % of John Legends !followers ..

.. are friends of my customers?

.. have heard of my company?

.. behave like my customers?

.. purchasing triggers are the same as my customers?

How do I Query HPCC Systems?ECL (Enterprise Control Language) is a C++ based query language for use with HPCC Systems Big Data platform. ECLs syntax and format is very simple and easy to learn.!!

Note - ECL is very similar to Hadoop’s pig ,but!more expressive and feature rich.

! {'waterboy11',[{'nytimes'},{'jsonlover'},{'johnlegend'}]}

Parent Data Set!“Customer”

Child Data Set!“List of who my customer is following”

HPCC Systems does De-normalized Data

//Schema childrec := {string15 users}; parentrec := {String15 customers, dataset(childrec) following}; !!!//Data cAll:= dataset([ {'waterboy11',[{'nytimes'},{'jsonlover'},{'johnlegend'}]}, {'rocky-o',[{'paulie'}, {'mickey'}]}, {'coolguy60', [{'walter78'}, {'johnlegend'}]} ], parentrec); output(cAll); //Output

“USE DATABASE;” (inline data or file)

Schema of CustomersSchema of who my customer is following

Use above schema on this inline data

Results

//Schema childrec := {string15 users}; parentrec := {String15 customers, dataset(childrec) following}; !!!//Data cAll := dataset([ {'waterboy11',[{'nytimes'},{'jsonlover'},{'johnlegend'}]}, {'rocky-o',[{'paulie'}, {'mickey'}]}, {'coolguy60', [{'walter78'}, {'johnlegend'}]} ], parentrec); //Filter legend := cAll(EXISTS(following(users = 'johnlegend'))); output(legend); //Output

Which customers are following “John Legend”?

Which customers are following “John Legend”?

Results

//Schema childrec := {string15 users}; parentrec := {String15 customers, dataset(childrec) following}; !//Data cAll := dataset([ {'waterboy11',[{'nytimes'},{'jsonlover'},{'johnlegend'}]}, {'rocky-o',[{'paulie'}, {'mickey'}]}, {'coolguy60', [{'walter78'}, {'johnlegend'}]} ], parentrec); !follower1 := GROUP(SORT(cAll.following,users)); follower2 := TABLE(follower1 , {users , uC := COUNT(GROUP)},users); !//Output output(follower2); //Output

Who are my customers following the most?

Who are my customers following the most?

Results

Your Customers

Advanced Social Media

How many different types of customers do you have? How did red customers become my customers?

X = Price Y = Discount %

Machine Learning!!

Grouping customers by attributes

http://cdn.hpccsystems.com/pdf/machinelearning.pdf

Your Customers

Advanced Social Media

How many different types of customers do you have? How did red customers become my customers?

X = Price Y = Discount %

Machine Learning!!

Grouping customers by multiple attributes

X = Price Y = Discount % Z = Zip Code

IMPORT * FROM ML; IMPORT * FROM ML.Cluster; IMPORT * FROM ML.Types; //{gender,age,price} // gender 1 = male , 2 = female , 1-99 age , Price Ex. $7 x2 := DATASET([ {1, 18, 11}, {1, 35, 30}, {2, 50, 12}, {2, 36, 7}, {1, 23, 67}, {2, 34, 44}, {1, 29, 70}, {1, 18, 20}, {1, 45, 90}, {2, 21, 95}, {2, 34, 44}, {1, 51, 34}, {2, 26, 9}, {2, 32, 76}],NumericField); c := DATASET([ {1, 20, 20}, {1, 30, 60}, {1, 50, 50}, {2, 20, 25}, {2, 35, 37}, {2, 50, 60}],NumericField); x3 := Kmeans(x2,c); OUTPUT(x3); http://hpccsystems.com/ml/ml-getting-started

Machine Learning!Built-in

Your Customers Not Your Customers

Advanced Social Media

How many different types of customers do you have? How did red customers become my customers?

How many non-customers are red ?

? ?

X = Price Y = Discount %

X = Price Y = Discount % Z = Zip Code

Combine!+

//Schema childrec := {string15 users}; parentrec := {String15 customers, dataset(childrec) following}; parentrec2 := {String15 noncustomers, dataset(childrec) following}; //Data cAll := dataset([ {'waterboy11',[{'nytimes'}, {'jsonlover'},{'johnlegend'}]}, {'rocky-o',[{'paulie'}, {'mickey'}]}, {'coolguy60', [{'walter78'}, {'johnlegend'}]} ], parentrec); ncAll := dataset([ {‘fish99',[{'cake_bake'}]}, {'johnlegend',[{'ralphy1'}, {‘fixxyman’},…….]}, {'nytimes', [{‘sub_lime9'}, {‘johnlegend’},…..]} ], parentrec2); j1:= JOIN(cAll, ncAll,LEFT.users = RIGHT.noncustomers, JoinThem(LEFT,RIGHT)); Output(j1); //Output

Combine Customers and Non-Customers

Your Customers Not Your Customers

Advanced Social Media

? ?

Correct Sales Pipe Line

Better Predict Conversion

Plugin for HPCC Systems

http://hpccsystems.com/products-and-services/products/plugins/r-integration

Plug n Play BI Tools

Social Meda CRMAnalytic3rd Party API

http://hpccsystems.com/products-and-services/products/plugins/ecl-pentaho-data-integration

http://www.slideshare.net/Engauge/pinterest-is-social-medias-newest-sweetheart-the-real-deal-11978662

+ HPCC Systems

Users

Use Case - Pinterest

http://www.slideshare.net/FujioTurner/

For More HPCC!“How To’s”!

Go to SlideShare

http://www.youtube.com/watch?v=8SV43DCUqJg

Watch how to install HPCC Systems

in 5 Minutes

Download HPCC Systems Open Source

Community Edition

or

Source Codehttps://github.com/hpcc-systems

http://hpccsystems.com/download/

top related