how startups can leverage big data?

42
How Can Startups Leverage Big Data? Trudging Through Myth To Discover Real Value

Upload: rackspace

Post on 07-Jul-2015

4.612 views

Category:

Data & Analytics


1 download

DESCRIPTION

Data is being generated at a feverish pace and forward thinking companies are integrating big data and analytics as part of their core strategy from day one. However, it is often hard to sift through the hype around big data and many companies start with only a small subset of data. Can smaller companies benefit from big data efforts? We will discuss several use cases and examples of how startups are using data to optimize their operations, connect with their users, and expand their market.

TRANSCRIPT

Page 1: How Startups can leverage big data?

How Can Startups Leverage Big Data?Trudging Through Myth To Discover Real Value

Page 2: How Startups can leverage big data?

• Mostly Unstructured Data

• Client Data

• Customer Data

• Social Data

• Driving towards insight

2

What is Big Data?

www.rackspace.com

Page 3: How Startups can leverage big data?

RACKSPACE® HOSTING | WWW.RACKSPACE.COM

“Big Data is any dataset not suited to be processed by traditional legacy technology.”

Page 4: How Startups can leverage big data?

The Three V’s

4

V3CMining social data for sentiment

Analyzing web clickstreams

Analyzing log data for security breaches

Telemetry from sensors and machines

eCommerce predictive analytics

VOLUME VELOCITY

VARIETY COMPLEXITY

Page 5: How Startups can leverage big data?

The Three V’s

5

V3CMining social data for sentiment

Analyzing web clickstreams

Analyzing log data for security breaches

Telemetry from sensors and machines

eCommerce predictive analytics

VOLUME VELOCITY

VARIETY COMPLEXITY

Page 6: How Startups can leverage big data?

Evolution of Data

Time

Page 7: How Startups can leverage big data?

• Big Data is now much more than hype – real customers with real use cases are adopting daily

• Recent survey found that business leaders expected the deployment of Hadoop to result in a 3-year benefit ranging from $5M to $50M+

• Close to 100% of business leaders have already deployed or plan to deploy ApacheTM Hadoop®

7

Big Data is Here to Stay

www.rackspace.com

"Enterprises are showing increasing interest in the value provided by the large-scale data processing that Hadoop and Spark

can provide, but can be wary of the upfront cost and complexity of setting up a cluster to prove that value. Managed services

such as [OnMetalTM Cloud Big Data Platform] enable enterprises to focus their energies on generating business insights rather

than configuring and managing infrastructure.”

Matt Aslett

451 Research Director, Data Platforms and Analytics

Page 8: How Startups can leverage big data?

• To learn more about your customers

• To optimize your business processes

• To become a more targeted marketer

• Interact with users and customers in real time

• Add additional revenue and services

8

Why leverage Big Data?

www.rackspace.com

Page 9: How Startups can leverage big data?

9www.rackspace.com

What Is the Cost of Lacking a Big Data Strategy?

• Today every company can be a data company

• Successful companies will be data companies

• Under Armour isn’t just a fitness company – they’re a data company

Page 10: How Startups can leverage big data?

• Open Source

• Able to process petabytes of data quickly

• Developed at Google, implemented at scale at Yahoo

• Handles unstructured data very well

• One of the fastest growing eco-systems

10

Hadoop Has Emerged As A Leader In Distributed Data Sets

Page 11: How Startups can leverage big data?

Fundamentals of Hadoop v1

11

Data

Services

Core

Services HDFSDistributed File System

HBaseDistributed,

scalable, non

relational

databaseHCatalog

Metadata and table management system

PigData flow

scripting

language

HiveDW analysis layer

through HiveQL

(SQL-like) queries

MapReduceData processing framework

Operational

Services

AmbariInstallation, monitoring, administration

OozieWorkflow and job

scheduling

ZookeeperConfiguration, sync

and naming registry

FalconData pipeline

framework

KnoxAuth and access

FlumeLog data

aggregation and

movement

SqoopBulk data transfer

from and to

relational DB

Page 12: How Startups can leverage big data?

• Biggest impediments include:

– Insufficient skills in-house to design and deploy

– Designing and deploying takes too long

– High cost of physical infrastructure

12

Hadoop is Hard

www.rackspace.com

3 10inonly

businesses that plan

to implement Hadoop

have done so

Page 13: How Startups can leverage big data?

• Original focus on batch processing

• Streaming and interactive use cases emerging

• Shift from jobs that take hours to seconds

• Impala, Spark, and Presto are emerging tools

Hadoop is Changing

Page 14: How Startups can leverage big data?

14

But what are these companies

doing with Big Data?

www.rackspace.com

Gaining Insights!!!

Page 15: How Startups can leverage big data?

What are Companies Doing with Hadoop?

15www.rackspace.com

Vertical Use Case Data Type

Financial Services

New Account Risk Screens Text, Server Logs

Fraud Prevention Server Logs

Trading Risk Server Logs

Maximize Deposit Spread Text, Server Logs

Insurance Underwriting Geographic, Sensor, Text

Accelerate Loan Processing Text

Telecom

Call Detail Records (CDRs) Machine, Geographic

Infrastructure Investment Machine, Server logs

Next Product to Buy (NPTB) Clickstream

Real-time Bandwidth Allocation Server Logs, Text,

Sentiment

New Product Development Machine, Geographic

Retail

360 View of the Customer Clickstream, Text

Analyze Brand Sentiment Sentiment

Localized, Personalized Promotions Geographic

Website Optimization Clickstream

Optimal Store Layout Sensor

Manufacturing

Supply Chain and Logistics Sensor

Assembly Line Quality Assurance Sensor

Proactive Maintenance Machine

Crowdsourced Quality Assurance Sentiment

Page 16: How Startups can leverage big data?

Application Underpinning

• Mobile

– Enterprises consider support for mobility and productivity enhancement to mobile workers as their top-priority new application category, according to a recent survey by CIMI Corp. That means most companies that have adopted, or are adopting, Hadoop will likely have to integrate the framework with mobile applications.

• Data Aggregation

– The two big use cases we're seeing for Impala are aggregating data in Hadoop to present analytic dashboards and improving data-discovery applications by providing faster performance than Hive," Alex Gutow, Cloudera's product marketing manager.

• Dashboarding

– Users are increasingly choosing Hadoop as the underlying technology to power interactive dashboarding capability.

• Internet of Things

– As tech wearables and generated devices start to become common-day solutions the backend of your application needs to be built to address these concerns and can handle the velocity and volume of data being produced by the appliance.

People are building net-new applications with Hadoop as their database

16www.rackspace.com

Page 17: How Startups can leverage big data?

Clickstream Analysis

Your home page looks great. But how do you move customers on to bigger things—like submitting a form or completing a purchase? Get more granular with customer segmentation. Hadoop makes it easier to analyze, visualize and ultimately change how visitors behave on your website.

A clickstream is a series of page requests. Every page requested generates a signal. These signals can be graphically represented for clickstream reporting. The main point of clickstream tracking is to give webmasters insight into what visitors on their site are doing.

• Clickpath

– The study of human clicks on a website

• Tracking Cookies

– Tool used to understand and track online activity

• Data Mining

– Collecting data from websites and online properties

Understand how your users are behaving on your website and optimize your experience

17www.rackspace.com

Page 18: How Startups can leverage big data?

Sentiment Analysis

Your customers are talking. With Hadoop, you can mine Twitter, Facebook and other social media conversations for sentiment data about you and your competition, and use it to make targeted, real-time decisions that increase market share.

Sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document.

• Social Media Feeds

– Many companies are now capturing entire Twitter and Facebook feeds to analyze.

• Data Mining

– Users are searching the web for comments, blogs, and whitepapers that can point to overall sentiment

• E-Communities

– Forums, user groups, Heroku

Find out what your users are saying about you. Are they happy? Does your product make them a promoter?

18www.rackspace.com

Page 19: How Startups can leverage big data?

Machine Learning

Your machines know things. From out in the field to the assembly line floor—machines stream low-cost, always-on data. Hadoop makes it easier for you to store and refine that data and identify meaningful patterns, providing you with the insight to make proactive business decisions.

Machine Learning is a scientific discipline that deals with the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs and using that to make predictions or decisions, rather than following only explicitly programmed instructions.

• Pattern Recognition

– Users are building clusters to detect patterns and identify anomalies in data that these devices are generating

• Decision Tree

– Allows the system to take action and make choices based on the data

• Predictive Modeling

– Aims to automate the most common mistakes and errors as part of a preventative model

Interactive devices are now streamlining things like maintenance and troubleshooting

19www.rackspace.com

Page 20: How Startups can leverage big data?

Fraud Detection

Fraud is a billion-dollar business and it is increasing every year. The PwC global economic crime survey of 2009 suggests that close to 30% of companies worldwide have reported being victims of fraud in the past year.

Fraud involves one or more persons who intentionally act secretly to deprive another of something of value, for their own benefit. Fraud is as old as humanity itself and can take an unlimited variety of different forms. However, in recent years, the development of new technologies has also provided further ways in which criminals may commit fraud.

• Rules-Based Detection

– Even though internet hackers have become better at tricking online systems, they still exhibit very calculated behavior.

• Machine Learning

– The aggregation of data points can help you collect more info about the potential sale and detect if it might be fraud.

• Users Tagging and Tracing

– Once users are flagged as fraudulent, their repeated attempts can be prevented.

Users are detecting fraudulent online behavior and rejecting those users before they commit an offense

20www.rackspace.com

Page 21: How Startups can leverage big data?

Server Log Data

Security breaches happen. And when they do, your server logs may be your best line of defense. Hadoop takes server-log analysis to the next level by speeding and improving security forensics and providing a low cost platform to show compliance.

Generally small files that track user information inside a confined environment; often used to meet compliance or troubleshoot an incident.

• Scrub Data for Forensics

– If a security incident occurs, it is important to remediate fast

• Identify Anomalies

– Anti-patterns are often the first sign

• Discover Trends

– Some types of errors might become common; learn to identify them

• Actively Automate to Solve Issues with Log Files

– Many of these errors can be proactively eliminated through the use of automation.

Aggregate server logs to find trends and anomalies in your security records

21www.rackspace.com

Page 22: How Startups can leverage big data?

360 View of Customer – Dashboards and Analytics

Whenever a customer interacts with an organization, it is vital that the richness of information available on that customer informs and guides the processes that will help to maximize their experience, while simultaneously making the interaction as effective and efficient as possible. This includes everything from avoiding repetition or rekeying of information, to viewing customer history, establishing context and initiating desired actions.

A total 360 view often contains 3 views:

• The Past

– Understanding how your users act in the past lets you understand who they are and serve them relevant content and products

• The Present

– Where are users coming from? What is their experience on your site right now? Do they need help?

• The Future

– Did they buy? Can we serve them more information to help their choice? Can we market to them better?

Create in-depth personas for your customers based on how they are actually behaving.

22www.rackspace.com

Page 23: How Startups can leverage big data?

What’s Next? Interactive Processing!

What if instead of reacting to behavior we can engage virtually with the user to inhibit behavior?

This is called interactive processing and it takes input from humans and reacts based on patterns and algorithms.

The quicker we can server up this interaction, to the user the better equipped we are to inhibit their behavior!

Interact with customers in real-time offering suggestions and inhibiting behavior

23www.rackspace.com

Input

data

Proces

s

Output

data

source: Teach-ICT.com

Page 24: How Startups can leverage big data?

• Introducing support of Apache SparkTM

• Apache Spark enables enterprises to combine the breadth of structured and unstructured data with the speed of in-memory processing to build streaming, machine learning, and graph-optimized applications that allow businesses to take action at the speed of insight.

24

Apache Spark

www.rackspace.com

Page 25: How Startups can leverage big data?

• Deeper Integration with SQL Workloads

• Streaming Applications

• Machine Learning

• Iterative Processing

• Real-time Graphical Dashboards

25

New Use Cases

www.rackspace.com

Page 26: How Startups can leverage big data?

YES

26

Does the delivery method matter?

www.rackspace.com

Page 27: How Startups can leverage big data?

Choose The Best Deployment Model

27

Public Cloud Managed Cloud

Your Private Cloud

(on Premise)Private Cloud

Page 28: How Startups can leverage big data?

28

Page 29: How Startups can leverage big data?

Advantages of storing data in the cloud:

29

Portability between

providers

Utility Pricing Minimal

planning needed

Scale to meet the exact

demands

Integration with data

platforms

Page 30: How Startups can leverage big data?

• Dedicated Hosting

– No Capex Investment

– Choose new hardware and software versioning easily

– Rely on extended support personnel

– Increased security options

– Concurrent and predictable performance

• On-Premise

– Control Data Access

– Integrate with core mainframe and systems

– Build your own IP

– Control every aspect of design and operation

30www.rackspace.com

Advantages of Dedicated Hosting/On-Premise

Page 31: How Startups can leverage big data?

31www.rackspace.com

The Trade Off...

Custom Built

Consistent

Available

Performant

Purpose Built

Elastic

Flexible

On-Demand

Page 32: How Startups can leverage big data?

32www.rackspace.com

OnMetal Lets You Scale Like the Internet Giants

BARE METAL

SERVERS

API-drivenInstantly Available Highly Specialized No Hypervisor

“Rackspace Cloud, because of its single-tenant OnMetal line, is the only place on Earth where you can enjoy

Facebook/Google-style infrastructure rented by the hour.”-Ev Kontsevoy

Director, Product

Rackspace

Page 33: How Startups can leverage big data?

Benefits of Outsourced Hosting

Deliver resources fast

Scale as you grow

Offload management responsibilities

Optimize around specified hardware

Page 34: How Startups can leverage big data?

34www.rackspace.com

The Level of Management You Need

Only you can decide what model is best for you!

• DIY

• Platform

• Managed Service

• Turnkey Service

Page 35: How Startups can leverage big data?

Data as a Service: more time building,less time managing databases

• For some businesses, database or infrastructure management IS core to the business

• For most software-based businesses, database or infrastructure management represents time and resources not spent building the application

• You must answer for yourself: are you in the business of managing infrastructure, or in the business of [your market here]?

More time

spent building

the app

More tasks performed FOR the

developer (means that more time can be

spent building the application)

Sharding

Scaling

Performance

Availability

Analytics

Optimization

Proactive tasks

Complex admin

Patch

Upgrade

Backup/Restore

Monitoring

Replication

HW selection

Installation

Patch

Upgrade

Backup/Restore

Monitoring

Replication

HW selection

Installation

Patch

Upgrade

Backup/Restore

Monitoring

Replication

HW selection

Installation

1

Do-it-yourself

database

2

Provisioned

database

3

Automated

database

4

Data as a

Service

HW selection

Installation

Patch

Upgrade

Backup/Restore

Sharding

Scaling

Performance

Availability

Analytics

Optimization

Proactive tasks

Complex admin

App-specific

data mgmt

More tasks performed BY the developer

(means that more time can be spent

building the application)

Patch

Upgrade

Backup/Restore

Monitoring

Replication

Sharding

Scaling

Performance

Availability

Analytics

Optimization

Proactive tasks

Complex admin

App-specific

data mgmt

Sharding

Scaling

Performance

Availability

Analytics

Optimization

Proactive tasks

Complex admin

App-specific

data mgmt

App-specific

data mgmt

Page 36: How Startups can leverage big data?

36www.rackspace.com

Page 37: How Startups can leverage big data?

37www.rackspace.com

Page 38: How Startups can leverage big data?

38www.rackspace.com

Page 39: How Startups can leverage big data?

39

Rackspace Offerings for the Data Tier

www.rackspace.com

Infrastructure

for Data

Managed

Offerings of Most

Popular

Big Data, SQL, &

NoSQL Databases

Managed

Database

Services for

Production Apps

Cloud IaaSGet started fast

Dedicated

HostingPredictable costs &

performance

OnMetalCloud Elasticity &

Dedicated

Performance

•Automatic DBA: Sharding,

Backup, & HA

•Entire Stack Optimized on Bare

Metal

•Supported 24x7x365 by experts

•More than MongoDB…

•Architecture & Design

•Tuning & Monitoring

•24 x 7 x 365 Support

•Cost Effective

DBA Services

Page 40: How Startups can leverage big data?

1. Sign up for a free trial

2. Want to know more?

– Read my blog and check out the articles

40

What’s Next?

www.rackspace.com

www.baremetalbigdata.com

Page 41: How Startups can leverage big data?

41

Questions?

www.rackspace.com

Page 42: How Startups can leverage big data?

THANK YOU

RACKSPACE® | 1 FANATICAL PLACE, CITY OF WINDCREST | SAN ANTONIO, TX 78218

US SALES: 1-800-961-2888 | US SUPPORT: 1-800-961-4454 | WWW.RACKSPACE.COM

© RACKSPACE LTD. | RACKSPACE® AND FANATICAL SUPPORT® ARE SERVICE MARKS OF RACKSPACE US, INC. REGISTERED IN THE UNITED S TATES AND OTHER COUNTRIES. | WWW.RACKSPACE.COM