why, how, when and when not of big data for startups

25
@yourFriendDhruv For Startup Saturday : Jan 2016 Big Data for Startups An introductory session about What, Why, When & When Not of Big Data for Startups Yes, You may not like it; but its really my twitter id Dhruv Gohil @yourFriendDhruv

Upload: dhruv-gohil

Post on 06-Jan-2017

366 views

Category:

Technology


2 download

TRANSCRIPT

@yourFriendDhruv

For Startup Saturday : Jan 2016

Big Data for Startups

– An introductory session about What, Why, When & When Not of Big Data for Startups

Yes, You may not like it; but its really my twitter id

Dhruv Gohil@yourFriendDhruv

Welcome!

l Why do you care to hear my opinion?

l What is this “big data”?

l Why “startup”s should care about it?

l When to “do big data”?

l When “NOT to do big data”?

@yourFriendDhruv

Seems too serious?

Now, This is much better!

So, let's change the font!

@yourFriendDhruv

OK... Why do you care to hear this from me?

Meet me after the session, to compare favorites

@yourFriendDhruv

OK... So what questions I will try to answer?

Big is not only ‘big’. Why startup needs 'Big data'? What 'Big data' is NOT? fear of Big data? Kick it off! Big Data for “small startups”?

@yourFriendDhruv

Let me tell you a story..

http://en.wikipedia.org/wiki/Information_Management_System

@yourFriendDhruv

If you were thinking about RDBMS now...then

Everything you have been taught in academics about Database is ALL WRONG.

http://slideshot.epfl.ch/play/suri_stonebraker

@yourFriendDhruv

Big Data is...

http://www.ibmbigdatahub.com/infographic/four-vs-big-data

@yourFriendDhruv

Big Data is not only ‘big’

Volume, Velocity, VarietyGB/TB vs PB/EBCentralized vs DistributedStructured vs Semi-Structured/UnstructuredData Model vs SchemaKnown relationships vs Flexible associations

@yourFriendDhruv

What 'Big data' is NOT?

Big data हैं इसलि�ए Hadoop हैँ , Hadoop हैँ इस�ए Big data नहिहं!

@yourFriendDhruv

What 'Big data' is NOT?

Applying for a funding here?

Hadoop से कम तो गा�ी के बराबर हैं !

@yourFriendDhruv

What 'Big data' is NOT?

Why always Hadoop/Technologies comes to mind with big data?What else we should know?Tools vs MethodologiesBeing too futuristic vs. being practical/economical

@yourFriendDhruv

Big Data in your startup

Cost of tools/software decreases, but cost of knowledge increases

Being agile is the only way to deal competition Are you working with...

Social networking and media Mobile devices Internet transactions Networked devices and sensors

@yourFriendDhruv

Big Data in your product/service Have to change thinking in perspective of access vs.

storage Design based on when/where data is used vs.

when/where data is produced. Use redundancy in contrast of storage cost Understand NoSQL = Not Only SQL

Streams In memory analytics Massively parallel processing (Data crunching)

@yourFriendDhruv

Big Data in your startup

Random Research says.. 99% client of Big Data startups, ended up having

total paid customers less then your own fingers.

A Startup hits Business scalability much much earlier then technical scalability.

@yourFriendDhruv

Big Data for your clients

Business first - technology second Current reality for client projects:

Use big data tools which works at small scale :-) Design with domain in mind not the database client

suggests. Always design for read optimization in mind (the

golden rule)

@yourFriendDhruv

Big Data project for small data startups

If you can do it postgresql, then do it postgresql (the blue elephant rule)

@yourFriendDhruv

For Tech centric startups - The CAP theorem

Read a lot about design of database before using any non traditional database. Or read good negative posts to know when NOT to use it.

e.g. : http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/

@yourFriendDhruv

And Now... Quick Tips

Why Big Data? Data == VALUE MONEY $$$! It's a buzzword, but ride on it like you mean it. Your competitors do it. claims to do it. Think of your growth exit stretegy, again!

Yes, I never owned/worked at startup, Still advising you!

@yourFriendDhruv

And Now... Quick Tips

When to actually do Big Data? The purpose of Data your startup has, changes should change == To PIVOT

Do it for “Unfair advantage” not for UVP www://leanstack.com

See, I did it again.

@yourFriendDhruv

And Now... Quick Tips

How to do Big Data? Big Data Storage

Use Big Data patterns, but don't use Big Data tools/technologies (yet)

Fact/Event based system design CQRS (command query responsibility seperation) Easy RDMS but with NON-Relational Design

Big Data Analytics Until you hit 1K customer use Analytics-as-

services IBM WATSON Prediction.io

Even more!, I am liking it, not sure about you although.

@yourFriendDhruv

And Now... Quick Tips

How NOT to do Big Data? If you are not selling your startup in NEXT 6

months Don't start with Technology, start with business case on NON-BIGDATA-

TECHNOLOGY If you have not pivoted even once!

Even more!, I am liking it, not sure about you although.

@yourFriendDhruv

Few references used AND this is not last slide Basic hadoop introductory material : http://www.coreservlets.com/hadoop-tutorial/ Evaluate hadoop without installation : http://go.cloudera.com/cloudera-live.html Postgresql good parts : http://www.slideshare.net/Aveic/postgresql-34323147 Postgresql as NOSQL column store : http://postgresguide.com/sexy/hstore.html Postgresql as Elastic search basic functionality : http://blog.lostpropertyhq.com/postgres-full-text-search-is-good-enough/ Good big data compatible OSS softwares : http://netflix.github.io/ Practical Hbase usage : https://www.facebook.com/UsingHbase Why BigData technologies are on Linux : https://www.youtube.com/watch?v=njos57IJf-0 Using cassandra for write heavy applications : http://www.datastax.com/1-million-writes On-line analytics in STORM : http://hortonworks.com/hadoop/storm/ E-commerce Domain specific use case : http://www.slideshare.net/jaykumarpatel/cassandra-at-ebay-13920376 Good use case of selecting data store based on proper understanding of CAP theorem :

http://tech-blog.flipkart.net/2013/01/nosql-for-a-user-engagement-platform/ Recommendation engine in Big Data scenarios : http://www.slideshare.net/hava101/recommendations-play-flipkart-14115791 High volume log proessing: http://www.splunk.com/view/product-tour/SP-CAAAAGV Open source alternatives : http://logstash.net/

and http://graylog2.org/

Yes, It's unreadable and even un complete, And has irrelevant you tube video links!

@yourFriendDhruv

Question & Answers Ask the Question now if your Question:

Is 1 liner Is not personal, from either side.

Ask it post session today If your context is specific If its personal and you don't wanna be humiliated in public If its technical, then attend next 2 sessions

Ask in any café @Prahladnagar We advice free to startups and individuals (Not joking this time)

Don't ask Melody itni chocolaty kyun hain?

“No question unanswered” is not copyrighted by me, yet.