blind spots in big data erez koren @ forter

32
Blind Spots in BIG DATA Erez Koren Forter

Upload: ido-shilon

Post on 12-Apr-2017

76 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Blind spots in big data erez koren @ forter

Blind Spots in BIG DATA

Erez KorenForter

Page 2: Blind spots in big data erez koren @ forter

About myself - Erez Koren

In the computer business since 2nd grade

Love building products and hacking stuff

Currently in my 3rd startup adventure

Working at Forter from before day one

2017

Page 3: Blind spots in big data erez koren @ forter

About Forter

We catch Fraudsters & protect E-commerce merchants

Founded 3.5 years ago

~80 employees worldwide

Backed by

2017

Page 4: Blind spots in big data erez koren @ forter
Page 5: Blind spots in big data erez koren @ forter

We detect fraud, give a real-time decision (approve/decline) every time and guarantee it (chargeback protection). Covering the whole customer lifecycle

We collect data from browsers (JS) and mobile apps (through SDK )

We also receive order/account data S2S into our API and reply with our decision in real-time

Our stack:

Forter - What & How We Do It

2017

Compliance:

And more...

Page 6: Blind spots in big data erez koren @ forter

The big data infrastructure...

2017

Page 7: Blind spots in big data erez koren @ forter

But you have a feeling that something is wrong

Are you sure the data contains everything you need?

How do you ensure the quality of your data?

2017

Page 8: Blind spots in big data erez koren @ forter

The COVERAGE challengeIn some cases the data you are analyzingis only partial

Page 9: Blind spots in big data erez koren @ forter

Today’s internet is a jungle.

There are thousands of devices, platforms, browsers and configurations.

Are you sure you are collecting data from all / most of the relevant sources?

2017

The COVERAGE Challenge

Page 10: Blind spots in big data erez koren @ forter

Demo timeThis is how we do it

2017

Page 11: Blind spots in big data erez koren @ forter

11

MULTIPLE DEVICES & PLATFORMS

Page 12: Blind spots in big data erez koren @ forter

12

MULTIPLE VERSIONS, INCLUDING DEV. VERSIONS

Page 13: Blind spots in big data erez koren @ forter

13

SENDING EVENTS FROM 25 DIFFERENT CONFIGS

Page 14: Blind spots in big data erez koren @ forter

14

SELENIUM TESTS COVERS FULL CHECKOUT EXPERIENCE

Page 15: Blind spots in big data erez koren @ forter

15

IN REAL WORLD SOME OF THE TESTS ALWAYS FAIL

Page 16: Blind spots in big data erez koren @ forter

16

EXAMPLE FOR UNEXPECTED DATA IN REAL WORLD

ChormeSafariMobile SafariFirefoxIEAndroid BrowserEdgeChrome WebViewPhantomJSundefinedOperaWebKit

Page 17: Blind spots in big data erez koren @ forter

17

Detect exceptions that occurs on client sideBrowsers (JS), Mobile SDKs and any other client integrations

CLIENT SIDE CODE MONITORING

Page 18: Blind spots in big data erez koren @ forter

18

JS SCRIPT TIMEOUTS

Merchant checked the website with a browser that is not supporting javascript

Detect gaps between script request from server and script events received

Page 19: Blind spots in big data erez koren @ forter

Compare the data segments of the

general population versus the data

segment spread in your data

Test it as if you were a real user

Even if everything is working now, in

the future it will not

Takeaways

2017

Page 20: Blind spots in big data erez koren @ forter

The MONITORING Challenge

Page 21: Blind spots in big data erez koren @ forter

2017

The MONITORING Challenge

Is “measuring everything” good enough?

How often are you checking the graphs?

Do you have enough alerts or too many?

There are always technical issues that can corrupt the alerting data

Page 22: Blind spots in big data erez koren @ forter

Demo 2 timeThis is how we do it

2017

Page 23: Blind spots in big data erez koren @ forter

23

API AVAILABILITY CHECK

External monitoring (watch the watcher), including round-tripPingdom and StatusCake

Page 24: Blind spots in big data erez koren @ forter

24

DEPENDENCIES MONITORING (RSS)

e.g. AWS, GitHub

Reported to our #productionroom in slack

Page 25: Blind spots in big data erez koren @ forter

25

API RESPONSES ANOMALY DETECTION

Detect decline increase from X% to Y% in a given time window

Page 26: Blind spots in big data erez koren @ forter

26

1. Making sure we don’t slow the site down, or impact checkout funnel via automated Selenium tests (with & without our script, multiple browsers)

2. Incremental deployment support for

JS SCRIPT MONITORING

Page 27: Blind spots in big data erez koren @ forter

27

ML FEATURES ANOMALY DETECTION

Monitoring system’s healthby measuring our MachineLearning featuresdistribution over time

Page 28: Blind spots in big data erez koren @ forter

28

VULNERABILITIES MONITORING (RSS)

OS, databases, libraries etc.

Page 29: Blind spots in big data erez koren @ forter

29

ALERTS DAILY SUMMARY

Alerts summary of in the last 24h + ability to drill the graphs

Page 30: Blind spots in big data erez koren @ forter

2017

Takeaways

Make sure every alert can be drilled down into a graph and relate to the raw metric

Know how to investigate - leave breadcrumbs to raw data (even when the data is aggregated)

Differentiate between critical alerts and other alerts (that can be fixed the next morning)

Measure low values as well as the high ones - alerts for low values (e.g. CPU) is something that most systems are missing

Page 31: Blind spots in big data erez koren @ forter

2017

Takeaways

Understand the pipes and filters make sure there are no hidden blockages in the data pipelines

Log errors both from client side and server side when possible and analyze together

Make sure incidents that affect input data are shared with your data scientists by using “dirty” or “partial” flag

Page 32: Blind spots in big data erez koren @ forter

Thank you !

2017