square's machine learning infrastructure and applications - rong yan

Post on 10-May-2015

656 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

http://www.hakkalabs.co/articles/squares-machine-learning-infrastructure-applications

TRANSCRIPT

May 15, 2014

!Rong Yan

Machine Learning @ Square

Birth of Square

Payment

StandReader

Payment Device Payment Aggregation

Risk Model

Payment

Commerce Cash

Market

Our Mission Make commerce easy.

Payment

Data

Commerce

The Next Big Thing

3M+ Readers

$15B+ Annualized

Scale

Offline and Online

Amount

Location

Item Desc.

Card #

Credit Score

Friends

Activity History

Inventory

Sales Volume

Haircut Price

Turn Data into Business Value

FraudDetection

BusinessInsight

CustomerRelation

InformationDiscovery

Fraud Detection @ Square

Fraud Detection in the payment flow

Bank

Clears for

settlement

Suspect ~2000 sellers

Risk OpsTransaction review

150,000 active sellers per day

Risk ML Fraud Detection

Payments

near-real-time

ML Architecture

Merchant

Devices

Bank Accounts

Machine Learning

(300+ features)Suspicions

Card not present: Yes

Pan Diversity: 0.05

Use iPhone: No

Feature Generation

Easy to interpret

!

Dimension reduction !

!

Very powerful in ensemble

Decline Rate >= 0.1

NoYes

Amount <= $10000

NoYes

Business Type = Auto repair

NoYes

0.9 0.6

Decision Tree Model

Random Forests: Decision Tree Ensemble

Decline Rate <= 0.1

NoYes

Amount <= $10000

Business Type = Auto repair

0.9 0.6

Tree 1 Tree N

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Mode for classification = Bad Average for regression = 0.63

NoYes

NoYes

Success Rate <= 0.2

NoYes

Age >= 20

Amount <= $1000

0.4 0.7

NoYes

NoYes

Decline Rate <= 0.3

NoYes

Amount <= $20000

Age <= 22

0.8 0.6

NoYes

NoYes

Tree 2

Bad, 0.9 Good, 0.4 Bad, 0.6

Random Forests - Build each Tree

All data

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

All dataSamples

Random Forests - Build each Tree

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Features

Dollar Amount

Connected with bad user

Business Type

Decline Rate

Time of Day

Location

Randomly select sqrt(n) features

All dataSamples

Random Forests - Build each Tree

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Features

Dollar Amount

Connected with bad user

Business Type

Decline Rate

Time of Day

Location

Randomly select sqrt(n) features

Best split: feature and value

Decline Rate <= 0.1

NoYes

0.4 0.6

All dataSamples

Random Forests - Build each Tree

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Features

Dollar Amount

Connected with bad user

Business Type

Decline Rate

Time of Day

Location

Randomly select sqrt(n) features

Best split: feature and value

Decline Rate <= 0.1

NoYes

0.4 0.6

All dataSamples

Grow Tree Grow Tree

Random Forests - Build each Tree

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Features

Dollar Amount

Connected with bad user

Business Type

Decline Rate

Time of Day

Location

Randomly select sqrt(n) features

Best split: feature and value

Decline Rate <= 0.1

NoYes

0.4 0.6

All dataSamples

Grow Tree Grow Tree

When sample size is small STOP

Random Forests - Build each Tree

Breiman, Leo (2001). "Random Forests". Machine Learning 45 (1): 5–32.

Features

Dollar Amount

Connected with bad user

Business Type

Decline Rate

Time of Day

Location

Randomly select sqrt(n) features

Best split: feature and value

Decline Rate <= 0.1

NoYes

0.4 0.6

All dataSamples

Grow Tree Grow Tree

When sample size is small STOP

Repeat these steps multiple times to create a forest

Random Forests - Build each Tree

Boosting Trees

Tree 1

Boosting Trees

Tree 1 Tree 2

Help Tree 1

Boosting Trees

Tree 1 Tree 2 Tree 3 Tree 4

Help Tree 1

Help Tree 1, 2

Help Tree 1, 2, 3

Stop when no help needed

0 weights all samples

Boosting Trees

Tree 1 Tree 2 Tree 3 Tree 4

Help Tree 1

Help Tree 1, 2

Help Tree 1, 2, 3

8.0 -2.0 1.0 0.57.5 = + + +

Boosting Trees - AlgorithmObjective function:

Loss

Friedman, J. H. "Greedy Function Approximation: A Gradient Boosting Machine." 1999

Precision at a fixed recall level

Results - Precision

Model April May June

Random Forest 76% 77% 80%

Boosting Trees 85% 82% 88%

+11.8% +6.5% +10%

Results - Fraud Detection Recall

# Payments to Reject

Frau

d $

Pre

vent

ed Easy

Hard

Medium

Data Sampling

Highly biased in label distribution - Less than 1 in 1000 !Weighted training - Higher weights on positive samples => oscillation - Lower weights on negative samples => no real gain !Solution - Keep negative:positive ratio to be 3:1 - 10:1 - Scale the final model if calibration is needed !Fewer data requires fewer resources to train !Observed +10% improvement from 20:1 to 3:1

ProductionalizeMachine Learning

‣Ruby-on-Rails + MySQL ‣MySQL replication ‣Tied to production schema ‣Hard to do complex analysis

Startup Architecture

‣ Jave services ‣APIs ‣HDFS

Scale it up: SOA + Data Warehouse

Scale it up: Data Transport

‣Append-only feeds ‣Kafka ‣Replication ‣Protocol buffers

Payments

Highly Available

Merchant

Devices

Bank Accounts

Suspicions

Parallel Environments and Data Integrity

Blue

Green

VIPupstream

Square Random Forest Learning Management

Recommendation

Other ML @ Square

Square Random Forest

RF Learner Implementation Time (Train / Test)

RiskML Random Forest (Built on Scikit-Learn)

C / Cython / Python (Open Source + Square Code) 72 minutes

WiseRF C++ (Proprietary) 23 minutes

Square Random Forest Java (Square Code) 15 minutes

Note: time reported on 3M training and 15M testing data

Learning Management System

‣ Support non-sophisticated users

‣ Fast ad-hoc analytics

‣ Accessible to everyone for easy

model generation and evaluation

‣ Tracks results to ensure different

models can be compared

Square Market Recommendation

10x conversion rate vs. random baseline

ML @ Square !

rongyan@squareup.com

top related