yiming - img.raqsoft.com.cn

11
One click intelligent data modeling YIMING

Upload: others

Post on 17-Oct-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: YIMING - img.raqsoft.com.cn

One click intelligent data modeling

Y I MI NG

Page 2: YIMING - img.raqsoft.com.cn

Yiming intelligent modeling VS

One click intelligent data modeling

Manual? Intelligence!

The modeling process is fully automated.

One click modeling, fast and good!

No data scientists needed to model

Artificial intelligence

Manual modeling

Explore data?

Missing value?

Data noise?

High cardinality variable? Standardization?

Time characteristics?

LR, RF,GBDT… ... what algorithm is to be used?

Parameter configuration?

How to evaluate the model effectively?

Long project cycle?

Non normal distribution?

Many model requirements?

Page 3: YIMING - img.raqsoft.com.cn

Traditional modeling process

Data input Modeling

data preprocessing

Manual modeling

Model performance

Model output

Many human tasks that need to be done by data scientists

Exception handling Missing value handling High cardinality variable processing Data smoothing Intelligent filtering of numerical

variables Add derived variable

Filter important variables Optimize model parameters Select modeling method

AUC GINI MSE LIFT KS RECALL RATE...z

Variable recognition Generation of basic statistics

Page 4: YIMING - img.raqsoft.com.cn

Intelligent modeling process

Data input Automatic

data preprocessing

Intelligent modeling

Model performance

Model output

Many human tasks originally completed by data scientists are completed by intelligent modeling

tool in one click, ensuring model quality and stability.

Intelligent export data quality report

Exception handling Missing value handling High cardinality variable processing Data smoothing Intelligent filtering of numerical

variables Add derived variable

Auto filter important variables Automatic and optimal

setting of model parameters Automatic and optimal

selection modeling method

AUC GINI MSE LIFT KS RECALL RATE...z

Variable recognition Generation of basic statistics

Page 5: YIMING - img.raqsoft.com.cn

Yiming intelligent modeling architecture

Application integration invocation

Data source

Auto-Modeling

Tree Based Neural network All Regression GBDT

RDB NoSQL HDFS LocalFile HTTP

Data Preprocessing Missing value, outlier; correction, smoothing; high radix processing, derived variable

...

Modeling tool Prediction model

Data preparation (ETL)

Page 6: YIMING - img.raqsoft.com.cn

Yiming intelligent modeling process

Au

tom

atic

ide

ntific

atio

n o

f va

riab

le

typ

es

Dataset size statistics

Continuous variable

Categorical variable Automatic preprocessing + modeling

Multiple model evaluation index

Page 7: YIMING - img.raqsoft.com.cn

Why us?

me

2

The painstaking work that the statistics expert pursues all his life. 1

The exquisite product of R&D team.

Deep mathematical understanding, super software implementation, industry leading high-performance big data technology.

Decades of practical experience in data mining modeling, participated in and presided over

many domestic and foreign data mining projects in the banking and insurance industry, and

repeatedly led the team to win awards in the international SAS competition.

Page 8: YIMING - img.raqsoft.com.cn

Case: personal credit default forecast

Target

• Establish a credit default model and give

the probability of user's credit default

• Give reasonable credit line to users

• Let business personnel select data

modeling based on experience, and help

business personnel accept the

application and popularization of the

model

• Improve the capture rate of defaulting

customers

Pain points

• Find a reasonable data dimension

• The influence of high cardinality

and nonlinear problems on the

model

• Select a reasonable model or

model combination

• Less positive samples, avoid

model over fitting

Comparison of modeling results

Intelligent modeling Traditional modeling

Number of people 1 1

Modeling time 5 minutes(Data preprocessing + modeling) 2 months

Modeling quantity 1 1

Data size 100000+ / 28MB 100000+/ 28MB

Model

performance 0.9728(test set 0.965) 0.957

Model performance (test set)

Page 9: YIMING - img.raqsoft.com.cn

Case: Marketing recommendation of bank financial products

Customer group 1 Customer group 2 Customer group

3 Customer group

4

Number of modeling people

1 1 1 1

Number of models 13 13 13 13

Modeling time 1.5 hour/model 1.5 hour/model 1 minute/model 2 minutes/model

Data volume 1340k 1550k 6400 12k

1. The purchase rate of the first 5% data using the model is 14.4 times higher than that without the model. That is, for every 100 selected customers, 24.77 transactions can be completed. It is far higher than the average of 1.72 transactions per 100 customers.

2. 72.0% of the target customers can be captured from the first 5% of the data captured by the model. 96.0% of the target customers can be captured from the first 20% of the data captured by the model.

Cumulative

improvement

Cumulative capture

rate

First 5% 14.4 72%

First 10% 9.4 94%

First 15% 6.3 94.5%

First 20% 4.8 96%

The current purchase rate of the financial product is 1.72%

Intelligent modeling vs manual modeling:

Number of models Time Number of project

participants

Intelligent

modeling 50-60 2 weeks 1

Manual

modeling

Not suitable for mass

modeling

1 week ~ 2 month/model (It depends on the complexity of the

model and the skill of the modeler, and the time is uncontrollable)

Several

Page 10: YIMING - img.raqsoft.com.cn

Characteristics of intelligent modeling of Yiming

Automatic modeling

Efficient

Non data scientist

Low cost

Model perfection

High accuracy

Intelligent modeling changes application mode: business user led, modeling anytime and anywhere in the application process.

Artificial intelligence - Less personnel

Page 11: YIMING - img.raqsoft.com.cn

THANKS

Mining data value

Y i m i n g i n t e l l i g e n t m o d e l i n g