machine learning testing [email protected] jie m. zhang ... - … · definitions of machine...

Machine Learning TestingJie M. Zhang, CREST, UCL

[email protected]

About me

Research Fellow at CREST, UCL

Supervisor: Prof. Mark Harman

PHD student of Peking University | Sep, 2015

– June, 2018 | Supervisor : Lu Zhang

Research Interest

https://sites.google.com/view/jie-zhang/home

http://sei.pku.edu.cn/~zhanglu/

https://sites.google.com/view/jie-zhang/home

Why SE ML Matters?

Diversity Matters Different communities: different perspectives

What Can SE Bring to ML?

ML Community


ML Community

“Analysis destroys wholes. Some things, magic things, are meant to stay whole. If you look at their pieces, they go away.” -- The Bridges of Madison County


ML Community

SE Community

SE for ML: Where are we now?

ML Community

SE for ML: Can we take another path?

SE Community

ML Community

Number of ML testing publications

trustworthiness

Number of ML testing publications

Definitions of Machine Learning Testing

An ML bug refers to any imperfection in a

machine learning item that causes a discordance between the

existing and the required conditions.

Definition 1 (ML Bug).

Machine Learning Testing (ML testing) refers

to any activities designed to reveal machine learning bugs

Definition 2 (ML Testing).

Comparison between ML testing & software testing

Comparison between ML Testing & Software Testing

“There would be no need to write such programs, if the correct answer were known”------- Davis and Weyuker, 1981

ML Testing Workflow

Components Where the Bug May Exist

ML Properties to TestCorrectness

Overfitting degree

Robustness

Security

Privacy

Fairness

Interpretability

Efficiency


Overfitting degree

Robustness

Security

Privacy

Fairness

Interpretability

functional


Overfitting degree

Robustness

Security

Privacy

Fairness

Interpretability

Efficiency

non-functional

Testing Workflow

Metamorphic relationsDifferential TestingN-version ProgrammingMetrics

Fuzzing symbolic executionDomain-specific synthesis: GAN

CoverageMutation testing

Input generation

oracle

adequacyevaluation

Supervised/Unsupervised/Reinforcement Learning

Machine Learning Properties

Challenges● Test Input Generation:

○ Safety-critical: Space coverage: enormous behaviour space of ML models

● The Oracle Problem: how to reduce false positives.

● Testing Cost Reduction.

Research Opportunities

● Test Input Generation:

○ Safety-critical: Space coverage



SBST

Testability Transformation(Harman 2004)






Multiple oracles

Metamorphic relation: largely depend on inputs





● Testing Cost Reduction.Test-cost reduction Transformation


● Unsupervised/reinforcement learning testing

● More machine learning properties: privacy, fairness

● More testing benchmarks

● More testing tools

● More testing activities

Model Selection and Overfitting Problem

Overfitting happens when a model :learns the detail and noise in the training data too well; the noise or random fluctuations in the training data is picked up and learned as concepts by the model.

Existing Research

cross-validation:

test error to approximate the generalisation error.

● test sample may not be be representative● different models may have very similar cross validation results

Applied Machine Learning

VC-dimension and Rademacher Complexity:

● only measure the complexity of the hypothesis space, not the fit

● both quantities are usually intractable in practice,

● Only provide bounds and the bound can be quite loose.

Statistical Machine Learning

Our approach

Create different data versions by injecting noise

Retrain the model on each data version

Observe the decrease of training accuracy

Mutation Metamorphic Relations

On real data sets

● Is now dealing with feedback from authors of the collected papers (have got around 70 responses so far).

● Will be on arXiv very soon.● Feel free to ask for a draft ([email protected])● Any feedback is very welcome!

mailto:[email protected]

machine learning testing [email protected] jie m. zhang ... - … · definitions of machine...

Documents