machine learning testing [email protected] jie m. zhang ... - … · definitions of machine...
TRANSCRIPT
Machine Learning TestingJie M. Zhang, CREST, UCL
About me
Research Fellow at CREST, UCL
Supervisor: Prof. Mark Harman
PHD student of Peking University | Sep, 2015
– June, 2018 | Supervisor : Lu Zhang
Research Interest
https://sites.google.com/view/jie-zhang/home
Why SE ML Matters?
Diversity Matters Different communities: different perspectives
What Can SE Bring to ML?
ML Community
What Can SE Bring to ML?
ML Community
“Analysis destroys wholes. Some things, magic things, are meant to stay whole. If you look at their pieces, they go away.” -- The Bridges of Madison County
What Can SE Bring to ML?
ML Community
SE Community
SE for ML: Where are we now?
ML Community
SE for ML: Can we take another path?
SE Community
ML Community
Number of ML testing publications
trustworthiness
Number of ML testing publications
Definitions of Machine Learning Testing
An ML bug refers to any imperfection in a
machine learning item that causes a discordance between the
existing and the required conditions.
Definition 1 (ML Bug).
Machine Learning Testing (ML testing) refers
to any activities designed to reveal machine learning bugs
Definition 2 (ML Testing).
Comparison between ML testing & software testing
Comparison between ML testing & software testing
Comparison between ML Testing & Software Testing
“There would be no need to write such programs, if the correct answer were known”------- Davis and Weyuker, 1981
Definitions of Machine Learning Testing
An ML bug refers to any imperfection in a
machine learning item that causes a discordance between the
existing and the required conditions.
Definition 1 (ML Bug).
Machine Learning Testing (ML testing) refers
to any activities designed to reveal machine learning bugs
Definition 2 (ML Testing).
ML Testing Workflow
ML Testing Workflow
Definitions of Machine Learning Testing
An ML bug refers to any imperfection in a
machine learning item that causes a discordance between the
existing and the required conditions.
Definition 1 (ML Bug).
Machine Learning Testing (ML testing) refers
to any activities designed to reveal machine learning bugs
Definition 2 (ML Testing).
Components Where the Bug May Exist
Definitions of Machine Learning Testing
An ML bug refers to any imperfection in a
machine learning item that causes a discordance between the
existing and the required conditions.
Definition 1 (ML Bug).
Machine Learning Testing (ML testing) refers
to any activities designed to reveal machine learning bugs
Definition 2 (ML Testing).
ML Properties to TestCorrectness
Overfitting degree
Robustness
Security
Privacy
Fairness
Interpretability
Efficiency
ML Properties to TestCorrectness
Overfitting degree
Robustness
Security
Privacy
Fairness
Interpretability
functional
ML Properties to TestCorrectness
Overfitting degree
Robustness
Security
Privacy
Fairness
Interpretability
Efficiency
non-functional
Testing Workflow
Metamorphic relationsDifferential TestingN-version ProgrammingMetrics
Fuzzing symbolic executionDomain-specific synthesis: GAN
CoverageMutation testing
Input generation
oracle
adequacyevaluation
Supervised/Unsupervised/Reinforcement Learning
Machine Learning Properties
Challenges● Test Input Generation:
○ Safety-critical: Space coverage: enormous behaviour space of ML models
● The Oracle Problem: how to reduce false positives.
● Testing Cost Reduction.
Research Opportunities
● Test Input Generation:
○ Safety-critical: Space coverage
● The Oracle Problem: how to reduce false positives.
● Testing Cost Reduction.
SBST
Testability Transformation(Harman 2004)
Research Opportunities
● Test Input Generation:
○ Safety-critical: Space coverage
● The Oracle Problem: how to reduce false positives.
● Testing Cost Reduction.
Multiple oracles
Metamorphic relation: largely depend on inputs
Research Opportunities
● Test Input Generation:
○ Safety-critical: Space coverage
● The Oracle Problem: how to reduce false positives.
● Testing Cost Reduction.Test-cost reduction Transformation
Research Opportunities
● Unsupervised/reinforcement learning testing
● More machine learning properties: privacy, fairness
● More testing benchmarks
● More testing tools
● More testing activities
Model Selection and Overfitting Problem
Overfitting happens when a model :learns the detail and noise in the training data too well; the noise or random fluctuations in the training data is picked up and learned as concepts by the model.
Existing Research
cross-validation:
test error to approximate the generalisation error.
● test sample may not be be representative● different models may have very similar cross validation results
Applied Machine Learning
VC-dimension and Rademacher Complexity:
● only measure the complexity of the hypothesis space, not the fit
● both quantities are usually intractable in practice,
● Only provide bounds and the bound can be quite loose.
Statistical Machine Learning
Our approach
Create different data versions by injecting noise
Retrain the model on each data version
Observe the decrease of training accuracy
Mutation Metamorphic Relations
On real data sets
● Is now dealing with feedback from authors of the collected papers (have got around 70 responses so far).
● Will be on arXiv very soon.● Feel free to ask for a draft ([email protected])● Any feedback is very welcome!