master thesis defense · master thesis defense knowledge production and control of a black box...

Master Thesis DefenseKnowledge Production and Control of a Black Box Using Machine LearningStudent: Christopher W. BlakeSupervisor: Prof. Eirini NtoutsiNovember 2018

2Christopher Blake, November 2018

Overview

§ Background + Motivation§ Problem Statement§ Related Work§ Mathematical Formulation§ Learning Theory§ Experimental Setup§ Results§ Conclusion§ Recommendations/Future

012345

0 5 10 15 20 25 30 35

Stream 1

012345

0 5 10 15 20 25 30 35

Stream 2

Time

Control Information

Signal Interpreter Knowledge Production

Low Level Knowledge

Discretization

Black Box

Policy Trainer

Goal State

Result State

Signal Generator

High-Level Knowledge

ID Manager

External Labels

Bool2

(108) True (112) Stay True (147) False (151) Stay False (154) False to True (156) True to False

Bool1

(41) False (45) Stay False (164) True(28) False : 12%(36) True : 88%

(44) Stay True : 18%(56) Stay False : 36%

(65) False to True : 46%

Bool1

Bool1

(41) False (164) True(28) False : 84%(36) True : 16%

(28) False : 85%(36) True : 15%

(28) False : 14%(36) True : 86%



(45) Stay False(44) Stay True

(164) True(44) Stay True : 5%

(50) True to False : 15%(56) Stay False : 80%

(166) Stay True(44) Stay True : 72%

(50) True to False : 23%

(168) False to True

(44) Stay True : 40%(65) False to True : 60%

(164) True(44) Stay True : 82%(56) Stay False : 45%




(45) Stay False(44) Stay True : 70%


(167) True to False

Bool1

(56) Stay False : 60%(65) False to True : 40%


Motor 1

Motor 3

X0

Y0

Motor 2

L1

L2

L3

T1

T3y

x


Background

§ 7 Years Industry, Product Development§ B.S. Mechanical Engineering§ Ex-Dancer§ Ex-Gymnast

Motivation and Inspiration§ Development Times§ Testing Times§ Learning Dance (as an Adult)§ Observing Children (and Adults)


Problem Statement

§ Complex products/systems are regularly developed.§ Requires a costly-to-develop control process.§ Requires extensive analysis and domain knowledge.

Objectivesü Creation of an automatic or semi-automatic method for

development of control systems.ü Identification of the unique data experienced by a model.ü Identification of the repeating structures within the unique data.ü Identification of the primitive functions of the model, providing

the primitive control mechanism.

Enablesü Shorter time-to-marketü Extended analysisü More capable products


Related Work

*Language Learning

§ Neural Networks§ Genetic Algorithms§ Traditional Feedback

Mixed Approach

Disadvantages§ < 5 Parameter Optimization§ Require Prior Knowledge§ Require Domain Knowledge

§ Optimal Control§ Adaptive Control§ Robust Control§ Intelligent Control

Iterative Learning Control


Most Similar Work

Process1. Neural network (NN) trained to emulate the plant.2. Controller trained on emulated plant NN.3. NN backpropagation provides error for controller training.

Kevin L Moore. Iterative Learning Control. Iterative Learning Control for Deterministic Systems, pages 425–488, 1993.

§ Handles more variables§ Automatic method§ Intelligent process

Advantages Disadvantages§ Only deterministic systems§ Prior info of plant

§ Feedback type§ Degrees of freedom

§ Error info not real§ Domain knowledge


This Work

Knowledge Extraction

Low Level Knowledge

Quantization

Black Box(Object of Control)

Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels

Advantages§ Easy-To-Read Results§ Less Prior Information§ Stream-Based

§ Non-Deterministic Systems§ Extensible Architecture§ Layered Knowledge


Formulation – Uncertain Model

Black Box

Function Space! = #$ % , #' % , … #)(%)

(with internal memory)

Output Space, ∈ ./

Input Space% ∈ 01

2 = 345678 9# :3;4<=> = 345678 9# 94<;4<=? = 345678 9# #43@<:93=

Movement Sound Math


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Formulation – Quantization

Bin Statistics

!"# =%&'

!(!"# =%&')

* =%1'

, = !"#/*. = !(!"# − 2*, + *.)

Action Space! = !2345(47789_&;3"8)= = =89>8(784>ℎ@A9)

HighLow

, +6.−6.

Nomenclature

Low -6. , +6. High Data Count

[0.00 (0.97) |5.01| (9.02) 10.00] [47]


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Theory - Quantization

Outside à !"#$%(')Flat à !"#$%(')

Well-Formed Range

Low, −∞à !"#$%(−6,)

High, +∞à !"#$%(+6,) Overlap, low àM/01/(low)

Overlap, high àM/01/(ℎ$1ℎ)


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Formulation – Knowledge Model

!" = (!%, !', !()Sequentiality

Knowledge Space! ∈ +

!" = (!%; !-)Simultaneity

c2c1c2c3 c3

c1Time (t)t1 t2 t3 t4 t5 t6

c2c1 c2 c3 c3 c1c4 c6 c8

c11

c9 c8

High

er-L

evel


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Theory – Knowledge Extraction

c2c1c1c1 c1 c2 c2 c2 c2 c2c1c1 c1Stream:

Sequentiality: c3 c4

c5 c6

Simultaneity:

c2c1c1c1c1 c1c1c1c1 c2 c1c1c1Stream 1:

c4c3c3c3 c3c3c3c3 c4 c3c3c3Stream 2: c4

c5 c6 c7

c2c1c1c1c1 c1c1c1c1 c2 c2 c2 c2 c2 c2c1c1c1 c1c1c1Interpreted:

5.00.20.0-0.10.1 0.3-0.10.10.3 5.1 4.9 4.8 5.2 5.1 5.0-0.20.10.2 0.00.10.1Raw Value:50000 0000 5 5 5 5 5 5000 000Quantized:


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Interpretation

c2c1c2c3 c3

c1Time (t)t1 t2 t3 t4 t5 t6

1: c2c1 c2 c3 c3 c1

Pass Interpretation Generated Knowledge

1 c1; c2; c3; c3; c2; c1

c4 =(c1;c2)c5 =(c2;c3)c6 =(c3;c3)c7 =(c3;c2)c8 =(c2;c1)

2 c4; c6; c8 c9 =(c4;c6) c10 =(c6;c8) 3 c9; c8 c11 = (c9; c8)

4 c11 (done)

c4 c6 c82:

c9 c83:

c114:

Theory – InterpretationKnowledge Extraction

Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Formulation – RLDT

Set of feature-value pairs.

State ! ∈ #

1. Good classification results2. Fewer queries

Policy (Decision Tree)

Query – for another feature’s valueReport – pick a classification label

Actions $, & ∈ '

(-) for each query(+) for each correct classification(-) for each incorrect classification

Value Function (

("#, "%, "&)

("#, 0, "&)(1, "%, "&)

(1,0, "&)

"%?"#?

"%? "#?

Class


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels

Feature Values

Color white, brown,

Bruise yes, no

Oder choc, fruity, none


Combined Processes


Low Level Knowledge

Quantization


Policy Trainer

Control Information

Goal State

Result State

Signal Generator


Signal Interpreter

ID Manager

External Labels


Experimentation

Motor 1

Motor 3

X0

Y0

Motor 2

L1

L2

L3

T1

T3y

xBlack Box

motor1angle1length2angle3

motor2motor3

xy

Robotic Arm

012345

0 5 10 15 20 25 30 35Time

Black Boxanglesincostan

Trigonometric Functions

012345

0 5 10 15 20 25 30 35

Stream 1

012345

0 5 10 15 20 25 30 35

Stream 2

Time

Logic Operators

Black Boxbool1

andorxorbool2


Bool1 ID Name Content 8 [-∞ |∞| 0.00]

41 False [0.00 (0.00) |0.00| (0.00) 0.00]

45 Stay False (41; 41)

164 True [0.00 (5.00) |5.00| (5.00) 5.00]

165 [5.00 |∞| ∞]

166 Stay True (164; 164)

167 True to False (164; 41)

168 False to True (41; 164)

Exclusive Or ID Name Content 14 [-∞ |∞| 0.00] (∞)

28 False [0.00 (0.00) |0.00| (0.00) 0.56]

35 [0.56 |∞| 5.00] (∞)

36 True [5.00 (5.00) |5.00| (5.00) ∞]

44 Stay True (36; 36)

50 True to False (36; 28)

56 Stay False (28; 28)

65 False to True (28; 36)

Black Boxbool1

andorxorbool2

ResultsLogic Operations


0%

5%

10%

15%

20%

25%

30%

0 1000 2000 3000 4000 5000 6000

Errors

Passes

And Or Xor

Bool2


Bool1




Bool1

Bool1


(28) False : 85%(36) True : 15%

(28) False : 14%(36) True : 86%








(168) False to True








(167) True to False

Bool1



Black Boxbool1

andorxorbool2

ResultsLogic Operations



-3

-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200

Angle [deg]

sin cos tan Predicted Sin Predicted Cos Predicted Tan

-3

-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200

Angle [deg]


5 Passes

35 Passes

ResultsTrig Functions


ResultsTrig Functions

0.000

0.100

0.200

0.300

0.400

0.500

0 10 20 30 40

MSE

Passes

Sin Cos

*Tan not shown



Black Boxmotor1

angle1length2angle3

motor2

motor3xy

ResultsRobotic Arm

Motor 1

Motor 3

X0

Y0

Motor 2

L1

L2

L3

T1

T3y

x

§ Input VocabularyLearned

§ Output Vocabulary§ Policy Generation

Failed

Expected Result


ExpandedKnowledge Production

3

4

2

3

1

2Device Safety

-5 0 5 10 15

0

5.00 15.0010.000.00

-5 0 5 10 15 20

0

1

15.007.508.00 10.006.50 9.505.506.000.00 7.00 9.005.00 8.50

Closed-Loop Control

Future


Conclusions (1/2)

§ The unique values of the inputs and outputs of a system can be identified and labeled while streaming, assuming a gaussian distribution.

§ Complex knowledge can be developed out of the simple concepts of simultaneity and sequentiality.

§ Simple (low-level) knowledge can be combined to form more complex (higher-level) knowledge, and be tracked.

§ A knowledge-based decision tree can be generated using reinforcement learning.

1: c2c1 c2 c3 c3 c1c4 c6 c8

c9 c8c11

2:3:4:

Bool2


Bool1




Bool1

Bool1


(28) False : 85%(36) True : 15%

(28) False : 14%(36) True : 86%








(168) False to True








(167) True to False

Bool1



Black Box


§ Testing shows that systems with binary, categoricaland even continuous data can be learned.

§ Testing shows that only deterministic systems are currently possible.

§ Dynamic systems are theoretically possible with adaption of the learning process, but this is not yet tested.

-3

-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200

Angle [deg]


Conclusions (2/2)

Motor 1

Motor 3

X0

Y0

Motor 2

L1

L2

L3

T1

T3y

x


References§ Abhinav Garlapati, Aditi Raghunathan, Vaishnavh Nagarajan,

and Balaraman Ravindran. A Reinforcement Learning Approach to Online Learning of Decision Trees. Technical report, Department of Computer Science, Indian Institute of Technology, Madras, 2015.

§ Simon Kirby. Edinburgh Occasional Papers in Linguistics Language evo- lution without natural selection: From vocabulary to syntax in a population of learners. Technical report, 1998.

§ Kevin L Moore. Iterative Learning Control. Iterative Learning Control for Deterministic Systems, pages 425–488, 1993.

§ Derrick H. Nguyen and Bernard Widrow. Neural networks for self-learning control systems, 1991.

§ Youqing Wang, Furong Gao, and Francis J Doyle Iii. Survey on iterative learning control, repetitive control, and run-to-run control. Journal of Process Control, 19:1589–1600.


THANK YOU

master thesis defense · master thesis defense knowledge production and control of a black box...

Documents