additional material: mahalanobis distance

77
S N F EURO UZZY Prof. Dr. Rudolf Kruse 1 Additional Material: Mahalanobis Distance

Upload: tudor

Post on 05-Jan-2016

72 views

Category:

Documents


3 download

DESCRIPTION

Additional Material: Mahalanobis Distance. Interpretation of a Covariance Matrix. A univariate normal distribution has the density function A multivariate normal distribution has the density function. Variance and Standard Deviation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 1

Additional Material:Mahalanobis Distance

Page 2: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 2

Interpretation of a Covariance Matrix

A univariate normal distribution has the density function

A multivariate normal distribution has the density function

Page 3: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 3

Variance and Standard Deviation

Univariate Normal/Gaussian DistributionThe variance/standard deviation provides information about the height of the mode and the width of the curve.

Page 4: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 4

Interpretation of a Covariance Matrix

The variance/standard deviation relates the spread of the distribution to the spread of a standard normal distribution

The covariance matrix relates the spread of the distribution to the spread of a multivariate standard normal distribution

Example: bivariate normal distribution

Question: Is there a multivariate analog of standard deviation?

Page 5: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 5

Eigenvalue Decomposition

Yields an analog of standard deviation.Let S be a symmetric, positive definite matrix (e.g. a covariance

matrix).

Page 6: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 6

Eigenvalue Decomposition

Special Case: Two Dimensions

Page 7: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 7

Eigenvalue Decomposition

Page 8: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 8

Eigenvalue Decomposition

Page 9: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 9

Eigenvalue Decomposition

Special Case: Two Dimensions

Page 10: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 10

Cluster-Specific Distance Functions

The similarity of a data point to a prototype depends on their distance. If the cluster prototype is a simple cluster center, a general distance

measure can be defined on the data space.In this case the Euclidean distance is most often used due to its rotation invariance. It leads to (hyper-)spherical clusters.

However, more flexible clustering approaches (with size and shape parameters) use cluster-specific distance functions.The most common approach is to use a Mahalanobis distance with a cluster-specific covariance matrix.

The covariance matrix comprises shape and size parameters.The Euclidean distance is a special case that results for

Page 11: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 11

Additional Material:

Neuro-Fuzzy Systems

Page 12: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 12

Beispiel : Automatik-Getriebe

Aufgabe: Verbesserung des VWAutomatik-Getriebes - keine zusätzlichen Sensoren - individuelle Anpassung des Schaltverhaltens

Idee (1995):Das Fahrzeug “beobachtet” und klassifiziert den Fahrer nach

Sportlichkeit - ruhig, normal, sportlich Bestimmung eines Sport-Faktors aus [0,

1] - nervös Beruhigung des Fahrers

Testfahrzeug: - verschiedene Fahrer, Klassifikation durch Experten (Mitfahrer) - gleichzeitige Messungen:Geschwindigkeit,Position,Geschwindigkeit des Gaspedals,Winkel des Lenkrades, ... (14 Attribute).

Page 13: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 13

Modellierung unscharfer Informationen mit Fuzzy-Mengen

Zugehörigkeitsgrad

negativgroß

negativmittel

negativklein

ungefährnull

positivklein

positivmittel

positivgroß

1

138

ungefähr 13

6

etwa zwischen6 und 8

fast genau 2

2

Page 14: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 14

Example:Continously Adapting Gear Shift Schedule in VW New Beetle

classification of driver / driving situationby fuzzy logic

accelerator pedal

filtered speed ofaccelerator pedal

number ofchanges in pedal direction

sport factor [t-1]

gear shiftcomputation

rulebase

sportfactor [t]

determinationof speed limitsfor shiftinginto higher orlower geardepending onsport factor

gearselection

fuzzification inferencemachine

defuzzifi-cation

interpolation

Page 15: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 15

1 1 1

1 1 1

1

Eingabewerte:Stellwert:

If X is positive small and Y is positive small then Z is positive small

If X is positive big and Y is positive small then Z is positive big

defuzzifizierter Wert

x X

x X y

Y

Y

y

x und yz

Z

Z

Z

Page 16: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 16

Fuzzy-Regler mit 7 Regeln

Optimiertes Programm

DigimataufROMByte702

RAMByte24

AG 4

Laufzeit 80 ms,12 mal pro Sekunde wird ein neuer Sportfaktor bestimmt

In Serie im VW Konzern

Erlernen von Regelsystemen mit Hilfe von Künstlichen Neuronalen Netzen, Optimierung mit evolutionären Algorithmen

Page 17: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 17

Beispiel : Fuzzy Datenbank

TOPMANAGEMENT

MANAGEMENT

NACH-FOLGER TALENTBANK

Nachfolger für Top-Management Positionen

Page 18: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 18

Page 19: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 19

Beispiel : Automatisiertes sensor-basiertes Landen

Page 20: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 20

Neuro-Fuzzy Systems

Building a fuzzy system requires

prior knowledge (fuzzy rules, fuzzy sets)

manual tuning: time consuming and error-prone

Therefore: Support this process by learning

learning fuzzy rules (structure learning)

learning fuzzy set (parameter learning)

Approaches from Neural Networks can be used

Page 21: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 21

Page 22: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 22

Example: Prognosis of the Daily Proportional Changes of the DAX at the Frankfurter Stock Exchange (Siemens)

Database: time series from 1986 - 1997

DAX Composite DAX

German 3 month interest rates Return Germany

Morgan Stanley index Germany Dow Jones industrial index

DM / US-$ US treasury bonds

Gold price Nikkei index Japan

Morgan Stanley index Europe Price earning ratio

Page 23: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 23

Fuzzy Rules in Finance Trend Rule

IF DAX = decreasing AND US-$ = decreasingTHEN DAX prediction = decreaseWITH high certainty

Turning Point RuleIF DAX = decreasing AND US-$ = increasingTHEN DAX prediction = increaseWITH low certainty

Delay RuleIF DAX = stable AND US-$ = decreasingTHEN DAX prediction = decreaseWITH very high certainty

In generalIF x1 is 1 AND x2 is 2

THEN y = WITH weight k

Page 24: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 24

Classical Probabilistic Expert Opinion Pooling Method

DM analyzes each source (human expert, data + forecasting model) in terms of (1) Statistical accuracy, and (2) Informativeness by asking the source to asses quantities (quantile assessment)

DM obtains a “weight” for each source

DM “eliminates” bad sources

DM determines the weighted sum of source outputs

Determination of “Return of Invest”

Page 25: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 25

E experts, R quantiles for N quantities

each expert has to asses R·N values stat. Accuracy:

information score:

weight for expert e:

outputt=

roi =

E

e

etew

1output

T

ttty

1

DMoutputsign

Ee eeee

eeee

cIc

cIcw

1 id

id

N

i riri

rR

rroiRi vv

ppvv

NI

1 1,,

11

11,1, lnln

1

R

i

iiR p

sspsIpsINC

0

2 ln, ,,21

Page 26: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 26

Formal Analysis

Sources of informationR1 rule set given by expert 1R2 rule set given by expert 2D data set (time series)

Operator schemafuse (R1, R2) fuse two rule setsinduce(D) induce a rule set from Drevise(R, D) revise a rule set R by D

Page 27: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 27

Formal Analysis

Strategies:

fuse(fuse (R1, R2), induce(D))

revise(fuse(R1, R2), D) fuse(revise(R1, D), revise(R2, D))

Technique: Neuro-Fuzzy SystemsNauck, Klawonn, Kruse, Foundations of Neuro-Fuzzy

Systems, Wiley 97SENN (commercial neural network environment, Siemens)

Page 28: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 28

Neuro-Fuzzy Architecture

Page 29: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 29

From Rules to Neural Networks

1. Evaluation of membership degrees

2. Evaluation of rules (rule activity)

3. Accumulation of rule inputs and normalization

NF: IRn IR,

r

l r

j jj

lll

xk

xkwx

1

1

lD

j ijsc xx

1

)(,

l: IRn [0,1]r,

Page 30: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 30

The Semantics-Preserving Learning Algorithm

Reduction of the dimension of the weight space

1. Membership functions of different inputs share their parameters, e.g.

2. Membership functions of the same input variable are not allowed to pass each other, they must keep their original order, e.g.

Benefits: the optimized rule base can still be interpreted the number of free parameters is reduced

stablecdax

stabledax

increasingstabledecreasing

Page 31: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 31

Return-on-Investment Curves of the Different Models

Validation data from March 01, 1994 until April 1997

Page 32: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 32

Neuro-Fuzzy Systems in Data Analysis

Neuro-Fuzzy System:System of linguistic rules (fuzzy rules).Not rules in a logical sense, but function

approximation.Fuzzy rule = vague prototype / sample.

Neuro-Fuzzy-System:Adding a learning algorithm inspired by neural

networks.Feature: local adaptation of parameters.

Page 33: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 33

A Neuro-Fuzzy System

is a fuzzy system trained by heuristic learning techniques derived from neural networks

can be viewed as a 3-layer neural network with fuzzy weights and special activation functions

is always interpretable as a fuzzy system

uses constraint learning procedures

is a function approximator (classifier, controller)

Page 34: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 34

Learning Fuzzy Rules

Cluster-oriented approaches=> find clusters in data, each cluster is a rule

Hyperbox-oriented approaches=> find clusters in the form of hyperboxes

Structure-oriented approaches=> used predefined fuzzy sets to structure the

data space, pick rules from grid cells

Page 35: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 35

Hyperbox-Oriented Rule Learning

y

x

Search for hyperboxes in the data space

Create fuzzy rules by projecting the hyperboxes

Fuzzy rules and fuzzy sets are created at the same time

Usually very fast

Page 36: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 36

Hyperbox-Oriented Rule Learning

Detect hyperboxes in the data, example: XOR function Advantage over fuzzy cluster anlysis:

No loss of information when hyperboxes are represented as fuzzy rules

Not all variables need to be used, don‘t care variables can be discovered

Disadvantage: each fuzzy rules uses individual fuzzy sets, i.e. the rule base is complex.

y

x

y

x

y

x

y

x

Page 37: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 37

Structure-Oriented Rule Learning

small medium large

x

y

smal

lm

ediu

mla

rge

Provide initial fuzzy sets for all variables.

The data space is partitioned by a fuzzy grid

Detect all grid cells that contain data (approach by Wang/Mendel 1992)

Compute best consequents and select best rules (extension by Nauck/Kruse 1995, NEFCLASS model)

Page 38: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 38

Structure-Oriented Rule Learning

Simple: Rule base available after two cycles through the training data1. Cycle: discover all antecedents

2. Cycle: determine best consequents

Missing values can be handled

Numeric and symbolic attributes can be processed at the same time (mixed fuzzy rules)

Advantage: All rules share the same fuzzy sets

Disadvantage: Fuzzy sets must be given

Page 39: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 39

Learning Fuzzy Sets

Gradient descent proceduresonly applicable, if differentiation is possible, e.g. for Sugeno-type fuzzy systems.

Special heuristic procedures that do not use gradient information.

The learning algorithms are based on the idea of backpropagation.

Page 40: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 40

Learning Fuzzy Sets: Constraints

Mandatory constraints: Fuzzy sets must stay normal and convex Fuzzy sets must not exchange their relative positions (they

must not „pass“ each other) Fuzzy sets must always overlap

Optional constraints Fuzzy sets must stay symmetric Degrees of membership must add up to 1.0

The learning algorithm must enforce these constraints.

Page 41: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 41

Example: Medical Diagnosis

Results from patients tested for breast cancer (Wisconsin Breast Cancer Data).

Decision support: Do the data indicate a malignant or a benign case?

A surgeon must be able to check the classification for plausibility.

We are looking for a simple and interpretable classifier: knowledge discovery.

Page 42: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 42

Example: WBC Data Set

699 cases (16 cases have missing values).

2 classes: benign (458), malignant (241).

9 attributes with values from {1, ... , 10}(ordinal scale, but usually interpreted as a numerical scale).

Experiment: x3 and x6 are interpreted as nominal attributes.

x3 and x6 are usually seen as „important“ attributes.

Page 43: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 43

Applying NEFCLASS-J

Tool for developing Neuro-Fuzzy Classifiers

Written in JAVA

Free version for research available

Project started at Neuro-Fuzzy Group of University of Magdeburg,

Germany

Page 44: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 44

NEFCLASS: Neuro-Fuzzy Classifier

Input variables (attributes)

Fuzzy rules

Output variables (class labels)

Fuzzy sets (antecedents)

Unweighted connections

Page 45: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 45

NEFCLASS: Features

Automatic induction of a fuzzy rule base from data

Training of several forms of fuzzy sets

Processing of numeric and symbolic attributes

Treatment of missing values (no imputation)

Automatic pruning strategies

Fusion of expert knowledge and knowledge obtained

from data

Page 46: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 46

Representation of Fuzzy Rules

x y

R1 R2

c1 c2

large

smalllarge

Example: 2 Rules

R1: if x is large and y is small, then class is c1.

R2: if x is large and y is large, then class is c2.

The connections x R1 and x R2

are linked.

The fuzzy set large is a shared weight.

That means the term large has always the

same meaning in both rules.

Page 47: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 47

1. Training Step: Initialisation

small medium large

x

y

small

medium

large

Specify initial fuzzy partitions for all input variables

x y

c1 c2

Page 48: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 48

2. Training Step: Rule Base

Algorithm:

for (all patterns p) dofind antecedent A,such that A( p) is maximal;if (A L) then add A to L;

end;

for (all antecedents A L) dofind best consequent C for A;create rule base candidate R = (A,C);Determine the performance of R;Add R to B;

end;

Select a rule base from B;

Variations:

Fuzzy rule bases can also be created by using prior knowledge, fuzzy cluster analysis, fuzzy decision trees, genetic algorithms, ...

Page 49: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 49

Selection of a Rule Base

• Order rules by performance.

• Either selectthe best r rules orthe best r/m rules per class.

• r is either given or is determined automatically such that all patterns are covered.

Page 50: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 50

Rule Base Induction

NEFCLASS uses a modified Wang-Mendel procedure

x y

c1 c2

R1 R3R2

small medium large

x

y

small

medium

large

Page 51: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 51

Computing the Error Signal

10,1 )( withcon rRrrr EE

Rule Error:

output) actual : output, correct :(

and

with

ot

ed

otdddE

d

da

jjj

2

max)(,1,0:

,)(1)sgn(

Fuzzy Error ( jth output):

x y

c1 c2

R1 R3R2

Error Signal

Page 52: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 52

3. Training Step: Fuzzy Sets

bbcfc

babfa

bxacfb

x

xf

E

E

E

E

)sgn(

)(1

0)(

otherwise

if

Example:triangularmembershipfunction.

Parameterupdates for anantecedentfuzzy set.

otherwise

if

if

0

],[

),[

)(],1,0[: ,,,, cbxbc

xc

baxab

ax

xcbacba

Page 53: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 53

Training of Fuzzy Sets

small medium large

x

y

small

medium

large

0.85enlarge

0.30

reduce

x

(x)

0.55

initial fuzzy set

Heuristics: a fuzzy set is moved away from x (towards x) and its support is reduced (enlarged), in order toreduce (enlarge) the degree of membership of x.

Page 54: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 54

Training of Fuzzy Sets

Variations:

Adaptive learning rate

Online-/Batch Learning

optimistic learning(n step look ahead)

Observing the error on a validation set

Algorithm:

repeatfor (all patterns) do

accumulate parameter updates;accumulate error;

end;modify parameters;

until (no change in error);

local minimum

Page 55: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 55

Constraints for Training Fuzzy Sets

Valid parameter values

Non-empty intersection of adjacent fuzzy sets

Keep relative positions

Maintain symmetry

Complete coverage (degrees of membership add up to 1 for each element)

Correcting a partition after modifying the parameters

Page 56: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 56

4. Training Step: Pruning

Goal: remove variables, rules and fuzzy sets, in order toimprove interpretability and generalisation.

Page 57: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 57

Pruning

Algorithm:

repeatselect pruning method;

repeatexecute pruning step;train fuzzy sets;

if (no improvement)then undo step;

until (no improvement);

until (no further method);

Pruning Methods:

1. Remove variables(use correlations, information gain etc.)

2. Remove rules(use rule performance)

3. Remove terms(use degree of fulfilment)

4. Remove fuzzy sets(use fuzziness)

Page 58: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 58

WBC Learning Result: Fuzzy Rules

R1: if uniformity of cell size is small and bare nuclei is fuzzy0 then benign

R2: if uniformity of cell size is large then malignant

Page 59: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 59

WBC Learning Result: Classification Performance

Predicted Class

malign benign notclassified

sum

malign 228 (32.62%) 13 (1.86%) 0 (0%) 241 (34.99%)

benign 15 (2.15%) 443 (63.38%) 0 (0%) 458 (65.01%)

sum 243 (34.76%) 456 (65.24%) 0 (0%) 699 (100.00%)

NEFCLASS-J: 95.42%

Discriminant Analysis: 96.05%

C 4.5: 95.10%

NEFCLASS-J (numeric): 94.14%

Multilayer Perceptron: 94.82%

C 4.5 Rules: 95.40%

Estimated Performance on Unseen Data (Cross Validation)

Page 60: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 60

WBC Learning Result: Fuzzy Sets

0.01.0 2.8 4.6 6.4 8.2 10.0

0.5

1.0

0.01.0 2.8 4.6 6.4 8.2 10.0

0.5

1.0

sm lguniformity of cell size

bare nuclei

Page 61: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 61

NEFCLASS-J

Page 62: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 62

Resources

Detlef Nauck, Frank Klawonn & Rudolf Kruse:

Foundations of Neuro-Fuzzy Systems

Wiley, Chichester, 1997, ISBN: 0-471-97151-0

Neuro-Fuzzy Software (NEFCLASS, NEFCON, NEFPROX):

http://www.neuro-fuzzy.de

Beta-Version of NEFCLASS-J:

http://www.neuro-fuzzy.de/nefclass/nefclassj

Page 63: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 63

Download NEFCLASS-J

Download the free version of NEFCLASS-J athttp://fuzzy.cs.uni-magdeburg.de

Page 64: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 64

Conclusions

Neuro-Fuzzy-Systems can be useful for knowledge discovery.

Interpretability enables plausibility checks and improves acceptance.

(Neuro-)Fuzzy systems exploit tolerance for sub-optimal solutions.

Neuro-fuzzy learning algorithms must observe constraints in order not to jeopardise the semantics of the model.

Not an automatic model creator, the user must work with the tool.

Simple learning techniques support explorative data analysis.

Page 65: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 65

Information Mining

Information mining is the non-trivial process of identifying valid, novel, potentially useful, and understandable information and patterns in heterogeneous information sources.

Information sources are data bases, expert background knowledge, textual description, images, sounds, ...

Page 66: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 66

Information Mining

Problem Understanding

Information Understanding

Information Preparation

Modeling Evaluation Deployment

Determine Problem Objectives Assess Situations Determine Information Mining Goals Produce Project Plan

Collect Initial Information Describe Information Explore Information Verify Information Quality

Select Infor-mation Clean Infor-mation Construct In-formation Integrate In-formation Format Infor-mation

Select Modeling Technique Generate Test Design Build Model Assess Model

Evaluate Results Review Process Determine Next Steps

Plan Deployment Plan Moni-toring and Maintenance Produce Final Results Review Project

Page 67: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 67

Example: Line Filtering

Extraction of edge segments (Burns’ operator) Production net:

edges lines long lines parallel lines runways

Page 68: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 68

Example: Line Filtering

Problems

extremely many lines due to distorted images long execution times of production net

Page 69: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 69

Example: Line Filtering

Only few lines used for runway assembly Approach:

Extract textural features of lines Identify and discard superfluous lines

dleft window

right window

gradient

d

Page 70: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 70

Example: Line Filtering

Several classifiers:

minimum distance, k-nearest neighbor, decision trees, NEFCLASS Problems: classes are overlapping and extremely unbalanced Result above with modified NEFCLASS:

all lines for runway construction found reduction to 8.7% of edge segments

Page 71: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 71

Surface Quality Control: the 2 Approaches

The Proposed Approach

Our Approach is baseded on the digitization of the exterior body panel surface with an optical measuring system.

We characterize the form deviation by mathematical properties that are close to the subjective properties that the experts used in their linguistic description.

Today’s Approach

The current surface quality control is done manually an experienced worker treats the exterior surfaces with a grindstone. The experts classify surface form deviations by means of linguistic descriptions.

CumbersomeCumbersome – – SubjectiveSubjective - - Error ProneError Prone Time Time ConsumingConsuming

Page 72: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 72

Topometric 3-D measuring system

Triangulation and Gratings Projection

Miniaturized Projection Technique (Grey Code Phase shift)

High Point Density Fast Data Collection Measurement Accuracy Contact less and Non-destructive

0100

(x,y)(x,y)

b

φnP(x,y)

z(x,y)

Pixelcoding

y

x

z

Page 73: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 73

Data Processing

3-D Data Acquisition

Detection of Form Deviation

Features AnalysisPost-Processing

• Approximation by a Polynomial Surface

• Difference • Colour-Coded Visualization

• 3-D-Point Cloud

z(x,y)

z(x,y)

z(x,y)˜

Dz(x,y)

Form Deviation

• Feature Calculation

• Classification (Data-Mining)

z(x,y)˜

Page 74: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 74

Color Coded VisualizationResult of Grinding

Page 75: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 75

3D Visualization of Local Surface Defects

Uneven Surface (several sink marks in series or adjoined)

Press Mark(local smoothing of (micro-)surface)

Sink Mark (slight flat based depression inward)

Waviness(several heavier wrinklings in series)

Page 76: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 76

Data Characteristics We analysed 9 master pieces with a total number of 99 defects

For each defect we calculated 42 features

0 10 20 30 40 50

uneven surface

press mark

sink mark

flat area

draw line

w aviness

line

uneven radius

The types are rather unbalanced

We discarded the rare classes

We discarded some of the extremely correlated features (31 features left)

We ranked the 31 features by importance

We use stratified 4-fold cross validation for the experiment.

Page 77: Additional Material: Mahalanobis Distance

SNFEURO

UZZYProf. Dr. Rudolf Kruse 77

Application and Results

NBC DTree NN NEFCLASS DC

Train Set 89.0% 94.7% 90% 81.6% 46.8%

Test Set 75.6% 75.6% 85.5% 79.9% 46.8%

Classification Accuracy

The Rule Base for NEFCLASS