sofsem 2004 knowledge acquisition and processing: new methods for neuro-fuzzy systems danuta...
TRANSCRIPT
SOFSEM 2004
Knowledge acquisition and Knowledge acquisition and processing: new methods for neuro-processing: new methods for neuro-fuzzy systemsfuzzy systems
Danuta Rutkowska
Department of Computer EngineeringTechnical University of Częstochowa, Poland
E-mail: [email protected]
SOFSEM 2004
Knowledge Acquisition and Inferencein the Framework of Soft Computingand Computing with Words
Cognitive Technologies
Soft Computing, Computing with Soft Computing, Computing with Words, ...Words, ...
• Soft computing• Computing with words• Perception-based systems• Computational Intelligence• Artificial Intelligence• Cognitive sciences• Neural networks• Fuzzy systems• Evolutionary algorithms• Intelligent systems
Evolutionaryalgorithms
Neuro--computing
Roughsets
Uncertainvariables
Probabilistictechniques
Softcomputing
Fuzzylogic
Soft computing techniques
Cognition
The word „cognitioncognition” comes from the latin word „cognitio”, which means„knowledge”.
Cognitive sciencesCognitive sciences concern thinking, perception, reasoning,creation of meaning, and other functions of a human mind.
The principal aim of soft computingis to exploit the tolerance of uncertainty and vagueness in the area of cognitivereasoning.
[Nauck D., Kruse R.: NEFCLASS-J – A JAVA-BasedSoft Computing Tool, In. B. Azvine et al. (Eds.), Intelligent Systems and Soft Computing, LNAI 1804,Springer-Verlag, Heidelberg, New York (2000), pp.139-160].
Soft computing and cognition
Artificial Intelligence and cognition
The aim of artificial intelligenceartificial intelligenceis to develop paradigms or algorithmsthat allow machines to perform tasksthat involve cognitioncognition when performedby humans
[A.P. Sage (ed.), Coincise Encyclopedia ofInformation Processing in Systems and OrganizationPergamon Press, New York, 1990]
Perception and fuzzy systems
The systems that incorporateperceptions expressed by wordsare fuzzy systems, introducedby Prof. L.A. Zadeh.
Perception is very importantin human cognition
Perception-based systemsPerception-based systems
Fuzzy systems are rule-based systems(knowledge-based systems) that can beviewed as perception-based systems.
The rule base of a fuzzy systemis composed of fuzzy IF-THEN rulesthat are similar to the rules usedby humans in their reasoning.
Learning by examples
Learning by examplesis one of the simplest cognitive capabilitiesof a young child.
Artificial neural networkswith an inductive, supervisedlearning algorithm, imitatethe cognitive behaviour.
Machine learningMachine learning
Machine learningMachine learning research has the potentialto make a profound contribution to the theory and practice of expert systemsexpert systems, as well as to other areas of artificialartificialintelligenceintelligence. Its application to the problem of deriving rule sets from rule sets from examples examples is already helping to circumvent the knowledge acquisition bottleneck.knowledge acquisition bottleneck.
[P. Jackson, Introduction to Expert Systems,Addison Wesley, 1999, Chapter 20, p.399]
Inductive learningInductive learning
The most common form ofsupervised learningsupervised learning taskis called induction.An inductive learninginductive learning programis one which is capable oflearning from examples learning from examples by a process of generalization.generalization.
[P. Jackson, Introduction to Expert Systems,Addison Wesley, 1999, Chapter 20, p.381]
Neural network (MLP)
Model of an artificial neuron
RBF network
Gaussian function
Normalized RBF network
General neuro-fuzzy architecture
Fuzzy reasoning for Fuzzy reasoning for kk-th rule-th rule
kkk ByAR isis: THENI xF
consequentconsequentantecedentantecedentk-th rulek-th rule
nTn Rxx Xx ,,1
input variable
xkA
output variableRy Y
Nk ,,1
kn
kk AAA 1
xx
xxx
if0
if1A
fuzzificationfuzzification Xx T
nxx ,,1 input value
yy kkk BAB,x
input fuzzy setinput fuzzy set k-th output fuzzy setk-th output fuzzy set
fuzzy relationfuzzy relation
Aggregation and defuzzification
T-normT-norm
aggregation forMamdani approach
aggregation forMamdani approach
aggregation forlogical approach
aggregation forlogical approach
output fuzzy setfor all N rules
output fuzzy setfor all N rules
output valueoutput valuecentre of consequent
fuzzy set Bk
centre of consequentfuzzy set Bk
defuzzificationdefuzzification
S-normS-norm
ySy kB
N
kB
1
' yTy kB
N
kB
1
'
N
k
kB
N
k
kB
k
y
yyy
1
1
Fuzzy implicationsFuzzy implications: Mamdani, logical: Mamdani, logical Mamdaniapproach
Mamdaniapproach
logicalapproach
logicalapproach
An example of a neuro-fuzzy network
More general form of this network
Another example of the NF network
T-norm
A triangular norm T is a function of two arguments T: [0,1]×[0,1]→[0,1]which satisfies the following conditionsfor a,b,c,d [0,1]:∈
Monotonicity :T(a,b)≤T(c,d); a≤c; b≤dCommutativity :T(a,b)=T(b,a)Associativity :T (T(a,b),c)=T(a,T(b,c))Boundary conditions :T(a,0)=0; T(a,1)=a
T-conorm (S-norm)
A T-conorm (S-norm) is a function of twoarguments S: [0,1]×[0,1]→[0,1],which satisfies the following conditionsfor a,b,c,d [0,1]∈
Monotonicity :S(a,b)≤S(c,d); a≤c; b≤d Commutativity :S(a,b)=S(b,a)
Associativity :S (S(a,b),c)=S(a,S(b,c)) Boundary conditions :S(a,0)=a; S(a,1)=1
Neuro-fuzzy inference systems (NFIS)
MAMDANI LOGICAL
APPROACHES TO DESIGN NFIS
TAKAGI - SUGENO
Fuzzy-logic inference system
FUZZIFIER
x
DEFUZZIFIER
y
FUZZY INFERENCE ENGINE
FUZZY RULE BASE( )IF ... THEN ...
Fuzzy-logic inference system: fuzzifier
Fuzzy-logicinference system:fuzzy rule base
Fuzzy-logic inference system: fuzzy inference engine
Fuzzy-logicinference system:defuzzifier
General architecture of Neuro-Fuzzy Inference System
II III IVI
x1
x2
xN
11,1 , yI x
22,1 , yI x
NN yI ,,1 x
11,2 , yI x
22,2 , yI x
NN yI ,,2 x
11, , yIN x
22, , yIN x
NNN yI ,, x
11 ,agr yx
22 ,agr yx
NN y,agr x
1x
2x
nx
.
.
.
y
1y
2y
Ny
.
.
.
1
1.
.
.
1
NFIS
Flexible neuro-fuzzysystem:Mamdani approach
IMPLICATIONS
AGGREGATIONS OF RULES
e.g.
e.g.
DefinitionDefinition: : Fuzzy implicationFuzzy implication
A fuzzy implication is a function I:[0,1]2→[0,1] satisfying the following conditions:
(I1) if a1≤a3 then I(a1,a2)≥I(a3,a2), for all a1,a2,a3[0,1]
(I2) if a2≤a3 then I(a1,a2)≤I(a1,a3), for all a1,a2,a3[0,1]
(I3) I(0,a2)=1, for all a2[0,1](falsity implies anything)
(I4) I(a1,1)=1, for all a1[0,1](anything implies tautology)
(I5) I(1,0)=0 (booleanity)
Fuzzy implications
KLEENE DIENES
ŁUKASIEW ICZ
REICHENBACH
FODOR
SHARP
GOGUEN
GÖDEL
YAGER
ZADEH
WILLMOTT
NAME IMPLICATION I(a,b) NAME IMPLICATION I(a,b)
,1m a x ba
1,1m in ba
1 ba- a
i f1m a x
i f1
baa ,b
ba
i f0
i f1
ba
ba
0if,1m in
0if1
aab
a
i f
i f1
bab
ba
0i f
0i f1
ab
aa
1,,m inm a x aba
,1m in,1,m a x
,,1m a xm in
baba
ba
Flexible neuro-fuzzysystem:Logical approach
IMPLICATIONS
AGGREGATIONS OF RULES
e.g.
e.g.
Flexible neuro-fuzzy system: AND-type compromise NFIS
M A M D AN I TYPE
LOG IC AL TYPE
C OM PR OM ISE(M AM DA NI AN D LOG IC AL)
,1,1, baSbaTbaI
,1m ax,m in1, bababaI 1,0
0
1
( 0 , 1 )
S Y S T E M
Flexible neuro-fuzzy system: OR-type compromise NFIS
M A M D AN I TYPE
LOG IC AL TYPE
U ND EFINED
“M O RE M AM DA NI”
“M O RE LO GICA L”
0
1
0 . 5
( 0 , 0 . 5 )
( 0 . 5 , 1 )
S Y S T E M
Flexible neuro-fuzzy system
L. Rutkowski and K. Cpałka „Flexible Neuro-Fuzzy Systems”, IEEE Trans. Neural Networks, vol. 14, pp. 554-574, May 2003
Flexible neuro-fuzzy system: Soft NFIS (1/2)
11;
~
1
aa Tan
Tn
ii
11;
~
1
aa San
Sn
ii
1,0
,2
11;,
~baTbabaI
,112
11;,
~baSbabaI
1,0
Flexible neuro-fuzzy system: Soft NFIS (2/2)
Flexible neuro-fuzzy system: NFIS realized by parameterised families of triangular norms (1/2)
THE DOMBI TRIANGULAR NORMS
,p 0
Flexible neuro-fuzzy system: NFIS realized by parameterised families of triangular norms (2/2)
Flexible neuro-fuzzy system: NFIS realized by triangular norms with weighted arguments (1/2)
22112121 11,11 awawT,w;w,aaT
1,0, 21 ww
22112121 a,wawS,w;w,aaS
11
11,10
22
22221
aw
awT,w;,aaT
11
1,110
11
1121
aw
awT,;w,aaT
1
,00
22
22221
aw
awS,w;,aaS
0,0
11
1121
aw
awS,;w,aaS
Flexible neuro-fuzzy system: NFIS realized by triangular norms with weighted arguments (2/2)
Flexible neuro-fuzzy system: Glass Identification– experimental results
INIT
IAL
VA
LU
ES
FIN
AL
VA
LU
ES
A
FT
ER
LE
AR
NIN
G
RM
SE
/ M
ISTA
KE
S [
%]
(LE
AR
NIN
G S
EQ
UE
NC
E)
RM
SE
/ M
ISTA
KE
S [
%]
(TE
ST
ING
SE
QU
EN
CE
)
iii
NA
ME
OF
FL
EX
IBIL
ITY
P
AR
AM
ET
ER
i
ii
INIT
IAL
VA
LU
ES
FIN
AL
VA
LU
ES
A
FT
ER
LE
AR
NIN
G
RM
SE
/ M
ISTA
KE
S [
%]
(LE
AR
NIN
G S
EQ
UE
NC
E)
RM
SE
/ M
ISTA
KE
S [
%]
(TE
ST
ING
SE
QU
EN
CE
)
NA
ME
OF
FL
EX
IBIL
ITY
P
AR
AM
ET
ER
iv
v
0.5 1.0000 0.23956.66%
0.25537.81%
0.5 1.0000 0.23927.33%
0.24837.81%
00 .2 8 4 510.00%
0.21967.81%
I
a g r
p
p I
p agr
0.5101010111
1.00009.99539.99989.99990.95760.99310.8482
0.18563.33%
0.21916.25%
I
a g r
p
p I
p ag r
w
w ag r
0.510101011111
0.17842.00%
0.25966.25%
1.00009.96019.99979.98360.92130.99390.8456
nextslide
Flexibleneuro-fuzzysystem: Glass Identification– weights representation
Weights representationWeights representationin the Glass Identification in the Glass Identification problem (dark areas problem (dark areas correspond to low values correspond to low values and vice versa) and vice versa)
ag rw
9,,1 i
2,,1
kw
Flexible neuro-fuzzy system: Glass Identification – comparison table
Dong and Kothari (IG ) 92.86
Dong and Kothari (IG +LA) 93.09
Dong and Kothari (G R) 92.86
Dong and Kothari (G R+LA) 93.10
our result 93.75
M ethodTesting Acc. [% ]
1x
2x
Nx
r 1 ,1
r 1 ,2
r 2 ,1
r 2 ,2
T
div
y
x1A
x2A
1b
2b
Mb
,K Mr
1,Mr
2 , Mr
,1Kr
,2Kr
T
T
T
T
T
T
T
T
xKA
S
S
S
Neuro-fuzzy relational system
Neuro-fuzzy relational system with fuzzy matrix R
Neuro-fuzzy connectionist system (basic architecture)
1x
2x
Nx
div
y
1y
2y
1
1
1
Ky
L1 L2 L3
11A
12A
1NA
2NA
22A
KA2
KA1
KNA
21A
Rule generation
The neuro-fuzzy networksreflect fuzzy IF-THEN rules.
The network architecturesare created based on the rules.
How to get the rules ?
Basic questions:
• How many rules ?
• What kind of the membership functions
(Gaussian, triangular, trapezoidal, etc.) ?
• How to determine parameter values
of the membership functions (centers, widths) ?
Many methods
There are many methodsof rule generation.
However, most of the rulesobtained by these methods,when applied in neuro-fuzzysystems for classification,result in some misclassifications.
Perception-based approachPerception-based approach
This method generatesfuzzy IF-THEN rules,from a data set, by useof fuzzy granulation.
The neuro-fuzzy systems,which utilize these rules,perform without misclassifications.
Multi-stage classification
The perception-based approachallows to generate fuzzy rulesand perform a multi-stageclassification withoutmisclassifications.
This method will be illustratedon the IRIS example.
IRIS data set:
150 data items that contain measurements of iris flowers from three species of iris:Setosa, Versicolor, and Virginica; 50 data items for each of the iris species.
The data include information about four features of the iris flowers: sepal length, sepal width, petal length, petal width.
Ranges of the measurementsof iris flowers (in centimeters)
Sepal length 4.3 – 7.9
Sepal width 2.0 – 4.4
Petal length 1.0 – 6.9
Petal width 0.1 – 2.5
Ranges within the classesSetosa Versicol
orVirginic
aSepal length
4.3 – 5.8
4.9 – 7.0 4.9 – 7.9
Sepal width
2.3 – 4.4
2.0 – 3.4 2.2 – 3.8
Petal length
1.0 – 1.9
3.0 – 5.1 4.5 – 6.9
Petal width
0.1 – 0.6
1.0 – 1.8 1.4 – 2.5
Granulated ranges of sepal length
4.3 – 4.9 Sestosa
4.9 – 5.8 Sestosa Versicolor Virginica
5.8 – 7.0
Versicolor Virginica
7.0 – 7.9
Virginica
Granulated ranges of sepal width
2.0 – 2.2
Versicolor
2.2 – 2.3
Versicolor Virginica
2.3 – 3.4 Sestosa Versicolor Virginica
3.4 – 3.8 Sestosa
Virginica
3.8 – 4.4 Sestosa
Granulated ranges of petal length
1.0 – 1.9 Sestosa
3.0 – 4.5
Versicolor
4.5 – 5.1
Versicolor Virginica
5.1 – 6.9
Virginica
Granulated ranges of petal width
0.1 – 0.6 Sestosa
1.0 – 1.4
Versicolor
1.4 – 1.8
Versicolor Virginica
1.8 – 2.5
Virginica
Linguistic labels for sepal length
4.3 – 4.9 short sepal A11
4.9 – 5.8 medium long sepal A12
5.8 – 7.0 long sepal A13
7.0 – 7.9 very long sepal A14
Linguistic labels for sepal width2.0 – 2.2 very narrow sepal A21
2.2 – 2.3 narrow sepal A22
2.3 – 3.4 medium wide sepal A23
3.4 – 3.8 wide sepal A24
3.8 – 4.4 very wide sepal A25
Linguistic labels for petal length
1.0 – 1.9 very short petal A31
3.0 – 4.5 medium long petal A32
4.5 – 5.1 long petal A33
5.1 – 6.9 very long petal A34
Linguistic labels for petal width
0.1 – 0.6 very narrow petal A41
1.0 – 1.4 medium wide petal A42
1.4 – 1.8 wide petal A43
1.8 – 2.5 very wide petal A44
Rule 1Rule 1IF sepal is short or medium long and medium wide or wide or very wideand petal is very short and very narrow THEN Setosa
IF x1 is and x2 is and x3 is and
x4 is THEN Setosa
11A 1
2A 13A
14A
25242312 AAAA 1211
11 AAA
3113 AA 41
14 AA
Rule 2Rule 2
IF sepal is medium long or long and very narrow or narrow or medium wideand petal is medium long or long and medium wide or wide THEN Versicolor
IF x1 is and x2 is and x3 is and x4
is THEN Versicolor
21A 2
2A 23A
24A
13122
1 AAA 23222122 AAAA
333223 AAA 4342
24 AAA
Rule 3Rule 3
IF sepal is medium long or long or very long and narrow or medium wide or wide and petal is long or very long and wide or very wide THEN Virginica
IF x1 is and x2 is and x3 is
and x4 is THEN Virginica
31A 3
2A 33A
34A
14131231 AAAA 242322
32 AAAA
343333 AAA
444334 AAA
NF network for the iris classification
Results of the 1st stage classificationResults of the 1st stage classification
50 data vectors correctly classified to Setosa32 data vectors correctly classified to Versicolor42 data vectors correctly classified to Virginica
26 data vectors – „I do not know” decision: Versicolor or Virginica
These data vectors participate in the 2nd stageof the classification.
2nd stage classification
Two fuzzy IF-THEN rules are formulated,based on the granulated ranges, obtainedfor the data vectors with the „I do not know”
decision in the 1st stage.
The NF network in the 2nd stage is reducedto the components associated with the Versicolor and Virginica classes.
Results of the 2nd stage classificationResults of the 2nd stage classification
12 data vectors correctly classified to Versicolor 1 data vector correctly classified to Virginica
13 data vectors – „I do not know” decision: Versicolor or Virginica
These data vectors participate in the 3rd stageof the classification. Two new rules are created.
Results of the 3rd stage classificationResults of the 3rd stage classification
4 data vectors correctly classified to Versicolor5 data vectors correctly classified to Virginica
4 data vectors – „I do not know” decision: Versicolor or Virginica
These data vectors participate in the 4th stageof the classification. Two new rules are created.
Results of the 4th stage classificationResults of the 4th stage classification
2 data vectors correctly classified to Versicolor2 data vectors correctly classified to Virginica
All data vectors correctly classifiedafter 4 stages of the classification.
No misclassifications !
IRIS data: P1, P2IRIS
0
0,5
1
1,5
2
2,5
3
3,5
4
4,5
5
0 1 2 3 4 5 6 7 8 9
P1 (sepal length)
P2
(s
ep
al w
idth
)
SestosaP1P2
VersicolorP1P2
VirginicaP1P2
IRIS data: P1, P3IRIS
0
1
2
3
4
5
6
7
8
0 1 2 3 4 5 6 7 8 9
P1 (sepal lenght)
P3
(p
eta
l le
ng
ht)
SestosaP1P3
VersicolorP1P3
VirginicaP1P3
IRIS data: P2, P4IRIS
0
0,5
1
1,5
2
2,5
3
0 1 2 3 4 5 6 7 8 9
P2 (sepal width)
P4
(p
eta
l wid
th)
SestosaP2P4
VersicolorP2P4
VirginicaP2P4
IRIS data: P3, P4IRIS
0
0,5
1
1,5
2
2,5
3
0 1 2 3 4 5 6 7 8
P3 (petal length)
P4
(p
eta
l wid
th)
SestosaP3P4
VersicolorP3P4
VirginicaP3P4
Diagnosis of a tumor of mucous membrane of uterus
Attributes :• period of time after menopause • BMI (Body Mass Index) • LH (luteinizing hormone )• FSH (follicle-stimulating hormone ) • PRL (prolactin ) • E1 (estron) • E2 (estradiol) • Aromatase• estrogenic receptor
Diagnosis:negative (class 0), positive (class 1)
Data:52 records of positive diagnosis
13 records of negative diagnosis
9 attributes
Ranges of the attribute values
0.5 - 34
20 - 46
0.5 – 120.3
1.36 – 155.4
2.4 – 128.1
156 - 542
0.04 – 1.48
2.28 – 11.85
0.72 – 3.85
1a2a3a
4a5a
6a
7a
8a9a
Ranges within the classes
Class 0 Class 1
0.5 - 20 0.5 - 34
20 - 46 20 - 45
1.2 – 53.9 0.5 – 120.3
1.63 – 88.2
1.36 – 155.4
3.4 – 128.1
2.4 – 76.6
170 - 412 156 - 542
0.04 – 0.27
0.05 – 1.48
2.28 – 10.51
3 – 11,85
0.72 – 1.05
0.91 – 3.85
1a2a3a4a
5a6a7a
8a
9a
Rules for the medical diagnosisRules for the medical diagnosis
kAxAx kk ClassTHEN isandand is IF 9911
1,0k
NF network for the medical diagnosisNF network for the medical diagnosis
Class 1
Class 0
Attribute 1
Attribute 9
Attribute 2
21A
91A
1A 1
2A0
A 10
90A
П
П
x 1
x 2
x 9
..
...
...
.
Results: correct diagnosis
3 cases with the “I do not know” response after the first stage of classification;
The “I do not know” answers, which meanpositive or negative diagnosis, refer to thecases that are difficult to be recognized,because they belong to overlapping regions.
62 correct diagnosis for all 65 input vectors.(95.4% correct decisions, 4.6 % “I do not know” )
Conclusions (perception-based classification)
The perception-based approach allowsto generate fuzzy IF-THEN rulesin the same way as humans do, andperform the multi-stage classificationwithout misclassifications.
Final conclusions
Neuro-fuzzy systems are soft computing methods utilizing artificial neural networks and fuzzy systems.Various connectionist architectures of neuro-fuzzy systems can be constructed. The knowledge acquisition concerns fuzzy IF-THEN rules, and is performed by a learning process. The systems realize an inference (fuzzy reasoning) based on these rules.