fuzzy trees
TRANSCRIPT
-
8/8/2019 Fuzzy Trees
1/15
The use of Fuzzy Decision Tree Analysis in Monitoring a Minimum Wage
Malcolm Beynon
and
Keith Whitfield
Cardiff Business School, Cardiff University, Wales, UK.
Address for correspondence: Dr Malcolm Beynon,
Cardiff Business School,
Colum Drive,Cardiff, CF10 3EU,
Wales, U.K.
Telephone: +44 (0)29 2087 5747,
Fax +44 (0)29 2087 4419
E-mail: [email protected]
-
8/8/2019 Fuzzy Trees
2/15
1
The use of Fuzzy Decision Tree Analysis in Monitoring a Minimum Wage
Abstract
Effective monitoring of a minimum wage, requires that establishments potentially paying low
wages are effectively identified. This paper investigates the identification of establishmentspaying low wages prior to the introduction of the British National Minimum Wage in 1999,
through the utilization of fuzzy decision trees. Incorporating a fuzzy aspect within this
problem (using membership functions) enables the judgements to be made with linguistic
scales. An intelligent technique for constructing the required membership functions is
introduced, which greatly reduces the necessity of any expert opinion within their
construction. The Parzen windows method of estimating a probability distribution and the
FUSINTER method of continuous variable discretisation are incorporated in this technique.
An illustration of the utilization of the constructed fuzzy if then rules is included.
JEL Classification. C14, C15, C44, J31
Keywords. FUSINTER, Fuzzy decision trees, Labour economics, Low pay, Membership
functions, Parzen windows.
1 Introduction
In April 1999, the UK government introduced a National Minimum Wage (NMW) of 3.60
per hour for workers over the age of 21. Enforcing such a regulation is a major task. The
method chosen was targeted monitoring, whereby workplaces are investigated according to
their probability of employing workers on low pay. Such a procedure needs to be based on an
appropriate model for identifying potentially low-paying workplaces. Fuzzy decision treeanalysis seems highly apropriate for such a task.
Inductive decision trees were first introduced in 1963 with the Concept Learning System
Framework Hunt (1962). Since then they have continued to be developed and applied. The
structure of a decision tree starts with a root decision node, from which all branches
originate. A branch is a series of nodes where decisions are made at each node enabling
progression through (down) the tree. A progression stops at a leaf node, where a decision
classification is given, based on the rule associated with the full branch from the root node to
the individual leaf node.
As with many data analysis techniques (e.g., traditional regression models), decision trees
have been developed within a fuzzy environment. For example, the well known decision tree
method ID3 (Quinlan, 1986) was developed to include fuzzy entropy measures (see Cios and
Sztandera (1992) and Weber (1992)). The fuzzy decision tree method used in this paper was
introduced by Yuan and Shaw (1995), to take account of cognitive uncertainty, i.e. vagueness
and ambiguity. One reason for the utilization of fuzzy set theory is its simplicity and
similarity to human reasoning (Hong and Chen 1999). This similarity includes the use of
linguistic terms through the utilization of certain membership functions.
The membership function converts crisp numerical values into levels over a set of linguistic
terms. Central to any method within a fuzzy environment is the defining of the requiredmembership functions. This area has itself been the subject of research studies (see Hong and
-
8/8/2019 Fuzzy Trees
3/15
2
Chen (1999), Sancho-Royo and Verdegay (1999) and Kahraman et al. (2000)), with many
studies using opinions of experts to construct the necessary functions, e.g. see Tarrazo and
Gutierrez (2000). In this paper an intelligent technique for constructing the membership
functions is introduced which takes into account the information of the individual continuous
values in the original data used to construct the fuzzy decision tree.
The main aim of this paper is to illustrate how fuzzy decision tree analysis can be used to
help monitor a minimum wage. It uses data derived from a survey of British workplaces
(WERS98) which was undertaken just before the introduction of the NMW and which
contains information on low pay.
The rest of the paper is structured as follows. In section 2 a description of the problem
considered is given. In section 3 an intelligent method of membership function construction
is introduced. In section 4 a brief description of the fuzzy decision tree method is given. In
section 5 the construction of the fuzzy decision tree for this problem is exposited.
2 Problem description and data set
In this paper the proportion of employees paid less than 3.50 per hour is defined as the
decision attribute %pay.1 From WERS98 over two thirds of establishment reported zero level
of low-paid employees. In the study of McNabb and Whitfield (2000) certain intervals
(classes) of %pay values were considered. Similarly here, three classes are used to offer an
initial partitioning of this decision attribute %pay. These are; zero percentage (zero - Z),
between 0 and 10 percent (low - L) and above 10 percent (high - H).
Since a full analysis of this problem is not the basis of this paper, a subset of the whole data
set is used, i.e. details on 100 establishments are used to enable the construction of the fuzzy
decision tree (see later). Furthermore a subset of the condition attributes (characteristics) of
the establishments are used. That is, here six condition attributes are used, see Table 1 for
their introduction and description.
Attribute Description
age Age of the organisation (years)
emps Number of employees in establishment
%yng Percentage of employees < 20 years old
%old Percentage of employees > 51 years old
%fem Percentage of female employees%prt Percentage of part-time employees
Table 1: Description of condition attributes.
For a full description of the data (condition and decision attributes) the reader is directed to
the study by McNabb and Whitfield (2000).
Currently a model used by the Inland Revenue aimed at identifying those sectors of
geographical areas where non-compliance is likely to be most prevalent (Low Pay
1 With a one year difference between the WERS98 data and the NMW in 1999, the level of 3.50 takes into
account inflation, i.e. to the 3.60 level.
-
8/8/2019 Fuzzy Trees
4/15
3
Commission, 2000). The ability to successfully identify (predict) those establishments with a
high percentage of low paid employees is an important factor. That is, limited resources to
inspect establishments (by the Inland Revenue) requires efficient ways to target those
establishments more likely to pay low wages. This efficiency includes using external
characteristics of the establishment which are quick (free) to acquire. Many of these
characteristics will be approximations (e.g. percentage of young or female employees). Onefurther factor is that WERS98 includes data about low pay, based on answers from managers
most responsible for personnel matters, i.e. their answers may not be accurate facts but more
an immediate reaction judgement. Subsequently a fuzzy approach would go someway to
appease these issues, and within a decision tree setting, the resultant (readable) rules do not
require particular expertise in specific analysis techniques.
3 Construction of membership functions
As described in section 1, certain membership functions are used to convert a crisp numerical
value into levels over a set of linguistic terms. In this section an intelligent technique isintroduced for constructing the required membership functions, used in the subsequent fuzzy
decision tree method. This intelligent technique is made up of three parts, namely;
a) Discretisation of data set to provisionally intervalise the values of the continuouscondition attributes.
b) Construction of estimated distributions to offer a functional form for the spread of thevalues in an identified interval.
c) Definition of membership function from the constructed estimated distribution.Each of these parts will be described here, through using the NMW problem and data set
described in section 2.
3.1 Discretisation of Data Set
This section is concerned with Continuous Variable Discretisation (CVD). Research within
CVD has suggested several alternatives, based on whether the discretisation is supervised
(utilise the decision class) or unsupervised (consider only the group of continuous (condition
attribute) variables in question). CVD can further be separated into whether they are local
methods, i.e. operate on a single variable at a time, or global methods when they discretise a
group of objects at the same time.
In this paper the supervised CVD method FUSINTER is used (Zighed, 1998). One reason for
using FUSINTER is that this derives the appropriate number of intervals from the
distribution of the data, hence removing the need for an expert opinion here. FUSINTER is a
bottom up algorithm (merging sub-intervals rather than introducing new interval boundary
values) whose objective is to partition a condition attribute subject to the optimising of a
certain entropy measure. The method only partitions one attribute at a time, one advantage of
this method is its ability to avoid very thin partitioning, i.e. intervals which include a very
small number of objects.2
2 For a detailed discussion of the FUSINTER algorithm see Zighed et al. (1998). In this paper the quadratic
entropy method is used, including the default values = 0.975 and = 1.
-
8/8/2019 Fuzzy Trees
5/15
4
Since FUSINTER is a supervised technique, the actual value of the decision attribute
(%pay), is employed to enable the discretisation of each of the six continuous condition
attributes (given in Table 1) to take place, see Table 2. Here, as with the decision attribute, it
is a provisional discretisation aiming to intelligently group the condition attributes before
further analysis. The decision classes (Z, L and H) defined in section 2 for %pay are used to
provisionally discrete the six condition attributes.
Attribute Interval 1 Interval 2 Interval 3
age [0, 7.5), 15 [7.5, 19.0), 41 [19.0, ), 44emps [0, 30.5), 21 [30.5, 98.5), 35 [98.5, ), 44%yng [0, 0.035), 46 [0.035, 0.120), 24 [0.120, 1], 30
%old [0, 0.045), 13 [0.045, 0.135), 44 [0.135, 1], 43
%fem [0, 0.455), 35 [0.455. 0.575), 15 [0.575, 1], 50
%prt [0, 0.325), 53 [0.325, 1], 47
Table 2: Intervals from FUSINTER discretisation.
From Table 2, it is shown the six condition attributes are each partitioned into 2 or 3
intervals. Also given are the number of objects in each interval, which clearly shows the
avoidance of particularly small intervals (i.e. thin partitioning).
3.2 Construction of estimated distributions
The method of Parzen windows (Parzen, 1962) constructs a probability density function (pdf)
based on the values in the domain of the interval. In its general form (assuming each value xiis represented by a zero mean, unit variance, univariate density function, see Thompson and
Tapia (1990)), the estimatedpdfis given by:
=
=
m
im
i
m h
xx
hmxpdf
1
2
2
1exp
2
111)(
,
where m is the number of values in the interval and hm is the window width, Duda and Hart
(1973, p. 89) consider the problem of constructing hm. They givem
hhm
1= , where h1 is a
parameter to define. In this paper h1 is the range of the individual values in the interval under
consideration. Defining Ij to be thejth interval, then h1 = max )(Ij min )(Ij , where min )(Ijand max )(I
j, signify the smallest and largest of the values in the j
thinterval respectively.
Hence, the associatedpdf(i.e.pdfj(x)) for thejth
interval is given by;
==
jm
ijj
i
jjj
j
xx
mxpdf
1
2
)min(I)max(I2
1exp
2))(Imin)(max(I
1)(
(1)
where mj is the number of values in Ij. The pdfj(x) function is the mean of the univariate
density functions centred at each of the values in the j
th
interval.
-
8/8/2019 Fuzzy Trees
6/15
5
Using the original data values of the condition attributes and the intervals defined in Table 2,
the associated estimated distributions (i.e.,pdfs) can be constructed, see Figure 1.
0.05
0.15
0
0 .5
1
1 .5
2
0 .2 0 .4 0 .6 0 .8 1 0
0.5
1
1.5
2
0.2 0 .4 0 .6 0 .8 1
0
0 .5
1
1 .5
0 .2 0 .4 0 .6 0 .8 1 0
0.2
0 .4
0 .60 .8
0 .2 0 .4 0 .6 0 .8 1
0
0.1
50 10 0 1 5 0 2 0 0 25 0 3 0 0 0
0.02
0.04
0.06
0.08
2 0 0 40 0 60 0 8 0 0 1 0 0 0 12 0 0
e m p sag e
% y n g % o l d
% f e m % p r t
1
2
3
3
3
3 2
2
2 2
2 3
1
1
1
1
1
Figure 1: Estimated distributions of condition attributes.
In Figure 1, each set of estimated distributions is shown over the domain of the intervals
given in Table 1. It is noted that the constructed pdfj(x) functions have a domain over (,), but here a check is made on the feasible domain for each attribute, e.g. %yng is apercentage hence has a feasible domain [0, 100], given as a proportion with [0, 1] domain inFigure 1. The labels 1, 2 and 3 identify the estimated distributions to the intervals given
in Table 2.
A similar set of estimated distributions can be constructed for the decision attribute (%pay),
as shown in Figure 2.
0
1
2
3
4
0 .2 0 .4 0 .6 0 .8 1
1
2 3
% pay
Figure 2: Estimated distributions of decision attribute.
In Figure 2, the three associated pdfs are shown. Of special note is the pdf with label 1
relating to the %pay = Z class. That is, while it represents those establishments with zero
percentage of low pay workers, it would have zero interval width hence unable to use
equation (1). In this case an interval width h1 = 0.05 is used, enabling apdfto be constructed.
The reasoning for this, is that allowing a pdf to exist for a relatively crisp value, a level of
fuzziness is included. That is, within a workplace the manager answering the questions mayanswer with zero level of low pay while aware of a very small proportion existing.
-
8/8/2019 Fuzzy Trees
7/15
6
3.3 Definition of the membership functions
This section is concerned with the construction of the required membership functions. Within
related studies, a number of different types of membership functions have been investigated.
These include triangular functions, trapezoidal functions also whether they should belinear/non-linear and possibly piecewise (see Hu and Fang (1998), Medasani et al. (1998) and
Roa-Sepulveda and Herrera (2000)). Here, linear trapezoidal membership functions are
utilised. For each interval, i.e. membership function, their general functional form is given
by,
=zjj
p ,
1,2,
==
zjjp ,
1,3, ==
zjjp and
0,4, >=
zjjp .
Using the estimated distributions given in section 3.2, and with z=1
= 0.1 and z>0
= 0.97 the
defining values for each membership function can be found. For the case z>0 = 0.97, this
implies that the associated membership function has a value greater than zero for the central
97% area of thepdffor this interval, hence possibly removing the influence of any particular
outliers in the data. If comparing to a possibility distribution the z=1 andz>0 values define the
necessity and possibility measures for the membership functions (Bandemer and Gottwald,
1995). These defining values enable the membership functions to be constructed, as given in
Figure 3.
7 9 .9 8
2.931 7 .8 7
1 0 .9 91 0 .2 7
5.719.81
4.62
0
0 .5
1
L M H
ag e3 6 0 .9 4
1 4 .0 61 0 4 .8 8
5 4 .9 34 8 .7 8
1 7 .5 83 5 .4 5
2 2 .7 8
0
0.5
1
L M H
e m p s
0 .2 6 2
0 .0 2 0 0 .1 2 7
0 .0 6 50 .0 5 8
0 . 0 1 9 0 . 0 3 7
0 .0 1 7
0
0 .5
1
L M H
% y n g
0 .2 6 0
0 .0 2 7 0 .1 4 1
0 .0 9 00 .0 8 1
0 .0 3 2 0 .0 5 7
0 .0 2 5
0
0 .5
1
L M H
% o l d
0. 700
0 .5 00 0 .6 04
0 . 5290 . 519
0 .4 4 2 0 .5 2 3
0 . 227
0
0 .5
1
L M H
% f e m0. 592
0 .2 15 0 .3 48
0 . 127
0
0 .5
1
L H
% p r t
Figure 3: Sets of membership functions for condition attributes.
From Figure 3, the membership functions are shown, e.g. for the 2 nd interval of the %yng
attribute, its defining values are [0.019, 0.058, 0.065, 0.127], this membership function is
labelled M - representing a linguistic term medium. Further labels are also given to its
neighbouring intervals in Figure 3, i.e. L - low and H - high. In summary for the %yng
attribute, a linguistic scale of low, medium and high has been constructed with the only
requirement needed from an expert, being the choice of the z=1 andz>0 values. This follows
also for age, emps, %old and %fem, with the attribute %prt having linguistic scales L - low
and H - high only.
A similar set of fuzzy membership functions can be constructed for the decision attribute
using the estimated distribution given in Figure 2, see Figure 4.
-
8/8/2019 Fuzzy Trees
9/15
8
0.352
0.102
0.0390.032
0.0020.019
0.006
0
0 .5
1
Z
L H
% p a y
Figure 4: Membership functions for decision attribute.
In Figure 4, the membership functions for the decision attribute are given. In this case the
associated linguistic terms are Z - zero, L - low and H - high.
To further illustrate the construction of the fuzzy numbers from the original data, the details
of an establishment are given in Table 3 along with the subsequent fuzzy values.
Crisp value Fuzzy value
age 15 [0, 0.417, 0.157]
emps 91 [0, 0.278, 0.222]
%yng 0.05 [0, 0.793, 0.126]
%old 0.16 [0, 0, 0.571]
%fem 0.12 [1, 0, 0]
%prt 0.01 [1, 0]
%pay 0.02 [0, 0.599, 0.007]
Table 3: Original and Fuzzy attribute values.
Using the membership functions previously defined, the resultant fuzzified values given in
Table 3 can also be written;
{0, 0.417, 0.157; 0, 0.278, 0.222; 0, 0.793, 0.126; 0, 0, 0.571; 1, 0, 0; 1, 0; 0, 0.599, 0.007}
where the semi-colons separate the sets of fuzzy values for each attribute (condition and
decision attributes included).4
4 Summary of fuzzy decision tree method
In this section a brief description of the functions used in the fuzzy decision tree method
introduced by Yuan and Shaw (1995) are exposited. A fuzzy set A in a universe of discourse
Uis characterized by a membership function A which takes values in the interval [0, 1]. For
all uU, the intersectionAB of two fuzzy sets is given by AB = min(A(u), B(u)).
A membership function (x) of a fuzzy variable Y defined on X, can be viewed as a
possibility distribution ofYon X, i.e. (x) = (x), for all xX. The possibilistic measure -)(YE of ambiguity is defined as;
4 The same method of illustrating fuzzy values as used in Wang et al. (2000).
-
8/8/2019 Fuzzy Trees
10/15
9
===
+
n
iii igYE
11 ]ln[)()()( ,
where },...,,{ 21= n is the permutation of the possibility distribution
)}(),...,(),({ 21 nxxx = ,5
sorted so that+
1ii for i = 1, .., n, and 01 =+n , see Zadeh
(1978) and Higashi and Klir (1983). The ambiguity of attributeA is then;
==
m
iiuAE
mAE
1
))((1
)( ,
where )))((max)(())((1
iTsj
iTi uuguAE js = , with Tj the linguistic scales used within an
attribute for m cases. When there is overlapping between linguistic terms of an attribute or
between classes, the ambiguity exists.
The fuzzy subsethood S(A, B) measures the degree to which A is a subset ofB (see Kosko
1986) and is given by;6
=
UuA
UuBA
u
uu
BAS)(
))(),(min(
),(
.
Given fuzzy evidenceE, the possibility of classifying an object to class Ci can be defined as;
),(max
),(
)|(j
j
i
i CES
CES
EC == ,
where S(E, Ci) represents the degree of truth for the classification rule, i.e. ifE then Ci.
Knowing a single piece of evidence (i.e., a fuzzy value from an attribute) the classification
ambiguity based on this fuzzy evidence is defined as;
))|(()( ECgEG = .
The classification ambiguity with fuzzy partitioning P = {E1, ,Ek} on the fuzzy evidence
F, denoted as G(P | F), is the weighted average of classification ambiguity with each subsetof partition;
==
k
iii FEGFEwFPG
1
)()|()|( ,
where G(Ei F) is the classification ambiguity with fuzzy evidence Ei F, w(Ei | F) is theweight which represents the relative size of subsetEiFin F.
5
That is, the values )}(),...,(),({ 21 nxxx are normalised based on the largest value.6 To calculate S(A,B),A andB should be defined on the same universe of discourse. In this case all attributes are
over the same set of objects (workplaces).
-
8/8/2019 Fuzzy Trees
11/15
10
=
=
k
j UuFE
UuFE
i
uu
uu
FEw
j
i
1
))(),(min(
))(),(min(
)|(
.
The fuzzy decision tree method considered here utilizes these functions. In summaryattributes are assigned to nodes based on the lowest level of ambiguity. A node becomes a
leaf node if the level of subsethood (based on the conjunction (intersection) of the branches
from the root) is higher than some truth value assigned to the whole of the decision tree.The classification from the leaf node is to the decision class with the largest subsethood
value. For a full description of this method see Yuan and Shaw (1995) and Wang et al.
(2000).
5 Fuzzy decision tree construction
Utilizing the definitions defined is section 4, in this section the fuzzy decision tree method isillustrated, using the fuzzy values for the low pay problem described in section 2. A truth
level of= 0.6 is used throughout. The final fuzzy decision tree is given in Figure 5 and canbe used as reference while its construction is described below.
To find the root node attribute, the class ambiguity values are found for each attribute, they
are; G(age) = 0.6607, G(emps) = 0.4568, G(%yng) = 0.4195, G(%old) = 0.7244, G(%fem) =
0.5189 and G(%prt) = 0.4546. Since G(%yng) is the lowest of these values, it is chosen as
the root node attribute. The subsethood of each of the branches from %yng to the classes of
the decision attribute (%pay) are calculated. For the branch (%yng = L) they are; S(%yng =
L, %pay = Z) = 0.8666, S(%yng = L, %pay = L) = 0.0569 and S(%yng = L, %pay = H) =
0.0834. The largest of these values (0.8666) is above the required truth level (= 0.6), hencethis branch ends in a leaf node from which a rule can be constructed.
Similar considerations are given to the branches (%yng = M) and (%yng = H), in these cases
the largest subsethood values are S(%yng = M, %pay = L) = 0.4246 and S(%yng = H, %pay
= H) = 0.5431 respectively. Since both of these largest subsethood values are less than the
acceptable truth level it follows these branches require further partitioning with different
attributes needed to be considered. For the (%yng = M) branch we first calculate this
classification ambiguity G(%yng = M) = 0.7820 value then compare this with the
classification ambiguity with fuzzy partitions values, i.e. consider the other attributes from
this branch, e.g. G(age | %yng = M) = 0.6598. An inspection of the possible values showsG(%prt | %yng = M) = 0.4856 is the least, hence %prt is the chosen attribute for the decision
node at this branch. It also follows G(%yng = H) = 0.3831, and G(%fem | %yng = H) =
0.2725 is the chosen attribute for this branch.
The branches from the decision node (%prt | %yng = M) are next considered. Firstly the
associated largest subsethood values for each subsequent branch; S(%yng = M and %prt = L,
%pay = L) = 0.6101 and S(%yng = M and %prt = H, %pay = H) = 0.4970. Of these values,
only S(%yng = M and %prt = L, %pay = L) has a value above the truth value, hence is a leaf
node, the other branch requires possible further partitioning with attributes. For the decision
node (%fem | %yng = H) it follows the largest subsethood values for each branch are S(%yng
= H and %fem = L, %pay = L) = 0.5860, S(%yng = H and %fem = M, %pay = H) = 0.6482
-
8/8/2019 Fuzzy Trees
12/15
11
and S(%yng = H and %fem = H, %pay = H) = 0.6597. Hence only branch (%yng = H and
%fem = L) requires further possible partitioning by attributes.
This process is continued until only leaf nodes are at the end of each branch, or no further
augmentation of attributes to nodes can be made.7
The final results of the fuzzy decision tree
method are illustrated in Figure 5.
R o o t
% y n g = L % y n g = M % y n g = H
% y n g
% fe m = L% p r t = H
% f e m % o l d
% o ld = L % o ld = M % o ld = H
% p r t
% p r t = L
% f e m
% fe m = M
% f e m = M
% f e m = H
% f e m = H% fe m = L
% p r t = L % p r t = H
% p r t
8 6 . 7 %
6 1 . 0 %
8 2 .2 % 6 6 .9 %
7 7 .2 % 6 5 .7 %
9 3 .6 % 7 8 .9 %1 0 0 . 0 %
% p a y = H % p a y = H6 4 .8 % 6 5 .9 %
% p a y = Z
% p a y = L
% p a y = L
% p a y = H
% p a y = L % p a y = H
% p a y = L % p a y = L% p a y = Z
Figure 5: Fuzzy decision tree.
In Figure 5, the fuzzy decision tree is shown for the NMW problem considered. It follows
there are 11 fuzzy rules (leaf nodes), described by the larger rectangle boxes. Hence each ruleis described by the downward progression from the root to a leaf node. That is, in each non-
leaf node (excluding root) there are two parts. Firstly in their rectangle boxes, above the
dashed line the particular condition attribute linguistic term to be satisfied. Secondly, below
the dashed line the next condition attribute to consider.
At a leaf node, above the dashed line is the final condition attribute linguistic term to be
satisfied and below the dashed line the class of the decision attribute %pay the rule classifies
7 This may be based on no improvement (reduction) of the classification ambiguity value of a branch, or no
further attributes able to be augmented.
-
8/8/2019 Fuzzy Trees
13/15
12
to, along with the degree of truth in the classification. For example one rule is given in Figure
6 along with a wording of the rule.
R o o t
% y n g = M
% y n g
% p r t
% p r t = L
% p a y = L6 1 .0 %
If %yng = M and %prt = L then
%pay = L with degree of truth61.0%.
That is, when the fuzzy value of the
membership function for %yng = M
is the largest for that attribute,
similarly for %prt = L condition
attribute.
Figure 6: Description of a fuzzy decision rule.
To illustrate this decision tree the establishment given in Table 3 is used to illustrate its
classification. The fuzzy values for the establishment are given below, with the largest values
from each attribute underlined;
{0, 0.417, 0.157; 0, 0.278, 0.222; 0, 0.793, 0.126; 0, 0, 0.571; 1, 0, 0; 1, 0; 0, 0.599, 0.007}
it follows for each attribute the dominant linguistic terms are age = M (since largest value0.417), emps = M, %yng = M, %old = H, %fem = L, %prt = H and %pay = L. Using this
information it shows that the fuzzy rule given in Figure 6 is the rule which classifies this
establishment. An inspection of the result shows the correct classification was given, even
though the degree of truth is an indication of the fuzzy nature of this analysis.
6 Conclusions
This paper has illustrated the use of a fuzzy decision tree approach to the investigation of
identifying establishments that pay low wages. Through the use of Parzen windows and
FUSINTER, the required membership functions are intelligently constructed, with the needfor an expert opinion not required within many parts of the analysis.
The results of the fuzzy decision tree, are fuzzy classification rules each with an associated
degree of truth in their classification. These rules are relatively simple to read and apply, i.e.
a person may calculate the specific fuzzy values from crisp data or simply use the low (L),
medium (M) and high (H) labels as simple linguistic terms. Hence removing the need for any
further analysis to be undertaken, except the personnel linguistic judgements.
References
Bandemer, H. and Gottwald, S. (1995). Fuzzy Sets, Fuzzy Logic Fuzzy Methods. Wiley,
New York.
-
8/8/2019 Fuzzy Trees
14/15
13
Cios, K. J. and Sztandera, L. M. (1992). Continuous ID3 algorithm with fuzzy entropy
measure. Proceedings IEEE International Conference on Fuzzy Systems, San Diego, CA,
469476.
Duda, R. O., and Hart, P. E. (1973). Pattern Classification and Scene Analysis. Wiley, New
York.
Higashi, M. and Klir, G. J. (1983). Measure of uncertainty and information based on
possibility distributions.International Journal of General systems, 9: 4358.
Hong, T-P. and Chen, J-B. (1999). Finding relevant attributes and membership functions.
Fuzzy Sets and Systems, 103: 389404.
Hu, C-F. and Fang, S-C (1998). Solving fuzzy inequalities with concave membershipfunctions. Fuzzy Sets and Systems, 99: 233240.
Hunt, E. B. (1962). Concept learning: An information processing problem. New York,
Wiley.
Kahraman, C., Tolga, E. and Ulukan, Z. (2000). Justification of manufacturing technologies
using fuzzy benefit/cost ration analysis.International Journal of Production Economics, 66:
4552.
Kosko, B. (1986), Fuzzy entropy and conditioning.Information Science, 30: 165
174.
Low Pay Commission (2000). The National Minimum Wage: The Story So Far: Second
Report of the Low Pay Commission. Cm 4571, London: HMSO.
McNabb, R. and Whitfield K. (2000). Worth So Appallingly Little: A Workplace-Level
Analysis of Low Pay.British Journal of Industrial Relations, 38(4): 585609.
Medasani, S., Kim, J. and Krishnapuram, R. (1998). An overview of membership function
generation techniques for pattern recognition. International Journal of Approximate
Reasoning, 19: 391417.
Parzen, E. (1962). On Estimation of a probability density function mode. Annals of
Mathematical Statistics , 33: 10651076.
Quinlan, J. R. (1986). Induction of decision trees.Machine Learning, 1(1): 81106.
Roa-Sepulveda C. A. and Herrera, M. (2000). A solution to the economic dispatch problem
using decision trees.Electric Power Systems Research, 56: 255259.
Sancho-Royo, A. and Verdegay, J. L. (1999). Methods for the Construction of Membership
Functions.International Journal of Intelligent Systems , 14: 12131230.
-
8/8/2019 Fuzzy Trees
15/15
14
Tarrazo, M. and Gutierrez L. (2000). Economic expectation, fuzzy sets and financial
planning.European Journal of Operational Research, 126: 89105.
Thompson, J. R. and Tapia, R. A. (1990). Nonparametric Function Estimation, Modeling,
and Simulation. Society for Industrial and Applied Mathematics, Philadelphia.
Wang, X., Chen, B., Qian, G. and Ye, F. (2000). On the optimization of fuzzy decision
trees. Fuzzy sets and Systems, 112: 117125.
Weber, R. (1992). Fuzzy-ID3: a class of methods for automatic knowledge acquisition.
Proceedings of 2nd
International conference on Fuzzy Logic and Neural networks, Iizuka,
Japan, 265268.
Yuan, Y. and Shaw, M. J. (1995). Induction of fuzzy decision trees. Fuzzy Sets and
Systems, 125139.
Zadeh, L. A. (1978). Fuzzy Sets as a basis for a theory of possibility. Fuzzy Sets and
Systems, 1: 328.
Zighed, D. A., Rabaseda, S. and Rakotomala R. (1998). FUSINTER: A method for
discretisation of continuous attributes. International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems, 6(3): 307326.