roughset & it’s variants

43
ROUGH SET & IT’S VARIANTS:VARIABLE PRICISION ROUGH SET AND FUZZY ROUGH SET APPROACHES presented by- Rajdeep Chatterjee PLP, MIU ISI Kolkata

Upload: rajdeep-chatterjee

Post on 12-Aug-2015

169 views

Category:

Engineering


1 download

TRANSCRIPT

Page 1: Roughset & it’s variants

ROUGH SET & IT’S VARIANTS: VARIABLE

PRICISION ROUGH SET AND FUZZY ROUGH

SET APPROACHES

presented by-

Rajdeep Chatterjee

PLP, MIU ISI Kolkata

Page 2: Roughset & it’s variants

Overview

Introduction to Rough Set

Information/Decision Systems

Indiscernibility

Set Approximations of Rough Set

Reducts and Core

Dependency of Attributes

Variable Precision rough Set (VPRS)

Set Approximations (VPRS)

Fuzzy Rough Set (FRS)

Set Approximations and Dependency (FRS)

Observations

R Chatterjee, PLP, MIU ISI Kolkata

Page 3: Roughset & it’s variants

Introduction

Often, information on the surrounding

world is

◦ Imprecise

◦ Incomplete

◦ uncertain.

We should be able to process uncertain

and/or incomplete information.

R Chatterjee, PLP, MIU ISI Kolkata

Page 4: Roughset & it’s variants

Introduction

“Rough set theory” was developed by

Zdzislaw Pawlak in the early 1980’s.

Representative Publications:

◦ Z. Pawlak, “Rough Sets”, International Journal of

Computer and Information Sciences, Vol.11, 341-

356 (1982).

◦ Z. Pawlak, Rough Sets - Theoretical Aspect of

Reasoning about Data, Kluwer Academic

Pubilishers (1991).

R Chatterjee, PLP, MIU ISI Kolkata

Page 5: Roughset & it’s variants

Information Systems

Age LEMS

X1 16-30 50

X2 16-30 0

X3 31-45 1-25

X4 31-45 1-25

X5 46-60 26-49

X6 16-30 26-49

X7 46-60 26-49

IS is a pair (U, A)

U is a non-empty

finite set of objects.

A is a non-empty

finite set of

attributes such that

for every

is called the value

set of a.aV

R Chatterjee, PLP, MIU ISI Kolkata

Page 6: Roughset & it’s variants

Decision Systems

Age LEMS Walk

X1 16-30 50 Yes

X2 16-30 0 No

X3 31-45 1-25 No

X4 31-45 1-25 Yes

X5 46-60 26-49 No

X6 16-30 26-49 Yes

X7 46-60 26-49 No

DS:

is the decision

attribute (instead of

one we can consider

more decision

attributes).

The elements of A

are called the

condition attributes.

}){,( dAUT

Ad

Condition attributes

Decision attribute

R Chatterjee, PLP, MIU ISI Kolkata

Page 7: Roughset & it’s variants

Indiscernibility

The equivalence relation

A binary relation which is

reflexive (xRx for any object x) ,

symmetric (if xRy then yRx), and

transitive (if xRy and yRz then xRz).

The equivalence class of an element

consists of all objects

such that xRy.

XXR

Rx][

Xx

Xy

R Chatterjee, PLP, MIU ISI Kolkata

Page 8: Roughset & it’s variants

Indiscernibility

Let IS = (U, A) be an information system, then with any there is an associated equivalence relation:

where is called the B-indiscernibility relation.

If then objects x and x’ are indiscernible from each other by attributes from B.

The equivalence classes of the B-indiscernibilityrelation are denoted by

AB

)}'()(,|)',{()( 2 xaxaBaUxxBINDIS

)(BINDIS

),()',( BINDxx IS

Rx][

R Chatterjee, PLP, MIU ISI Kolkata

Page 9: Roughset & it’s variants

An Example of Indiscernibility

Age LEMS Walk

X1 16-30 50 Yes

X2 16-30 0 No

X3 31-45 1-25 No

X4 31-45 1-25 Yes

X5 46-60 26-49 No

X6 16-30 26-49 Yes

X7 46-60 26-49 No

The non-empty subsets of the condition attributes are {Age}, {LEMS}, and {Age, LEMS}.

IND({Age}) = {{x1,x2,x6}, {x3,x4}, {x5,x7}}

IND({LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x6,x7}}

IND({Age,LEMS}) = {{x1}, {x2}, {x3,x4}, {x5,x7}, {x6}}.

R Chatterjee, PLP, MIU ISI Kolkata

Page 10: Roughset & it’s variants

Set Approximation

Let T = (U,A) and let and

We can approximate X using only the

information contained in B by

constructing the B-lower and B-upper

approximations of X, denoted and

respectively, where

AB .UX

XB

XB

},][|{ XxxXB B

}.][|{ XxxXB B

R Chatterjee, PLP, MIU ISI Kolkata

Page 11: Roughset & it’s variants

Set Approximation

B-boundary region of X,

consists of those objects that we cannot

decisively classify into X in B.

B-outside region of X,

consists of those objects that can be with

certainty classified as not belonging to X.

A set is said to be “rough” if its boundary

region is non-empty, otherwise the set is crisp.

,)( XBXBXBNB

,XBU

R Chatterjee, PLP, MIU ISI Kolkata

Page 12: Roughset & it’s variants

An Example of Set Approximation

Let W = {x | Walk(x) = yes}.

The decision class, Walk, is rough since the boundary region is not empty

}.7,5,2{

},4,3{)(

},6,4,3,1{

},6,1{

xxxWAU

xxWBN

xxxxWA

xxWA

A

Age LEMS Walk

X1 16-30 50 Yes

X2 16-30 0 No

X3 31-45 1-25 No

X4 31-45 1-25 Yes

X5 46-60 26-49 No

X6 16-30 26-49 Yes

X7 46-60 26-49 No

R Chatterjee, PLP, MIU ISI Kolkata

Page 13: Roughset & it’s variants

An Pictorial Depiction of Set Approximation

yes

yes/no

no

{{x1},{x6}}

{{x3,x4}}

{{x2}, {x5,x7}}

WA

AW

R Chatterjee, PLP, MIU ISI Kolkata

Page 14: Roughset & it’s variants

Lower & Upper Approximations

}:/{ XYRUYXR

}:/{ XYRUYXR

Lower Approximation:

Upper Approximation:

R Chatterjee, PLP, MIU ISI Kolkata

Page 15: Roughset & it’s variants

Properties of Approximations

1.

2.

3.

4.

5.

6. YX

YX

YBXBYXB

YBXBYXB

UUBUBBB

XBXXB

)()()(

)()()(

)()( ,)()(

)(

)()( YBXB

)()( YBXB

implies

implies

R Chatterjee, PLP, MIU ISI Kolkata

Page 16: Roughset & it’s variants

Properties of Approximations

7.

8.

9.

10.

11.

12. )())(())((

)())(())((

)()(

)()(

)()()(

)()()(

XBXBBXBB

XBXBBXBB

XBXB

XBXB

YBXBYXB

YBXBYXB

Where, -X denotes U - X.R Chatterjee, PLP, MIU ISI Kolkata

Page 17: Roughset & it’s variants

Four Basic Classes of Rough Sets

X is roughly B-definable, iff and

X is internally B-undefinable, iffand

X is externally B-undefinable, iff

and

X is totally B-undefinable, iff

and

)(XB

,)( UXB

)(XB

,)( UXB

)(XB

,)( UXB

)(XB

.)( UXB

R Chatterjee, PLP, MIU ISI Kolkata

Page 18: Roughset & it’s variants

Accuracy of Approximation

where |X| denotes the cardinality of

Obviously

If X is crisp with respect to B.

If X is rough with respect to B.

|)(|

|)(|)(

XB

XBXB

.X

.10 B

,1)( XB

,1)( XB

R Chatterjee, PLP, MIU ISI Kolkata

Page 19: Roughset & it’s variants

Issues in the Decision Table

The same or indiscernible objects may be

represented several times.

Some of the attributes may be

superfluous (redundant).

That is, their removal cannot worsen the

classification.

R Chatterjee, PLP, MIU ISI Kolkata

Page 20: Roughset & it’s variants

Reduct

Keep only those attributes that preserve

the indiscernibility relation and,

consequently, set approximation.

There are usually several such subsets of

attributes and those which are minimal

are called reducts.

R Chatterjee, PLP, MIU ISI Kolkata

Page 21: Roughset & it’s variants

Dispensable & Indispensable

AttributesLet Attribute c is dispensable in T

if , otherwise

attribute c is indispensable in T.

.Cc

)()( }){( DPOSDPOS cCC

XCDPOSDUX

C /

)(

The C-positive region of D:

R Chatterjee, PLP, MIU ISI Kolkata

Page 22: Roughset & it’s variants

Independent

T = (U, C, D) is independent

if all are indispensable in TCc

R Chatterjee, PLP, MIU ISI Kolkata

Page 23: Roughset & it’s variants

Reduct & Core

The set of attributes is called a

reduct of C, if T’ = (U, R, D) is independent

and

The set of all the condition attributes

indispensable in T is denoted by CORE(C).

where RED(C) is the set of all reducts of C.

CR

).()( DPOSDPOS CR

)()( CREDCCORE

R Chatterjee, PLP, MIU ISI Kolkata

Page 24: Roughset & it’s variants

Discernibility Matrix

Let T = (U, C, D) be a decision table, with

By a discernibility matrix of T, denoted M(T), we will mean matrix defined as:

for i, j = 1,2,…,n

is the set of all the condition attributes that classify objects ui and uj into different classes

}.,...,,{ 21 nuuuU

ijc

R Chatterjee, PLP, MIU ISI Kolkata

Page 25: Roughset & it’s variants

Discernibility Function

A discernibility function for an information

system IS is a boolean function om m boolean

variables (corresponding to the attributes

a1,a2,…,am) defined as follows.

Where .The set of all prime

implicants of determines the set of all reduct

of IS.

R Chatterjee, PLP, MIU ISI Kolkata

Page 26: Roughset & it’s variants

Examples of Discernibility Matrix

a b c d

u1 a0 b1 c1 Y

u2 a1 b1 c0 n

u3 a0 b2 c1 n

u4 a1 b1 c1 Y

In order to discern equivalence

classes of the decision attribute d,

to preserve conditions described

by the discernibility matrix for

this table

u1 u2 u3

u2

u3

u4

a,c

b

c a,b

C = {a, b, c}

D = {d}

Reduct = {b, c}

cb

bacbca

)()(

R Chatterjee, PLP, MIU ISI Kolkata

Page 27: Roughset & it’s variants

Dependency of Attributes

Set of attribute D depends totally on a set

of attributes C, denoted if all

values of attributes from D are uniquely

determined by values of attributes from C.

,DC

R Chatterjee, PLP, MIU ISI Kolkata

Page 28: Roughset & it’s variants

Dependency of Attributes

Let D and C be subsets of A. We will say

that D depends on C in a degree k

denoted by if

where called C-

positive region of D.

),10( k ,DC k

||

|)(|),(

U

DPOSDCk C

),()(/

XCDPOSDUX

C

R Chatterjee, PLP, MIU ISI Kolkata

Page 29: Roughset & it’s variants

Dependency of Attributes

Obviously

If k = 1 we say that D depends totally on

C.

If k < 1 we say that D depends partially

(in a degree k) on C.

.||

|)(|),(

/

DUX U

XCDCk

R Chatterjee, PLP, MIU ISI Kolkata

Page 30: Roughset & it’s variants

Variable Precision Rough Set

A generalized model of rough sets called

variable precision model (VPRS) aimed at

modeling classification problems involving

uncertain or imprecise information, is

presented by Wojceich Ziarko in 1993.

This extended rough set model able to

allow some degree of misclassification in

the largely correct classification.

R Chatterjee, PLP, MIU ISI Kolkata

Page 31: Roughset & it’s variants

Variable Precision Rough Set

c(X,Y) of the relative degree of misclassification of theset X with respect to set Y defined as

where card denotes set cardinality.

The quantity c(X,Y) will be referred to as the relativeclassification error.

The actual number of misclassified elements is givenby the product c(X,Y)*card(X) which is referred to asan absolute classification error.

R Chatterjee, PLP, MIU ISI Kolkata

Page 32: Roughset & it’s variants

-majority (VPRS)

is known as admissible classificationerror must be within the range .

More than *100 elements of X should becommon with Y. then it is called –majorityrelations.

Let X1={x1,x2,x3,x4}

X2={x1,x2,x5}

Y={x1,x2,x3,x8}

XY

XY 1

25.0 XY 2

33.0

R Chatterjee, PLP, MIU ISI Kolkata

Page 33: Roughset & it’s variants

Set Approximations in VPRS

Let A=(U,R) which consists of a non-

empty, finite universe U and of the

equivalence relation R on U. The

equivalence relation R, referred to as an

indiscernibility relation, corresponds to a

partitioning of the universe U into a

collection of equivalence classes or

elementary sets R*={E1,E2,…,En}

R Chatterjee, PLP, MIU ISI Kolkata

Page 34: Roughset & it’s variants

Lower & Upper Approximation (VPRS)

Lower approximation:

or, equivalently,

Upper approximation:

EXR

R

R

R Chatterjee, PLP, MIU ISI Kolkata

Page 35: Roughset & it’s variants

Boundary & Negative Region (VPRS)

Boundary region:

Negative region:

R Chatterjee, PLP, MIU ISI Kolkata

Page 36: Roughset & it’s variants

Theoretical aspect of Approximation

(VPRS)

The lower approximation of the set X canbe interpreted as the collection of allthose elements of U which can beclassified into X with the classificationerror not greater than b.

The negative region of the set X is thecollection of all those elements of Uwhich can be classified into thecomplement of X, -X with theclassification error not greater than b.

R Chatterjee, PLP, MIU ISI Kolkata

Page 37: Roughset & it’s variants

Theoretical aspect of Approximation

(VPRS)

The boundary region of the set X cannot

be classified either into X or –X with the

classification error not greater than b.

The upper approximation of the set X

includes all those elements of U which

cannot be classified into -X with the error

not greater than b.

R Chatterjee, PLP, MIU ISI Kolkata

Page 38: Roughset & it’s variants

Fuzzy-Rough Sets (FRS)

One particular use of RST is that of attribute reduction

in datasets. Given dataset with discretized attribute

values, it is possible to find a subset of the original

attributes that are the most informative (termed as

Reduct).

However, most often the case that the values of

attributes may be real-valued and cannot be handled by

traditional rough set.

Some discretization is possible which in turn gives you

loss of information.

To deal with vagueness and noisy data in the dataset,

Fuzzy Rough Set was introduced by Richard Jensen.

R Chatterjee, PLP, MIU ISI Kolkata

Page 39: Roughset & it’s variants

Fuzzification for Conditional Features

a b c q

1 -0.4 -0.3 -0.5 No

2 -0.4 0.2 -0.1 Yes

3 -0.3 -0.4 -0.3 No

4 0.3 -0.3 0 Yes

5 0.2 -0.3 0 Yes

6 0.2 0 0 no

Fuzzy-rough set is defined by two

fuzzy sets: fuzzy lower and upper

approximations, obtained by

extending the corresponding

crisp rough set notions.

In crisp case, elements that

belong to the lower

approximation (i.e., have

membership of 1) are said to

belong to the approximated set

with absolute certainty. In fuzzy-

rough case, elements may have a

membership in the range [0,1],

allowing greater flexibility in

handling uncertainty.

R Chatterjee, PLP, MIU ISI Kolkata

Page 40: Roughset & it’s variants

Membership values from MFs of

linguistic labels

a b c q

Na Za Nb Zb Nc Zc {1,3,6} {2,4,5}

1 0.8 0.2 0.6 0.4 1.0 0.0 1.0 0.0

2 0.8 0.2 0.0 0.6 0.2 0.8 0.0 1.0

3 0.6 0.4 0.8 0.2 0.6 0.4 1.0 0.0

4 0.0 0.4 0.6 0.4 0.0 1.0 0.0 1.0

5 0.0 0.6 0.6 0.4 0.0 1.0 0.0 1.0

6 0.0 0.6 0.0 1.0 0.0 1.0 1.0 0.0

R Chatterjee, PLP, MIU ISI Kolkata

Page 41: Roughset & it’s variants

Lower & Upper Approximations (FRS)

Lower approximation:

Upper approximation:

sup/ PUF

infUy

supUy

sup/ PUF

P

P

R Chatterjee, PLP, MIU ISI Kolkata

Page 42: Roughset & it’s variants

Positive Region & Dependency Measure (FRS)

The membership of an object , belonging to the fuzzy

positive region can be defined by

Using the definition of the fuzzy positive region, the new

dependency function can be defined as follows:

P

sup/ QUX

R Chatterjee, PLP, MIU ISI Kolkata

Page 43: Roughset & it’s variants

Observations

Evaluation of importance of particular attributes andelimination of redundant attributes from the decisiontable.

Construction of a minimal subset of independentattributes ensuring the same quality of classification asthe whole set, i.e. reducts of the set of attributes.

Intersection of these reducts giving a core of attributes,which cannot be eliminated without disturbing theability of approximating the classification and

Generation of logical rules from the reduced decisiontable.

R Chatterjee, PLP, MIU ISI Kolkata