andrzej skowron warsaw university
TRANSCRIPT
ROUGH SETS ROUGH SETS ANDAND
CHALLENGES CHALLENGES IN DATA MININGIN DATA MINING
Andrzej SkowronWarsaw University
RSCTC 2004, Uppsala, June 2, 2004RSCTC 2004, Uppsala, June 2, 2004
AGENDAAGENDA
�� ExamplesExamples of domains andof domains and problemsproblems�� Complex phenomenaComplex phenomena modelingmodeling�� General issuesGeneral issues�� Rough sets in complex phenomena modelingRough sets in complex phenomena modeling
AGENDAAGENDA
�� Hierarchical (layered learning) of Hierarchical (layered learning) of complex conceptscomplex concepts
�� Learning from sparse dataLearning from sparse data�� Complex decisionComplex decision and condition valuesand condition values�� Learning representation of concurrent Learning representation of concurrent
processes processes �� Learning of cooperation protocols Learning of cooperation protocols
BIOINFORMATICSBIOINFORMATICS
�� Microarray Microarray data analysisdata analysis�� MicroarrayMicroarray data analysis extended by data analysis extended by
knowledge basesknowledge bases�� Modeling of geneModeling of gene--protein networksprotein networks�� The protein folding problemThe protein folding problem�� Membrane computingMembrane computing
EXAMPLESEXAMPLES– UAV
– SOJOURNER, MARS ROVERS
– ROBO-CUP
– WEB MINING
– OBSTACLE AVOIDANCE– collaborative systems for logistic problem solving
– robo-cup-rescue simulation league
– drugs or new material discovery
– sensor fusion in robotics
–...
WITAS PROJECTWITAS PROJECTwww.ida.liu.se/ext/witaswww.ida.liu.se/ext/witas//
CONTROLCONTROL OFOF AUVAUV
U possible situations
dangerous situations
training cases
IDENTIFICATIONIDENTIFICATION
y3y2y1sensors
granule defined
by t
tperception term
inclusion degree testing
WEB
User query
documents relevant for the user
Query to the Web CLOSE
to the user query
KNOWLEDGE BASE (KB)
INFERENCEENGINE
WEB MINING
GellGell--MannMann, a Nobel Prize, a Nobel Prize--winning winning theoretical physicist and a theoretical physicist and a pioneer in the science of pioneer in the science of complexity, here examines that complexity, here examines that important concept, focusing on important concept, focusing on complex adaptive systemscomplex adaptive systems. Such . Such systems are capable of learning systems are capable of learning and are able to adapt or evolveand are able to adapt or evolvesuccessfully. The intricate successfully. The intricate processes used by a child to processes used by a child to learn a language, for example, learn a language, for example, constituteconstitute a complex adaptive a complex adaptive system, as do the processes system, as do the processes used by bacteria to develop used by bacteria to develop resistance to drugs.resistance to drugs.
MODELING ISSUES
MODELING MODELING ISSUEISSUE
Crisp globalmodel
RM1 M2
Mi ...
M3...
Soft network of interacting
local models
THE MATHEMATICS OF LEARNING: THE MATHEMATICS OF LEARNING: DEALING WITH DATADEALING WITH DATA
T. Poggio, S. Smale Notices AMS, Vol.50, May 2003T. Poggio, S. Smale Notices AMS, Vol.50, May 2003
�� The problem of understanding of The problem of understanding of intelligenceintelligence is is said to be said to be the greatest problem in science today the greatest problem in science today and and „„thethe”” problem for this centuryproblem for this century –– as as deciphering the genetic code was deciphering the genetic code was for for the the second half of the latest one.second half of the latest one.
�� Arguably, Arguably, the problem of the problem of learninglearning represents a represents a gatewaygateway to understanding intelligence in brains to understanding intelligence in brains and machinesand machines, , to discovering how the human to discovering how the human brain works and brain works and to making intelligent machines to making intelligent machines that learn from experience and improve their that learn from experience and improve their competencecompetence........
Information granulation involves Information granulation involves partitioning a class of objects (points) partitioning a class of objects (points) into granules, with a granule being a into granules, with a granule being a clump of objects (points) which are clump of objects (points) which are
drawn together by drawn together by indistinguishability,indistinguishability, similarity or similarity or
functionality.functionality.
L. A. L. A. ZadehZadeh• computing with words• from measurements to perception• toward computational theory of perception
INFORMATION GRANULATIONINFORMATION GRANULATION�� IInformationnformation granularity is a concomitant of thegranularity is a concomitant of the
bounded ability of sensory organs, and bounded ability of sensory organs, and ultimately the brain, toultimately the brain, to resolve detail and store resolve detail and store information.information.
�� HHumanuman perceptions are, for the most part, perceptions are, for the most part, intrinsically imprecise. intrinsically imprecise.
�� BBoundaries of perceived classes are fuzzyoundaries of perceived classes are fuzzy..�� TThe valueshe values of perceived attributes are granularof perceived attributes are granular..�� IInformation granulation maynformation granulation may be viewed as a be viewed as a
human way of achieving data compression.human way of achieving data compression.�� ItIt plays a key role in implementation of the plays a key role in implementation of the
strategystrategy of divideof divide--andand--conquer in human conquer in human problemproblem--solvingsolving..
COMPUTING WITH WORDS AND COMPUTING WITH WORDS AND PERCEPTIONS: PERCEPTIONS:
principal rationalesprincipal rationales (L. (L. ZadehZadeh))�� AvailableAvailable information is not precise enough to information is not precise enough to
justify the use of numbers.justify the use of numbers.�� TThere is a tolerance for imprecision which here is a tolerance for imprecision which
can becan be exploited to achieve tractability, exploited to achieve tractability, robustness and low solutionrobustness and low solution cost. cost.
�� TThe expressive power of words is higher thanhe expressive power of words is higher thanthe expressive power of numbers. the expressive power of numbers.
VAGUE (SOFT) CONCEPT VAGUE (SOFT) CONCEPT MODELINGMODELING
• Fuzzy Sets (L. Zadeh, 1965)• Rough Sets (Z. Pawlak, 1982)• Recently (last 8-10 years):
• Rough Mereology • Granular Computing
• Rough-Neural Computing• Hierarchical Learning (Layered Learning)• Computing with Words and Perceptions
ROUGH SETS AND ROUGH SETS AND INFORMATION GRANULATIONINFORMATION GRANULATION
�� Modeling and discovery of approximation spaces: Modeling and discovery of approximation spaces: from basic to networks of approximation spacesfrom basic to networks of approximation spaces–– Basic caseBasic case–– Tolerance approximation spacesTolerance approximation spaces–– Approximation spaces and inductionApproximation spaces and induction–– Rough inclusion aRough inclusion annd rough d rough mereologymereology–– Networks of approximation spacesNetworks of approximation spaces
• Hierarchical classifiers• Modeling of spatio-temporal reasoning by Approximate Reasoning
Networks• Rough sets and concurrent systems• Rough sets and complex condition and decision values• Rough-neural computing
MAIN SCHEME IN MULTIAGENT MAIN SCHEME IN MULTIAGENT FRAMEWORKFRAMEWORK
DIALOG
ag2ag1
reasoning in L2 by ag2reasoning in L1 by ag1
Rough Set view on the DIALOG results for ag1 :
knowledge about concepts of ag2 making it
possible to approximate these concepts by ag1
A. A. SkowronSkowron: : Approximate reasoning in distributed environments by agents, Approximate reasoning in distributed environments by agents, JJ. Liu and N. . Liu and N. ZhongZhong (eds.),(eds.), InIntelligenttelligent Technologies for Information AnalysisTechnologies for Information Analysis,, SpringerSpringer, 2004, 2004..
BASIC CASEBASIC CASE
Approximations based on conceptApproximations based on conceptdescription by means of decision tablesdescription by means of decision tables
Z. Z. PawlakPawlak, Rough Sets, , Rough Sets, Int. J. Computer Information Int. J. Computer Information SciSci. 11, . 11, 1982, 1982, 341341--356356..
DECISION SYSTEMSDECISION SYSTEMS),,( dAUT = Ad ∉
Age LEMS Walk
x1 16-30 50 yes x2 16-30 0 no x3 31-45 1-25 nox4 31-45 1-25 yesx5 46-60 26-49 nox6 16-30 26-49 yesx7 46-60 26-49 no
dA
condition attributes
decision attribute
dVUd →:
inconsistency
INDISCERNIBILITYINDISCERNIBILITY
IS = (U, A), B⊆AInformation aboutInformation about x: : InfB(x)={(a,a(x)):
a∈B}Two types of Two types of indiscernibilityindiscernibility::
EquivalenceEquivalence::xIND(B)y iff iff InfB(x)= InfB(y)
Tolerance (similarity)Tolerance (similarity): : τxIND(B)y iff iff InfB(x) τ InfB(y)
U
set X
U/B
B subset of attributes
XB
XB
LOWER & UPPER APPROXIMATIONSLOWER & UPPER APPROXIMATIONS
}0:/{ ≠∩∈= XYBUYXB U
}:/{ XYBUYXB ⊆∈= U
INDISCERNIBILITY FUNCTIONINDISCERNIBILITY FUNCTION
x u=InfA(x)
N(x)=(InfA)-1(u)
information signature of xneighborhood of x
)()()( yInfxInfiffyAxIND AA =
SETS SETS ROUGH SETS AND FUZZY SETSROUGH SETS AND FUZZY SETS
�� Characteristic function Characteristic function µX of a set of a set X⊆ U
⎩⎨⎧ ∈
=
→
otherwiseXxif
x
U
X
X
01
)(
}1,0{:
µ
µ
X
U
�� Rough membership functionRough membership function of a set of a set X⊆ U
U
XB
XBXB −
XBU −
BXµ
0=BXµ
1=BXµ
10 << BXµ
SETS SETS ROUGH SETS AND FUZZY SETSROUGH SETS AND FUZZY SETS
�� Fuzzy sets (L.Fuzzy sets (L. ZadehZadeh, 1965), 1965)
]1,0[: →UXµ
� µX(x) –– degree of membership of degree of membership of x inin X
25
age
1
GENERALIZED APPROXIMATION SPACESGENERALIZED APPROXIMATION SPACESA. Skowron, J. Stepaniuk,A. Skowron, J. Stepaniuk, Generalized Approximation Spaces inGeneralized Approximation Spaces in::T.Y. Lin, and A.M. T.Y. Lin, and A.M. WildbergerWildberger (eds.), (eds.), Soft ComputingSoft Computing, Simulation , Simulation
Councils, Inc., San Diego, 18Councils, Inc., San Diego, 18--2121,, 19941994
]1,0[)()(:)(:),,(
→×→
=
UPUPUPUN
NUAS
ν
ν
))(()()( 1 xInfInfxNxInfx −=→→X
neighborhood of x
rough inclusion
partial function
neighborhood function
APPROXIMATION SPACEAPPROXIMATION SPACE
),,( νNUAS =
}1)),((:{),( =∈= XxNUxXASLOW ν
}0)),((:{),( >∈= XxNUxXASUPP ν
EXAMPLE OF ROUGH INCLUSIONEXAMPLE OF ROUGH INCLUSION
⎪⎩
⎪⎨
⎧
∅=
∅≠∩
=
Xif
XifX
YXYXst
1
),(ν
∅≠∩≠∅=∩=
⊆=∈⊆
CxNiffCxNCxNiffCxNCxNiffCxN
UxUC
st
st
st
)(0)),(()(0)),(()(1)),((
,
ννν
ROUGH MEREOLOGYROUGH MEREOLOGY
MEREOLOGYMEREOLOGYSt. LESt. LEŚŚNIEWSKI (1916)NIEWSKI (1916)
x is_a_ part_of yx is_a_ part_of y
ROUGH MEREOLOGYROUGH MEREOLOGYL. Polkowski and A. Skowron (1994L. Polkowski and A. Skowron (1994--..........)..)
x is_a_ part_of y in a degreex is_a_ part_of y in a degreeL. L. PolkowskiPolkowski, A. , A. SkowronSkowron,, Rough Rough mereologymereology, , ISMISISMIS’’9494, , LNAILNAI 869,869, SpSprringeringer, , 1994, 1994, 8585--9494
ROUGH SETS ROUGH SETS AND AND
INDUCTIVE REASONINGINDUCTIVE REASONING
Synak and SkowronSynak and Skowron, RSCTC 2004, RSCTC 2004
INDUCTIVE REASONINGINDUCTIVE REASONING
U*
U
concept C
information about C on a subset U (sample) ofU*
INDUCTIVE REASONINGINDUCTIVE REASONING
How to estimate inclusion of such neighborhood into C if we do not know C outside U?
U*
U
concept C
information about C on a subset U (sample) ofU*
NEW FAMILIES OF NEIGHBORHOODS NEW FAMILIES OF NEIGHBORHOODS AND ROUGH INCLUSION VALUES ESTIMATIONAND ROUGH INCLUSION VALUES ESTIMATION
1. Find a family of patterns for C, i.e., included to a high degree in C on U and another family of patternsagainst C, i.e., included to a high degree in the complement of C that also have with „a high chance”such property on U*
2. Compute the degrees of inclusion of P into such patterns?
3. Resolve conflicts
PPU
C
U PP
U*
ESTIMATION ESTIMATION OF ROUGH INCLUSION VALUESOF ROUGH INCLUSION VALUES
),(
,
),(
,
**
**
UU
UU
UU
UU
Output
ofpropertiessome
InputProblem
UU
βαν
βα
βαν
ααα ⊆⊆→
CLASSIFIERS CLASSIFIERS
11 =→ dα12 =→ dα13 =→ dα
21 =→ dβ22 =→ dβ23 =→ dβ24 =→ dβ
31 =→ dγ32 =→ dγ
),,( 3211 ααα=G ),,,( 43212 ββββ=G ),( 213 γγ=G
Match Conflict_res ie
input granule matching granule)),(),,,,(),,,(( 987654321 εεεεεεεεε
Conflict_res (Match(e,G1,G2,G3))
SECOND APPROACH: kSECOND APPROACH: k--nnnn
))(),((:,)(),( * yIxIUyUxforyIxI
odsneighborhoonofvaluesEstimate
ν
ν
∈∈
)),(),,(min(),( YXYXYXcl νν=
)(
*
xItoclosestUfromcenterswithodsneighborhokfind
UxFor ∈
confilctResolve
NETWORKS OF NETWORKS OF APPROXIMATION SPACESAPPROXIMATION SPACES
ModelsLocal of SystemsConcurrent(ARN)NetworksReasoningeApproximat
sclassifieralHierarchic
•••
LAYERED LEARNINGLAYERED LEARNINGa new learning approach for teams of autonomous a new learning approach for teams of autonomous agents acting in realagents acting in real--time, noisy, collaborative and time, noisy, collaborative and
adversarial environments adversarial environments
Given a hierarchical task decomposition, Given a hierarchical task decomposition, layered learning allows for learning at every level layered learning allows for learning at every level
of the hierarchy, with learning at each level of the hierarchy, with learning at each level directly affecting learning at the next higher level.directly affecting learning at the next higher level.
””Layered learning in Layered learning in multiagentmultiagent systems:systems:A winning approach to robotic soccerA winning approach to robotic soccer””
P. Stone 2000P. Stone 2000
NUEROSCIENCENUEROSCIENCET. Poggio, S. Smale Notices AMS, Vol.50, May 2003T. Poggio, S. Smale Notices AMS, Vol.50, May 2003
�� Organization of cortex Organization of cortex –– for instance visual for instance visual cortex cortex ––is strongly hierarchical.is strongly hierarchical.
�� Hierarchical learning systems show superior Hierarchical learning systems show superior performance in several engineering performance in several engineering applications.applications.
�� This is just one of several possible This is just one of several possible connections, still to be characterized, connections, still to be characterized, between learning theory and the ultimate between learning theory and the ultimate problem in natural science problem in natural science –– the organization the organization and the principles of higher brain functions.and the principles of higher brain functions.
ROUGH SETS ROUGH SETS AND AND
INFORMATION GRANULATIONINFORMATION GRANULATIONTOWARDSTOWARDS
APPROXIMATE REASONING APPROXIMATE REASONING IN IN
DISTRIBUTED SYSTEMSDISTRIBUTED SYSTEMS
KNOWLEDGE BASES KNOWLEDGE BASES FOR SPATIOFOR SPATIO--TEMPORAL TEMPORAL
REASONINGREASONINGCONSTRUCTED FROM CONSTRUCTED FROM EXPERIMENTAL DATA EXPERIMENTAL DATA
AND AND DOMAIN KNOWLEDGEDOMAIN KNOWLEDGE
KNOWLEDGE BASES KNOWLEDGE BASES CONSTRUCTED CONSTRUCTED
FROM PATTERNS FROM PATTERNS AND AND
THEIR PROPERTIESTHEIR PROPERTIESUSED FOR USED FOR
APPROXIMATE REASONING APPROXIMATE REASONING
APPROXIMATE REASONING SCHEMESAPPROXIMATE REASONING SCHEMES(AR(AR--SCHEMES)SCHEMES)
TOWARD HIERARCHICAL LEARNING AND TOWARD HIERARCHICAL LEARNING AND PERCEPTION SCHEMESPERCEPTION SCHEMES
GRAMMAR SYSTEMS (GS)GRAMMAR SYSTEMS (GS)�� PARAMETERIZED PRODUCTIONS PARAMETERIZED PRODUCTIONS
THROUGH LOCAL DECOMPOSITIONTHROUGH LOCAL DECOMPOSITION�� ARAR--SCHEMES: DERIVATIONS SCHEMES: DERIVATIONS
See papers See papers on on rough mereologyrough mereology and granular computingand granular computing (L. (L. PolkowskiPolkowski, A. , A.