baigiang csdl phantan final
TRANSCRIPT
-
8/4/2019 Baigiang Csdl Phantan Final
1/193
HC VIN CNG NGH BU CHNH VIN THNG
TP. H CH MINH 2010
BI GING
C S D LIU PHN TN
(Dng cho sinh vin h chnh qui)
LU HNH NI B
Bin son: TS. TN HNH
-
8/4/2019 Baigiang Csdl Phantan Final
2/193
MC LC
C s d liu phn tn - 2010
MC LC
Chng 1 I CNG V C S D LIU PHN TN 1
1.1 Gii thiu......................................................................................................... 1
1.2 nh ngha v c s d liu phn tn ............................................................... 3
1.3 Cc im c trng ca c s d liu phn tn so vi c s d liu tp trung.. 3
1.4 Ti sao cn c c s d liu phn tn? ............................................................. 5
1.4.1 L do t chc v kinh t........................................................................................ 5
1.4.2 L do kt ni cc c s d liu hin c ................................................................. 5
1.4.3 L do tng trng t chc..................................................................................... 5
1.1.4 L do ti truyn thng........................................................................................... 6
1.4.5 nh gi v hiu sut ............................................................................................ 6
1.4.6 tin cy v tnh hiu qu.................................................................................... 6
1.4.7 So snh u v nhc im ca vic phn tn d liu ............................................ 6
Chng 2 KIN TRC H QUN TR C S D LIU PHN TN 8
2.1 Kin trc tham kho cho c s d liu phn tn............................................... 82.2 Cc thnh phn ca h qun tr c s d liu phn tn .................................. 12
2.3 Kin trc ca h qun tr c s d liu phn tn............................................ 15
2.3.1 Cc h khch/ch (Client/Server)........................................................................ 15
2.3.2 H qun tr c s d liu phn tn....................................................................... 15
2.3.3 Kin trc ca h qun tr c s d liu phn tn.................................................. 17
CHNG 3 THIT K C S D LIU PHN TN 20
3.1 Cc vn v thit k c s d liu phn tn................................................. 21
3.1.1 Cc l do phn mnh........................................................................................... 21
3.1.2 Cc kiu phn mnh............................................................................................ 22
3.2 Thit k phn mnh........................................................................................ 27
-
8/4/2019 Baigiang Csdl Phantan Final
3/193
MC LC
C s d liu phn tn - 2010
3.2.1 Cc mc tiu ca vic thit k phn tn d liu................................................... 29
3.2.2 Cc tip cn thit k s phn tn d liu......................................................... 30
3.2.3 Thit k s phn mnh d liu ............................................................................ 31
3.3 S cp pht cc phn mnh ............................................................................ 613.3.1 Bi ton cp pht................................................................................................. 61
3.3.2 Yu cu v thng tin ........................................................................................... 61
3.3.3. M hnh cp pht .............................................................................................. 63
Chng 4 S TRONG SUT PHN TN 67
4.1 S trong sut phn tn ca ng dng ch c................................................. 67
4.2 S trong sut phn tn i vi cc ng dng cp nht.................................... 73
4.3 Cc nguyn tc truy xut c s d liu phn tn ............................................ 79
Chng 5 TI U HA TRUY VN PHN TN 83
5.1. Biu thc chun tc ca truy vn................................................................... 84
5.1.1. Truy vn............................................................................................................. 84
5.1.2. Biu thc chun tc ca truy vn........................................................................ 85
5.2. Ti u ha truy vn trong c s d liu tp trung.......................................... 86
5.2.1. Bc 1- Kim tra ng php (syntax Checking) .................................................. 86
5.2.2. Bc 2- Kim tra s hp l (Validation) ............................................................ 87
5.2.3. Bc 3 Dch truy vn (Translation)................................................................. 88
5.2.4. Bc 4- Ti u ha biu thc i s quan h (relational AlgebraOptimization)............................................................................................................... 89
5.2.5. Bc 5- Chn la chin lc truy xut (strategy selection)................................ 90
5.2.6. Bc 6- To sinh m (code Generation)............................................................. 90
5.3. Ti u ha truy vn trong c s d liu phn tn .......................................... 90
5.3.1. Bc 1- Phn r truy vn (Query Decomposition).............................................. 92
5.3.2. Bc 2 Cc b ha d liu............................................................................... 111
5.3.3 Bc 3 Ti u ho truy vn ton cc................................................................. 118
-
8/4/2019 Baigiang Csdl Phantan Final
4/193
MC LC
C s d liu phn tn - 2010
5.3.4 Bc 4 Ti u ho truy vn cc b ................................................................... 120
Chng 6 GIAO TC PHN TN 121
6.1 nh ngha giao tc .................................................................................... 123
6.1.1 Tnh hung kt thc giao tc ............................................................................. 1256.1.2 c trng ho cc giao tc .............................................................................. 126
6.1.3. Hnh thc ho khi nim giao tc..................................................................... 127
6.2 Cc tnh cht ca giao tc............................................................................. 131
6.2.1 Tnh nguyn t.................................................................................................. 131
6.2.2 Tnh nht qun .................................................................................................. 131
6.2.3 Tnh bit lp...................................................................................................... 132
6.2.4 Tnh bn vng................................................................................................... 136
6.3 Cc loi giao tc........................................................................................... 136
6.3.1 Giao tc phng .................................................................................................. 136
6.3.2 Giao tc lng..................................................................................................... 136
6.4 iu khin ng thi phn tn .................................................................... 137
6.4.1 L thuyt kh tun t ........................................................................................ 137
6.4.2 Phn loi cc c ch iu khin ng thi......................................................... 144
6.4.3 Cc thut ton iu khin ng thi bng kha cht ......................................... 146
6.4.4 Nghi thc 2PL tp quyn.................................................................................. 158
6.4.5 Thut ton 2PL bn chnh ................................................................................. 164
6.4.6 Thut ton 2PL phn quyn............................................................................... 164
CHNG 7 CC H C S D LIU 166
7.1 C s d liu song song ............................................................................... 1667.1.1 Gii thiu.......................................................................................................... 166
7.1.2 Kin trc h c s d liu song song................................................................ 167
7.1.3 Li ch ca h c s d liu song song.............................................................. 168
7.2 H c s d liu mobile ............................................................................... 168
-
8/4/2019 Baigiang Csdl Phantan Final
5/193
MC LC
C s d liu phn tn - 2010
7.2.1 Gii thiu.......................................................................................................... 168
7.2.2 nh ngha h c s d liu mobile................................................................... 169
7.2.3 Cc kiu di ng ............................................................................................... 170
7.2.4 Kin trc h c s d liu mobile...................................................................... 171TI LIU THAM KHO 176
MT S THI THAM KHO 177
S 1 ............................................................................................................ 177
S 2 ............................................................................................................ 179
S 3 ............................................................................................................ 181
S 4 ............................................................................................................ 182
TI.............................................................................................................. 184
-
8/4/2019 Baigiang Csdl Phantan Final
6/193
MC LC
C s d liu phn tn - 2010
MC LC HNH
Hnh 1.1 C s d liu phn tn ca ngn hng c ba chi nhnh
Hnh 1.2 Mi lin h gia mng my tnh, c s d liu phn tn v ngdng phn tn
Hnh 2.1 Kin trc tham kho cho mt c s d liu
Hnh 2.2 Cc phn mnh v cc nh vt l i vi mt quan h ton cc
Hnh 2.3 Cc thnh phn ca h qun tr c s d liu
Hnh 2.4 Cc kiu truy xut n c s d liu phn tn
Hnh 2.5 Kin trc my khch/ch
Hnh 2.6 Kin trc tham kho c s d liu phn tn
Hnh 2.7 Kin trc ca h qun tr c s d liu phn tn
Hnh 3.1 Cy phn mnh ca quan h EMP
Hnh 3.2 S phn mnh ca quan h DEPT
Hnh 3.3 Cc th kt ni
Hnh 3.4 Ma trn s dng
Hnh 3.5 Ma trn i lc
Hnh 3.6 Gom nhm cc thuc tnh
Hnh 3.7 ma trn thuc tnh tHnh 3.8 S phn mnh hn hp ca quan h R(A1, A2, A3, A4, A5)
Hnh 4.1a S trong sut phn tn
Hnh 4.1b S trong sut v tr
Hnh 4.1c S trong sut nh x cc b
Hnh 4.2 Mt ng dng trn c s d liu phn tn khng ng nht v khngtrong sut
Hnh 4.3 Cy con cp nht ca thuc tnh DEPTNUM trong cy phn mnhca quan h EMP
Hnh 4.4a Cy phn mnh khc ca quan h EMP
Hnh 4.4b H qu ca vic cp nht DEPTNUM ca EMPNUM=100
Hnh 5.1 S ti u ha truy vn trong c s d liu tp trung
Hnh 5.2 S ti u ha truy vn trong c s d liu phn tn
-
8/4/2019 Baigiang Csdl Phantan Final
7/193
MC LC
C s d liu phn tn - 2010
Hnh 5.3 S ti u ha truy vn trong c s d liu phn tn
Hnh 6.1 M hnh giao tc
Hnh 6.2. Biu din dng DAG cho mt giao tc
Hnh 6.3 Biu din DAG ca mt lch y
Hnh 6.4 Mt lch y
Hnh 6.5 Tin t ca lch y ca hnh 6.4
Hnh 6.6 Phn loi cc thut ton iu khin ng thi
Hnh 6.7 Ma trn tng thch ca cc th thc kha
Hnh 6.8 Cc nh ngha chun b cho cc thut ton sp ti
Hnh 6.9 Biu kha 2PL
Hnh 6.10 Biu kha hai pha nghim ngt
Hnh 6.11 Bng tng thch c th thc kha dng chung c th tHnh 6.12. Cu trc truyn giao ca 2PL tp quyn
Hnh 6.13 Cu trc truyn giao ca 2PL phn quyn
Hnh 7.1 Kin trc chia s b nh ca h c s d liu song song
Hnh 7.2 Kin trc chia s a cng ca h c s d liu song song
Hnh 7.3 Kin trc khng chia s ti nguyn ca h c s d liu song song
Hnh 7.4 Khng gian thng tin c kt ni y
Hnh 7.5 Tnh di ng c nhn
Hnh 7.6 Kin trc ca h c s d liu mobile
Hnh 7.7 Cc kiu nhn bn
-
8/4/2019 Baigiang Csdl Phantan Final
8/193
Chng 1 i cng v c s d liu phn tn 1
C s d liu phn tn - 2010
Chng 1 I CNG V C S D LIU PHN TN
MC TIUChng ny trnh by cc khi nim ca c s d liu phn tn:nh
ngha c s d liu phn tn, l do cn c c s d liu phn tn, cctnh cht c trng ca n. T so snh c s d liu phn tn vi
c s d liu tp trung qua rt ra nhng l do pht trin mt h
thng da trn c s d liu phn tn. Gii thiu v c s d liu
phn tn
1.1 Gii thiu
Trong nhng nm gn y, c s d liu phn tn tr thnh mt lnh vc x l
thng tin quan trng v chng ta d dng nhn ra tm quan trng ca n ngy cngln mnh. Chng ta c l do v t chc cng nh v k thut pht trin theo xu
hng ny: c s d liu phn tn khc phc c mt s hn ch ca c s d liu
tp trung nh qu ti server, nghn c chai khi truy xut, tnh sn sang/ tin cy
v kh nng chu li thp. Hn na c s d liu phn tn ph hp hn vi cc t
chc d liu phi tp trung cng nh vi cc ng dng phn tn.
Chng ta c th xem c s d liu phn tn l mt tp hp d liu (c s d liu)
ca mt h thng thng tin nhng c phn b trn nhiu a im (site)ca mt
mng my tnh (intranet). Khi nim ny nhn mnh n hai kha cnh quan trng
ca c s d liu phn tn l :
1. S phn tn: d liu khng lu tr trn cng mt a im v th chng ta
c th phn bit n vi c s d liu tp trung.
2. Mi tng quan lun l (logical correlation): Cc d liu c mt s thuc
tnh rng buc vi nhau t cc c s d liu cc b m c lu tr ti cc aim khc nhau trn mng.
V d : Xt mt ngn hng c ba chi nhnh nm ba ni khc nhau (hnh 1.1). Ti
mi nhnh, mt h thng my tnh iu khin cc trm thu hay rt tin v qun l
c s d liu v ti khon. Mi h thng ny c c s d liu ti khon cc b to
-
8/4/2019 Baigiang Csdl Phantan Final
9/193
Chng 1 i cng v c s d liu phn tn 2
C s d liu phn tn - 2010
thnh mt site ca c s d liu phn tn. Cc h thng my tnh ny c kt ni
bi mt mng truyn thng. Vi nhng hot ng thng thng, cc yu cu t cc
trm ch cn truy xut n c s d liu ti chi nhnh ca chng. V th ng dng
ny c gi l ng dng cc b.
V d trn lm ny sinh hai cu hi sau:
1) Mi chi nhnh ch lu tr c s d liu cc b c p ng cc ng dng
cha?
2) C s d liu phn tn c phi l mt tp hp cc c s d liu cc b?
tr li cc cu hi ny chng ta tm hiu xem vic x l trn c s d liu cc
b khc g trn c s d liu phn tn. V mt k thut, chng ta thy cn c cc
ng dng m truy xut d liu t nhiu nhnh. Cc ng dng ny c gi l
ng dng ton cc hay ng dng phn tn.
Mt ng dng ton cc thng thng trong v d trn l vic chuyn tin t mt ti
khon ny n ti khon khc. ng dng ny yu cu cp nht c s d liu c
hai nhnh.
DB1
T
T
T
DB2
T
T
T
DB3
T
T
T
CommunicationNetwork
Hnh 2.1 C s d liu phn tn ca ngn hng c ba chi nhnh
-
8/4/2019 Baigiang Csdl Phantan Final
10/193
Chng 1 i cng v c s d liu phn tn 3
C s d liu phn tn - 2010
Hn na ng dng ton cc gip cho ngi s dng khng phn bit c d liu
cc b hay t xa. l tnh trong sut d liu trong c s d liu phn tn. V
ng nhin khi ng dng ton cc truy cp d liu cc b s nhanh hn ng dng
t xa iu ny ni ln s nhn bn d liu cc ni cng lm tng tc x l
chng trnh.
1.2 nh ngha v c s d liu phn tn
Mt c s d liu phn tn l tp hp d liu quan h ln nhau mt cch lun l
trn cng mt h thng nhng c tri rng trn nhiu sites ca mt mng my
tnh.
Mi site c quyn t qun c s liu cc b ca mnh v thc thi cc ng dng
cc b. Mi v tr cng phi tham gia vo vic thc thi t nht mt ng dng toncc m yu cu truy xut d liu ti nhiu v tr qua mng.
Hnh nh ca c s d liu phn tn (hnh 1.2) minh ha mi quan h ca c s d
liu phn tn vi mi trng kt ni mng my tnh v cc ng dng phn tn.
lm r cc im c trng ca c s d liu phn tn hy so snh n vi c s
d liu tp trung trong phn sau.
Hnh 1.2 Mi lin h gia mng my tnh, c s d liu phn tn v ng dng phntn
1.3 Cc im c trng ca c s d liu phn tn so vi c s d liu tptrung
C s d liu phn tn khng n gin l vic phn tn cc c s d liu tp trung
bi v n cho php thit k cc h thng c cc tnh cht khc vi h thng tp trung
truyn thng. V th nn xem li cc tnh cht c trng ca c s d liu tp trung
Mng v truyn thng
C s d liu phn tn
ng dng phn tn
Tng lu tr d liu
Tng kt ni
Tng ng dng
-
8/4/2019 Baigiang Csdl Phantan Final
11/193
Chng 1 i cng v c s d liu phn tn 4
C s d liu phn tn - 2010
truyn thng v so snh n vi cc tnh cht ca c s d liu phn tn. Cc tnh
cht c trng ca c s d liu tp trung l iu khin tp trung, c lp d liu,
chun ha loi b s d tha d liu, cc cu trc lu tr vt l phc tp p ng
cho vic truy xut hiu qu, ton vn, phc hi, iu khin ng thi v an ton.
Di y l bng so snh cc tnh cht c trng ca c s d liu tp trung v c
s d liu phn tn:
Tnh cht c trng C s d liu tp trung C s d liu phn tn
iu khin tp trung - Kh nng cung cp s iu
khin tp trung trn cc ti
nguyn thng tin.
- Cn c ngi qun tr c
s d liu.
- Cu trc iu khin phn
cp: qun tr c s d liu
ton cc v qun tr c s d
liu cc b phn tn.
c lp d liu - T chc d liu trong sut
vi cc lp trnh vin. Cc
chng trnh c vit c
ci nhn quan nim v d
liu.
- Li im: cc chng trnh
khng b nh hng bi s
thay i t chc vt l ca
d liu
- Ngoi tnh cht c d liu
nh trong c s d liu tp
trung, cn c tnh cht trong
sut phn tn ngha l cc
chng trnh c vit nh c
s d liu khng h c phn
tn.
S d tha d liu Gim thiu s d tha d
liu do:
- Tnh nht qun d liu.
- Tit kim dung lng nh.
- Gim thiu s d tha d
liu m bo tnh nht qun.
- Nhng li nhn bn d liu
n cc a im m cc ngdng cn n, gip cho vic
thc thi cc ng dng khng
dng nu c mt a im b
hng. T vn qun l
-
8/4/2019 Baigiang Csdl Phantan Final
12/193
Chng 1 i cng v c s d liu phn tn 5
C s d liu phn tn - 2010
nht qun d liu s phc tp
hn.
Cc cu trc vt l phc
tp v truy xut hiu qu
Cc cu trc vt l phc tp
gip cho vic truy xut d
liu c hiu qu.
Cc cu trc vt l phc tp
gip lin lc d liu trong c
s d liu phn tn .
Tnh ton vn, phc hi,
ng thi
Da vo giao tc. Da vo giao tc phn tn.
T bng so snh trn, chng ta thy vic chn la c s d liu phn tn s thch
hp hn i vi cc ng dng pht trin trong mt h thng mng din rng do
gim c chi ph truyn thng truy xut d liu.
1.4 Ti sao cn c c s d liu phn tn?
1.4.1 L do t chc v kinh t
Nhiu t chc c c cu t chc phi tp trung nn gii php c s d liu phn tn
thch hp hn. Nhng nm gn y do s pht trin mnh m ca cng ngh my
tnh cng vi s pht trin rng ri ca cc t chc kinh t trn th gii nn vic lu
tr thng tin trn c s d liu tp trung cn xem xt li v mt hiu qu.
1.4.2 L do kt ni cc c s d liu hin c
C s d liu phn tn l gii php t nhin khi t chc c sn cc c s d liu
v cn m rng n cho cc ng dng ph qut hn. Trong trng hp ny c s d
liu phn tn c xy dng theo phng php t di ln, da trn cc c s d
liu cc b c sn. Qu trnh ny c th yu cu cu trc li c s d liu cc b tuy
nhin cng vic ny li n gin hn xy dng mt c s d liu tp trung hon
ton mi.
1.4.3 L do tng trng t chc
Nu mt t chc pht trin bng cch thm vo nhng n v t chc t qun nh
chi nhnh, kho bi th cch tip cn theo c s d liu phn tn h tr cho vic tng
trng c s d liu vi mc nh hng nh nht. Trong khi cch tip cn
theo c s d liu tp trung th ngay t u phi quan tm n s pht trin ca n
-
8/4/2019 Baigiang Csdl Phantan Final
13/193
Chng 1 i cng v c s d liu phn tn 6
C s d liu phn tn - 2010
trong tng lai m vic ny th kh d on v tn km, nu khng d liu trc th
s gy ra hu qu nghim trng khng ch cho nhng ng dng mi m cn cho c
h thng c sn.
1.1.4 L do ti truyn thngVi mt h c s d liu phn tn v mt a l th cc ng dng truy cp s gim
chi ph truyn thng so vi c s d liu tp trung.
1.4.5 nh gi v hiu sut
S tn ti ca cc b x l t qun nng hiu sut ln nh mc x l song song.
C s d liu phn tn c u th l phn tn d liu ti cc a im nn cc ng
dng c th chy ring r trn tng a im v s giao tip gia cc b x l l nh
nht.
1.4.6 tin cy v tnh hiu qu
Mc du vic phn tn d liu lm tng vic d tha d liu trn ton h thng
nhng li cho chng ta tin cy v tnh hiu qu cao hn trong c s d liu tp
trung. Tuy nhin t c mc tiu trn khng phi d dng m i hi cc k
thut kh phc tp. S hng hc trong c s d liu phn tn c th xy ra thng
hn trong c s d liu tp trung v s a im tng ln nhng khng bao gi nh
hng ln ton h thng bi th nn n c tin cy v tnh hiu qu cao hn c s
d liu tp trung.
1.4.7 So snh u v nhc im ca vic phn tn d liu
u im
Chia s d liu v iu khin phn tn: Ngi s dng ti mt v tr ny c
th truy xut d liu (c php) v tr khc. Hn na vic qun tr c s
d liu c th tc phn tn v thc hin t qun ti mi v tr.
tin cy v tnh sn sng: Nu mt v tr b hng th cc v tr cn li
trong h thng c s d liu phn tn vn tip tc hot ng. Nu d liu
c nhn bn mt s v tr th mt giao dch cn truy xut mt mc d
-
8/4/2019 Baigiang Csdl Phantan Final
14/193
Chng 1 i cng v c s d liu phn tn 7
C s d liu phn tn - 2010
liu c th tm thy bt k v tr no trong s v tr . Nh th s c ti
mt v tr khng nh hng n h thng.
Tng tc x l truy vn: Nu mt truy vn cn d liu mt s v tr th
c th chia cu truy vn thnh cc cu truy vn con ri thc thi n songsong ti cc v tr.
Nhc im
Chi ph pht trin phn mm: Vic pht trin mt h thng c s d liu
phn tn kh phc tp v th cn chi ph ln.
Kh pht hin li: Vic pht hin li v m bo tnh ng n ca cc
thut ton song song s rt kh khn.
Chi ph x l tng: S trao i cc thng bo v x l phi hp gia cc v
tr s tng chi ph x l hn trong cc h thng tp trung.
-
8/4/2019 Baigiang Csdl Phantan Final
15/193
Chng 2 Kin trc h qun tr c s d liu phn tn 8
C s d liu phn tn - 2010
Chng 2 KIN TRC H QUN TR C S D LIU
PHN TN
MC TIU
Chng ny trnh by khi qut v:
- Kin trc tham kho ca c s d liu phn tn nhm cung cp mt
tng quan v vic phn tn v cp pht d liu quan h n cc sites
cc b.
- Cc thnh phn ca h qun tr c s d liu phn tn v mi quan
h tng tc gia chng.
- Kin trc ca h qun tr c s d liu phn tn.
2.1 Kin trc tham kho cho c s d liu phn tn
Hnh 2.1 m t kin trc tham kho cho c s d liu phn tn. Kin trc tham kho
ny khng p dng cho mi c s d liu phn tn. Tuy nhin cc mc ca n gip
cho ta hiu t chc mt c s d liu phn tn bt k. V th chng ta s phn tch
v tm hiu tt c cc thnh phn trong kin trc ny.
Mi quan h ton cc c th c chia thnh cc thnh phn khng trng nhauc gi l cc phn mnh. C nhiu cch phn mnh m chng ta s bn n
sau. nh x t cc quan h ton cc n cc phn mnh c nh ngha trong lc
phn mnh. Php nh x ny l mt-nhiu ngha l c mt s phn mnh tng
ng vi mt quan h ton cc nhng ch c mt quan h ton cc ng vi mt phn
mnh. Cc phn mnh c ch nh bi tn quan h ton cc vi mt ch mc (ch
mc phn mnh) v d Ri ch phn mnh th i ca quan h ton cc R.
-
8/4/2019 Baigiang Csdl Phantan Final
16/193
Chng 2 Kin trc h qun tr c s d liu phn tn 9
C s d liu phn tn - 2010
Hnh 2.1 Kin trc tham kho cho mt c s d liu
Cc phn mnh l cc thnh phn ca cc quan h ton cc m c lu tr vt l
ti mt hay mt s a im. Lc cp pht (allocation scheme) xc nh v tr
ca mt phn mnh. Kiu nh x nh ngha trong lc cc b xc nh c s d
liu phn tn c d tha hay khng. Trong trng hp nh x l mt-nhiu th n
d tha, ngc li nu nh x c kiu mt-mt th n khng d tha. Tt c ccphn mnh tng ng vi cng mt quan h ton cc R v c lu tr ti a im
j to thnh nh vt l ca quan h R ti a im j. V th c mt nh x mt-mt
gia mt nh vt l v mt cp (quan h ton cc, a im); cc nh vt l c th
c ch ra bi tn quan h ton cc v ch s a im. phn bit cc mnh,
chng ta s s dng mt ch s m; v d, Rj ch nh vt l ca quan h ton cc R
ti a im j.
Mt v d ca mi quan h gia cc kiu i tng nh ngha trn c minh
ha hnh 2.2. Mt lc quan h R c phn thnh bn mnh R1, R2, R3 v R4.
Bn phn mnh ny c lu tr d tha ti ba a im trn mng my tnh, v th
to ra ba nh vt l R1, R2, R3.
Localmappingschema 1
Localmappingschema 2
DBMS ofSite 1
DBMS ofsite 2
Localdatabaseat site 1
Localdatabaseat site 2
Globalschema
Franmentationschema
Allocationschema
Siteindependentschema
(Other site)::
-
8/4/2019 Baigiang Csdl Phantan Final
17/193
Chng 2 Kin trc h qun tr c s d liu phn tn 10
C s d liu phn tn - 2010
lm r k thut ny, hy xt mt bn sao ca mt phn mnh ti mt a im v
ch thch n bng cch s dng tn ca quan h ton cc v hai ch s (ch s phn
mnh v ch s a im). V d, trong hnh 2.2, ch thch R32 ch mt bn sao ca
phn mnh R2 lu tr ti a im 3.
Hnh 2.2 Cc phn mnh v cc nh vt l i vi mt quan h ton cc
Cui cng s thy hai nh vt l bt k c th c phn bit. Trong trng hp
ny ta s ni mt nh vt l l mt bn sao ca mt nh vt l khc. V d trong
hnh 2.2, R1 l mt bn sao ca R2.
Kin trc tham kho hnh 2.1 m t mi quan h gia cc i tng ti ba mc
trn cng ca kin trc ny. Ba mc ny c lp v tr, v th chng khng ph
thuc vo m hnh d liu ca cc h qun tr c s d liu cc b. mc thphn, cn nh x cc nh vt l n cc i tng c thao tc bi cc h qun tr
c s d liu cc b. nh x ny c gi l lc nh x cc b v n ph thuc
vo kiu ca h qun tr c s d liu cc b; v th trong h thng khng ng
nht, c nhiu kiu nh x cc b ti cc v tr khc nhau.
(R23)
(R33)
(R12)
(R22)
(R11)
(R21)
R1 (Site 1)
R2 (site 2)
R3 (Site 3)
R1
R2
R3
R4
R
nh vt lLc phn mnhLc ton cc
-
8/4/2019 Baigiang Csdl Phantan Final
18/193
Chng 2 Kin trc h qun tr c s d liu phn tn 11
C s d liu phn tn - 2010
Kin trc ny cung cp mt m hnh quan nim tng qut hiu c c s d
liu phn tn. Ba i tng quan trng nht ca kin trc ny l s tch bit gia s
phn mnh d liu v s cc b ha d liu, iu khin d tha d liu v tnh c
lp cc h qun tr c s d liu cc b.
- S tch bit quan nim phn mnh d liu v quan nim nh v d liu : S tch
bit ny gip ta phn bit hai mc khc nhau ca s trong sut phn tn c gi l
s trong sut phn tn v s trong sut nh v. S trong sut phn tn l mc
cao nht ca s trong sut v bao gm cc yu t m ngi s dng v cc lp trnh
vin lm vic trn cc quan h ton cc. S trong sut nh v l mc thp hn v
yu cu ngi s dng v cc lp trnh vin lm vic trn cc phn mnh thay v
trn cc quan h ton cc. Tuy nhin h khng cn bit cc phn mnh ny lu tr u. S tch bit hai quan nim phn mnh v nh v rt ph hp trong thit k
c s d liu phn tn v s xc nh cc thnh phn thch hp ca d liu c
nhn bit t bi ton nh v ti u.
-iu khin tng minh s d tha d liu: Kin trc tham kho cung cp mt s
iu khin tng minh cho s d tha d liu ti mc phn mnh. V d trong hnh
2.2 hai nh vt l R22 v R32 trng lp ngha l chng cha chung d liu. nh
ngha cc phn mnh mt cch tch bit khi xy dng cc khi vt l cho phptham kho tng minh n tng phn trng lp ny tc phn mnh nhn bn R2.
iu khin s d tha d liu rt hu dng trong mt s kha cnh qun tr c s d
liu phn tn.
- Tnh c lp ti cc h qun tr c s d liu cc b: Tnh cht ny gi l s trong
sut nh x cc b, n cho php nghin cu mt s vn qun tr c s d liu m
khng quan tm n m hnh d liu c th ti cc h qun tr c s d liu cc b.
Mt kiu trong sut khc lin quan cht ch ti s trong sut nh v l s trong
sut nhn bn. S trong sut nhn bn c ngha l ngi s dng khng nhn thy
c s nhn bn ca cc phn mnh.
-
8/4/2019 Baigiang Csdl Phantan Final
19/193
Chng 2 Kin trc h qun tr c s d liu phn tn 12
C s d liu phn tn - 2010
2.2 Cc thnh phn ca h qun tr c s d liu phn tn
H qun tr c s d liu phn tn h tr vic to v duy tr c s d liu phn tn.
Cc h qun tr c s d liu phn tn hin nay c pht trin bi cc nh sn xut
cc h qun tr c s d liu tp trung. Chng cha cc thnh phn b sung m rngcc kh nng ca cc h qun tr c s d liu tp trung nh h tr s truyn thng
v s cng tc gia cc h qun tr c s d liu trn cc a im khc nhau qua
mng my tnh. Cc thnh phn c bn cn thit cho vic xy dng mt c s d
liu phn tn l :
1. Thnh phn qun tr c s d liu (DB Database Management)
2. Thnh phn truyn d liu (DC Data Communication)
3. T in d liu (DD (Data Dictionnary) m rng biu din thng tin vs phn tn d liu trn mng.
4. Thnh phn c s d liu phn tn (DDB Distributed Database)
Hnh 2.3 Cc thnh phn ca h qun tr c s d liu
Cc thnh phn ny c minh ha hnh 2.3 i vi hai a im trn mng.
Cc dch v c h tr cho h thng trn thng thng l:
DB DC
DDB
DDBDCDB
Localdatabase 1
Localdatabase 2
Site 1Site 2
T T T T
T T T T
-
8/4/2019 Baigiang Csdl Phantan Final
20/193
Chng 2 Kin trc h qun tr c s d liu phn tn 13
C s d liu phn tn - 2010
- Dch v truy xut c s d liu t xa: tnh cht ny l mt tnh cht quan trng
nht v c cung cp bi tt c cc h thng c thnh phn c s d liu phn tn.
- Mc trong sut ca s phn tn: tnh cht ny c h tr bi cc h thng
khc nhau v l s cn bng cc yu t t c s kt hp tt nht gia strong sut phn tn v hiu sut.
- H tr vic qun tr v iu khin c s d liu: tnh cht ny bao gm cc cng
c gim st c s d liu, ly thng tin v vic s dng c s d liu, cung cp
mt ci nhn ton cc v cc file d liu lu tr trn cc v tr khc nhau.
- H tr cho vic iu khin ng thi v phc hi cc giao tc phn tn.
(a) Truy xut t xa qua cc nguyn thy ca h qun tr c s d liu
(b) truy xut t xa qua chng trnh b tr
Hnh 2.4 Cc kiu truy xut n c s d liu phn tn
Site 1
DATABASEACCESSPRIMITIVEApplication
Program
Site 2
Site 1
Database 2
DBMS
REQUEST FOREXECUTION OFAUXILIARYPROGRAM
AplicationProgram
DBMS 1
DATABASEACCESSPRIMITIVESAND RESULTS
Database 2GLOBAL
Auxiliaryprogram
DBMS 2
Site 1
Site 2
-
8/4/2019 Baigiang Csdl Phantan Final
21/193
Chng 2 Kin trc h qun tr c s d liu phn tn 14
C s d liu phn tn - 2010
Vic truy xut n mt c s d liu t xa bi mt ng dng c th c thc hin
bi mt hai cch c bn minh ha hnh 2.4. Hnh 2.4a minh ha mt ng dng
a ra mt yu cu tham kho d liu t xa. Yu cu ny c nh tuyn bi h
qun tr c s d liu phn tn n v tr m d liu c lu tr, sau yu cu
c thc thi ti v tr v tr kt qu v. Trong cch ny, n v c bn lin lc
gia cc h thng l cc nghi thctruy xut c s d liu v kt qu nhn v cng
t nghi thc ny. Nu cc tip cn ny c s dng cho vic truy xut t xa, s
trong sut phn tn c th c thc hin bng cch cung cp cc tn file ton cc;
cc nghi thc s t ng nh v cc v tr t xa thch hp.
Hnh 2.4b minh ha mt tip cn khc, ng dng yu cu s thc thi ca mt
chng trnh b tr (auxiliary program) ti v tr t xa. Chng trnh b tr ny truyxut c s d liu t xa v tr kt qu cho ng dng yu cu.
Li ch ca cch tip cn th nht l cung cp s trong sut phn tn nhiu hn
trong khi cch tip cn th hai c th linh ng hn nu nhiu truy xut c s d
liu c yu cu v ng dng b tr c th thc hin tt c cc truy xut yu cu
v ch gi kt qu v.
Mt thuc tnh quan trng ca h qun tr c s d liu phn tn trong h thng l
chng cng loi hay khc loi. Cc h qun tr c s d liu phn tn khc loi phi
thm vn thng dch gia cc m hnh d liu khc nhau, cc cu trc d liu
khc nhau. y l mt vn rt kh gii quyt, nn n c khc phc bng cch
h tr s truyn thng gia cc thnh phn truyn thng d liu (data
communication component DC) khc nhau. Bi ton ny cng c cng ty
Microsoft gii quyt bng cc thnh phn truy xut d liu (Microsoft Data Access
Components (MDAC)). Cho nn mt h thng bao gm cc h qun tr c s d
liu cc b khc nhau s thch hp hn cho vic pht trin h thng thng tin mt
cch linh ng v t tr.
-
8/4/2019 Baigiang Csdl Phantan Final
22/193
Chng 2 Kin trc h qun tr c s d liu phn tn 15
C s d liu phn tn - 2010
2.3 Kin trc ca h qun tr c s d liu phn tn
Phn ny s xem xt chi tit cc kin trc h thng l h khch/ch (client/server),
cc h c s d liu phn tn v cc phc h c s d liu.
2.3.1 Cc h khch/ch (Client/Server)
Trong h khch/ch, ta phn bit chc nng cn c cung cp v chia nhng chc
nng ny thnh hai lp: chc nng ch, chc nng khch. N cung cp mt kin
trc hai mc, to d dng cho vic qun l mc phc tp ca cc h qun tr c
s d liu hin i v phc tp ca vic phn tn d liu.
V th, c th nghin cu nhng khc bit v chc nng khch v chc nng ch.
iu u tin phi ch l ch thc hin phn ln cc cng vic qun l d liu.
iu ny c ngha l mi vic x l v ti u ha vn tin, qun l giao tc v qun
l thit b lu tr u c thc hin ti my ch. Khch, ngoi giao din v ng
dng, s c mt module qun tr c s d liu, khch chu trch nhim qun l d
liu c gi n v i khi c vic qun l cc kho cht giao tc.
Kin trc khch/ch c biu din trong hnh 2.5. Kin trc ny rt thng dng
trong cc h thng quan h, vic giao tip gia khch v ch thng qua cc cu
lnh SQL. Ni cch khc, khch s chuyn cc cu lnh SQL cho my ch mkhng tm hiu v ti u ho chng. My ch thc hin hu ht cc cng vic v tr
quan h kt qu v cho khch.
2.3.2 H qun tr c s d liu phn tn
Chnh ta bt u m t kin trc ny bng cch xem xt hnh nh t chc d liu.
Trc tin ta ch rng t chc d liu vt l trn mi my c th khc nhau. V
th cn c mt nh ngha ni ti ring ti mi v tr c gi l lc ni ti cc
b LIS (local internal schema). Hnh nh ca m hnh d liu ton cc c m t
bng lc quan nim ton cc GCS (global conceptual schema). x l hin
tng nhn bn v phn mnh, cn phi m t vic t chc logic ca d liu ti mi
v tr, v th cn c mt tng th ba c gi l lc quan nim cc b LCS
-
8/4/2019 Baigiang Csdl Phantan Final
23/193
Chng 2 Kin trc h qun tr c s d liu phn tn 16
C s d liu phn tn - 2010
(local conceptual schema). Do vy trong m hnh kin trc ny, lc quan nim
ton cc l hp ca cc quan nim cc b. Cui cng cc ng dng v truy xut c
s d liu c h tr qua lc ngoi ES (external schema). M hnh kin trc
ny c trnh by hnh 2.6.
Client
H
Giao tip
ngi dng
Chng trnh
ng dng...
iu Client DBMS
Hnh Phn mm truyn thng
Truy vn Kt qu
SQL quan h
Server Phn mm truyn thng
H B iu khin ng ngha d liu
B ti u ha cu hi
B qun l giao tc
iu B phc hi
B x l h tr run time
Hnh
Hnh 2.5 Kin trc my khch/ch
CSDL
-
8/4/2019 Baigiang Csdl Phantan Final
24/193
Chng 2 Kin trc h qun tr c s d liu phn tn 17
C s d liu phn tn - 2010
2.3.3 Kin trc ca h qun tr c s d liu phn tn
Chnh ta bt u m t kin trc ny bng cch xem xt hnh nh t chc d liu.
Trc tin ta ch rng t chc d liu vt l trn mi my c th khc nhau. V
th cn c mt nh ngha ni ti ring ti mi v tr c gi l lc ni ti ccb LIS (local internal schema). Hnh nh ca m hnh d liu ton cc c m t
bng lc quan nim ton cc GCS (global conceptual schema). x l hin
tng nhn bn v phn mnh, cn phi m t vic t chc logic ca d liu ti mi
v tr, v th cn c mt tng th ba c gi l lc quan nim cc b LCS
(local conceptual schema). Do vy trong m hnh kin trc ny, lc quan nim
ton cc l hp ca cc quan nim cc b. Cui cng cc ng dng v truy xut c
s d liu c h tr qua lc ngoi ES (external schema). M hnh kin trc
ny c trnh by hnh 2.6.
ES 1 ES 2 ES n ES : External Schema
GCS GCS : Global Conceptual schema
LCS 1 LCS 2 LCS n Local Conceptual Schema
LiS 1 LiS 2 LiS 3 Local internal Schema
Hnh 2.6 Kin trc tham kho c s d liu phn tn
Cc thnh phn c th ca mt h qun tr c s d liu phn tn gm hai thnh
phn chnh (minh ha hnh 2.7): b phn giao tip ngi s dng (user processor)
v b phn x l d liu (data processor).
Cc thnh phn ca b phn giao tip ngi s dng gm:
- B phn giao tip (user interface handler): chu trch nhim dch cc cu
lnh ngi s dng v nh dng d liu kt qu chuyn cho ngi s
dng.
-
8/4/2019 Baigiang Csdl Phantan Final
25/193
Chng 2 Kin trc h qun tr c s d liu phn tn 18
C s d liu phn tn - 2010
- B phn kim sot d liu ng ngha (semantic data controller): s dng
cc rng buc ton vn v thng tin quyn hn, c nh ngha nh thnh
phn ca lc quan nim ton cc kim tra xem cc cu truy vn c
th x l c hay khng.
- B phn phn r v ti u ho vn tin ton cc (global query optimizer and
decomposer): xc nh nh mt chin lc hot ng nhm gim thiu chi
ph, phin dch cc cu vn tin ton cc thnh cc cu vn tin cc b bng
cch s dng cc lc quan nim ton cc, lc quan nim cc b v
th mc ton cc. B phn ti u vn tin ton cc cn chu trch nhim to
ra mt chin lc thc thi tt nht cho php ni phn tn.
- B phn gim st hot ng phn tn (distributed execution monitor): iu
phi vic thc hin phn tn cc yu cu ngi s dng v cng c gi l
b qun l giao tc phn tn (distributed transaction manager).
Thnh phn ch yu th hai ca h qun tr c s d liu phn tn l b x l
d liu (data processor), bao gm cc thnh phn:
- B phn ti u ho vn tin cc b (local query optimizer): thng hot
ng nh b chn ng truy xut, chu trch nhim chn ra mt ng truy
xut thch hp nht truy xut cc mc d liu.
- B phn khi phc cc b (local recovery manager) bo m cho cc c s
d liu cc b vn duy tr c tnh nht qun ngay c khi c s c xy ra.
- B phn h tr lc thc thi (run-time support processor): truy xut c s d liu
ty thuc vo cc lnh trong lch biu do b phn ti u vn tin to ra. N chnh
l b giao tip vi h iu hnh v cha b qun l vng m c s d liu, chutrch nhim qun l vng m v qun l vic truy xut d liu.
-
8/4/2019 Baigiang Csdl Phantan Final
26/193
Chng 2 Kin trc h qun tr c s d liu phn tn 19
C s d liu phn tn - 2010
Userp ng Yu cu ca User
ca h thng
B x l User
B iu khingiao tip User
ESb iu khinng nghi d liu
GCSB ti u haCu hi ton cc
GD / DB gim stThc thi ton cc T in d liu
Hnh 2.7 Kin trc ca h qun tr c s d liu phn tn
B x l Cu hicc b
B qun l phchi cc b
B x l h trthi ian ch
LCS
nht k hthng
LiS
B x l d liu
-
8/4/2019 Baigiang Csdl Phantan Final
27/193
Chng 3 Thit k c s d liu pht tn 20
C s d liu phn tn - 2010
CHNG 3 THIT K C S D LIU PHN TN
MC TIU
Chng ny cp n hai vn chnh sau:
1. Mt s tiu chun thit k v cch thc phn tn d liu mt cch hp l.
2. Nn tng ton hc h tr cho nh thit k xc nh s phn tn d liu.
Chng ny chia lm ba phn:
Phn th nht gii thiu m hnh thit k c s d liu phn tn vi hai tip cn ttrn xung v t di ln.
Phn th hai trnh by s thit k phn mnh ngang, phn mnh dc v phn mnhhn hp.
Phn th ba trnh by s cp pht cc phn mnh. Vn ny nhm n s nh x ccphn mnh n cc nh vt l.
Trong c s d liu phn tn, chng ta bit cc quan h trong lc c s d liu
thng c phn ra thnh cc mnh nh hn nhng cha a ra l do hoc chi tit
no v qu trnh ny. Phn ny cp n cc chi tit .
Cc cu hi sau y s bao qut ton b vn :
Ti sao cn phi phn mnh?
Lm th no thc hin phn mnh?
Phn mnh nn thc hin n mc no?
C cch g kim tra tnh ng n ca phn mnh hay khng?
Chng ta s cp pht nh th no?
Nhng thng tin no cn thit cho vic cp pht?
-
8/4/2019 Baigiang Csdl Phantan Final
28/193
Chng 3 Thit k c s d liu pht tn 21
C s d liu phn tn - 2010
3.1 Cc vn v thit k c s d liu phn tn
3.1.1 Cc l do phn mnh
i vi phn mnh, iu quan trng l c c mt n v phn mnh thch hp. Mt
quan h khng phi l mt n v p ng c yu cu .
Trc tin, khung nhn ca cc ng dng thng ch l tp con ca quan h. V th n
v truy xut khng phi l ton b quan h m ch l cc tp con ca quan h. Kt qu
l xem tp con ca cc quan h l n v phn tn s l iu thch hp nht.
Th hai l nu cc ng dng c cc khung nhn c nh ngha trn mt quan h cho
trc li nm ti nhng v tr khc nhau th c hai cch chn la vi n v phn tn lton b quan h. Hoc quan h khng c nhn bn m c lu mt v tr hoc
quan h c nhn bn cho tt c hoc mt s v tr c chy ng dng. Chn la u
gy ra mt s lng ln cc truy xut khng cn thit n d liu xa. Cn chn la
sau thc hin nhn bn khng cn thit, gy ra nhiu vn khi cp nht v c th gy
ra lng ph khng gian lu tr. V th, vic phn r mt quan h thnh nhiu mnh, mi
mnh c x l nh mt n v, s cho php thc hin nhiu giao tc ng thi.
Ngoi ra, vic phn mnh cc quan h s cho php thc hin song song mt cu vn tinbng cch chia n thnh mt tp cc cu vn tin con hot tc trn tng mnh. Do
vic phn mnh s lm tng mc hot ng song hnh v nh th lm tng lu
lng hot ng ca h thng.
Tuy nhin cng cn ch r nhng khim khuyt ca vic phn mnh:
- Nu ng dng cn phi truy xut d liu t hai mnh ri ni hoc hp chng li th
chi ph rt cao.
- Vn th hai lin quan n tnh ton vn d liu: do kt qu ca vic phn mnh,
cc thuc tnh tham gia vo mt ph thuc hm c th b phn r vo cc mnh khc
-
8/4/2019 Baigiang Csdl Phantan Final
29/193
Chng 3 Thit k c s d liu pht tn 22
C s d liu phn tn - 2010
nhau v c cp pht cho nhng v tr khc nhau. Trong trng hp ny vic kim tra
cc ph thuc hm cng phi thc hin truy tm d liu nhiu v tr.
3.1.2 Cc kiu phn mnhC hai kiu phn mnh khc nhau l phn mnh theo chiu dc, phn mnh theo chiu
ngang v phn mnh hn hp s c trnh by chi tit phn sau.
3.1.2.1 Phn mnh ngang
Phn mnh ngang chia mt quan h theo cc b. V vy mi mnh l mt tp con ca
quan h. C hai loi phn mnh ngang: phn mnh ngang nguyn thy v phn mnh
ngang dn xut.
Phn mnh ngang nguyn thy (primary horizontal fragmentation) l s phn mnh
mt quan h da trn mt v t c nh ngha trn mt quan h.
Phn mnh ngang dn xut (derived horizontal fragmentation) l phn r mt quan h
da vo cc v t c nh ngha trn mt quan h khc.
Trc khi trnh by thut ton hnh thc cho phn mnh ngang, chng ta s tho lun
mt cch trc quan v qu trnh phn mnh.
Cho quan h R, cc mnh ngang Ri l :
Ri = Fi(R)
Trong Fi l cng thc chn c c mnh Ri. Fi c dng chun hi.
V d: xt lc quan h ton cc :
SUPPLIER(SNUM,NAME,CITY)
Chng ta c th c hai phn mnh ngang sau:
-
8/4/2019 Baigiang Csdl Phantan Final
30/193
Chng 3 Thit k c s d liu pht tn 23
C s d liu phn tn - 2010
SUPPLIER1 = CITY= SF(SUPPLIER)
SUPPLIER1 = CITY= LA(SUPPLIER)
- S phn mnh trn tha iu kin y nu SF v LA ch l cc gi tr c th
c ca thuc tnh CITY; ngc li chng ta s khng bit nhng mnh no vi cc gi
tr CITY khc.
- iu kin ti thit c kim tra d dng v chng ta lun lun c th ti thit li
quan h ton cc SUPPLIER bng php ton hi:
SUPPLIER = SUPPLIER1 U SUPPLIER2
- iu kin tch bit cng c kim tra mt cch r rng.
3.1.2.2 Phn mnh ngang dn xut
Trong mt s trng hp s phn mnh ngang c dn ra t mt phn mnh ngang
ca mt quan h khc.
V d: Mt quan h ton cc
SUPPLY(SNUM, PNUM, DEPTNUM, QUAN)
Vi SNUM l m s ngi cung cp. Chng ta mun phn chia quan h ny sao cho
mt mnh cha cc b cho nhng ngi cung cp mt thnh ph cho trc. Tuy
nhin thnh ph khng phi l thuc tnh ca quan h SUPPLY m l thuc tnh ca
quan h SUPPLIER. V th chng ta cn thc hin php na kt xc nh cc b ca
SUPPLY tng ng vi nhng ngi cung cp trong mt thnh ph cho trc. Sphn mnh dn xut ca SUPPLY c th c nh ngha nh sau:
SUPPLY1 = SUPPLY < SUPPLIER1
SUPPLY2 = SUPPLY < SUPPLIER2
-
8/4/2019 Baigiang Csdl Phantan Final
31/193
Chng 3 Thit k c s d liu pht tn 24
C s d liu phn tn - 2010
Vi < l php ton na kt (Semi Join)
- Tnh ti thit quan h ton cc SUPPLY c th c th hin qua php hi.
- Tnh y ca s phn mnh trn yu cu khng c c m s ngi cung cp
c trong quan h SUPPLY nhng li khng tn ti trong quan h SUPPLIER. y
chnh l mt rng buc ton vn v kho ngoi trong c s d liu.
- iu kin tch bit cng c tho nu mt b trong quan h SUPPLY khng tng
ng vi hai b ca quan h SUPPLIER m thuc v hai mnh khc nhau. Trng hp
ny cng d kim tra v m s ngi cung cp l kho duy nht ca quan h
SUPPLIER; tuy nhin, trong trng hp tng qut th kh c th chng minh iu kinny hn.
3.1.2.3 Phn mnh dc
S phn mnh dc ca mt quan h ton cc l vic chia cc thuc tnh vo hai (nhiu)
nhm; cc mnh nhn c t php chiu quan h ton cc trn mi nhm. S phn
mnh ny s ng n nu mi thuc tnh c nh x vo t nht vo mt thuc tnh
ca cc phn mnh; hn na, n phi c kh nng ti thit li quan h nguyn thybng cch kt ni cc phn mnh li vi nhau. Ch , mi phn mnh dc ca mt
quan h ton cc phi cha kha ca quan h ton cc. C hai kiu phn mnh dc :
phn mnh dc d tha v phn mnh dc khng d tha.
Phn mnh dc d tha (redundant fragmentation):
Phn mnh dc d tha l cc phn mnh dc cha mt hoc nhiu thuc tnh chung
khng kha.
V d : Xt quan h ton cc nhn vin (EMP) m t m nhn vin (EMPNUM), tn
nhn vin (NAME), lng (SAL), thu thu nhp (TAX), nh qun l (MNRNUM) v
phng ban h lm vic (DEPTNUM) nh sau :
-
8/4/2019 Baigiang Csdl Phantan Final
32/193
Chng 3 Thit k c s d liu pht tn 25
C s d liu phn tn - 2010
EMP(EMPNUM, NAME, SAL, TAX, MNRNUM, DEPTNUM)
Quan h ton cc ny c phn mnh dc d tha (thuc tnh NAME ) nh sau :
EMP1 = EMPNUM, NAME, MGRNUM, DETPNUM (EMP)
EMP2 = EMPNUM, NAME ,SAL, TAX (EMP)
Phn mnh dc khng d tha (non-redundant fragmentation):
Phn mnh dc khng d tha l cc phn mnh dc khng cha thuc tnh chung
khng kha no c.
V d : Quan h ton cc :
EMP(EMPNUM, NAME, SAL, TAX, MNRNUM, DEPTNUM)
Quan h ton cc ny c phn mnh dc khng d tha (thuc tnh ) nh sau :
EMP1 = EMPNUM, NAME, MGRNUM, DETPNUM (EMP)
EMP2 = EMPNUM, SAL, TAX (EMP)
Phn mnh ny phn nh lng v thu ca cc nhn vin c quan l ring. Vic ti
thit li quan h EMP c th nhn c t :
EMP = EMP1 JNN EMP2
(vi JNN l php kt ni t nhin hai quan h)
V EMPNUM l kha ca quan h EMP. Ni chung, vic cha kha ca quan h ton
cc vo mi mnh l cch tt nht bo m cho tnh ti thit.
T chng ta thy s phn mnh cng tha tnh y v tnh tch bit.
V d : Xt quan h ton cc
-
8/4/2019 Baigiang Csdl Phantan Final
33/193
Chng 3 Thit k c s d liu pht tn 26
C s d liu phn tn - 2010
EMP(EMPNUM, NAME, SAL, TAX, MNRNUM, DEPTNUM)
Mt s phn mnh dc ca quan h ny c nh ngha:
EMP1 = EMPNUM, NAME, MGRNUM, DETPNUM (EMP)
EMP2 = EMPNUM, SAL, TAX (EMP)
Phn mnh ny phn nh lng v thu ca cc nhn vin c quan l ring. Vic ti
thit li quan h EMP c th nhn c t :
EMP = EMP1 JNN EMP2
(vi JNN l php kt ni t nhin hai quan h)
V EMPNUM l kha ca quan h EMP. Ni chung, vic cha kha ca quan h ton
cc vo mi mnh l cch tt nht bo m cho tnh ti thit.
T chng ta thy s phn mnh cng tha tnh y v tnh tch bit.
3.1.2.4 S phn mnh hn hp
Cc phn mnh nhn c bi cc php phn mnh trn l cc quan h, v th chng ta
c th p dng cc php ton phn mnh mt cch qui. Vic ti thit quan h thc
hin c bng cch p dng cc lut ti thit theo th t ngc. Cc biu thc m
nh ngha cc phn mnh trong trng hp ny s phc tp hn.
V d: Xt quan h ton cc:
EMP(EMPNUM, NAME, SAL, TAX, MNRNUM, DEPTNUM)
Di y l mt s phn mnh hn hp bng cch p dng s phn mnh dc ri sau
p dng phn mnh ngang trn DEPTNUM:
-
8/4/2019 Baigiang Csdl Phantan Final
34/193
Chng 3 Thit k c s d liu pht tn 27
C s d liu phn tn - 2010
Hnh 3.1 Cy phn mnh ca quan h EMP
EMP1 = deptnum
-
8/4/2019 Baigiang Csdl Phantan Final
35/193
Chng 3 Thit k c s d liu pht tn 28
C s d liu phn tn - 2010
Thit k lc quan nim: lc c s d liu m t cc thng tin v mi quan h
ca chng cn lu tr;
Thit k c s d liu vt l ngha l nh x lc quan nim n cc vng lu trvt l v xc nh cc phng php truy xut thch hp.
Trong c s d liu phn tn hai vn ny tr thnh vn thit k lc ph qut
v thit k cc c s d liu cc b ti mi site. S phn tn c s d liu cng thm
vo cc vn trn hai vn mi:
Thit k s phn tn, ngha l xc nh cc quan h ph qut c phn mnh ngang,
dc hay hn hp nh th no?
Thit k s cp pht cc phn mnh, ngha l xc nh cc phn mnh c nh x n
cc nh vt l nh th no, k c vic xc nh s nhn bn d liu.
Hai vn mi ny c trng y cho s thit k phn tn d liu. S thit k phn
mnh l mt tiu chun lun l trong khi s thit k cp pht nhm n vic sp t d
liu vt l ti cc sites.
Mc du vic thit k cc chng trnh ng dng c xy dng sau khi thit k lc
, s hiu bit v cc yu cu ca cc chng trnh ng dng cng quyt nh n s
thit k lc v cc lc phi h tr cc ng dng mt cch hiu qu. Cc yu
cu ca ng dng nh sau:
Site m ng dng c a ra (cn c gi l site gc ca ng dng)
Tn s hot ng ca ng dng ( ngha l s lng yu cu hot ng trong mt n v
thi gian); trong trng hp tng qut cc ng dng c th c a ra t nhiu sites,
chng ta cn bit tn s hot ng ca mi ng dng ti mi site.
-
8/4/2019 Baigiang Csdl Phantan Final
36/193
Chng 3 Thit k c s d liu pht tn 29
C s d liu phn tn - 2010
S lng, kiu v s thng k phn tn ca cc truy xut c to bi cc ng dng
n mi i tng d liu c yu cu.
3.2.1 Cc mc tiu ca vic thit k phn tn d liuS truy xut cc b: mc tiu ca s phn tn d liu l cc ng dng truy xut d
liu cc b cng nhiu cng tt, gim bt cc truy xut d liu t xa.
Vic thit k s phn tn d liu ti a ho truy xut cc b c th c thc hin
bng cch thm s lng cc tham kho cc b v cc tham kho t xa tng ng cho
mi phn mnh d tuyn v mi cp pht phn mnh, t chn ra gii php tt nht.
Tnh sn sng v kh tn ca cc d liu phn tn: trong chng 1, chng ta ch ra
tnh sn sng v kh tn ( tin cy) l cc im mnh ca c s d liu phn tn so vi
c s d liu tp trung. Mc sn sng cao i vi cc ng dng ch c c thc
hin bng cch lu tr nhiu bn sao ca cng mt thng tin; h thng phi c kh
nng chuyn n bn sao c chn thch hp khi mt bn sao khng c truy xut
bnh thng.
kh tn cng c thc hin bng cch lu tr nhiu bn sao, khi n c kh nngphc hi khi c s ph hu mt s bn sao.
S phn b ti: s phn tn b trn cc sites l mt tnh cht quan trng ca cc h
thng my tnh phn tn. S phn b ti tn dng sc mnh ca vic s dng cc
my tnh, v cc i ho mc x l song song cc lnh thc thi ca cc ng dng.
V s phn b ti c th nh hng xu n s truy xut cc b nn cn xem xt cn
bng hai mc tiu ny.
Chi ph lu tr: s phn tn c s d liu phn nh chi ph ca s lu tr ti cc sites
khc nhau. Tuy nhin chi ph lu tr d liu khng ng k so vi chi ph xut nhp,
chi ph truyn thng ca cc ng dng. Nhng gii hn ca b lu tr phi c xem
xt k.
-
8/4/2019 Baigiang Csdl Phantan Final
37/193
Chng 3 Thit k c s d liu pht tn 30
C s d liu phn tn - 2010
3.2.2 Cc tip cn thit k s phn tn d liu
C hai cch tip cn cho s thit k c s d liu: tip cn t trn - xung v tip cn
t di - ln.
Trong cch tip cn t trn xung, chng ta bt u thit k lc ph qut, thit k
s phn mnh c s d liu v sau cng cp pht cc mnh n cc sites, to cc nh
vt l ca chng.
Cch tip cn ny thch hp nht i vi cc h thng c pht trin t u v n cho
php thit k mt cch hp l. Trong chng ny chng ta khng cp n cch thit
k lc ph qut v lc vt l v n khng ring bit g i vi c s d liu
phn tn m tp trung vo s thit k phn mnh v cp pht cc phn mnh.
Khi c s d liu phn tn c pht trin nh l s t hp cc c s d liu sn c th
n li khng d dng i vi phng php tip cn t trn -xung. Trong trng hp
ny lc ph qut thng c to ra t s tho hip gia cc m t d liu sn c.
T cch tip cn t di-ln c th c s dng thit k s phn tn d liu.
Cch thit k t di ln yu cu:
Chn mt m hnh c s d liu chung m t lc ph qut ca c s d liu.
Chuyn dch mi lc cc b vo trong m hnh d liu chung.
T hp li lc cc b vo trong lc ph qut chung.
Ba vn ny khng ring bit g i vi c s d liu phn tn m n hin din ngay
trong cc h thng tp trung. Bi th phng php thit k t di ln khng c
cp y. Tuy nhin ba vn ny rt quan trng trong cc h thng c s d liu
phn tn khng ng nht.
-
8/4/2019 Baigiang Csdl Phantan Final
38/193
Chng 3 Thit k c s d liu pht tn 31
C s d liu phn tn - 2010
3.2.3 Thit k s phn mnh d liu
Thit k phn mnh l vn u tin phi gii quyt trong phng php thit k phn
tn d liu t trn xung. Mc ch ca vic thit k phn mnh l xc nh cc phn
mnh khng chng cho ln nhau. l cc n v logic ca s cp pht.
Thit k phn mnh l nhm cc b (trong trng hp phn mnh ngang) hoc nhm
cc thuc tnh (theo phn mnh dc) m c nhng tnh cht ging nhau t quan im
cp pht chng. Mi mt nhm cc b hay cc thuc tnh c cng tnh cht s to nn
mt phn mnh.
V d 1: Xt s phn mnh ngang cho quan h ph qut EMP. Gi s rng cc ng
dng quan trng ca c s d liu phn tn ny yu cu thng tin t quan h EMP v
cc nhn vin l thnh vin ca cc d n. Mi phng ban l mt site ca c s d liu
phn tn.
Cc ng dng c th c gi t bt k phng ban no; tuy nhin khi chng c gi
t mt phng ban th n s u tin tm cc b nhn vin trong phng ban trc vi
xc sut cao hn nhng nhn vin ca phng ban khc. Trong trng hp ny cc
nhn vin c phn mnh ngang theo tnh cht lm vic cng mt phng ban.
Mt v d n gin v s phn mnh dc ca quan h EMP nh sau: gi s cc thuc
tnh SAL v TAX ch c s dng bi cc ng dng qun tr th cc thuc tnh ny s
nm trong phn mnh dc thch hp.
3.2.3.1 S phn mnh nguyn thy
Nhc li s phn mnh nguyn thy c nh ngha bng cch s dng php chn la
trn quan h ton cc. Tnh ng n ca s phn mnh nguyn thy i hi mi b
trong quan h ton cc ch nm trong mt v ch mt phn mnh. V th xc nh mt
phn mnh nguyn thy ca mt quan h ton cc yu cu xc nh mt tp cc v t
-
8/4/2019 Baigiang Csdl Phantan Final
39/193
Chng 3 Thit k c s d liu pht tn 32
C s d liu phn tn - 2010
chn y v ri nhau. Tnh cht m chng ta yu cu cho mi phn mnh phi c
tham kho ng nht bi tt c cc ng dng.
Cho R l quan h ton cc m chng ta phn mnh ngang nguyn thy. Chng ta ara mt s nh ngha sau:
1. Mt v t n gin l v t c kiu:
Thuc tnh = gi tr
2. Mt v t s cp y cho mt tp cc v t n gin P l chun hi ca tt c cc v t
xut hin trong P :
y = p*i
Vi p*i = pi hoc p*i= not pi v y false.
3. Mt phn mnh l mt tp cc b tng ng vi mt v t s cp
4. Mt v t n gin pi l thch hp i vi mt tp cc v t n gin P nu tn ti t
nht hai v t s cp m biu thc ca n ch khc nhau do v t pi (xut hin dngthng thung v dng ph nh ca n) m cc phn mnh tng ng c tham kho
n bi t nht mt ng dng.
V d 2: Xt s phn mnh ngang v d 1. Gi s c mt s ng dng quan trng yu
cu cc thng tin v cc nhn vin tham gia vo cc d n; li c mt s ng dng
quan trng khc khng ch yu cu thng tin trn m cn cn thng tin v ngh nghip.
Hai v t n gin cho v d ny l DEPT =1 v JOB =P. Cc v t s cp cho hai vt ny l:
DEPT = 1 AND JOB = P
DEPT = 1 AND JOB p
-
8/4/2019 Baigiang Csdl Phantan Final
40/193
Chng 3 Thit k c s d liu pht tn 33
C s d liu phn tn - 2010
DEPT 1 AND JOB = p
DEPT 1 AND JOB p
Tt c cc v t n gin trn l thch hp, trong khi, v d, SAL > 50 khng l mt v
t thch hp; Ni chung, nu chng ta phn mnh vi bt k v t s cp no trn
cng vi v t n gin SAL>50 hoc SAL 5 }l y nhng khng cc tiu v SAL >
khng thch hp.
-
8/4/2019 Baigiang Csdl Phantan Final
41/193
Chng 3 Thit k c s d liu pht tn 34
C s d liu phn tn - 2010
S phn mnh c th thc hin nh sau:
Nguyn tc: Xt mt v t p1 phn chia cc b ca R vo hai phn m chng c
tham kho khc nhau bi t nht mt ng dng. Cho P = p1.
Phng php: Xt mt v t n gin mi pi phn chia t nht mt phn mnh ca P
thnh hai phn m c tham kho khc nhau bi t nht bi mt ng dng. t P = P
U pi. Xo cc v t khng thch hp khi P. Lp li bc ny cho n khi tp ca cc
phn mnh c s l y .
V d 4: Ly li cc v d lm v d minh ha cho phng php trn. Xt v t u
tin SAL > 50; gi s lng trung bnh ca cc lp trnh vin ln hn 50, v t ny xcnh hai tp nhn vin m c tham kho mt cch khc nhau bi cc ng dng. Ta
c P1 = { SAL > 50}
Chng ta xt DEPT =2; v t ny l thch hp v c thm vo tp P1, ta c P2 =
{SAL > 50, DEPT = 1}
Cui cng, xt JOB = P. V t ny cng thch hp v thm n vo P2, ta c P3 = {
SAL>50, DEPT = 1, JOB =P}. Chng ta khm ph ra SAL>50 khng thch hp
trong P3. V th chng ta nhn c tp cui cng l P4 = { DEPT =1 , JOB = P} y
v cc tiu.
Xt mt v d tng qut: C s d liu gm c cc quan h EMP, DEPT, SUPPLIER,
SUPPLY.
Gi s c s d liu phn tn ca cng ty California c ba sites ti San Francisco
(site 1), Fresno (site 2), v Los Angeles (site 3); Fresno nm gia San Francisco v Los
Angeles. C tt c 30 phng ban c nhm li nh sau: 10 phng ban u tin gn
San Francisco, cc phng ban t 11 n 20 gn Fresno v cc phng ban trn 20 th
gn Los Angeles. Tt c cc nh cung cp San Francisco hoc Los Angeles. Ngoi
-
8/4/2019 Baigiang Csdl Phantan Final
42/193
Chng 3 Thit k c s d liu pht tn 35
C s d liu phn tn - 2010
ra cng ty cng c chia theo khi nim min: San Francisco min Bc, Los
Angeles min nam cn Fresno nm gia hai min nn mt s phng ban nm gn
Fresno s ri vo min bc hoc min nam.
Chng ta thit k s phn mnh ca SUPPLIER v DEPT vi s phn mnh ngang
nguyn thy.
Cc nh cung cp trong quan h SUPPLIER(SNUM, NAME, CITTY) c gi tr ca
thuc tnh CITY l SF hoc l LA. Gi s c mt ng dng quan trng yu cu
cho bit tn nh cung cp (NAME) khi nhp m s nh cung cp (SNUM). Cu lnh
SQL cho ng dng nh sau:
Select NAME
from SUPPLIER
where SNUM = $X
ng dng c gi ti bt k site no; nu n c gi ti site 1, n s tham kho n
SUPPLIERS c CITY = SF vi xc sut 80%; nu c gi t site 2, n s tham
kho n SUPPLIERS ca SF v LA vi xc sut bng nhau; nu n c gi tsite 3, n s tham kho n SUPPLIERS ca LA vi xc sut 80%. iu ny dn
n l cc phng ban s lin h n cc nh cung cp gn .
Chng ta a cc v t sau:
p1 : CITY = SF
P2: CITY = LA
Tp {p1, p2} l y v cc tiu.
Mc du n gin, v d ny minh ha hai tnh cht quan trng sau:
-
8/4/2019 Baigiang Csdl Phantan Final
43/193
Chng 3 Thit k c s d liu pht tn 36
C s d liu phn tn - 2010
- Cc v t thch hp m t cho phn mnh ny khng th c suy ra bng cch phn
tch m lnh ca ng dng.
- Quan h mt thit gia cc v t gim i s lng phn mnh. Trong trng hp nychng ta nn xem xt nhng v t tng ng vi cc v t s cp sau:
y1: (CITY = SF) AND (CITY = LA)
y2: (CITY = SF) AND NOT(CITY = LA)
y3: NOT(CITY = SF) AND (CITY = LA)
y4: NOT(CITY = SF) AND NOT(CITY = LA)
Nhng chng ta bit rng:
(CITY = LA) NOT (CITY = SF)
v (CITY = SF) NOT (CITY = LA)
v v th chng ta suy ra y1 v y4 mu thun ln nhau v y2 v y3 s n gin thnh hai
v t p1 v p2.
By gi chng ta hy xt quan h ph qut sau:
DEPT(DEPTNUM, NAME, AREA, MGRNUM)
Chng ta s tp trung vo cc ng dng quan trng sau:
Cc ng dng qun tr ch c gi t site 1 v site 3; cc ng dng qun tr v cc
phng ban min bc c gi ti site 1 v cc ng dng qun tr v cc phng ban
min nam c gi ti site 3.
Cc ng dng v cng vic c qun l ti mi phng ban; chng c th c gi t
bt k phng ban no nhng chng phi tham kho cc b ca phng ban gn site ca
n nht vi xc xut cao hn cc b nhng lu nhng ni khc.
Chng ta a ra cc v t sau:
-
8/4/2019 Baigiang Csdl Phantan Final
44/193
Chng 3 Thit k c s d liu pht tn 37
C s d liu phn tn - 2010
p1: DEPTNUM 20
l sai 5 (Hnh 3.2 ); v th s phn mnh gim cn 4 phn mnh:
y1: DEPTNUM
-
8/4/2019 Baigiang Csdl Phantan Final
45/193
Chng 3 Thit k c s d liu pht tn 38
C s d liu phn tn - 2010
Input : R: quan h; Pr: tpcc v t n gin;
Output: Pr: tp cc v t cc tiu v y ;
Declare
F: tp cc mnh hi s cp;Begin
Pr=; F = ;
For each v t p Pr if p phn hoch R theo Quy tc 1 then
Begin
Pr: = Pr p;
Pr: = Pr p;
F: = F p; {fi l mnh hi s cp theo pi }End; {Chng ta chuyn cc v t c phn mnh R vo Pr}
Repeat
For each p Pr if p phn hoch mt mnh fk ca Prtheo quy tc 1 then
Begin
Pr: = Pr p;
Pr: = Pr p;
F: = F p;
End;
Until Pr y {Khng cn p no phn mnh fk ca Pr}
For each p Pr, ifp m pp then
Begin
Pr:= Pr-p;
F:= F - f;
End;
End. {COM_MIN}
Thut ton bt du bng cch tm mt v t c lin i v phn hoch quan h cho.
Vng lp Repeat-until thm cc v t c phn hoch cc mnh vo tp ny, bo m
tnh y ca Pr. on cui kim tra tnh cc tiu ca Pr.
-
8/4/2019 Baigiang Csdl Phantan Final
46/193
Chng 3 Thit k c s d liu pht tn 39
C s d liu phn tn - 2010
Sau khi c c tp cc v t cc tiu v y , ta c th tin hnh vic phn mnh
ngang nguyn thy da vo thuc ton PHORIZONTAL nh sau:
Thut ton PHORIZONTALInput: R: quan h; Pr: tp cc v t n gin;
Output: M: tp cc v t hi s cp;
BeginPr:= COM_MIN(R, Pr);
Xc nh tp M cc v t hi s cp;
Xc nh tp I cc php ko theo gia cc piPr;
For each mi M do
BeginIF mi mu thun vi I then
M:= M-mi
End;
End. {PHORIZONTAL}
ng ta c tp Pr l cc tiu v y .
3.2.3.2 S phn mnh ngang dn xut
S phn mnh dn xut ngang ca mt quan h ton cc R khng da trn cc thuc
tnh ca n m c dn ra t s phn mnh ngang ca mt quan h khc. S phn
mnh dn xut ngang c s dng thun li cho vic kt ni cc mnh.
Mt kt ni phn tn l mt kt ni gia cc quan h phn mnh ngang. Khi mt ng
dng yu cu mt kt ni gia hai quan h ton cc R v S, tt c cc b ca R v S
cn c so snh; v th, cn phi so snh tt c cc phn mnh Ri ca R vi cc phn
mnh Sj ca S. Tuy nhin, i khi chng ta c th gim mt s kt ni cc b rng gia
cc phn mnh. iu ny xy ra khi cc gi tr ca thuc tnh kt ni trong Ri v Sj ri
nhau.
-
8/4/2019 Baigiang Csdl Phantan Final
47/193
Chng 3 Thit k c s d liu pht tn 40
C s d liu phn tn - 2010
Kt ni phn tn c biu din mt cch hiu qu bng cch dng th kt ni.
th kt ni G ca kt ni phn tn R v S l mt th (N,E) vi cc nt N th hin cc
phn mnh ca R v S v cc cnh v hng E biu din cc kt ni khng rng gia
cc phn mnh. n gin, chng ta khng cha trong cc phn mnh no ca R v
S m c kt ni rng. th kt ni c minh ha hnh 3.3.
Chng ta ni mt th kt ni l hon ton khi n cha tt c cc cnh c th c gia
cc phn mnh ca R v S. N c rt gn khi mt mt s cnh. C hai kiu th rt
gn:
1. th kt ni c phn hochnu th gm hai hay nhiu th con ri nhau.
2. th kt ni n gin nu n c phn hochv mi th con c mt cnh.
Xc nh mt kt ni ca mt th kt ni n gin l rt quan trng trong thit k c
s d liu. Mt cp phn mnh m c kt ni bi mt cnh trong mt th n
gin th c mt tp gi tr chung ng ca thuc tnh kt ni. V th, nu c th xc nh
s phn mnh v s nh v ca hai quan h R v S sao cho th kt ni l n gin
v cc cp phn mnh tng ng c lu tr ti mt a im th kt ni ny c thc biu din phn tn bng cch kt ni cc b cc cp phn mnh v sau suy ra
cc kt qu ca nhng kt ni qua cc a im.
Hnh 3.3 Cc th kt ni
-
8/4/2019 Baigiang Csdl Phantan Final
48/193
Chng 3 Thit k c s d liu pht tn 41
C s d liu phn tn - 2010
Ti y c th a ra mt nh ngha hnh thc ca s phn mnh dn xut ngang. Cho
mt quan h ton cc R, cc phn mnh Ri ca n c dn xut t s phn mnh R v
S qua php na kt ni
-
8/4/2019 Baigiang Csdl Phantan Final
49/193
Chng 3 Thit k c s d liu pht tn 42
C s d liu phn tn - 2010
SUPPLY(SNUM, PNUM, DEPTNUM, QUAN)
SUPPLY1 = SUPPLY < DEPT1
SUPPLY2 = SUPPLY < DEPT2
Vi < l php ton na kt (Semi Join)
Kim tra tnh ng n
By gi chng ta cn phi kim tra tnh ng ca phn mnh ngang.
a. Tnh y
+ Phn mnh ngang nguyn thu: Vi iu kin cc v t chn l y , phn mnh
thu cng c m bo l y , bi v c s ca thut ton phn mnh l tp cc v
t cc tiu v y Pr, nn tnh y c bo m vi iu kin khng c sai st
xy ra.
+ Phn mnh ngang dn xut: C khc cht t, kh khn chnh y l do v t nh
ngha phn mnh c lin quan n hai quan h. Trc tin chng ta hy nh ngha quitc y mt cch hnh thc.
R l quan h thnh vin ca mt ng ni m ch nhn l quan h S. Gi A l thuc
tnh ni gia R v S, th th vi mi b t ca R, phi c mt b t ca S sao cho
t.A=t.A
Quy tc ny c gi l rng buc ton vn hay ton vn tham chiu, bo m rngmi b trong cc mnh ca quan h thnh vin u nm trong quan h ch nhn.
b. Tnh ti thit
-
8/4/2019 Baigiang Csdl Phantan Final
50/193
Chng 3 Thit k c s d liu pht tn 43
C s d liu phn tn - 2010
Ti thit mt quan h ton cc t cc mnh c thc hin bng ton t hp trong c
phn mnh ngang nguyn thy ln dn xut, V th mt quan h R vi phn mnh
Fr={R1, R2,,Rm} chng ta c
R = Ri , Ri FR
c. Tnh tch bit
Vi phn mnh nguyn thu tnh tch ri s c bo m min l cc v t hi s cp
xc nh phn mnh c tnh loi tr tng h (mutually exclusive). Vi phn mnh dn
xut tnh tch ri c th bo m nu th ni thuc loi n gin.
3.2.3.3 S phn mnh dc
Xc nh s phn mnh dc ca mt quan h ton cc i hi nhm li cc thuc tnh
m c tham kho cng kiu bi cc ng dng.
iu kin ng n ca s phn mnh dc yu cu mi thuc tnh ca R ph thuc vo
t nht mt tp thuc tnh v mi tp thuc tnh phi cha thuc tnh kho ca R.
Mc ch ca s phn mnh dc l xc nh cc mnh Ri no m nhiu ng dng cth thc thi trn chng. Xt mt quan h ton cc R c phn hoch dc thnh R1 v
R2. Mt ng dng s c li qua vic phn hoch ny nu n c th c thc thi bng
cch ch s dng R1 hoc R2. Tuy nhin nu ng dng yu cu c hai phn mnh th
vic phn hoch ny khng c li v phi thc hin php kt ni xy dng li R.
Vic xc nh mt phn mnh dc cho mt quan h ton cc khng d dng khi s
lng t hp cc thuc tnh ln. V th cch tip cn heuristic c u th hn. Chng tas m t vn tt hai cch tip cn ny:
1. Tip cn phn r: Mt quan h ton cc s c ln lt phn r vo thnh cc
phn mnh.
-
8/4/2019 Baigiang Csdl Phantan Final
51/193
Chng 3 Thit k c s d liu pht tn 44
C s d liu phn tn - 2010
2. Tip cn gom nhm: Cc thuc tnh c gom nhm ln lt to thnh cc phn
mnh.
C hai tip cn ny ging nhau im: Chng tip din bng cch to ra mt chn latt nht ti mi vng lp. Trong c hai trng hp, cc cng thc chn la c dng
xc nh kh nng tt nht cho vic phn r hay gom nhm. Mt s dng quay lui
(backtracking) c th c to ra chuyn mt s thuc tnh t tp ny n tp khc
cho n khi t dc phn mnh cui cng.
V d: Xt quan h ton cc:
EMP(EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)
Gi s cc ng dng s dng quan h DEPTNUM nh sau:
Cc ng dng qun tr tp trung site 3 yu cu thng tin NAME, SAL, TAX ca cc
nhn vin.
Cc ng dng v cng vic c qun l ti cc phng ban yu cu thng tin v
NAME, MGRNUM v DEPTNUM ca cc nhn vin; cc ng dng ny c th cgi ti tt c cc sites v tham kho cc b nhn vin trong cng mt nhm ca cc
phng ban vi xc sut 80%.
V th s phn mnh dc ca EMP thnh hai mnh vi cc thuc tnh qun tr v cc
thuc tnh m t cng vic l kh t nhin. T chn ta c c hai phn mnh
dc nh sau:
EMP1(EMPNUM,NAME,TAX,SAL)
EMP2(EMPNUM, NAME, MGRNUM,DEPT)
-
8/4/2019 Baigiang Csdl Phantan Final
52/193
Chng 3 Thit k c s d liu pht tn 45
C s d liu phn tn - 2010
Trong hai phn mnh ny chng ta quyt nh thuc tnh NAME c hai phn
mnh nhm tng hiu sut chng trnh (khi php thc hin php kt) v hn na l
tn ca cc nhn vien th thng l khng thay i.
Phng php phn mnh dc
Mt phn mnh dc cho mt quan h R sinh ra cc mnh R1, R2,..,Rr, mi mnh cha
mt tp con thuc tnh ca R v c kho ca R. Mc ch ca phn mnh dc l phn
hoch mt quan h thnh mt tp cc quan h nh hn nhiu ng dng ch cn chy
trn mt mnh. Mt phn mnh ti ul phn mnh sinh ra mt lc phn mnh
cho php gim ti a thi gian thc thi cc ng dng chy trn mnh .
Phn mnh dc tt nhin l phc tp hn so vi phn mnh ngang. iu ny l do tng
s chn la c th ca mt phn hoch dc rt ln.
V vy c c cc li gii ti u cho bi ton phn hoch dc thc s rt kh khn.
V th li phi dng cc phng php khm ph (heuristic). Chng ta a ra hai loi
heuristic cho phn mnh dc cc quan h ton cc.
- Nhm thuc tnh: Bt u bng cch gn mi thuc tnh cho mt mnh, v ti mi
bc, ni mt s mnh li cho n khi tha mt tiu chun no . K thut ny c
c xut ln u cho cc CSDL tp trung v v sau c dng cho cc CSDL phn
tn.
- Tch mnh: Bt u bng mt quan h v quyt nh cch phn mnh c li da trn
hnh vi truy xut ca cc ng dng trn cc thuc tnh.
Bi v phn hoch dc t vo mt mnh cc thuc tnh thng c truy xut chung
vi nhau, chng ta cn c mt gi tr o no nh ngha chnh xc hn v khi
nim chung vi nhau. S o ny gi l t lc hay lc ht (affinity) ca thuc tnh,
ch ra mc lin i gia cc thuc tnh.
-
8/4/2019 Baigiang Csdl Phantan Final
53/193
Chng 3 Thit k c s d liu pht tn 46
C s d liu phn tn - 2010
Yu cu d liu chnh c lin quan n cc ng dng l tn s truy xut ca chng.
gi Q={q1, q2,,qq} l tp cc vn tin ca ngi dng (cc ng dng) s chy trn
quan h R(A1, A2,,An). Th th vi mi cu vn tin qi v mi thuc tnh Aj, chng ta
s a ra mt gi tr s dng thuc tnh, k hiu use(qi, Aj) c nh ngha nh sau:
1 : nu thuc tnh Aj c vn tin qi tham chiu
use(qi, Aj)= 0 : trong trng hp ngc li
Cc vct use(qi, ) cho mi ng dng rt d nh ngha nu nh thit k bit c cc
ng dng s chy trn CSDL.
V d:
Xt quan h d n:
Project(JNO, JNAME, BUDJET, LOC)
Gi s rng cc ng dng sau y chy trn cc quan h . Trong mi trng hp
chng ta cng c t bng SQL.
q1: Tm ngn sch ca mt d n, cho bit m ca d n
SELECT Budjet
FROM PROJECT
WHERE JNO =gi tr
q2: Tm tn v ngn sch ca tt c mi d n
SELECT JNAME, Budjet
FROM PROJECT
q3: Tm tn ca cc d n c thc hin ti mt thnh ph cho
SELECT JNAME
FROM PROJECT
WHERE LOC=gi tr
-
8/4/2019 Baigiang Csdl Phantan Final
54/193
Chng 3 Thit k c s d liu pht tn 47
C s d liu phn tn - 2010
q4: Tm tng ngn sch d n ca mi thnh ph
SELECT SUM (Budjet)
FROM PROJECT
WHERE LOC=gi tr
Da theo bn ng dng ny, chng ta c th nh ngha ra cc gi tr s dng thuc
tnh. cho tin v mt k php, chng ta gi A1=JNO, A2=JNAME, A3=Budjet,
A4=LOC. Gi tr s dng c nh ngha di dng ma trn s dng (hnh 3.4), trong
mc (i,j) biu th use(qi , Aj ).
Hnh 3.4 Ma trn s dng
T lc ca cc thuc tnh
Gi tr s dng thuc tnh khng lm c s cho vic tch v phn mnh. iu
ny l do chng khng biu th cho ln ca tn s ng dng. S o lc ht (affinity)
ca cc thuc tnh aff(Ai, Aj), biu th cho cu ni (bond) gia hai thuc tnh ca mt
quan h theo cch chng c cc ng dng truy xut, s l mt i lng cn thit
cho bi ton phn mnh.
Xy dng cng thc o lc ht ca hai thuc tnh Ai, Aj.
Gi k l s cc mnh ca R c phn mnh. Tc l R = R1.Rk.
A1 A2 A3 A4
q1 1 0 1 0
q2 0 1 1 0
q3 0 1 0 1
-
8/4/2019 Baigiang Csdl Phantan Final
55/193
Chng 3 Thit k c s d liu pht tn 48
C s d liu phn tn - 2010
Q= {q1, q2,,qm} l tp cc cu vn tin (tc l tp cc ng dng chy trn quan h R).
t Q(A, B) l tp cc ng dng q ca Q m use(q, A).use(q, B) = 1.
Ni cch khc:
Q(A, B) = {qQ: use(q, A) =use(q, B) = 1}
v d da vo ma trn trn ta thy Q(A1,A1) = {q1}, Q(A2,A2 ) = {q2, q3}, Q(A3,A3 ) =
{q1,q2, q4}, Q(A4,A4 ) = {q3, q4}, Q(A1,A2 ) = rng, Q(A1,A3 ) = {q1}, Q(A2,A3 ) =
{q2},..
S o lc ht gia hai thuc tnh Ai, Aj c nh ngha l:aff(Ai, Aj)= refl (qk)accl(qk)
qkQ(Ai, Aj) l Rl
Hoc:
aff(Ai, Aj)= refl (qk)accl(qk)Use(qk, Ai)=1 Use(qk, Aj)=1 Rl
Trong refl (qk) l s truy xut n cc thuc tnh (Ai, Aj) cho mi ng dng qkti v
tr Rl v accl(qk) l s o tn s truy xut ng dng qkn cc thuc tnh Ai, Aj ti v trl. Chng ta cn lu rng trong cng thc tnh aff (Ai, Aj) ch xut hin cc ng dng
q m c Ai v Aj u s dng.
Kt qu ca tnh ton ny l mt ma trn i xng n x n, mi phn t ca n l mt s
o c nh ngha trn. Chng ta gi n l ma trn lc t ( lc ht hoc i lc)
thuc tnh (AA) (attribute affinity matrix).
V d: Chng ta hy tip tc vi v d 11. cho dn gin chng ta hy gi s rngrefl (qk) =1 cho tt c qkv Rl. Nu tn s ng dng l:
Acc1(q1) = 15 Acc2(q1) = 20 Acc3(q1) = 10
Acc1(q2) = 5 Acc2(q2) = 0 Acc3(q2) = 0
-
8/4/2019 Baigiang Csdl Phantan Final
56/193
Chng 3 Thit k c s d liu pht tn 49
C s d liu phn tn - 2010
Acc1(q3) = 25 Acc2(q3) = 25 Acc3(q3) = 25
Acc1(q4) = 3 Acc2(q4) = 0 Acc3(q1) = 0
S o lc ht gia hai thuc tnh A1 v A3 l:
Aff(A1, A3) = 1k=1
3t=1acct(qk) = acc1(q1)+acc2(q1)+acc3(q1) = 45
Tng t tnh cho cc cp cn li ta c ma trn i lc (hnh 3.5) sau:
A1 A2 A3 A4
A1 45 0 45 0
A2 0 80 5 75
A3 45 5 53 3
A4 0 75 3 78
Hnh 3.5 Ma trn i lc
Thut ton nng lng ni BEA (Bond Energy Algorithm)
n y ta c th phn R lm cc mnh ca cc nhm thuc tnh da vo s lin i
(lc ht) gia cc thuc tnh, v d t lc ca A1, A3 l 45, ca A2, A4 l 75, cn ca
A1, A2 l 0, ca A3, A4 l 3 Tuy nhin, phng php tuyn tnh s dng trc tip t
ma trn ny t c mi ngi quan tm v s dng. Sau y chng ta xt mt phng
php dng thut ton nng lng ni BEA ca Hoffer and Severance, 1975 v
Navathe., 1984.
1. N c thit k c bit xc nh cc nhm gm cc mc tng t, khc vi
mt sp xp th t tuyn tnh ca cc mc.
-
8/4/2019 Baigiang Csdl Phantan Final
57/193
Chng 3 Thit k c s d liu pht tn 50
C s d liu phn tn - 2010
2. Cc kt qu t nhm khng b nh hng bi th t a cc mc vo thut ton.
3. Thi gian tnh ton ca thut ton c th chp nhn c l O(n2), vi n l s lng
thuc tnh.
4. Mi lin h qua li gia cc nhm thuc tnh t c th xc nh c.
Thut ton BEA nhn nguyn liu l mt ma trn i lc thuc tnh (AA), hon v cc
hng v ct ri sinh ra mt ma trn i lc t (CA) (Clustered affinity matrix). Hon v
c thc hin sao cho s o i lc chung AM (Global Affinity Measure) l ln nht.
Trong AM l i lng:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)+aff(Ai-1, Aj)+ aff(Ai+1, Aj)]
Vi aff(A0, Aj)=aff(Ai, A0)=aff(An+1, Aj)=aff(Ai, An+1)=0 cho i,j
Tp cc iu kin cui cng cp n nhng trng hp mt thuc tnh c t vo
CA v bn tri ca thuc tnh tn tri hoc v bn phi ca thuc tnh tn phi
trong cc hon v ct, v bn trn hng trn cng v bn di hng cui cng trong cc
hon v hng. Trong nhng trng hp ny, chng ta cho 0 l gi tr lc ht aff giathuc tnh ang c xt v cc ln cn bn tri hoc bn phi (trn cng hoc di
y ) ca n hin cha c trong CA.
Hm cc i ho ch xt nhng ln cn gn nht, v th n nhm cc gi tr ln vi cc
gi tr ln , gi tr nh vi gi tr nh. V ma trn lc ht thuc tnh AA c tch cht i
xng nn hm s va c xy dng trn thu li thnh:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]
Qu trnh sinh ra ma trn t lc (CA) c thc hin qua ba bc:
Bc 1: Khi gn:
-
8/4/2019 Baigiang Csdl Phantan Final
58/193
Chng 3 Thit k c s d liu pht tn 51
C s d liu phn tn - 2010
t v c nh mt trong cc ct ca AA vo trong CA. v d ct 1, 2 c chn
trong thut ton ny.
Bc 2: Thc hin lp
Ly ln lt mt trong n-i ct cn li (trong i l s ct c t vo CA) v th
t chng vo trong i+1 v tr cn li trong ma trn CA. Chn ni t sao cho cho i
lc chung AM ln nht. Tip tc lp n khi khng cn ct no dt.
Bc 3: Sp th t hng
Mt khi th t ct c xc nh, cc hng cng c t li cc v tr tng
i ca chng ph hp vi cc v tr tng i ca ct.
Thut ton BEAInput: AA - ma trn i lc thuc tnh;
Output: CA - ma trn i lc t sau khi sp xp li cc hng cc ct;
Begin
{Khi gn: cn nh rng l mt ma trn n x n}
CA(, 1)AA(, 1)CA(, 2)AA(, 2)
Index:=3
while index
-
8/4/2019 Baigiang Csdl Phantan Final
59/193
Chng 3 Thit k c s d liu pht tn 52
C s d liu phn tn - 2010
end-while
Sp th t cc hng theo th t tng i ca ct.
end. {BEA}
hiu r thut ton chng ta cn bit cont(*,*,*). Cn nhc li s o i lc chung
AM c nh ngha l:
AM=ni=1nj=1 aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]
V c th vit li:
AM = ni=1nj=1 [aff(Ai, Aj) aff(Ai, Aj-1)+aff(Ai, Aj) aff(Ai, Aj+1)]
= nj=1[n
i=1 aff(Ai, Aj) aff(Ai, Aj-1)+ n
i=1 aff(Ai, Aj) aff(Ai, Aj+1)]
Ta nh ngha cu ni (Bond) gia hai thuc tnh Ax, v Ay l:
Bond(Ax, Ay )=nz=1aff(Az, Ax)aff(Az, Ay)
Th th c th vit li AM l:
AM = nj=1[ Bond(Ai, Aj-1)+Bond(Ai, Aj+1)]
By gi xt n thuc tnh sau:
A1 A2 Ai-1 AiAj Aj+1 An
Vi A1 A2 Ai-1 thuc nhm AM v AiAj Aj+1 An thuc nhm AM
Khi s o lc ht chung cho nhng thuc tnh ny c th vit li:
AMold = AM + AM+ bond(Ai-1, Ai) + bond(Ai, Aj) + bond(Aj, Ai)+
bond(bond(Aj+1, Aj) = n
l=1[ bond(Al, Al-1)+bond(Ai, Al+1)] + n
l=i+1[bond(Al, Al-
1)+bond(Ai, Al+1)] + 2bond(Ai, Al))
-
8/4/2019 Baigiang Csdl Phantan Final
60/193
Chng 3 Thit k c s d liu pht tn 53
C s d liu phn tn - 2010
By gi xt n vic t mt thuc tnh mi Akgia cc thuc tnh Ai v Aj trong ma
trn lc ht t. S o lc ht chung mi c th c vit tng t nh:
AMnew = AM + AM+ bond(Ai, Ak) + bond(Ak, Ai) + bond(Ak, Aj)+ bond(Aj, Ak) =AM + AM+ 2bond(Ai, Ak) + 2bond(Ak, Aj)
V th ng gp thc (net contribution) cho s o i lc chung khi t thuc tnh Ak
gia Ai v Aj l:
Cont(Ai, Ak, Aj) = AMnew - AMold = 2Bond(Ai, Ak)+ 2Bond(Ak, Aj ) - 2Bond(Ai, Aj )
Bond(A0, Ak)=0. Nu thuc tnh Akt bn phi thuc tnh tn bn phi v cha c
thuc tnh no c t ct k+1 ca ma trn CA nn bond(Ak, Ak+1)=0.
V d: Ta xt ma trn c cho trong v d 12 v tnh ton phn ng gp khi di
chuyn thuc tnh A4 vo gia cc thuc tnh A1 v A2, c cho bng cng thc:
Cont(A1, A4, A2)= 2bond(A1, A4)+ 2bond(A4, A2)-2bond(A1, A2)
Tnh mi s hng chng ta c:
Bond(A1, A4) = 4
z=1aff(Az, A1)aff(Az, A4) = aff(A1,A1) aff(A1,A4) +aff(A2,A1)
aff(A2,A4) + aff(A1,A3) aff(A3,A4) + aff(A1,A4) aff(A4,A4)
= 45*0 +0*75+ 45*3+0*78 = 135
Bond(A4, A2)= 11865
Bond(A1,A2) = 225
V th cont(A1, A4) = 2*135+2*11865+2*225 = 23550
V d: Chng ta hy xt qu trnh gom t cc thuc tnh ca quan h D n v dng ma
trn i lc thuc tnh AA.
-
8/4/2019 Baigiang Csdl Phantan Final
61/193
Chng 3 Thit k c s d liu pht tn 54
C s d liu phn tn - 2010
Bc khi u chng ta chp cc ct 1 v 2 ca ma trn AA vo ma trn CA v bt
u thc hin t ct th ba. C 3 ni c th t c ct 3 l: (3-1-2), (1, 3, 2) v (1, 2,
3). Chng ta hy tnh ng gp s i lc chung ca mi kh nng ny.
th t (0-3-1):
cont(A0, A3, A1) = 2bond(A0, A3)+ 2bond(A3, A1) - 2bond(A0, A1)
bond(A0, A3) = bond(A0, A1)=0
bond(A3, A1) = 45*48+5*0+53*45+3*0=4410
cont(A0, A3, A1) = 8820
th t (1-3-2)
cont (A1, A3, A2)=10150
th t (2-3-4)
cont (A2, A3, A4)=1780
Bi v ng gp ca th t (1-2-3) l ln nht, chng ta t A3 vo bn phi ca A1.
Tnh ton tng t cho A4 ch ra rng cn phi t n vo bn phi ca A2. Cui cng
cc hng c t chc vi cng th t nh cc ct v cc hng c trnh by tronghnh (3.6) sau:
A1 A2 A1 A3 A2
A1 45 0 A1 45 45 0
A2 0 80 A2 0 5 80
A3 45 5 A3 45 53 5
A4 0 75 A4 0 3 75
-
8/4/2019 Baigiang Csdl Phantan Final
62/193
Chng 3 Thit k c s d liu pht tn 55
C s d liu phn tn - 2010
A1 A3 A2 A4 A1 A3 A2 A4
A1 45 45 0 0 A1 45 45 0 0
A2 0 5 80 75 A3 45 53 5 3
A3 45 53 5 3 A2 0 5 80 75
A4 0 3 75 78 A4 0 3 75 78
Hnh 3.6 Gom nhm cc thuc tnh
Trong hnh trn chng ta thy qu trnh to ra hai t: mt gc trn tri cha cc gitr i lc nh, cn t kia di gc phi cha cc gi tr i lc cao. Qu trnh phn t
ny ch ra cch thc tch cc thuc tnh ca Project. Tuy nhin, ni chung th ranh ri
cc phn tch khng hon ton r rng. Khi ma trn CA ln, thng s c nhiu t hn
c to ra v nhiu phn hoch c chn hn. Do vy cn phi tip cn bi ton mt
cch c h thng hn.
Thut ton phn hoch
Mc ch ca hnh ng tch thuc tnh l tm ra cc tp thuc tnh c truy xut
cng nhau hoc hu nh l cc tp ng dng ring bit. Xt ma trn thuc tnh t (i
lc t) (hnh 3.7):
Nu mt im nm trn ng cho c c nh, hai tp thuc tnh ny c xc
nh. Mt tp {A1, A2,..., Ai} nm ti gc trn tri v tp th hai {Ai+1, Ai+2,..., An}
nm ti gc bn phi v bn di im ny. Chng ta gi 2 tp ln lt l TA, BA. Tpng dng Q={q1, q2,...,qq} v nh ngha tp ng dng ch truy xut TA, ch truy xut
BA hoc c hai, nhng tp ny c nh ngha nh sau:
-
8/4/2019 Baigiang Csdl Phantan Final
63/193
Chng 3 Thit k c s d liu pht tn 56
C s d liu phn tn - 2010
AQ(qi) = {Aj |use(qi, Aj)=1}
TQ = {qi | AQ(qi) TA}
A1 A2 A3 ... Ai Ai+1 ... An
A1
A1
:
Ai
Ai+1
:
:
An
Hnh 3.7 Ma trn thuc tnh t (i lc t)
AQ(qi) = {Aj |use(qi, Aj)=1}
TQ = {qi | AQ(qi) TA}
BQ = {qi | AQ(qi) BA}
OQ = Q - {TQ BQ}
y ny sinh bi ton ti u ho. Nu c n thuc tnh trong quan h th s c
n-1 v tr kh hu c th l im phn chia trn ng cho chnh ca ma trn thuc
TA
B A
-
8/4/2019 Baigiang Csdl Phantan Final
64/193
Chng 3 Thit k c s d liu pht tn 57
C s d liu phn tn - 2010
tnh t cho quan h . V tr tt nht phn chia l v tr sinh ra tp TQ v BQ sao
cho tng cc truy xut ch mt mnh l ln nht cn tng truy xut c hai mnh l nh
nht. V th chng ta nh ngha cc phng trnh chi ph nh sau:
CQ = refj(qi)accj(qi)qiQ Sj
CTQ = refj(qi)accj(qi)qiTQ Sj
CBQ= refj(qi)acc(qi)qiBQ Sj
COQ= refj(qi)acc(qi)qiOQ Sj
Mi phng trnh trn m tng s truy xut n cc thuc tnh bi cc ng
dng trong cc lp tng ng ca chng. Da trn s liu ny, bi ton ti u ho
c nh ngha l bi ton tm im x (1 x n) sao cho biu thc:
Z=CTQ+CBQ-COQ2
ln nht. c trng quan trng ca biu thc ny l n nh ngha hai mnh sao cho
gi tr ca CTQ v CBQ cng gn bng nhau cng tt. iu ny cho php cn bng ti
trng x l khi cc mnh c phn tn n cc v tr khc nhau. Thut ton phnhoch c phc tp tuyn tnh theo s thuc tnh ca quan h, ngha l O(n).
Thut ton PARTITIONInput: CA: ma trn i lc t; R: quan h; ref: ma trn s dng thuc tnh;
acc: ma trn tn s truy xut;
Output: F: tp cc mnh;
Begin
{xc nh gi tr z cho ct th nht}{cc ch mc trong phng trnh chi ph ch ra im tch}
tnh CTQn-1
tnh CBQn-1
tnh COQn-1
-
8/4/2019 Baigiang Csdl Phantan Final
65/193
Chng 3 Thit k c s d liu pht tn 58
C s d liu phn tn - 2010
best CTQn-1*CBQn-1 (COQn-1)2
do {xc nh cch phn hoch tt nht}
begin
for i from n-2 to 1 by -1 dobegin
tnh CTQi
tnh CBQi
tnh COQi
z CTQi*CBQi (COQi)2
if z > best then
beginbest z
ghi nhn im tch bn vo trong hnh ng x dch
end-if
end-for
gi SHIFT(CA)
end-begin
until khng th thc hin SHIFT c naXy dng li ma trn theo v tr x dch
R1 TA(R) K {K l tp thuc tnh kho chnh ca R}
R2 BA(R) K
F {R1, R2}
End. {partition}
p dng cho ma trn CA t quan h Project, kt qu l nh ngha cc mnh
FProj={Project1, Project2}
Trong :
Project1={A1, A3} v Project2= {A1, A2, A4}. V th
Project1={JNO, Budjet}
-
8/4/2019 Baigiang Csdl Phantan Final
66/193
Chng 3 Thit k c s d liu pht tn 59
C s d liu phn tn - 2010
Project2={JNO, JNAME, LOC}
JNO l thuc tnh kho ca Project
Kim tra tnh ng n:
Tnh y : c bo m bng thut ton PARTITION v mi thuc tnh ca
quan h ton cc c a vo mt trong cc mnh.
Tnh ti thit c: i vi quan h R c phn mnh dc FR={R1, R2,...., Rr} v cc
thuc tnh kho K
R= K Ri , RiFR
Do vy nu iu kin mi Ri l y php ton ni s ti thit li ng R. Mt
im quan trng l mi mnh Ri phi cha cc thuc tnh kho ca R.
3.2.4 S phn mnh hn hp
Cui cng chng ta xt s phn mnh hn hp. Cch thc d nht thc hin s phn
mnh hn hp l:
1. p dng s phn mnh ngang i vi cc phn mnh dc.
2. p dng s phn mnh dc i vi cc phn mnh ngang.
Mc du cc php ton trn c th lp li mt cch qui, nhng trn thc tin s phn
mnh khng nn qu hai cp.
Hnh 3.8 th hin th t ca s phn mnh nh sau:
S phn mnh ngang c p dng ngay trn mt phn mnh dc.
S phn mnh dc c p dng ngay trn mt phn mnh ngang.
-
8/4/2019 Baigiang Csdl Phantan Final
67/193
Chng 3 Thit k c s d liu pht tn 60
C s d liu phn tn - 2010
A1 A2 A3 A4 A5
S phn mnh dc ri sau phn mnh ngang
A1 A2 A3 A4 A5
S phn mnh ngang sau phn mnh dc
Hnh 3.8 S phn mnh hn hp ca quan h R(A1, A2, A3, A4, A5)
V d tng qut
Xt li quan h ph qut EMP c phn mnh dc thnh EMP1 v EMP2. Gi s cc
ng dng v cng vic c iu hnh ti cc phng ban m s dng phn mnh EMP2
tham kho n xc sut 80% cc b ca cc phng ban ln cn vi site m cc ng
dng c gi. V th EMP2 c th c phn mnh ngang tip tc theo nhm cc
phng ban.
-
8/4/2019 Baigiang Csdl Phantan Final
68/193
Chng 3 Thit k c s d liu pht tn 61
C s d liu phn tn - 2010
3.3 S cp pht cc phn mnh
3.3.1 Bi ton cp pht
Gi s c mt tp cc mnh F={F1, F2, ...,Fn} v mt mng bao gm cc v tr
S={S1, S2, ...,Sm} trn c mt tp cc ng dng Q={q1, q2, ...,qq} ang chy.
Bi ton cp pht l tm mt phn phi ti u ca F cho S.
Tnh ti u c th c nh ngha ng vi hai s o:
- Chi ph nh nht: Hm chi ph c chi lu mnh Fi vo v tr Sj, chi ph vn tin mnh
Fi vo v tr Sj, chi ph cp nht Fi ti tt c mi v tr c cha n v chi ph tryn d
liu. V th bi ton cp pht c gng tm mt lc cp pht vi hm chi ph t hp
nh nht.
- Hiu nng: Chin lc cp pht c thit k nhm duy tr mt hiu qu ln l h
thp thi gian p ng v tng ti a lu lng h thng ti mi v tr.
Ni chung bi ton cp pht tng qut l mt bi ton phc tp v c phc tp l
NP-y (NP-complete). V th cc nghin cu c dnh cho vic tm ra cc
thut gii heuristec tt c li gii gn ti u.
3.3.2 Yu cu v thng tin
giai on cp pht, chng ta cn cc thng tin nh lng v CSDL, v cc ng
dng chy trn , v cu trc mng, kh nng x l v gii hn lu tr ca mi v tr
trn mng.
Thng tin v CSDL
tuyn ca mt mnh Fj ng vi cu vn tin qi. y l s lng cc b ca Fj cn
c truy xut x l qi. Gi tr ny k hiu l seli(Fj)
Kch thc ca mt mnh Fj c cho bi
-
8/4/2019 Baigiang Csdl Phantan Final
69/193
Chng 3 Thit k c s d liu pht tn 62
C s d liu phn tn - 2010
Size (Fj) = card (Fj)* length(Fj)
Trong : Length(Fj) l chiu di (tnh theo byte) ca mt b trong mnh Fj.
Thng tin v ng dng
Hai s liu quan trng l s truy xut c do cu vn tin qi thc hin trn mnh Fj trong
mi ln chy ca n (k hiu l RRij), v tng ng l cc truy xut cp nht (URij). v
d chng c th m s truy xut khi cn phi thc hin theo yu cu vn tin.
Chng ta nh ngha hai ma trn UM v RM vi cc phn t tng ng uij v rij c
c t tng ng nh sau:
1 nu vn tin qi c cp nht mnh Fj
uij= 0 trong trng hp ngc li
1 nu vn tin qi c cp nht mnh Fj
rij = 0 trong trng hp ngc li
Mt vct O gm cc gi tr o(i) cng c nh ngha, vi o(i) c t v tr a ra cu
vn tin qi .
Thng tin v v tr
Vi mi v tr (trm) chng ta cn bit v kh nng lu tr v x l ca n. Hin nhin
l nhng gi tr ny c th tnh c bng cc hm thch hp hoc bng phng php
nh gi n gin.
+ Chi ph n v tnh lu d liu ti v tr Sks c k hiu l USCk.
+ c t s o chi ph LPCk, l chi ph x l mt n v cng vic ti v tr Sk. n v
cng vic cn phi ging vi n v ca RR v UR.
-
8/4/2019 Baigiang Csdl Phantan Final
70/193
Chng 3 Thit k c s d liu pht tn 63
C s d liu phn tn - 2010
Thng tin v mng
Chng ta gi s tn ti mt mng n gin, gij biu th cho chi ph truyn mi b gia
hai v tr Si v Sj. c th tnh c s lng thng bo, chng ta dng fsize lm kchthc (tnh theo byte) ca mt b d liu.
3.3.3. M hnh cp pht
M hnh cp pht c mc tiu lm gim thiu tng chi ph x l v lu tr d liu
trong khi vn c gng p ng c cc i hi v thi gian p ng. M hnh ca
chng ta c hnh thi nh sau:
Min (Total Cost)
ng vi rng buc thi gian p ng, rng buc lu tr, rng buc x l.
Bin quyt nh xij c nh ngha l
1 nu mnh Fi c lu ti v tr Sj
xij= 0 trong trng hp ngc li
Tng chi ph
Hm tng chi ph c hai thnh phn: phn x l vn tin v phn lu tr. V th n c
th c biu din l:
TOC= QPCi + STCjk qi Q SkS FjF
vi QPCi l chi ph x l cu vn tin ng dng qi, v STCjk l chi ph lu mnh Fj ti v
tr Sk.
Chng ta hy xt chi ph lu tr trc. N c cho bi
STCjk= USCk* size(Fj) *xjk
-
8/4/2019 Baigiang Csdl Phantan Final
71/193
Chng 3 Thit k c s d liu pht tn 64
C s d liu phn tn - 2010
Chi ph x l vn tin kh xc nh hn. Hu ht cc m hnh cho bi ton cp pht tp
tin FAP tch n thnh hai phn: Chi ph x l ch c v chi ph x l ch cp nht.
y chng ti chn mt hng tip cn khc trong m hnh cho bi ton DAP v
xc nh n nh l chi ph x l vn tin bao gm chi ph x l l PC v chi ph truyn
l TC. V th chi ph x l vn tin QPC cho ng dng qi l
QPCi=PCi+TCi
Thnh phn x l PC gm c ba h s chi ph, chi ph truy xut AC, chi ph duy tr
ton vn IE v chi ph iu khin ng thi CC:
PCi=ACi+IEi+CCi
M t chi tit cho mi h s chi ph ph thuc vo thut ton c dng hon tt
cc tc v . Tuy nhin minh ho chng ti s m t chi tit v AC:
ACi= (uij*URij+rij*RRij)* xjk*LPCkSkS FjF
Hai s hng u trong cng thc trn tnh s truy xut ca vn tin qi n mnh Fj. Ch
rng (URij+RRij) l tng s cc truy xut
c v cp nht. Chng ta gi thit rng ccchi ph x l chng l nh nhau. K hiu tng cho bit tng s cc truy xut cho tt c
mi mnh c qi tham chiu. Nhn vi LPCkcho ra chi ph ca truy xut ny ti v tr
Sk. Chng ta li dng xjk ch chn cc gi tr chi ph cho cc v tr c lu cc mnh.
Mt vn rt quan trng cn cp y. Hm chi ph truy xut gi s rng vic x
l mt cu vn tin c bao gm c vic phn r n thnh mt tp cc vn tin con hot
tc trn mt mnh c lu ti v tr , theo sau l truyn kt qu tr li v v tr
a ra vn tin.
H s chi ph duy tr tnh ton vn c th c m t r