kristal-irms 소개 kristalinfo

53
KRISTAL-IRMS KRISTAL-IRMS 소소 소소 http:// www.kristalinfo.com 소소소 소소소소소소소소소소소 (KISTI) 소소소소소소 소소소소소소 2006. 9. 21.

Upload: aimee

Post on 11-Jan-2016

101 views

Category:

Documents


3 download

DESCRIPTION

KRISTAL-IRMS 소개 http://www.kristalinfo.com. 2006. 9. 21. 김진숙 한국과학기술정보연구원 (KISTI) 지식정보센터 시스템개발팀. Information Retrieval. Static Text Collection. Inverted File (Index). Boolean Retrieval. (1). A ladybug has beautiful wings …. 1, 5. (Ladybug). (2). Bugs hide from enemy as …. - PowerPoint PPT Presentation

TRANSCRIPT

  • KRISTAL-IRMS http://www.kristalinfo.com(KISTI) 2006. 9. 21.

    R&D

  • A ladybug has beautiful wingsBugs hide from enemy as enemy of aphids is wasps that Ladybug as enemy agriculture Night heron has short legs and (1)(2)(3)(4)(5)ladybugenemy...1,52,3,5......StaticText CollectionInverted File(Index)Boolean Retrieval(Ladybug)1, 5(enemy)2, 3, 5(ladybug&enemy)5(ladybug|enemy)5, 1, 2, 3Information Retrieval However,Some documents are modified.New documents are created.Some documents are deleted.DB+IRIRMS

    R&D

  • R&D

  • 1. KRISTAL-IRMS

    R&D

  • KRISTAL-IRMS? DBMS (DBMS) (IRMS)KRISTAL / DBMS

    R&D

  • /

    DBMS-IRS ()DBMS-IR DBMS , IRS . (), , DBMS()-IRS() DBMS IR Oracle Cartridge,IBM DB2 Text Extender , SQL , DBMS IR ()Odysseus (KAIST) , . ?IR KRISTAL IRMS , IR , DB ? DB-IR DB-IR 2005

    R&D

  • / , :

    KISTI Feed Back

    R&D

  • KRISTAL-IRMS (1/2)KRISTAL Service telnet Korea Research Information in Science & Technology Access Line

    KRISTAL-II KNL, ROSE, FIRE, DAdmin Knowledge Retrieval In Science & Technology Affiliated Literatures

    R&D

  • KRISTAL-IRMS (2/2)KRISTAL-I : 1991. 5 - 1996. 2 (BASIS+ )KRISTAL-II : 1996. 03 ()KRISTAL-2000 : 2000. 03 ( )KRISTAL-2002 : 2002. 10 ( )KRISTAL-IRMS : 2006. 01 ( ) :

    R&D

  • KRISTAL-IRMS

    R&D

  • KRISTAL-IRMS

    R&D

  • KRISTAL-IRMS

    R&D

  • KRISTAL-IRMS -

    R&D

  • 2. KRISTAL

    R&D

  • (1/8) : , , : : = , 1 : = (AK), (RAK), (YO); = (YI), (LI) -

    R&D

  • (2/8)2:021:112:201:01 1, , , , 2, , , (Unigram)::1:172:031:182:042:19B+tree 1:192:05:1:121:151:21

    R&D

  • (3/8)2:021:112:201:01 ::1:172:031:182:042:19B+tree1:192:05:1:121:151:21 1 : (17) (18) (19) directly followed by directly followed by 2 : (03) (04) (05)

    R&D

  • (4/8) :11:13B+tree 11, , 11:1311:1311:1311:1311:1311:1311:1311:1311:13DB

    R&D

  • (5/8) [ ] [ ] [ ] :2:03B+tree 2, , , 2:032:032:042:042:052:05 = .2:032:042:05B+tre

    R&D

  • (6/8)Bigram 2, , , 2:032:042:052:032:04 Bigram, Bigram 2:032:042:052:032:04 = 593,579 = 75,051 = 305,013Unigram = 3 CPU = 593,579 = 4649 = 420Bigram = 2 CPU = 4649 DB

    R&D

  • (7/8)- 3

    , , , , 3:093:103:113:093:103:093:103:113:093:103:013:023:033:043:053:063:073:08(), ,Bigram

    R&D

  • (8/8): KRISTAL (Unigram ) DB , Bigram - (IRMS)

    R&D

  • 3. KRISTAL 3.1

    R&D

  • R&D

  • KRISTAL-2002 2.0/2.1 DB DB KRISTAL 3.1(Postings Segmentation) DB .: /

    R&D

  • KRISTAL-2002 2.0/2.1 DB DB KRISTAL 3.1(Postings Segmentation) DB .: 3.1 DB: DB7071 400

    R&D

    Chart1

    00000

    2.8080.7973.3060.2270.208

    2.8052.0673.4030.2190.212

    2.2062.1472.7560.2280.222

    1.9523.7963.1030.2270.211

    1.743.4392.8180.2360.21

    1.4442.7973.2950.2340.209

    1.1613.782.9330.2260.212

    1.133.5093.0640.2340.213

    3.2582.6212.9530.230.214

    1.9562.6483.2190.230.218

    1.0073.3453.0370.2330.222

    3.1921.913.0880.2310.21

    4.7951.9732.7180.2330.215

    3.9944.1473.080.2330.213

    3.9732.2122.9350.2390.217

    2.3422.6323.1310.2270.215

    2.1391.4992.8310.2310.214

    0.9011.6213.0760.2380.212

    3.5793.172.8320.2360.219

    1.9073.0653.1040.2270.213

    1.7333.9632.7830.2390.222

    4.3294.8743.0460.2420.209

    1.0042.9343.7630.230.213

    3.4343.0564.3010.2370.209

    2.6132.6134.1160.2360.209

    2.4693.5514.2350.2430.21

    2.7621.8063.7970.240.209

    2.1813.5213.7440.2410.211

    1.0372.1983.2860.2430.215

    2.6151.4593.1150.2450.215

    2.6463.5622.7790.2440.212

    1.9782.9693.0610.2580.212

    3.0282.1722.7750.2530.212

    1.492.4213.0410.2520.212

    1.9251.8272.8050.2140.223

    2.3913.1913.1150.250.214

    1.6613.5222.7710.2510.213

    3.7272.5333.0560.260.213

    3.5392.8722.7650.2510.213

    3.3382.9473.080.2560.213

    CHAR/

    CHAR/

    5 ()

    Sheet1

    2.0.93.1.12.0.93.1.1

    1211044818472314.35216.991

    2238148028208266.87113.477

    3179360019776231.7637.686

    4180665620408192.4965.95

    5185888021112216.8546.817

    6203487215104254.8388.629

    7180885616856251.9226.22

    8196108818960269.0196.153

    9160831219136221.8074.781

    10193560821600254.065.92

    192998019963.2247.39828.2624

    12008162426429.5433.092

    22374002788833.91.589

    31989203025624.9280.969

    41880721252024.2411.415

    51942241543223.6141.261

    203886.42207227.24521.6652

    1194728155120.3640.305

    210304155680.2480.279

    310336156160.1320.23

    410368156720.1330.253

    510400157280.1240.244

    47227.215619.20.20020.2622

    CCPC

    (ABK/ABE/TIK)KCHAR/IDXKCHAR/NOIDX

    5002.8080.7973.3060.2270.208

    10002.8052.0673.4030.2190.212

    15002.2062.1472.7560.2280.222

    20001.9523.7963.1030.2270.211

    25001.743.4392.8180.2360.21

    30001.4442.7973.2950.2340.209

    35001.1613.782.9330.2260.212

    40001.133.5093.0640.2340.213

    45003.2582.6212.9530.230.214

    50001.9562.6483.2190.230.218

    55001.0073.3453.0370.2330.222

    60003.1921.913.0880.2310.21

    65004.7951.9732.7180.2330.215

    70003.9944.1473.080.2330.213

    75003.9732.2122.9350.2390.217

    80002.3422.6323.1310.2270.215

    85002.1391.4992.8310.2310.214

    90000.9011.6213.0760.2380.212

    95003.5793.172.8320.2360.219

    100001.9073.0653.1040.2270.213

    105001.7333.9632.7830.2390.222

    110004.3294.8743.0460.2420.209

    115001.0042.9343.7630.230.213

    120003.4343.0564.3010.2370.209

    125002.6132.6134.1160.2360.209

    130002.4693.5514.2350.2430.21

    135002.7621.8063.7970.240.209

    140002.1813.5213.7440.2410.211

    145001.0372.1983.2860.2430.215

    150002.6151.4593.1150.2450.215

    155002.6463.5622.7790.2440.212

    160001.9782.9693.0610.2580.212

    165003.0282.1722.7750.2530.212

    170001.492.4213.0410.2520.212

    175001.9251.8272.8050.2140.223

    180002.3913.1913.1150.250.214

    185001.6613.5222.7710.2510.213

    190003.7272.5333.0560.260.213

    195003.5392.8722.7650.2510.213

    200003.3382.9473.080.2560.213

    205003.4740.260.246

    210004.1460.2530.273

    215003.8160.2530.254

    220002.8860.2580.234

    225001.8290.250.242

    230003.2820.2610.236

    235002.9510.2560.26

    240002.9310.2550.238

    245004.0640.2560.239

    250004.0390.2560.226

    255004.2930.260.244

    260004.080.2670.249

    265000.2580.252

    270000.2540.256

    275000.2550.238

    280000.250.252

    285000.2560.255

    290000.2550.231

    295000.2620.244

    300000.2560.234

    305000.2550.253

    310000.3030.234

    315000.3160.244

    320000.3270.242

    325000.3440.253

    330000.3410.246

    335000.3380.248

    340000.2990.244

    345000.3420.216

    350000.3230.215

    Sheet1

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    CHAR/

    CHAR/

    5 ()

    Sheet2

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    00000

    CHAR/

    CHAR/

    5 ()

    Sheet3

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    00

    (4)

    (5)

    5 ()

    (ABK/ABE/TIK)KCHAR/IDXKCHAR/NOIDX

    5002.8080.7973.3060.2270.208

    10002.8052.0673.4030.2190.212

    15002.2062.1472.7560.2280.222

    20001.9523.7963.1030.2270.211

    25001.7403.4392.8180.2360.210

    30001.4442.7973.2950.2340.209

    35001.1613.7802.9330.2260.212

    40001.1303.5093.0640.2340.213

    45003.2582.6212.9530.2300.214

    50001.9562.6483.2190.2300.218

    55001.0073.3453.0370.2330.222

    60003.1921.9103.0880.2310.210

    65004.7951.9732.7180.2330.215

    70003.9944.1473.0800.2330.213

    75003.9732.2122.9350.2390.217

    80002.3422.6323.1310.2270.215

    85002.1391.4992.8310.2310.214

    90000.9011.6213.0760.2380.212

    95003.5793.1702.8320.2360.219

    100001.9073.0653.1040.2270.213

    105001.7333.9632.7830.2390.222

    110004.3294.8743.0460.2420.209

    115001.0042.9343.7630.2300.213

    120003.4343.0564.3010.2370.209

    125002.6132.6134.1160.2360.209

    130002.4693.5514.2350.2430.210

    135002.7621.8063.7970.2400.209

    140002.1813.5213.7440.2410.211

    145001.0372.1983.2860.2430.215

    150002.6151.4593.1150.2450.215

    155002.6463.5622.7790.2440.212

    160001.9782.9693.0610.2580.212

    165003.0282.1722.7750.2530.212

    170001.4902.4213.0410.2520.212

    175001.9251.8272.8050.2140.223

    180002.3913.1913.1150.2500.214

    185001.6613.5222.7710.2510.213

    190003.7272.5333.0560.2600.213

    195003.5392.8722.7650.2510.213

    200003.3382.9473.0800.2560.213

    2.4552.7793.1530.2380.213

    (1)(2)(3)(4)(5)

    2.4552.7793.1530.2380.213

    0

    0

    0

    0

    0

    ()

    ()

  • KRISTAL /, - Bi-gram KConverter XML SYS.CDATE/SYS.UDATE Wild Card LIKE 2.0/2.1 3.1 DB Regular Expression XML (KConverter )XML (Xpath) ~ , XML XML

    R&D

  • 4. //

    R&D

  • DBMS IRS (Tightly Coupled) IRMS , (XML) (Concurrency Control), (Recovery) Coarse-grained transaction

    R&D

  • : , DB: ( ) DB: , DB:

    R&D

  • intChar[5]boolstring1KISTI T2KSC FKISTI Supercomputing Center3CCBB T4RNBD T

    R&D

  • , , , KSTRING: KCHAR[N]: N KINT, KUNIT, KFLOAT: KBOOL: TRUE, FALSE

    R&D

  • , , - - , - ( )

    R&D

  • (HANJA2HANGUL False )

    R&D

  • , Memory DB Summary DB

    R&D

  • , , , ,

    R&D

  • ::

    R&D

  • DB ( )

    R&D

  • : +

    R&D

  • ) (TITLE: ) (TITLE: & )

    R&D

  • , (0~1,0) : 0.5 50%

    R&D

  • 5.

    R&D

  • XML XML XML KRISTAL

    R&D

  • R&D

  • :: DB Summary DB

    R&D

  • ::DB (, XML, CSV) , XML, CSV DB DB DB

    R&D

  • :: (Section Operation) (Index Operation)

    R&D

  • R&D

  • (http://www.yeskisti.net) (http://techtrend.kisti.re.kr) (http://www.nktech.net) (http://www.ccbb.re.kr) (http://society.kisti.re.kr) (http://acoms1.kisti.re.kr:8080/kistiacoms/acoms_new) (http://next10.yeskisti.net) (http://www.kosen21.org) (http://www.mctnet.org)

    DB (http://www.history.go.kr) (http://sjw.history.go.kr) (http://www.koreanhistory.or.kr) (http://nmh.gsnu.ac.kr) (http://www.minchu.or.kr) (http://e-kyujanggak.snu.ac.kr) (http://seongnam.grandculture.net) (http://cheongju.grandculture.net)

    R&D

  • #

    R&D

  • (1/2) : :

    R&D

  • (2/2)

    R&D

  • KRISTAL http://www.kristalinfo.com

    .

    R&D