data.bnf.fr as a sandbox for frbrization - swibswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf ·...
TRANSCRIPT
![Page 1: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/1.jpg)
Data.bnf.fr as a sandbox for
FRBRizationAutomated work creation in data.bnf.fr
![Page 2: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/2.jpg)
Five entities...
![Page 3: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/3.jpg)
![Page 4: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/4.jpg)
![Page 5: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/5.jpg)
The interface
![Page 6: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/6.jpg)
The data
![Page 7: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/7.jpg)
![Page 8: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/8.jpg)
“Old works” at the BnF : a handcrafted artefact...
https://catalogue.bnf.fr/ark:/12148/
cb14473195cValidity control =
persistence guarantee
![Page 9: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/9.jpg)
Where to start ?
![Page 10: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/10.jpg)
We need ...● a homogenic corpus of documents → the XXth century authors.● an exhaustive collection of records from the legal deposit.● A highly configurable robot which likes every kind of metadata…
DATABOT !
… and to keep it simple : no “aggregates” records !
![Page 11: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/11.jpg)
AUTHOR 1
AUTHOR 2
AUTHOR 3
Subtitle 1
Title 1
Title 4
Title 2
Title 3
![Page 12: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/12.jpg)
Then, from titles clusters, generate the
two faces...
![Page 13: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/13.jpg)
The interface...
![Page 14: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/14.jpg)
...The data
![Page 15: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/15.jpg)
...Calendar Information
![Page 16: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/16.jpg)
● First semester of 2019 : ○ uploading computed works in the data.bnf.fr
interface○ Validation process
● Second semester of 2019 :○ Uploading computed and validated works in the
catalog○ Attribution of permanent URIs
![Page 17: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/17.jpg)
Concomitantly...
Evaluating the quality of the Main Catalog metadata :
o date : content and coherenceo title : content and structurationo author : homonyms et function codeso Language
Curation of the metadata in order to improve clustering performances
![Page 18: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/18.jpg)
After works’ integration into the Main Catalog...
• Side projectso Non textual workso Foreign workso Before 1900 workso Expressions
• “Benchmarking”
o Linking toward the ABES computed works to check validity of newly created works at the BnF
![Page 19: Data.bnf.fr as a sandbox for FRBRization - SWIBswib.org/swib18/slides/2_lapotre_data-bnf-fr.pdf · a homogenic corpus of documents →the XXth century authors. an exhaustive collection](https://reader036.vdocuments.site/reader036/viewer/2022081405/5f0a9bad7e708231d42c77d8/html5/thumbnails/19.jpg)
Thank you for your attention !