bibliographic data spring cleaning with sierra dna
Post on 15-Jul-2015
87 Views
Preview:
TRANSCRIPT
Bibliographic Data Spring Cleaning with Sierra DNA
Becky Yoose
Discovery and Integrated Systems Librarian
Grinnell College
http://www.flickr.com/photos/alisonlongrigg/3641086760/
Table structure
https://secure.flickr.com/photos/37prime/750293493/
Bib
• Fixed fields• Title? (sometimes)• Author? (sometimes)
Generic Record
• Leader• Control Fields• Variable Fields
p07 p08 p09 p101 9 8 51 9 8 51 9 8 51 9 8 41 9 8 51 9 8 41 9 8 41 8 9 41 9 5 61 8 9 1
To make your life more interesting, the control_field table...
To make your life more interesting, the control_field table...
EXPLAIN
SELECT *
FROM sierra_view.control_field
WHERE p07='2' AND p08='0' AND p09='0' AND p10='1'
LIMIT 10
Total runtime: 123.627 ms
Column Data Type
Comment
record_id bigint Foreign key to record.
record_type_code char Record type code.
record_num int Record number.
varfield_id bigint Foreign key to varfield.
field_type_code char III field type tag.
marc_tag varchar MARC tag.
marc_ind1 char First MARC indicator.
marc_ind2 char Second MARC indicator.
occ_num int The occurrence number of the field among other fields with the same tag. Used when a record contains more than one field of the same type.
display_order int Integer to manage the display order of a list.
tag char Subfield tag.
content varchar Content of the subfield.
subfield_view
Column Data Type Comment
record_id bigint Foreign key to record_metadata
index_tag varchar The itag of an index string (e.g., 'a'=author, 't'=title, 'd'=subject, etc.) for an entry.
varfield_type_code varchar The tag of the variable-length field to index.
index_entry varchar The index entry string.
insert_title varchar A normalized form of the title used to sort index entries.
original_content varchar The non-normalized version of the index entry string.
parent_record_id bigintThe system-generated ID of the parent of the phrase entry's source record.
phrase_entry (selected fields)
Example: Typo of the day
phase_entry
regular expressions
+
Example of one off word surrounded by spaces
SELECT index_entry, record_key
FROM sierra_view.phrase_entry
WHERE
index_tag='t' AND
varfield_type_code='t' AND
type3='' AND
index_entry ~* '(^|\s)fom\s' AND
index_entry !~* '(^|\s)fom\ssic\s'
woodhouse 1615 a plaine almanackeor prognostication for the yeare of our lord god 1615 being the third fomleape yeare conta b1719695
countrey messenger or the faithfullfoot post communicating his vveeklyintelligence fom the severall parts of the kingdome a b1819566
multiple choices after school findings fom the extended service schools initiative b1439991
Which one?
http://www.flickr.com/photos/visionsbyvicky/3369136077/
$demo
https://github.com/GrinnellCollegeLibraries/typooftheday
Possibilities
• Authority headings
• Subfield misbehavior
• Series statements
• Others...
Thanks Questions?
yoosebec@grinnell.edu @yo_bj
top related