mary o'neill - democratizing the dictionary: the challenges and opportunities presented by...

18
the challenges and opportunities presented by crowdsourcing content

Upload: scottish-language-dictionaries

Post on 13-Apr-2017

84 views

Category:

Presentations & Public Speaking


0 download

TRANSCRIPT

the challenges and opportunities presented by crowdsourcing content

In 2012, Collins launched its online dictionary site, CollinsDictionary.com, and offered registered users the opportunity to contribute in two ways:

• submit a word to the English Dictionary

• comment on a word (in any Collins dictionary online)

to solicit suggestions of new words that could become dictionary entries

to solicit comments on our current entries – or word submissions – that might inform how we revise or compile the entries

to present the user with a more inclusive experience and ultimately make the dictionary-making process more democratic

• sheer number of submissions: over 17,300 to end of March 2016, a (rough) average of 375 submissions every month

• multiple submissions by single users: monopoly, not democracy!

Quality control: submissions that are of no real benefit to Collins, such as

• words already suggested by other users

• words already included in the dictionary (curate = “to act as curator of”) or covered by existing content (toxic assets)

• users’ coinages (chatversation = “a conversation you had over a chat or messenger application”; blumpert = “that one old man who always gets drunk at the bar”)

• misuses of the facility, eg translations of English words

The challenges of opening up to comments

How does a reputable dictionary ‘open up’ to the public and maintain its authority, especially given such challenges?

The challenges of managing user comments are similar to those of managing word submissions:• number of submissions: 9052 comments recorded to end of March 2016,

a (rough) average of 196 every month• comments that are of no benefit to Collins, for example

o humorous comments

o other misuses of the facility, such as for learning English vocabulary

The daily moderation stage allows a submitted word or comment to appear (or not) on the CollinsDictionary.com website.

A word that has been ‘approved’ by a moderator will appear on the word submissions page with the status ‘pending investigation’:

At a subsequent stage, a lexicographer carries out some basic research and assigns a new status to the submitted word: candidate, under review, or rejected.

Candidate = selected as a potential dictionary entry

Under review = a possible entry, but still further investigation needed

kitchen-sinking = “the practice of including all possible bad news in a single press release, usually for strategic purposes”

Rejected = not selected as a potential dictionary entry (because already in the dictionary, not enough evidence, etc)

By carrying out daily moderation and a regular review we can:

• manage what is visible on the site

• limit the number of submissions not yet investigated

• unearth the best candidates for future dictionary entries

• reassure the user that his or her submission is being fully considered and so encourage repeat visits

• keep the system ‘alive’ in the eyes of the user

Suggestions from the public are not immediately included in the dictionary proper. When it is time for a dictionary update, further research into candidate words and senses selected for inclusion is undertaken by experienced lexicographers at the compilation stage.

These lexicographers will also compile the finished entry:

We review, research, and compile crowdsourced content with the same rigour that we apply to words brought to our attention through in-house sources.

By carrying out regular dictionary updates sourced from user-generated content we can:

• keep the online dictionary ‘alive’ in the eyes of the user

• keep firm editorial control of what is included in the dictionaries themselves, and so retain authority

• simultaneously reassure the user that their contribution has been valued, and make him or her feel involved in the dictionary-creation process

• Contributions from users with different areas of knowledge

o kalonji = “the plant Nigella sativa whose seeds are used as a spice”

o clapotis = “reflection of a travelling surface wave”

o stegananalysis = “the analysis of cover material to identify presence of hidden information in the field of steganography”

• Contributions from users in different English-speaking territories

o cruel = “to spoil” (Australia)

o thumbsuck = “a guess or estimate” (South Africa)

o wave election = “when one political party makes major gains in the US House and US Senate” (US)

o Mollywood = “Malayalam cinema or cinema of Kerala” (India)

Cutting-edge vocabulary

o Zika = “virus transmitted by mosquito bite”

o gene editing = “type of genetic engineering in which DNA is inserted or removed from a genome using artificially engineered nucleases”

o Brexit = “the withdrawal of the UK from the EU”

o outer = “person who believes the UK should leave the EU”

o timeshift = “to record a TV or radio programme to a storage medium to be viewed or listened to after the live broadcasting”

o air five = “a variation of "high five" where the hands of the participants never touch”

o Trumpism = “the policies of Donald Trump”

a source of neologisms for all new titles and editions of dictionaries, including monolingual dictionaries and the ‘English side’ of bilingual dictionaries

of the 800+ neologisms added to the latest edition of Collins English Dictionary (2014), nearly 600 were words and senses sourced from user-generated content

bioplay n a play based on the life of a famous person, esp one giving a popular treatment

impactful adj having a powerful effect or making a strong impression

posterize vb (tr) to humiliate (a sporting opponent) by performing a dramatic feat against them

a source of regular updates to the online English Dictionary

Brexit n the potential withdrawal of the United Kingdom from the European Union

a source of informed comment on entries and word suggestions

refinements to the system (eg more fine-grained categories)

more efficient methods of monitoring and revisiting words for which there was not enough evidence before

plans to more fully and effectively utilize submissions

ways to incentivize the best users to continue contributing

ways to encourage the kind of submissions that are of real benefit to Collins

o words overlooked by editors or missed by other monitoring systems

o cutting-edge vocabulary

o new vocabulary from English-speaking territories outside the UK

o constructive comments

strategies to extend the resource to non-English submissions

Identify and develop:

make improvements to the system

moderate and monitor submissions

regularly update online dictionary content

review, research, and compile crowdsourced neologisms with the lexicographical rigour expected from a reputable dictionary publisher

Meanwhile, we continue to:

• yes, if it is resourced and managed well

• yes, if it delivers the kind of content you want

• yes, if you do not rely solely on user-generated content

• yes, if the trend for sharing content continues – CollinsDictionary.com had 33,466 registered users at 31 March 2016

www.collinsdictionary.com

[email protected]