neurocognitive approach to creativity in the domain of word-invention
DESCRIPTION
Maciej Pilichowski 1 Włodzisław Duch 2 1 Faculty of Mathematics and Computer Science, 2 Department of Informatics, Nicolaus Copernicus University, Toruń, Poland Contact: [email protected], Google: W.Duch. Neurocognitive Approach to Creativity in the Domain of Word-invention. - PowerPoint PPT PresentationTRANSCRIPT
Neurocognitive Approach Neurocognitive Approach to Creativity to Creativity
in the Domain of Word-inventionin the Domain of Word-invention
Maciej Pilichowski1 Włodzisław Duch2
1 Faculty of Mathematics and Computer Science,2 Department of Informatics,
Nicolaus Copernicus University, Toruń, Poland
Contact: [email protected], Google: W.Duch
IntroductionIntroduction
Creativity: “the capacity to create a solution that is both novel and appropriate”.
Creative brains are:
well trained in a given domain, have great imagination, combine faster basic primitives, recognize interesting combinations of these primitives through emotional and associative filtering.
Computational creativityComputational creativityTo understand creative use of words go to the lower level …
construct words from combinations of phonemes, pay attention to morphemes, flexion etc.
Creativity = space + imagination (fluctuations) + filtering (competition)
Space: neural tissue providing space for infinite # of activation patterns. Imagination: many chains of phonemes activate in parallel both words and non-words reps, depending on the strength of synaptic connections. Filtering: associations, emotions, phonological/semantic density.
General ideaGeneral idea
Start from keywords priming phonological representations in the auditory cortex; spread the activation to concepts that are strongly related.
Use inhibition in the winner-takes-most to avoid false associations.
Find fragments that are highly probable, estimate phonological probability.
Combine them, search for good morphemes, estimate semantic probability.
Autoassociative networksAutoassociative networksSimplest networks:
binary correlation matrix,
probabilistic p(ai,bj|w)
Major issue: rep. of symbols,
morphemes, phonology …
W
x 0 00 x 00 0 x
x x xx x xx x x
x x xx x xx x x
x 0 00 x 00 0 x
x x xx x xx x x
x x xx x xx x x
x 0 00 x 00 0 x
ObjectiveObjective
Invention of new words that capture some characteristics of objects or processes.
For example: industrial or software products, activity of companies, the main topic of web pages.
Understanding creative processes in the brain requires network simulations, but here only formal, probabilistic model is considered.
DataData Linguistic source for the Mambo algorithm is based on
Google Web 1T 5-gram dictionary. Spell-checking is based on LRAGR and SCOWL
dictionaries. To avoid over-representation of most common words
logarithmic scale of word occurrences has been used.
Word representationWord representation As letters (“the” → ``t'', ``h'', ``e'') – not good for
phonological filters, words may not be easy to pronounce.
As phonemes (“the” → “ð”, “ə”) – not easy because most dictionaries do not contain phonological transcriptions.
As a semi-letter form (“the” → “th”, “e”), for English only.
Mixed form of any of the above.
SemanticsSemantics
“Light” — is it as “small weight” or as “daylight”?
Enforcing required association is crucial: pairing “possibilities” with “great” (positive association)
rather than “problems” (negative association).
In case of ambiguous situation that the algorithm cannot evaluate the user has to select a proper set of synonyms (synset).
SimilaritiesSimilarities real world: “borrow” and yet “sorrow”, “barrow”, or
“burrow”, artificial system: rejected to avoid transitions like
“borrow” → “borr” or “borrow” → “borrom”.
GenuinenessGenuineness
Examples of compound words — “bodyguard”, “brainstorm” or “airmail”.
They are forbidden to avoid hijacking of words — priming word “jet” + “●●●mail” from the dictionary → “jetmail”.
ngramsngramsFunction ng(w) returns a sequence of strings (ngrams):
0: :
2 :2 ...
:
ng ng ng ng
ng ng ng
ng ng ng
ng w = w N ,w S S +N ,
w S S +N , ,
w nS nS +N
where w[i:j] represents string of symbols at positions i to j in the word w, and n·Sng = |w|-Nng-1.
In most cases: Nng=2, Sng=1.
Example: ''world'' → ''wor'', ''orl'', ''rld''.
Word rankWord rank
1
0
ng T w
i=
Q' w = q ng T w i
q is a dictionary function, T(w) is a composition of word w transformations, ng is a function partitioning symbols in w into
overlapping ngrams.
The total word rank function is a product over models:
The word rank function Q'(w)
#
1
modelsWk
kk=
Q w = Q' w
TransformationsTransformations
Transformation examples for Nng=2, Sng=1:neutral transformation: w → w
world → world
cyclic transformation: w → w•w[0:Nng-1]
world → worldwo
mirror transformation: w → w[|w|-1]•w[|w|-2]•...•w[0]
world → dlrow
topic
WordNet associations
dictionary priming set
wordrepresentation
word representation
probabilitymatrix
wordrank similarity
associations
results
Data flowData flow
Amazon’s Kindle — the core priming setAmazon’s Kindle — the core priming setacquir, collect, gatherair, light$, lighter, lightest, paper, pocket, portableanyplace, anytime, anywhere, cable, detach, global, globe, go$, went, gone, going, goes, goer, journey, move, moving, network, remote, road$, roads$, travel, wire, worldbook, data, informati, knowledge, librar, memor, news, word$, words$comfort, easi, easy, gentl, human, natural, personalcomputer, electronicdiscover, educat, learn, read$, reads, reading, explor
The exclusion list: aird, airin, airs, bookie, collectic, collectiv, globali, globed, papere, papering, pocketf, travelog.
ResultsResultsCreated word Google word count No. domains
librazone 968 1inforizine - -librable 188 -bookists 216 -inforld 30 -newsests 3 -memorld 78 1goinews 31 -infooks 81,200 7
More resultsMore resultsCreated word Google word count No. domainslibravel 972 -rearnews 8 -informated 18,900,000 8booktion 49 -inforion 7,850 61newravel 7 -datnews 51,500 20infonews 1,380,000 20lighbooks 1 -journics 763 1
Mambo system — the core priming setMambo system — the core priming setarticula, name
create, creating, creativ, generat, conceiv, build, make, construct, cook, formula, prepar, produc
explor, discov, new$, newer$, newest$, newly$, imagin
mean$, meanin, associat, idea$, ideas, cognitiv, think, thought, semant, connect, art$, artist, brain, mind, cogit
system$, systems$, program, automat, computer, artifici
wit$, wits$, witty$, smart, intell
word, letter, languag
The exclusion word: cookie.
Results for Mambo replacementResults for Mambo replacementCreated word Google word count No. domainssemaker 903 9braingene 45 -assocink 3 -thinguage 4,630 -systemake 4 -newthink 8,960 46thinknew 3,300 43assocnew 58 -artistnew 1,590 1semantion 693 6
Computational efficiencyComputational efficiencyNo priming dictionary, Nng=2, Sng=1, 100 best words,
English language,requires:word length naive algorithm optimized alg. increase [%]
3 40,122 6,682 - 4 1,083,321 22,968 243.73 5 29,249,694 39,559 72.24 6 789,741,765 39,111 -1.13 7 21,323,027,682 74,616 90.78 8 575,721,747,441 95,890 28.51 9 15,544,487,180,934 19,798 -79.35
10 419,701,153,885,245 47,569 140.2711 11,331,931,154,901,642 147,176 209.3912 305,962,141,182,344,361 104,371 -29.0813 huge numbers continue 132,095 26.56