buzz word cloud

7
BUZZ WORD CLOUD

Upload: dirk-nachbar

Post on 20-Jul-2015

392 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Buzz word cloud

BUZZ WORD CLOUD

Page 2: Buzz word cloud

Word clouds

... Have been around for a while

... Visualise a word frequency table by changing font

Page 3: Buzz word cloud

Improvements

Using posts from blogs/fora, treat them independently

Remove posts which replicate content (similarity > 50%)

Pick up collocations (frequent bigrams)

Remove links

Page 4: Buzz word cloud

Process

Process in Python Import

create many lists of strings make lower case

From list of strings create list of lists Remove punctuation and links For each list create word frequency Deduplicate similar lists (score is >50%)

generate one list

Find collocations, assign ~ Remove top 500 most frequent words

Create frequency table Remove top 500 most frequent words

Output top 150 from frequency table

Create word cloud in Wordle http://www.wordle.net/advanced

Page 5: Buzz word cloud

Deduplication algorithm

For each word frequency of posts Compare to word frequencies processed Total=0, unique=0

For each letter in word frequency 1 Unique+1

If word frequency 1 in word frequency 2 Total+Min(frequencies)/max(frequencies)

For each letter in word frequency 2 not in word frequency 1 Unique+1

Score=Total/unique

Page 6: Buzz word cloud

Social media example

old

new

•Social media is a collocation•Needs to be developed further

Page 7: Buzz word cloud

BUZZ WORD CLOUD

Thanks

Where I work: Targetbase Claydon Heeley