years of world war-riors · the field, wordle searches for the smallest quadtree node that entirely...

12
YEARS OF WORLD WAR-RIORS Supervisor Professor Qu Huamin AUGUST 5, 2014 UROP 1000 REPORT By Akanksha Gupta | 20154675 | HKUST

Upload: others

Post on 23-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

YEARS OF WORLD WAR-RIORS

Supervisor Professor Qu Huamin

AUGUST 5, 2014

UROP 1000 REPORT

By Akanksha Gupta | 20154675 | HKUST

Page 2: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 1

TABLE OF CONTENTS

ABSTRACT ........................................................................................................................ 2

VISUALISATION TECHNIQUES ................................................................................. 3

(A) TAG CLOUD (WORDLE) ........................................................................................... 3

(B) WORD TREE (IBM MANY EYES): ........................................................................... 5

DESIGN CONCEPTUALIZATION ............................................................................... 7

(A) TIMELINE ................................................................................................................. 7

(B) STORYBOARD ........................................................................................................... 8

CONCLUSIVE COMPILATION .................................................................................. 10

DATA SOURCES USED FOR VISUAL ANALYSIS ............................................... 10

BIBLIOGRAPHY ............................................................................................................ 11

ACKNOWLEDGEMENTS ............................................................................................ 11

Page 3: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 2

ABSTRACT This project is a research into visualizing data (interactive java and flash applets) with word trees

using IBM Many Eyes and tag clouds using Wordle. Two databases were used; first, a collection of

personal notes and online timelines, and second, two course books of HUMA 2589. The aim of the

project is to organize data for easier management and to make learning of history during two world wars

revolving around Hitler, Stalin and Mussolini easier and more analytical for students as well as

interested historians. I have also discussed what visualization software I used, their working & basic

algorithm along with analysis of the visualized data. For immediate requirement, please go to:

https://mypresentation4urop.wordpress.com/.

Page 4: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 3

VISUALISATION TECHNIQUES There are a lot of techniques for making data interactive, such as pie charts, histograms, tree maps,

graphs, matrices, networks, scatterplots, phrase nets etc. The aim, however, was to use data from two

textbooks and online sources to act as a visual guide for the students of HUMA 2589 course. Broadly,

the data used were:

(a) Large & unstructured: text analysis should exclude relationships, trends & value comparisons.

(b) Intensive & repetitive: data needs to be trimmed; key words need to be emphasized.

(c) Chronological: the display needs to focus on data sequence, whether year-wise or chapter-wise.

Tus, there are two techniques that matched the above suitably; WORD TREE and TAG CLOUD.

(A) Tag Cloud (Wordle)

Introduction: A tag cloud that arranges words from a passage in a random way, with maximum use of

typographical space and different fonts, to represent their frequency and importance.

Creation: The words (i) are assigned weights (𝑡𝑖) according to the frequency of their occurrence. When

the number of display words and weights of the most and least frequent words (𝑡𝑚𝑎𝑥, 𝑡𝑚𝑖𝑛) are

determined, the max display font size (𝑓𝑚𝑎𝑥) is calculated by keeping the minimum size 1. Then

logically the font size (𝑆𝑖) of each word is: 𝑺𝒊 =𝒇𝒎𝒂𝒙 𝒕𝒊−𝒕𝒎𝒊𝒏

𝒕𝒎𝒂𝒙−𝒕𝒎𝒊𝒏

Advantages: Wordle is the easiest way of producing visually aesthetic static tag clouds in any non-CKJ

(Chinese, Japanese, Korean) language using the static method UnicodeBlock.of(int codePoint). Unlike

Togul et al, Wordle counts the no. of words by itself; we need not input the weight of each word

Page 5: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 4

individually. It allots every word a hierarchical bounding box, which is divided recursively

into progressively smaller boxes. The largest word is placed first. The next smaller word is

spiraled away from the previous word until they don’t intersect/collide. To check collision a caching

technique is used along with special indexing. Caching ensures that the bounding boxes of the words

don’t overlap. Spatial indexing ensures that all the words remain within a rectangular playing field using

‘region quadtree’. It divides the 2D field into 4 smaller rectangles recursively. When a word is placed on

the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the

word to that node. This makes efficient use of space to fit in all the words without collision. Wordle lets

the user change the layout, word count, color scheme, font style etc.; allowing cloud manipulation.

Usage: Tag clouds are used to get the gist of stories, keywords in articles, checking for repetition in

essays, learning vocabulary, making posters etc.. I used it as a storyboard to get chapterwise summary

and to emphasize the keywords. I also used it to aid the <search> in word tree timeline.

Overcoming Disadvantages: A tag cloud may mislead a person who has no pre-knowledge of this topic

by emphasizing less important words and misconveying the gist. This happens if the writer of the

passage has used the keywords very less or is not very good and so, prone to repitition. To avoid his, for

the first two series of clouds, I used the latest edition of historic books by accomplished authors, and for

the second cloud, I used Wordle accessories to delete words like “Russia”, “Hitler”, “Italy” etc. as they

had been used too many times. There is Scope for improvement in future:

Color scheme: Wordle should have meaningful color schemes because some colors (red, green etc.)

grab more attention than others (blue, yellow etc.)

Page 6: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 5

Word Form: words like “words”, “Word” & “Words” should have unique representation (“word”)

Cloud shape: Wordle should let users set the shape of the cloud.

As an example, I have generated a Wordle tag cloud for this entire

report.

(B) Word Tree (IBM Many Eyes):

Introduction: A word tree is a tool used to enable the user to search for selected relevant data. The data

that is entered is pruned into a visual concordance i.e. it branches out from a search word.

Creation: A word tree begins as a blank slate, i.e. the data stored is not visualised until the user gives

the command <search word>. Once the command for search is issued, it looks for all the data either

before or after the word. It then activates a finite word loop which crops the data till the terminating

point or line break. Finally, it displays the branches of the word, with the font size of a branch as large as

its frequency (i.e no. of its sub-branches).

Advantages: One of the most optimum software for creating word trees is IBM Many Eyes as it allows

easy manipulation & navigation of data. For instance, one can zoom into a particular branch, see it from

the front or the end, and go backward and forward through the search.

Reason: Often while studying online, I press “CTRL + F” to find relevant data. This gave me an idea -

what if I save time and get all the important data simultaneously, without having to scroll through a

mountain of text? So, I decided to experiment with word trees, esp. to study HUMA 2589 History books.

Page 7: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 6

Usage: Some of the popular uses of word trees are political speeches, newspaper or research articles,

teaching students etc. I used it for chronological comparison of three countries and their leaders during

to the world wars for both learning & teaching puposes.

Overcoming Disadvantages: A user with no pre-knowledge of the data will not know which search

word would offer meaningful results in an IBM/ME Word Tree. So, I made a tag cloud using the data of

the word tree because the tag cloud gives idea of which search wors) would yield the necessary results.

There is following scope for improvements:

Search: should also generate a cloud with the tree such that the cloud words act as search words.

Pruning: should treat periods (.) &

line-breaks (“\n”) as the end

(break;) of a branch (word loop).

I have genereated a Tree for the entire

report. (also see previous topic example)

Page 8: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 7

DESIGN CONCEPTUALIZATION

(A) Timeline

I merged the notes obtained from the HUMA text books

and online sources to create a data base for a tag cloud that

shows the years (e.g. circled) one should enter in the word

tree below (also created) for optimum results:

After typing one of the years in wordle cloud:

After clicking on a branch:

Page 9: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 8

(B) Storyboard

The sequence of chapter clouds is accompanied by its interpretation (words in bold are used from cloud):

CASE I (with data analysis)

Introduction

Russia's political climate was revolutionary (millions of peasants

aided) during the reign of TsarNicholas due to economic crisis.

This led to the creation of the Duma(elected parliament). The regimewas still perceived as repressive

and autocratic.

1917: Revolutions of Feb & Oct

In 1917 Petrograd, there were revoutions in Feb and Oct >> Bolsheviks seized power from bourgeoisie via coup >> dual

power b/w soviets & provisionalgov. + Kornilov's-Kerensky Putsch,

democratic western influence, constituent assembly & coalitions,

and Marxism.

The Civil War

Bolsheviks's Red Army (workers; proletariat support) won the 1918 Civil War (post brest-litovsk peace treaty) >> economic standstill, war

communism and Bolshevik's in power. Lenin headed them like a

dictator & believed in national self-determination (anti-capitalist).

NEP & the Future Of Revolution

Kronstadt & Tambov Uprisings -Bolsheviks & Workers part ways.

Communists & revolutionaries fail. NEP: Agriculture (requsitioningreplaced with tax in kind), Private

trade revived, Currency stabilised, industrial development. Moscow,

capital, revived. Power struggle after Lenin's death.

Stalin's Revolution

Initially, leadership divided b/w Stalin & Right. Later, Stalin's five-

year plan -- war-time policies(execution by GPU on suspicion of

anti-soviet conspiracy or party opposition, industrialisation,

collectivisation) Trotsky (exiled) opposed foreign policies, bourgeois

intelligentsia & kulaks killed.

Ending the Revolution

Brinton's analogy of russian revolution to a fever virus >> 3 stages from rev. to post rev. Rev.

victory promised return of normalcy & convalescence, but

would relapse into revolution again. Stalin's regime -- total employment,

NEP, collectivisation; but also Great Purges until the very end.

Page 10: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 9

CASE II (very brief analysis)

Fascism definition; Italian movement origin; political, radical; anti conservatism,

socialism & authoritarianism; militarised movement

Cultural transformation of the Fin de siècle (19th CE): new Europe society - growth expansion, modern doctrines. Racism, violence, rationalism

anti-Jewism, Marxism

Radical & Authoritarian Nationalism in Late 19th

CE: Italy, France, Germany -new revolutionary, radical,

anti-liberal movements; nationalism within socialism

Impact of WW1: militarisation, dictatorships, genocides, political fragility, Germany deafeated, onset of

revolutions and rise in nationalism & violence.

Rise of Italian Nationalism: fascismo, radical political

revolutionary mvmt; violent, nationalistic, socialist; rise of Mussolini - march on Rome

Nonfascist Authoritarianism in S. & E. Europe: Spain, romania, bulgaria, greece, hungary, portugal, poland:

militarist authoritarian regime

German National Socialism: Nazi Party (NSDAP) under

Hitler rose to power after the failure of Weimar republic. Rise of Hitler, third reich, econ growth, social revltn.

Transformation of Italian Fascism: Mussolini dictator; growing economic autarchy;

Hitler's anti-semitic influence; invasions into ethiopia,

bologna atc.

4 Major Variants, Fascism: German & Austrian, Spanish,

Hungarian, Romanian; authoritarian, socialist,

nationalist, political radicalist

Minor Mvmts: French, Greek, Portugese, British, German --protofascist, socialist, catholic -- against authoritarian regime

Fascism Outside Europe:Japan (Militarist Imperialism & nationalism), Argentina & other Latin / South American

countris (Peronism)

WW2: Destruction Of Fascism: Germany aggressor; Fra, Brit Itay, USSR - unions,

military alliances --administration nonfascist

Interpretations of Fascism: different approach by

germany, italy etc. political movement against all

(capitalism, totalitarianism, authoritarinism etc.)

Generic Fascism? Nationalist yet socialist political mvmts of small radical groups all

over the country. First Mussolini, then Hittler

inspired too.

Fascism & Modernization: economic growth, Nazism, racism, socialist nationalist

revolutionary mvmts, modern technology in industry &

military

Retrodictive fascist theory:Italy, Germany, Austria,

Hungary, Romania, Spain, Croatia; crisis (democratic), intense nationalism, racism

yet secularization, militarism

Page 11: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 10

CONCLUSIVE COMPILATION I believe visualization helps people perceive & analyze massive, elaborate databases efficiently, and

there is always scope for further improvements in the existing techniques and need for newer ones. I

have placed all my designs together at https://mypresentation4urop.wordpress.com/ for easy access and

organisation. For the timeline design, I have embedded the word tree (image) and the wordle cloud on

the webpage. For the storyboard design, I have embedded PDF links of SmartArt – sequenced tagclouds.

I also uploaded a Audio-Visual presentation of my journey through the project till its completion.

DATA SOURCES USED FOR VISUAL ANALYSIS

Adolf Hitler Timeline. (n.d.). Retrieved from World History Project:

http://worldhistoryproject.org/topics/adolf-hitler

Benito Mussolini. (n.d.). Retrieved from Skepticism.org: http://skepticism.org/timelines/tag/people/benito-

mussolini/order:year/criteria:4/tmpl_suffix:_table/

Benito Mussolini (1883-1945). (n.d.). Retrieved from The History Mole: http://www.historymole.com/cgi-

bin/main/results.pl?theme=10025525

Fitzpatrick, S. (1979). The Russian Revolution (3 ed.). (M. Cotton, Ed.) SPI Publisher Services,

Pondicherry, India.

Hitler's rise and fall: Timeline. (2005, April). Retrieved from OpenLearn - Open University:

http://www.open.edu/openlearn/history-the-arts/history/hitlers-rise-and-fall-timeline

Payne, S. G. (1995). A History of Fascism, 1914-45.

Rosenberg, J. (n.d.). Adolf Hitler. Retrieved from About.com:

http://history1900s.about.com/library/holocaust/blhitler.htm

Rosenberg, J. (n.d.). Russian Revolution Timeline. Retrieved from About.com:

http://history1900s.about.com/od/Russian-Revolution/a/Russian-Revolution-Timeline.htm

Russian Revolution Timeline. (n.d.). Retrieved from SoftSchools:

http://www.softschools.com/timelines/russian_revolution_timeline/70/

Word War II in Europe Timeline. (n.d.). Retrieved from The History Place:

http://www.historyplace.com/worldwar2/timeline/ww2time.htm

Page 12: YEARS OF WORLD WAR-RIORS · the field, Wordle searches for the smallest quadtree node that entirely contains the word, and adds the word to that node. This makes efficient use of

UROP 1000 VISUAL ANALYSIS OF DATA Page | 11

BIBLIOGRAPHY

Feinberg, J. (2010). Chater Three, Wordle. In J. Steele, & N. Iliinsky, Beutiful Visualisation (p. 418).

O'Reilly. Retrieved from http://static.mrfeinberg.com/bv_ch03.pdf

Feinberg, J. (2013). Retrieved from Wordle, Beautiful Woord Clouds: http://www.wordle.net/

Gupta, A. (2014, July). Retrieved from Years Of World War-riors:

https://mypresentation4urop.wordpress.com/

Gupta, A. (2014, July). Years Of World War-riors. Retrieved from IBM Many Eyes: http://www-

958.ibm.com/software/data/cognos/manyeyes/visualizations/life-lies-of-4-world-war-riors

Henderson, S., & Evergreen, S. (2014, January 13). Word Tree. Retrieved from BetterEvaluation:

http://betterevaluation.org/evaluation-options/wordtree

IBM research & corpus software groups. (n.d.). Word Tree. Retrieved from IBM Many Eyes:

http://www-958.ibm.com/software/data/cognos/manyeyes/page/Word_Tree.html

Qu, H., Cui, W., Wu, Y., Liu, S., Wei, F., & Zhou, M. X. (2010). Context-Preserving, Dynamic Word

Cloud Visualization. Pacific Visualization. Retrieved from

http://www.cse.ust.hk/~huamin/cui_cga10.pdf

(n.d.). Visualization of Student Activity Patterns within Intelligent Tutoring Systems. Retrieved from

http://centerforknowledgecommunication.com/newPubs/PatternsITSJan23CC.pdf

ACKNOWLEDGEMENTS

Most of the conceptual understanding and visualization software were a part of the suggestions of

Professor Qu Huamin and PhD students SHI Conglei and XU Panpan.