how and why study big cultural data v2
DESCRIPTION
Visualizing large image and video collections: techniques, examples, theory. More info: softwarestudies.comTRANSCRIPT
![Page 1: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/1.jpg)
softwarestudies.com
How and why study big visual cultural data
Dr. Lev ManovichProfessor, CUNY Graduate [email protected]
Fall 2012 version
1
![Page 2: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/2.jpg)
softwarestudies.com 2
![Page 3: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/3.jpg)
softwarestudies.com 3
Software Studies Initiative - 2007
NEH Office for Digital Humanities - 2008
NEH Humanities High Performance Computing - 2008
NEH/NSF Digging Into Data competition - 2009
Computational Social Science - 2009
Culturnomics and Google n-gram viewer - 2010
New York Times: “The next big idea in language, history and the arts? Data.”- 2010
![Page 4: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/4.jpg)
softwarestudies.com 4
How can we take advantage of unprecedented amounts of cultural data available on the web and digitized cultural heritage to begin analyzing cultural processes in new ways?
How does computational analysis of the massive cultural datasets and real-time flows can help us to develop theories and methods in humanities adequate for the scale and speed of the 21st century global networked digital culture ?
![Page 5: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/5.jpg)
softwarestudies.com
NEH/NSF Digging into Data competition (2009):
“How does the notion of scale affect humanities and social science research? Now that scholars have access to huge repositories of digitized data—far more than they could read in a lifetime—what does that mean for research?”
5
![Page 6: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/6.jpg)
softwarestudies.com
Why study big cultural data ?
6
![Page 7: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/7.jpg)
softwarestudies.com
1 study societies through the social media traces (social computing)
2 more inclusive understanding of cultural history and present (using much larger samples)
3 detect large scale cultural patterns
7
![Page 8: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/8.jpg)
softwarestudies.com
4 generate multiple maps of the same cultural data sets (multiple “landscapes”) 5 the best way to follow global professionally produced digital culture; understand new developed cultural fields (“X” design)
6 map cultural variability and diversity
8
![Page 9: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/9.jpg)
softwarestudies.com 9
![Page 10: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/10.jpg)
softwarestudies.com
Example - graph from Ted Underwood, “The Differentiation of Literary and nonliterary diction, 1700-1900.” Data: 3,724 18th century volumes, using 10,000 most frequent words (excluding proper nouns).
10
![Page 11: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/11.jpg)
softwarestudies.com
modern (19th-20th centuries) social and cultural theory: describe what is similar (classes, structures, types) / statistics (reduction)
computational humanities and social science should focus on describing what is different / variability / diversity
“from data to knowledge” is wrong. In the study of culture, we need to go from our (incomplete, biased) knowledge to actual cultural data
11
![Page 12: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/12.jpg)
softwarestudies.com
“We are no longer interested in the conformity of an individual to an ideal type; we are now interested in the relation of an individual to the other individuals with which it interacts... Relations will be more important than categories; functions, which are variable, will be more important than purposes; transitions will be more important than boundaries; sequences will be more important than hierarchies.”
Louis Menand on Darvin, 2001.
12
![Page 13: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/13.jpg)
softwarestudies.com
Visualization: Thinking without “large” categories
13
![Page 14: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/14.jpg)
softwarestudies.com 14
Manual De Landa:“The ontological status of assemblages, large and small, is always that of unique, singular individuals.”
“Unlike taxonomic essentialism in which genus, species and individuals are separate ontological categories, the ontology of assemblages is flat since it contains nothing but differently scaled individual singularities.”
source: A New Philosophy of Society.
![Page 15: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/15.jpg)
softwarestudies.com
Bruno Latour: “The ‘whole is now nothing more than a provisional visualization which can be modified and reversed at will, by moving back to the individual components, and then looking for yet other tools to regroup the same elements into alternative assemblages.”
source: “Tarde’s idea of quantification.” In The Social After Gabriel Tarde: Debates and Assessments.
15
![Page 16: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/16.jpg)
softwarestudies.com
How to study big cultural visual data in practice?
How to explore massive visual collections (exploratory media analysis)?
Which data analysis and visualization techniques are appropriate for non-technical users? How to democratize data analysis?
16
![Page 17: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/17.jpg)
softwarestudies.com
Our methodology:media visualization
display complete collection sorted using
metadata and/or extracted features
17
![Page 18: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/18.jpg)
softwarestudies.com 18
infovis: data into pictures
mediavis: pictures into pictures
![Page 19: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/19.jpg)
softwarestudies.com 19
left: scatter plotright: media visualization (image plot) of the same data
![Page 20: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/20.jpg)
our media visualization software on 287 megapixel display (image: 1 million manga pages)
![Page 21: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/21.jpg)
softwarestudies.com
our media visualization software on newer display wall with thin bezels data: 4535 Time magazine covers)
21
![Page 22: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/22.jpg)
softwarestudies.com 22
mediavis - related research:
M. Worring, G.P. Nguyen. Interactive access to large image collections using similarity-based visualization. Journal of Visual Languages and Computing 19 (2008) (submitted 2005).
Gerald Schaefer. Interactive Browsing of Image Repositories. ICVG 2012.
Jing et al., Google Inc. Google Image Swirl: A Large-Scale Content-Based Image Visualization System. WWW 2012.
![Page 23: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/23.jpg)
softwarestudies.com 23
mediavis vs. normal computer science approach:borrow techniques from media art, digital art, information visualization / for non-technical users
explore the possibilities of simplest techniques by using them with media collections from every area of humanities
use mediavis to challenge existing concepts and assumptions of humanities
![Page 24: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/24.jpg)
![Page 25: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/25.jpg)
softwarestudies.com
Basic media visualization techniques:
1 montage: sort images using metadata
2 slice: sample images and arrange using metadata
3 image plot: automatically measure image properties (features) and organize in 2D using these measurements and metadata
25
![Page 26: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/26.jpg)
softwarestudies.com
1 montage: sort images using metadata
26
4535 Time covers, 1923-2009
![Page 27: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/27.jpg)
softwarestudies.com 27
1 montage close up: Time magazine covers, 1920s
![Page 28: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/28.jpg)
softwarestudies.com 28
1 montage close up: Time magazine covers, 1990s-2000s
![Page 29: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/29.jpg)
softwarestudies.com 29
2 slice: sample images and arrange using metadata
4535 Time covers, 1923-2009. Each line is a vertical slice through the center of an image.
![Page 30: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/30.jpg)
softwarestudies.com 30
Time coves slice close-up
![Page 31: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/31.jpg)
softwarestudies.comImage plots of 4535 Time covers, 1923-2009. X-axis = date; Y-axis = saturation mean.
31
3 image plot: organize images using features and (optionally) metadata
![Page 32: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/32.jpg)
softwarestudies.com 32
Time covers image plot close-up
![Page 33: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/33.jpg)
softwarestudies.com
Comparing a number of image sets with image plots
33
Selected paintings by six impressionist artists. X-axis = mean saturation. Y-axis = median hue. Megan O’Rourke, 2012.
![Page 34: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/34.jpg)
softwarestudies.com 34
![Page 35: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/35.jpg)
softwarestudies.com
visualizing video collections:
use media visualization with a set of keyframes
automatic selection of key frames (for example, using free shot detection software)
35
![Page 36: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/36.jpg)
softwarestudies.com
Kingdom Hearts video game 62.5 hr. of game play, 29 sessions over 20 days.ys.montage: 1 frame per 3 sec (22500 frames in total)
![Page 37: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/37.jpg)
softwarestudies.com 37
![Page 38: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/38.jpg)
softwarestudies.com 38
![Page 39: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/39.jpg)
softwarestudies.com
11th Year (Dziga Vertov, 1928): first frame of every shot
![Page 40: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/40.jpg)
softwarestudies.com
11th Year (Dziga Vertov, 1928): comparing first and last frame in every shot (close-ups from the larger visualization)
40
![Page 41: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/41.jpg)
softwarestudies.com
Why use numbers?
Using numbers to describe cultural artifacts allows to replacing discrete categories (words) with continuos descriptions (curves)
41
![Page 42: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/42.jpg)
softwarestudies.com
1 from timelines to graphs
2 better represent analog attributes of cultural artifacts
3 map cultural landscapes (fuzzy / overlapping / hard clusters?)
4 visualize cultural variability
5 discover new gropings42
![Page 43: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/43.jpg)
softwarestudies.com
1 from timelines to curves Mark Rothko, 393 paintings (1927-1970).X - year. Y - brightness mean. Hao Wang and Mayra Vasquez.
![Page 44: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/44.jpg)
softwarestudies.com
2 better represent analog attributes of cultural artifacts
Next slide: close-up of a visualization showing average amount of visual change (bar graph) in every shot in Vertov’s 11th year. Images above the bar: first frame of every shot.
To measure visual change per shot: 1) calculate brightness mean of the difference image between each two frames in the shot2) add all means3) divide by number of frames in the shot
![Page 45: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/45.jpg)
softwarestudies.com
![Page 46: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/46.jpg)
softwarestudies.com
3 the maps of cultural landscapes reveal fuzzy and overlapping clusters - rather than discrete categories with hard boundaries
46
![Page 47: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/47.jpg)
softwarestudies.com
4 visualize the space of variations 600 variations of Google Logo, 1988-2009
![Page 48: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/48.jpg)
softwarestudies.com 48
![Page 49: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/49.jpg)
softwarestudies.com
Studying large massive data sets challenges our existing theoretical concepts and assumptions
example: what is “style”?
49
![Page 50: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/50.jpg)
softwarestudies.com
image plot of one million manga pagesx - standard deviationy - entropy
![Page 51: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/51.jpg)
softwarestudies.com 51
![Page 52: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/52.jpg)
softwarestudies.com 52
distribution of million manga pages
x - standard deviationy - entropy
![Page 53: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/53.jpg)
softwarestudies.com
single short manga series < 1000 pages
53
![Page 54: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/54.jpg)
softwarestudies.com
776 Vincent van Gogh paintings. X - year/month. Y - brightness mean.
54
![Page 55: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/55.jpg)
softwarestudies.com
Current / recent projects at softwarestudies.com:
6000+ paintings of French Impressionists
7000 year old stone arrowheads (with UCSD anthropologist)
55
![Page 56: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/56.jpg)
softwarestudies.com
samples from 4.7 million newspaper pages collection from Library of Congress (UCSD undergraduate students)
virtual world / game analytics (funded by NSF Eager, with UCSD Experimental Games Lab)
comparing Art Now & Graphic design Flickr groups (340,000 images)(with CS collaborator from Laurence Berkeley National Laboratory)
56
![Page 57: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/57.jpg)
softwarestudies.com 57
Big project supported by Mellon Foundation Grant, 2012-2015
- tools and workflows for working with image and video collections using SEASR / MEANDRE digital humanities workflow platform
- applications:1) 1+ million images + millions of metadata records from deviantArt (the largest social network for user-created art - 20 M users, 240 M artworks).2) 1+ million manga pages.3) thousands of hours TV poltical news and online video
![Page 58: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/58.jpg)
softwarestudies.com
Postscript:
digital humanities (working with digitized collections of historical artifacts) vs. computational humanities (using social web data)
58
![Page 59: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/59.jpg)
softwarestudies.com
“The capacity to collect and analyze massive amounts of data has transformed such fields as biology and physics. But the emergence of a data-driven 'computational social science' has been much slower. Leading journals in economics, sociology, and political science show little evidence of this field. But computational social science is occurring in Internet companies such as Google and Yahoo, and in government agencies such as the U.S. National Security Agency.”
“Computational Social Science.” Science, vol. 323, no. 6, February 2009.
59
![Page 60: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/60.jpg)
softwarestudies.com
Massive amounts of cultural content and online conversations, opinions, and cultural activities (general and specialized social media networks; personal and professional web sites ).This data offers us unprecedented opportunities to understand cultural processes and their dynamics and develop new concepts and models which can be also used to better understand the past.
Currently only analyzed by Google, Facebook, YouTube, Bluefin labs, Echonest, and other companies, and computer scientists working in “social computing”- not yet by humanists.
60
![Page 61: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/61.jpg)
softwarestudies.com
softwarestudies.com
61
![Page 62: How and why study big cultural data v2](https://reader035.vdocuments.site/reader035/viewer/2022062418/55617e7bd8b42a98268b5224/html5/thumbnails/62.jpg)
softwarestudies.com 62
Our free open source software tools for analyzing and visualizing large image and video collections, publications and projects:
softwarestudies.com
The tools run on Mac, PC, Unix.
All media visualizations in this presentation were created by members of Software