skytree visualization fireside chat is big data ...tmm/talks/skytree14/skytree14.pdf · is big data...
TRANSCRIPT
![Page 1: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/1.jpg)
SkyTree Visualization Fireside ChatIs Big Data Visualization Possible?Tamara MunznerDepartment of Computer ScienceUniversity of British Columbia
Google Hangout on AirOctober 1 2014
http://www.cs.ubc.ca/~tmm/talks.html#skytree14
![Page 2: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/2.jpg)
About me: Geometry Center 1991-1995
2
http://geomview.org/
http://youtu.be/-gLNlC_hQ3M
• geometry and topology vis– 3D, 4D, non-Euclidean
http://youtu.be/sKqt6e7EcCs
http://youtu.be/x7d13SgqUXg
Geomview
The Shape of Space
Outside In
http://youtu.be/6j4T7l49H3Y http://www.crcpress.com/product/isbn/9781568814537
![Page 3: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/3.jpg)
About me: Stanford 1995-2000
• infovis: network vis– 3D hyperbolic trees/networks– computational linguistics network
3
H3
http://youtu.be/fhbQy_NCwWI
Constellation
http://youtu.be/7sJC3QVpSkQ
![Page 4: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/4.jpg)
About me: UBC 2002-
4
technique-driven work
problem-driven work
evaluation
theoretical foundations
![Page 5: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/5.jpg)
When to use visualization
• human in the loop needs the details– doesn't know exactly what questions to ask in advance– longterm analysis– automation stepping stone, refining, trustbuilding– presentation
• external representation: perception vs cognition• intended task, measurable definitions of effectiveness
5
Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively.
more at:Visualization Analysis and Design, Chapter 1. Munzner. AK Peters, 2014, to appear.
Visualization is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods.
![Page 6: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/6.jpg)
Why show data to people?
• summaries lose information – confirm expected and find unexpected patterns– assess validity of statistical model
6
![Page 7: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/7.jpg)
Why show data to people?
• summaries lose information – confirm expected and find unexpected patterns– assess validity of statistical model
6
Identical statisticsIdentical statisticsx mean 9x variance 10y mean 8y variance 4x/y correlation 1
Anscombe’s Quartet
![Page 8: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/8.jpg)
Why show data to people?
• summaries lose information – confirm expected and find unexpected patterns– assess validity of statistical model
6
Identical statisticsIdentical statisticsx mean 9x variance 10y mean 8y variance 4x/y correlation 1
Anscombe’s Quartet
![Page 9: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/9.jpg)
Technique-driven work: Networks
• scaling up networks– multilevel networks, 10K-100K nodes
• topologically aware decomposition, layout, browsing
– trees, millions of nodes• guaranteed visibility of semantically meaningful marks
7
ii
ii
ii
ii
166 7. Making Views
(a) Original Graph
Graph Hierarchy 1 Graph Hierarchy 2 Graph Hierarchy 3
(b) Graph Hierarchies
Figure 7.25: GrouseFlocks uses containment to show graph hierarchy struc-ture. (a) Original graph. (b) Several alternative hierarchies built from thesame graph. The hierarchy alone is shown in the top row. The bottom rowcombines the graph encoded with connection with a visual representationof the hierarchy using containment. From [Archambault et al. 08], Figure3.
TreeJuxtaposerPRISAD
http://youtu.be/GdaPj8a9QEo
http://youtu.be/AWXAe8zvkt8
TopoLayoutSmashing Peacocks FurtherGrouseGrouseFlocksTugGraph
http://youtu.be/fq8EIAOutvs
http://youtu.be/t1Xbt6XOWp8
![Page 10: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/10.jpg)
Technique-driven work: Dimensionality reduction
• closest overlap between vis and ML– Glimmer: MDS on the GPU– Glint: DR for costly distances– QSNE: sparse documents
• high quality for millions of items
8
QSNE
Glimmer
http://youtu.be/PLaBAPM6qLI
Glint
![Page 11: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/11.jpg)
MulteeSum
Problem-driven work: Genomics
9
source: Human
destination: Lizardchr1
chr2
chr3
chr4
chr5
chr6
chr7
chr8
chr9
chr10
chr11
chr1
2
chr13
chr14
chr15
chr16
chr17
chr18
chr1
9ch
r20
chr2
1
chr22
chrX
chrY
chr3
chr1
chr2
chr3
chr4
chr5
chr6
chra
chrb
chrc
chrd
chrf
chrg
chrh
saturationline
- +
10Mb
chr3
go to:
chr3 chr3
237164 146709664
386455 146850969
orientation:
match
inversion
invert
out in
MizBee http://youtu.be/86p7brwuz2g
http://youtu.be/AHDnv_qMXxQVariant View http://youtu.be/76HhG1FQngICerebral
![Page 12: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/12.jpg)
Problem-driven work: Many domains
10Overview: investigative journalism
RelEx: in-car overlay networksLiveRAC: system management time-series
Vismon: fisheries management http://youtu.be/h0kHoS4VYmk
http://youtu.be/89lsQXc6Ao4
http://youtu.be/ld0c3H0VSkw
http://vimeo.com/71483614
![Page 13: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/13.jpg)
More info
11
http://www.cs.ubc.ca/group/infovis/
http://www.cs.ubc.ca/~tmm/talks.html#skytree14
![Page 14: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/14.jpg)
12
![Page 15: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/15.jpg)
Overview design evolution
13
v4
![Page 16: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/16.jpg)
Overview design evolution
13
v4
• how to find the needle in the haystack?
• how to convince that the haystack has no needles?
![Page 17: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/17.jpg)
Overview design evolution
13
v1
v4
• how to find the needle in the haystack?
• how to convince that the haystack has no needles?
![Page 18: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/18.jpg)
Overview design evolution
13
v1
v3
v4
• how to find the needle in the haystack?
• how to convince that the haystack has no needles?
![Page 19: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/19.jpg)
14
Overview origin story: WikiLeaks meets Glimmer
![Page 20: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/20.jpg)
14
Overview origin story: WikiLeaks meets Glimmer
• WikiLeaks: hacker-journalist Jonathan Stray analyzing Iraq warlogs– conjecture that existing label classification falls short of showing all meaningful
structure in data• friendly action, criminal incident, ...
– had some NLP, needed better vis tools
![Page 21: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/21.jpg)
14
Overview origin story: WikiLeaks meets Glimmer
• WikiLeaks: hacker-journalist Jonathan Stray analyzing Iraq warlogs– conjecture that existing label classification falls short of showing all meaningful
structure in data• friendly action, criminal incident, ...
– had some NLP, needed better vis tools
• Glimmer: multilevel dimensionality reduction algorithm– scalability to 30K documents and terms
[Glimmer: Multilevel MDS on the GPU. Ingram, Munzner, Olano. IEEE TVCG 15(2):249-261, 2009. ]
![Page 22: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/22.jpg)
15
Task 1
InHD data
Out2D data
ProduceIn High dimensional data
Why?What?
Derive
In2D data
Task 2
Out 2D Data
How?Why?What?
EncodeNavigateSelect
DiscoverExploreIdentify
In 2D dataOut ScatterplotOut Clusters & points
OutScatterplotClusters & points
Task 3
InScatterplotClusters & points
OutLabels for clusters
Why?What?
ProduceAnnotate
In ScatterplotIn Clusters & pointsOut Labels for clusters
wombat
Visual dimensionality reduction for document datasets
• more on visual DR: hour-long talk Dimensionality Reduction from Several Angleshttp://www.cs.ubc.ca/~tmm/talks.html#linz14
![Page 23: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/23.jpg)
What/Why/How interplay
16
![Page 24: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/24.jpg)
What/Why/How interplay
16
• why: understand clusters
![Page 25: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/25.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy
![Page 26: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/26.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
![Page 27: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/27.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
Tables
Dataset Types
Networks
Link
Node (item)
Trees
![Page 28: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/28.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Network Data
Topology
Paths
Targets
![Page 29: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/29.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Network Data
Topology
Paths
Targets
![Page 30: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/30.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
Network Data
Topology
Paths
Targets
![Page 31: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/31.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
Network Data
Topology
Paths
Targets
![Page 32: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/32.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
ProduceAnnotate
tag
ProduceAnnotate
Network Data
Topology
Paths
Targets
![Page 33: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/33.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs– following or cross-cutting hierarchy!
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
ProduceAnnotate
tag
ProduceAnnotate
Network Data
Topology
Paths
Targets
![Page 34: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/34.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs– following or cross-cutting hierarchy!
• simple annotation
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
ProduceAnnotate
tag
ProduceAnnotate
Network Data
Topology
Paths
Targets
![Page 35: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/35.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs– following or cross-cutting hierarchy!
• simple annotation• progress tracking
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
ProduceAnnotate
tag
ProduceAnnotate
Network Data
Topology
Paths
Targets
![Page 36: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/36.jpg)
What/Why/How interplay
16
• why: understand clusters
• what: derive data of full cluster hierarchy– explore space of possible clusterings
• how: show cluster hierarchy– arrange space: node-link
• how: support tagging clusters/docs– following or cross-cutting hierarchy!
• simple annotation• progress tracking• user-defined semantics
Tables
Dataset Types
Networks
Link
Node (item)
Trees
Arrange Networks And Trees
Node-link Diagrams
TREESNETWORKS
Connections and Marks
ProduceAnnotate
tag
ProduceAnnotate
Network Data
Topology
Paths
Targets
![Page 37: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/37.jpg)
How: Idiom design decisions
17
Juxtapose and Coordinate Views
Share Encoding: Same/Di!erent
Share Data: All/Subset/None
Linked Highlighting
Why?
How?
What?
• facet: juxtapose linked views– linked color coding
• cluster hierarchy tree• DR scatterplot• tags
– reading text/keywords• cluster list• doc reader
Identity Channels: Categorical Attributes
Spatial region
Color hue
Motion
Shape
![Page 38: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/38.jpg)
Overview video (version 1)
18
http://www.cs.ubc.ca/labs/imager/tr/2012/modiscotag/
![Page 39: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/39.jpg)
Path to adoption
• version 1– fast cluster hierarchy construction for sparse data– research prototype by PhD student– positive initial assessment from AP Caracas bureau chief
• barrier to adoption: difficult install/load process
19
2011
v1
![Page 40: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/40.jpg)
Path to adoption
• version 1– fast cluster hierarchy construction for sparse data– research prototype by PhD student– positive initial assessment from AP Caracas bureau chief
• barrier to adoption: difficult install/load process
• version 2– web deployment, DocumentCloud integration, usability
• many months of engineering– Knight Foundation funding to the rescue!
• published story by unaffiliated reporter: police corruption in Tulsa
20
2011 2012
v1 v2$
![Page 41: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/41.jpg)
Path to adoption
• even more rounds of what/why/how interplay– which views needed? what should they show? how should they show it?– usability and utility
• version 3– published story: VP candidate Ryan asked for federal help even as championed cuts– published story: gun control debate
• version 4– followup investigation: government corruption in Texas– published story: police misconduct in New York (Pulitzer prize finalist!)
21
2011 2012 2013 2014
v1 v2 v3 v4$
![Page 43: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/43.jpg)
Overview video v4
22
• versions 3 and 4– no DR scatterplot– tree arrangement emphasizing nodes not links– combined doc/cluster viewer
http://vimeo.com/71483614
![Page 44: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/44.jpg)
Why: Task abstractions
23
![Page 45: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/45.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)
![Page 46: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/46.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis
![Page 47: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/47.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters
![Page 48: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/48.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
![Page 49: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/49.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)
![Page 50: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/50.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis
![Page 51: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/51.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents
![Page 52: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/52.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
![Page 53: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/53.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
Discover
![Page 54: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/54.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
QueryIdentify Compare Summarise
Discover
![Page 55: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/55.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
Search
Target known Target unknown
Location knownLocation unknown
Lookup
Locate
Browse
Explore
[A Multi-Level Typology of Abstract Visualization Tasks. Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ]
QueryIdentify Compare Summarise
Discover
![Page 56: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/56.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
• prove non-existence of evidence
Search
Target known Target unknown
Location knownLocation unknown
Lookup
Locate
Browse
Explore
[A Multi-Level Typology of Abstract Visualization Tasks. Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ]
QueryIdentify Compare Summarise
Discover
![Page 57: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/57.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
• prove non-existence of evidence– even harder!
Search
Target known Target unknown
Location knownLocation unknown
Lookup
Locate
Browse
Explore
[A Multi-Level Typology of Abstract Visualization Tasks. Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ]
QueryIdentify Compare Summarise
Discover
![Page 58: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/58.jpg)
Why: Task abstractions
23
• what’s in this collection? (of leaked docs)– generate hypothesis– summarize clusters– explore clusters
• locate evidence (within FOIA dump)– verify hypothesis– identify clusters/documents– locate clusters/documents
• prove non-existence of evidence– even harder! – exhaustive reading vs filtering out irrelevant
Search
Target known Target unknown
Location knownLocation unknown
Lookup
Locate
Browse
Explore
[A Multi-Level Typology of Abstract Visualization Tasks. Brehmer and Munzner. IEEE TVCG 19(12):2376-2385, 2013 (Proc. InfoVis 2013). ]
QueryIdentify Compare Summarise
Discover
![Page 59: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/59.jpg)
Now what?
• continuing adoption– food stamp distribution delays in North Carolina
– Surprise! Many credit card agreements allow repossession
– The brilliance of Louis C.K.'s emails: He writes like a politician
– Private memo reveals winding tale involving John McCain, the NRA, and... condors
• continuing development– Knight Foundation funds v5: named entity recognition, plugin API
• InfoVis14 paper
24
Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool For Investigative Journalists. Brehmer, Ingram, Stray, and, Munzner.
https://www.overviewproject.org/
http://overview.ap.org/
http://www.cs.ubc.ca/labs/imager/tr/2014/Overview/
![Page 60: SkyTree Visualization Fireside Chat Is Big Data ...tmm/talks/skytree14/skytree14.pdf · Is Big Data Visualization Possible? Tamara Munzner Department of Computer Science ... • version](https://reader036.vdocuments.site/reader036/viewer/2022080718/5f78896b160d54566423b8a6/html5/thumbnails/60.jpg)
Algorithm: Spinoff series
• dimensionality reduction for huge text collections– great algorithm problem in its own right!– QSNE: fast and high-quality DR for millions of documents
• key feature: handle sparseness appropriately
25
[Dimensionality Reduction for Documents with Nearest Neighbor Queries. Ingram and Munzner. Neurocomputing (Special Issue on Visual Analytics using Multidimensional Projections), to appear 2014.]http://www.cs.ubc.ca/labs/imager/tr/2014/QSNE/