visualizating the non-visual: spatial analysis and interaction with information from text documents...

23
Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M. Pottier, A. Schur, and V. Crow Bongshin Lee April 4, 2001

Post on 20-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

Visualizating the Non-Visual:Spatial Analysis and Interaction with

Information from Text Documents

J.A. Wise, J.J. Thomas, K. Pennock,

D. Lantrip, M. Pottier, A. Schur, and V. Crow

Bongshin LeeApril 4, 2001

Page 2: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 2

Paper Outline

• Introduction

• Visualizing text

• Visualization transformations: from text to pictures

• Examples from the MVAB Project

• Conclusions and directions for future research and development

Page 3: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 3

Introduction

• Current Visualization approaches– For visualizing mostly structured and/or hierarchical

information

• Some research in information retrieval– Utilized graph theory or figural display– Information returned is documents in text form

• Users still have to read• Causes a severe upper limit

• Open Source digital information– Available text overwhelms the traditional reading

methods of inspection, sift and synthesis

Page 4: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 4

Visualizing text

• True text visualizations– Must represent textual content and meaning without the

user having to read it

– Result from content abstraction and spatialization of the text document

• Use primarily preattentive, parallel processing powers of visual perception

• Goal is to spatially transform text information into a new visual representation

Page 5: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 5

Visualization transformations: from text to pictures

• Four important technical considerations – Clear definition of text

• what comprises text

• how it can be distinguished from other symbolic representations

– Way to transform raw text into a different visual form

– Foundation for meaningful visualization• Suitable mathematical procedures and analytical measures

– A database management system

Page 6: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 6

Processing Text

• Requirements of text processing engine– Identification and extraction of text features

• Frequency-based measures on words

• Higher order statistics taken on the words

• Semantic in nature

– Efficient and flexible representation of documents in terms of these text features

– Support for information retrieval and visualization• Pre-process, indexing

Page 7: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 7

Visualizing output from text processing

• Representing the document– a vector in high dimensional feature space

• Comparisons, filters, and transformations can be applied

• Clustering using the normalized document vectors• Projection

– Principal Components Analysis– Multi-Dimensional Scaling– Exponential order of complexity

• Clustering in the high-dimensional feature space• Visualize the cluster centroids

Page 8: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 8

Managing the representation

• Two basic classes of data– Raw text files

• Static in nature, Simple in structure• Easy to manage

– Visual forms of the text• Extensive and dynamic

• Object-Oriented Database– Flexibility of data representation– Power of inheritance– Ease of data access

Page 9: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 9

Interface design for text visualization

• Backdrop– Central display resource

• Workshop– Grid having resizable windows to hold multiple

views

• Chronicle– Area where views are placed and linked to form

a visual story

Page 10: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 10

Examplesfrom the MVAB Project

• MVAB– Multidimensional Visualization and Advanced

Browsing Project– Visualization and analysis of textual information– Showcased in SPIRE

• SPIRE– Spatial Paradigm for Information Retrieval and

Exploration

• Starfields and Topographical maps metaphors– Galaxies and Themescapes

Page 11: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 11

Galaxies

• Displays cluster and document interrelatedness

• 2D scatterplot of ‘docupoints’

• Simple point and click exploration

• Sophisticated tools– Facilitate more in-depth analysis– Ex) temporal slicer

Page 12: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 12

Galaxies Screen Shot

Page 13: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 13

ThemeScapes

• Abstract, 3D landscapes of information

• Convey relevant information about topic or themes without the cognitive load

• Spatial relationships reveal the intricate interconnection of thems

Page 14: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 14

ThemScapes - Advantages

• Displays much of the complex content of the document database

• Utilizes innate human abilities for pattern recognition and spatial reasoning

• Communicative invariance across levels of textual scale

• Promote analysis

Page 15: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 15

ThemeScapes Screen Shot

Page 16: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 16

Conclusions

• Text visualizations can overcome much of the user limitations– Enhanced insight and time savings (35 mins vs 2

weeks)– Creative with the tool

• Querying and analytical manipulation come together in a single visualization– Permits a different kinds of querying

• Text visualizations will have to access and utilize the cognitive and visual processes

Page 17: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 17

Directions for Future R & D

• Visual Data Analysis

• Elaborate the visual metaphors

• Addition of sensory modalities– Virtual interaction

Page 18: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 18

My Favorite Sentence

The bottleneck in the human processing and understanding of information in large amounts of text can be overcome if the text is specialized in a manner that takes advantage of common powers of perception.

Page 19: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 19

Contributions

• Explorations of new visualizations

• Discussion of the process for mapping Raw Data Document collections into visualizations

Page 20: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 20

Notes on the Reference

• Designing Interaction: Psychology at the Human Computer Interaction

• Interfaces Issues and Interaction Strategies for Information Retrieval Systems

• Clustering and Dimensionality Reduction in SPIRE

Page 21: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 21

Critique – Strengths and Weaknesses

• Strengths– Provide natural visual metaphors– Enable the users to see the relationships

between documents with minimal required reading

• Weaknesses– No validation of some conclusions

Page 22: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 22

What has happened to this topic?

• 1996 R&D 100 Award• OCSB

– On-line Citation Searching and Browsing in UMD

• "ThemeScape" is now a trademarked term of Cartia, Inc.

• WebThemeTM – an interactive tool that provides a visual display of the

common themes in collections of web-based documents

Page 23: Visualizating the Non-Visual: Spatial Analysis and Interaction with Information from Text Documents J.A. Wise, J.J. Thomas, K. Pennock, D. Lantrip, M

April 4, 2001 CMSC838b Information Visualization 23

WebTheme Screen Shot