a geovisual analytic technique for exploratory analysis of online discourses

Upload: diego-velasquez-rios

Post on 03-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    1/57

    A GEOVISUAL ANALYTIC TECHNIQUE FOR EXPLORATORY

    ANALYSIS OF ONLINE DISCOURSES

    _______________

    A Thesis

    Presented to the

    Faculty of

    San Diego State University

    _______________

    In Partial Fulfillment

    of the Requirements for the Degree

    Master of Science

    in

    Computer Science

    _______________

    by

    Kanwar Gurnawaz Singh Buttar

    Fall 2010

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    2/57

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    3/57

    iii

    Copyright 2010

    by

    Kanwar Gurnawaz Singh Buttar

    All Rights Reserved

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    4/57

    iv

    DEDICATION

    I would like to dedicate this thesis to my parents, who have been an everlastingsource of inspiration in my life. I would not have been able to achieve what I have without

    their support.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    5/57

    v

    ABSTRACT OF THE THESIS

    A Geovisual Analytic Technique for Exploratory Analysis ofOnline Discourses

    byKanwar Gurnawaz Singh Buttar

    Master of Science in Computer Science

    San Diego State University, 2010

    This thesis project focuses on designing and programming an implementation of a

    geovisual analytic technique to evaluate online participatory decision making. The

    4D (spatio-temporal) geovisualization technique called a Grapevine was developed byresearchers at University of Washington, to evaluate the quality and scale of participatory

    decision interactions during an online discussion about improving transportation in the

    central Puget Sound region. The 4D aspect of the technique derives from its representation oflocation (latitude, longitude), type of discourse interaction, and time of its occurrence. The

    theory behind the grapevine comes from two National Research Council (NRC) publications

    that synthesized research on how the analytic-deliberative process can improve decisionmaking about risks to public health, public safety, and the environment. The grapevine

    technique can be used to distil and cluster specific types of on-line discourse events, rank thequality of on-line participation and represent spatial trends in on-line discourses. This work is

    about the automation of grapevine functionalities including robust database queries in a

    desktop Geographic Information System (GIS) environment based on Environmental

    Systems Research Institute (ESRI)ArcGIS software.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    6/57

    vi

    TABLE OF CONTENTS

    PAGE

    ABSTRACT ...............................................................................................................................v

    LIST OF FIGURES ............................................................................................................... viii

    ACKNOWLEDGEMENTS .......................................................................................................x

    CHAPTER

    1 INTRODUCTION .........................................................................................................12 BACKGROUND ...........................................................................................................4

    2.1 LIT Web Portal ..................................................................................................4

    2.2 Grapevine: A Geovisual Analytic Technique ....................................................7

    3 DESIGN .......................................................................................................................113.1 Grapevine Features Automated........................................................................11

    3.2 ArcGIS Desktop ...............................................................................................12

    3.2.1 ArcScene ................................................................................................ 13

    3.2.2 Geoprocessing ........................................................................................ 14

    3.2.2.1 Geoprocessing Framework ........................................................... 15

    3.2.2.2 Scripting ........................................................................................ 16

    3.3 LIT Discourse Database ...................................................................................17

    3.4 PYODBC .........................................................................................................20

    4 USER MANUAL .........................................................................................................214.1 Posts .................................................................................................................22

    4.2 Replies on a Post ..............................................................................................24

    4.3 Post-Reply Relationship...................................................................................24

    4.4 Concerns ..........................................................................................................25

    4.5 Comments on a Concern ..................................................................................25

    4.6 Concern-Comment Relationship ......................................................................26

    4.7 Votes on Posts ..................................................................................................26

    4.8 Post-Vote Relationship ....................................................................................27

    4.9 Votes on Concerns ...........................................................................................28

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    7/57

    vii

    4.10 Concern-Vote Relationship ............................................................................28

    4.11 Votes on Post-Replies ....................................................................................29

    4.12 Vote-Post-Reply Relationship .......................................................................29

    4.13 Votes on Comments to a Concern .................................................................30

    4.14 Vote-Concern Comment Relationship ...........................................................30

    4.15 Visual Cue 1: A Coiling Stem .......................................................................31

    4.16 Visual Cue 2: Lots of Nodes ..........................................................................31

    4.17 Visual Cue 3: Lots of Buds ............................................................................32

    4.18 Visual Cue 4: An Open Proliferation of Shoots and Leaves .........................33

    4.19 Visual Cue 5: An Open Proliferation of Tendrils ..........................................34

    4.20 Discourse Contributions by Zip Code............................................................35

    5 CONCLUSION ............................................................................................................45BIBLIOGRAPHY ....................................................................................................................46

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    8/57

    viii

    LIST OF FIGURES

    PAGE

    Figure 3.1. ArcGIS desktop architecture. ................................................................................13

    Figure 3.2. Typical geoprocessing tool. ...................................................................................15

    Figure 3.3. ArcToolBox. ..........................................................................................................16

    Figure 3.4. LIT discourse database. .........................................................................................18

    Figure 3.5. Relationships between tables. ................................................................................19

    Figure 4.1. Grapevine analysis toolbox. ..................................................................................22

    Figure 4.2. Posts tool dialog box. ............................................................................................23

    Figure 4.3. Post nodes shapefile (in red)..................................................................................24

    Figure 4.4. Post node details. ...................................................................................................24

    Figure 4.5. Replies on posts (in blue). .....................................................................................25

    Figure 4.6. Post-reply relationship. ..........................................................................................25

    Figure 4.7. Concerns (in red). ..................................................................................................26

    Figure 4.8. Comments on a concern (in blue). .........................................................................26

    Figure 4.9. Concern-comment relationship. ............................................................................27

    Figure 4.10. Votes on posts (in ultramarine). ..........................................................................27

    Figure 4.11. Post-vote relationship. .........................................................................................27

    Figure 4.12. Votes on concerns (in ultramarine). ....................................................................28

    Figure 4.13. Concern-vote relationship. ..................................................................................28

    Figure 4.14. Votes on post-replies (in ultramarine). ................................................................29

    Figure 4.15. Vote-post-reply relationship. ...............................................................................29

    Figure 4.16. Votes on concern-comments. ..............................................................................30

    Figure 4.17. Votes-concern comments relationship. ...............................................................30

    Figure 4.18. Visual cue 1 dialog box. ......................................................................................31Figure 4.19. Visual cue 1 values. .............................................................................................32

    Figure 4.20. Cue1.png. .............................................................................................................33

    Figure 4.21. Visual cue 2 values. .............................................................................................34

    Figure 4.22. Cue2.png. .............................................................................................................35

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    9/57

    ix

    Figure 4.23. Visual cue 3 values. .............................................................................................36

    Figure 4.24. Cue3.png. .............................................................................................................37

    Figure 4.25. Visual cue 4 values. .............................................................................................38

    Figure 4.26. Cue4.png. .............................................................................................................39

    Figure 4.27. Visual cue 5 values. .............................................................................................40

    Figure 4.28. Cue5.png. .............................................................................................................41

    Figure 4.29. Discourse contributions by zip code dialog box..................................................42

    Figure 4.30. Discourse contribution output. ............................................................................43

    Figure 4.31. Discourse contribution line graph. ......................................................................44

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    10/57

    x

    ACKNOWLEDGEMENTS

    I would like to thank Dr. Piotr Jankowski and Dr. Robert Aguirre for giving me theopportunity to work on this project and for providing constant support and motivation. I am

    also thankful to Dr. Carl Eckberg, who with his experienced advice has always guided me to

    a right path. A special thanks to Dr. Ming-Hsiang Tsou for becoming part of the thesis

    committee.

    I would also like to thank my parents for the sacrifices they made to help me to attend

    graduate school.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    11/57

    1

    CHAPTER 1

    INTRODUCTION

    Geovisualization, also known as Geographic Visualization, refers to a set of tools and

    techniques for analyzing geospatial data through the use of interactive visualization.

    Geovisualization represents a set of cartographic techniques and practices that helps in data

    exploration and decision-making processes. This technique has a lot of advantages over

    traditional static maps which have a limited exploratory capability. Geovisualization and

    Geographic Information System (GIS) makes our map more interactive by adding the ability

    to explore different layers of the map and changing its visual appearance.

    The major challenge for a Geovisual Analyst is an ever increasing amount of multi-

    dimensional, multi-source, time-varying and geospatial digital information. Analysts have to

    evaluate, analyze and make decisions based on these information streams. Most of the times

    this kind of analysis has to be done in time-critical situations and demands efficient,

    integrated and interactive tools that assists the user to explore, present and communicate

    visually large information spaces. All these factors have contributed to the idea of Geovisual

    Analytics, an emerging interdisciplinary field that integrates perspectives from VisualAnalytics (grounded in Information and Scientific Visualization) and Geographic

    Information Science (including work in geovisualization, geospatial semantics and

    knowledge management, geocomputation, and spatial analysis) [1]. Geovisual Analytics

    tools aid analysts in identifying relevant geospatial information, data, and knowledge through

    computer-based visual interfaces. Geovisual Analytics thus helps us in recognizing useful

    information in enormous datasets which otherwise is difficult to find using traditional

    methods. Tools that are highly interactive and support exploration are required for Geovisual

    Analysis [2]. The whole idea of using Geovisual Analytical technique is the dissemination of

    results to decision makers who need a concise communication of the interpretations made by

    an analyst. Geovisual Analytics tools visualization of data and analysis of observed patterns

    with human interpretation and domain knowledge [3].

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    12/57

    2

    Geovisual Analytical tools can be both web based and desktop based. Environmental

    Systems Research Institute (ESRI) is a well known software development and services

    company providing Geographic Information System (GIS) software and geodatabase

    management applications. ESRI offers a number of tools for management and analysis of

    spatial data which can be customized to suit each of our respective application. ESRIs GIS

    software called ArcGIS is a well known suit of software product allowing users to author,

    analyze, map, manage, share, and publish geographic information. It provides a robust set of

    GIS capabilities suitable for many applications.

    This project describes the design, programming, and the implementation of a 4D

    (spatio-temporal) Geovisual Analytic technique, known as grapevine, to evaluate online

    participatory decision making. The 4D aspect of the technique derives from its representation

    of location (latitude, longitude), type of discourse interaction, and time of its occurrence.

    Grapevine is an organic-looking, geo-referenced 4D structure of participatory interaction

    used to display fine-grained emergent patterns among hundreds of human-Computer-Human-

    Interaction (HCHI) events distributed in space and time [4]. The grapevine technique can be

    used to distil and cluster specific types of on-line discourse events, rank the quality of on-line

    participation and represent spatial trends in on-line discourses. The Grapevine technique can

    be used to evaluate any participatory interaction irrespective of the number of people and

    duration of discourse. As a case study in my project the interaction data for the grapevine

    technique comes from a large online field experiment called the Lets Improve

    Transportation (LIT) challenge conducted in late 2007. Around 200 community participants

    from three counties in the central Puget Sound region of Washington State discussed for over

    a month regarding the best transportation improvement package for the region. This thesis

    report describes the automation of grapevine functionalities including robust database queries

    in a desktop GIS environment based on Environmental Systems Research Institute (ESRI)

    ArcGIS software.

    The report is organized as follows: Chapter 2 describes the background and details of

    LIT online experiment and the grapevine as a geovisual analytic technique for evaluating the

    interaction data. Chapter 3 describes the design, programming elements used for

    implementing and automating grapevine functionalities followed by user manual presented in

    Chapter 4. User manual discusses the list of automated grapevine functionalities and the way

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    13/57

    3

    these can be used by an analyst. The closing Chapter 5 concludes with the possible future

    enhancement of the automated grapevine analytic tool.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    14/57

    4

    CHAPTER 2

    BACKGROUND

    In a democratic society participatory decision making is an established approach to

    allocating resources and making choices effecting peoples lives. People can express their

    views on political, economic, and environmental management decisions. Public participation

    can happen at any level of decision making, including economic, political, management,

    cultural or social. Public participation can be an effective way of influencing governments

    decision in a representational manner.

    2.1LITWEB PORTAL

    A lot of factors such as local, state and federal laws govern the transportation

    department decisions to improve regional transportation. Transportation agencies across the

    United States do involve the public in their decision making process to help prioritize

    different projects such as highway expansions, new light rail lines and determining possible

    sources of funding. But the problem is that local government transportation agencies engage

    the public at a very late stage by producing a list of transportation projects and funding

    sources allocated to pay for those projects and then asking people if the list is acceptable [5].

    Involvement at such a late stage of decision process severely limit publics- ability to

    influence the selection of projects and hence the selection of choices they are asked to vote

    for. To overcome that, a research project known as Participatory Geographic Information

    System for Transportation (PGIST) was carried out as an effort to involve public throughout

    the transportation improvement process. The experiment was conducted online using a

    website called Lets Improve Transportation (LIT) developed by a research team based at the

    University of Washington. LIT combines web mapping and online deliberation capabilities

    with a structured five-step decision-making process designed to enable large groups (200+)

    of participants to asynchronously collaborate in the construction, evaluation, and selection of

    their own transportation improvement program [5]. The experiment was held in late 2007

    over a 28 day period wherein residents of Seattle area participated using online deliberation

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    15/57

    5

    tools located at http://www.letsimprovetransportation.org. Registered participants were

    engaged in a hypothetical situation in which they were asked to provide Seattle area policy

    makers their recommendation regarding a regional transportation ballot measure. The

    ultimate goal was to identify which package of projects and funding options the participants

    could collectively recommend.

    The LIT websites flexible workflow architecture enables it to be reconfigured to suit

    other similar decision problems. LIT webportal used for this experiment is composed of five

    progressive and overlapping stages. These five steps and their respective sub-steps in the

    decision process are as follows [5]:

    1. Discuss Concernsa. Map your daily travelb. Brainstorm concernsc. Review Summaries

    2. Assess transportation improvement factorsa. Review factorsb. Weigh factors

    3. Create transportation packagesa. Discuss projectsb. Discuss funding optionsc. Create your own package

    4. Select a package for recommendationa. Discuss candidate packagesb. Vote on package recommendation

    5. Prepare group reporta. 5a: Review draft reportb. 5b: Vote on report endorsement

    The objective of each of the above step is either deliberative or analytic. In

    step 1, participants enter details about their daily travel path and raise their concerns or views

    about improving the transportation in the central Puget Sound region. All these concerns or

    views are then grouped together into a set of common themes or summaries by a moderator.

    These summaries are then reviewed and voted by participants in order to make sure these

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    16/57

    6

    themes represent their original concerns. In step 2, different factors are reviewed and

    quantified by participants that can be used as a multiple criteria for creating the best

    transportation improvement package. In step, participants are given a chance to create their

    own transportation improvement package. Participants can discuss funding options for any

    project chosen from a large spatial inventory of proposed projects from all over the Central

    Puget Sound region. After that an offline analysis is done to determine six diverse packages.

    In step 4, participants deliberate on these six packages and then vote on to decide which of

    these packages is most preferable. In the concluding step 5, the final report containing the

    outcome of deliberation process and final package recommendation is prepared. This report

    is reviewed, endorsed by the participants and handed over to the concerned authorities.

    The design of LIT online experiment is an initiative to improve the analytic-

    deliberative decision making process. To evaluate the quality and scale of public

    participation in an analytic-deliberative process means deciphering the client-server

    interaction event as proxy for analytic or deliberative HCHI activity [4]. Thus in context

    to LIT web portal, an event is an interaction between computers. Interaction between

    computers refers to the occurrence of HCHI between people in real geographic space and

    time [4]. Following are the four major analytic and deliberative HCHI activities of sending a

    message:

    1. Type your concern2. Type your comment on someone elses concern3. Type your post4. Type your reply to someone elses post

    All events other than the four above and voting are only analytic. Voting to agree or

    disagree with a message is considered both analytic and deliberative. In LIT web portal

    voting can be done on concerns, concern comments, posts or replies to posts. All these

    activities of LIT webportal are logged in a Microsoft Access event database. The analysts

    applied the grapevine techniques on the event database using 3D GIS software. The primary

    objective of the analysis was to determine the most productive participatory interactions

    using five natural grapevine-looking visual criteria as discussed later in this chapter.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    17/57

    7

    2.2GRAPEVINE:AGEOVISUAL ANALYTIC TECHNIQUE

    Grapevine is an organic-looking, geo-referenced 4D structure of participatory

    interaction use to display fine-grained emergent patterns among hundreds of thousands of

    HCHI events distributed in space and time [4]. Grapevine technique can be used as an

    evaluation tool for any participatory interaction irrespective of the number of people and

    duration of their involvement. Using the grapevine techniques analysts can quickly and

    reliably recognize the most productive daily clusters of HCHI activity.

    In our case study of determining the best transportation improvement package for the

    central Puget Sound region, the grapevine showed the growth of participant deliberation

    coiling up through time. All these grapevine features were processed and displayed in ESRI

    based 3D GIS product known as ArcScene. In the context of LIT experiment, nomenclature

    of different grapevine features is defined as follows [4]:

    1. MainstemA mainstem grows from one node to another. A node on the mainstem represents a

    message (post or concern) written by the user on LIT web portal. The mainstem is considered

    most productive when stem turns back and forth because of rapid message turn-taking from

    participants at different locations. The mainstem is unproductive when it grows straight up

    with little twisting because of the lack of rapid message exchange or lack of geographic

    diversity.2. Node and Internode

    It represents a message added along the mainstem from a particular location and point

    in time. Nodes can generate buds if there is a reply. Many large nodes with short internodes

    are the most productive because participants are rapidly posting messages and voting to agree

    or disagree. On the other hand it is unproductive when few or small nodes are generated

    because participants are not posting messages or voting on each others messages.

    3. BudIt represents a message that at least one other participant replied to with their own

    message. Buds generate shoots and leaves. Generation of many large buds represents a high

    productivity because many participants are replying to each others messages. It is

    unproductive when small buds are generated because participants are not replying to each

    others messages.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    18/57

    8

    4. TendrilA tendril represents a vote to agree or disagree with the message in a node, bud, or

    leaf. A tendril grows from a node, bud, or leaf to the specific time and location of the voting

    participant. Productive tendrils are nodes with many tendrils branching out in all directions at

    a relatively low angle, indicating rapid and geographically diverse voting responses.

    Unproductive tendrils are nodes with a few short tendrils branching out at a relatively high

    angle because of delayed and non-geographically diverse voting responses.

    5. ShootA reply to a bud is a shoot. A shoot grows from a bud and ends in a leaf at the time

    and location of the responding participant. Shoots branching out in all directions at a

    relatively low angle to the bud are highly productive whereas shoots branching out in only a

    few directions at a high angle relative to the bud are unproductive.

    6. LeafA leaf is a message sent as a reply. A leaf is generated from a bud and exists at the

    end of a shoot. High productivity can be spotted when there are many large leaves, because

    participants voted to agree or disagree with a reply. On the other hand a low productivity

    means a few or small leaves, because of few participants who voted to agree or disagree with

    a reply.

    Analysts use following five visual cues as multiple-criteria for determining the most

    productive clusters of grapevine which are then used for further analysis:

    1. Visual Cue 1: A Coiling StemGrapevines mainstem grows from one node (post or concern) to the next. A node

    represents a post or a concern raised by the user on LIT web portal. The location (latitude,

    longitude) of the user is determined by a self-reported home zip code. The third dimension is

    the time (Pacific Standard Time) coordinate logged by the LIT web portal whenever user

    writes a post or concern. The mainstem grows towards the time-space coordinate whenever a

    new node is generated. The mainstem will be highly coiled if it rapidly twists and turn back

    and forth with a dense collection of nodes. This will be the case when many participants

    interact from many different locations. On the other hand, the mainstem will be relatively

    straight and barren with a few nodes separated by time if many people are not interacting. In

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    19/57

    9

    terms of mathematical formula, visual cue 1 values for all nodes from i to j that lie within a

    certain span of time can be calculated as follows:-

    SUM of ABS of (from Node i to j) [(LAT i, LONG i) (LAT j, LONG j)]

    To determine the visual cue 1 value of a segment of grapevine, analysts calculate the

    sum of the absolute value of all the differences in latitude and longitude between each node

    on the mainstem. The further apart the nodes are in real geographic space, the higher the

    value, all things being equal. However, if there are lot of nodes, the cue value will also be

    higher. If there are only two nodes and they have exactly the same location (latitude,

    longitude), the result is 0.

    2. Visual Cue 2: Many NodesThe grapevine will be highly productive if there are a lot of nodes along the

    mainstem. This is the case when there is a high voting activity concerning posts or concerns.

    The size of the node is determined based on the number of votes it has received. To

    determine the visual cue 2 value for a segment of grapevine, analysts compute the total

    number of nodes generated and then compute the average size of node, which represents

    the number of votes that the given node received. The larger the node symbol the higher the

    cue value. If there are only two nodes and they each got one vote, the result is 3 ( 2 (number

    of nodes) + 1(average number of votes)). The mathematical formula for all nodes from i to j

    that lie within a certain span of time is as follows:

    SUM [Node i to Node j] + AVE No. of VOTES [Node i to Node j]

    3. Visual Cue 3: Many BudsParticipants can also reply to a post or a concern. In terms of grapevine we can say

    that the node (post or concern) has developed in to a bud giving rise to a reply. All buds

    develop from nodes but not all nodes generate buds. Thus buds are special nodes on the

    mainstem that actually got a reply rather than just a vote. The size of a bud (post or concern)

    is determined by the number of replies (reply to a post or concern comment) it got. So if

    there are only two buds and they each got one reply, the result is 3 ( 2 (number of buds) + 1

    (average number of replies)). The mathematical formula for all buds from i to j that lie within

    a certain span of time is as follows:

    SUM [Bud i to Bud j] + AVE No. of REPLIES [Bud i to Bud j]

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    20/57

    10

    4. Visual Cue 4: An Open Proliferation of Shoots and LeavesAs discussed in visual cue 3, a node develops into a bud when other participants

    reply. A highly productive bud is one when there is a open pattern of shoots branching off at

    a low angle to the bud extending out in all directions. It implies that participants from many

    different locations replied to the post or concern. In order to determine this visual cue value,

    analysts calculate the ratio of total spatial differences divided by total temporal delays. The

    mathematical formula is as follows:

    ABS((Latitude of bud Latitude of leaf) + (Longitude of bud Longitude of leaf)) / ABS

    (Time of bud creation Time of leaf creation)

    The absolute value of the sum of differences in participant latitude and longitude

    locations is divided by the absolute value of the sum of differences in time between the

    message and the reply. The greater are the differences in participant locations and the more

    rapid the replies, the higher the cue value.

    5. Visual Cue 5: An Open Proliferation of TendrilsVisual cue 5 is similar to visual cue 4, the only difference being that analysts deal

    with nodes and tendrils rather than buds and leaves. Whenever a participant votes on a post

    or concern, a tendril is grown up and out from a node. The LIT webportal is designed in a

    way that a participant can also vote on a reply to a post or on a comment to a concern, thus

    generating an additional set of tendrils off of leaves. A node is known as dead node if it

    gets no reply or vote. Tendrils that branch off at a low angle indicate a relatively rapid voting

    response. The mathematical formula is as follows:

    ABS((Latitude of node Latitude of tendril) + (Longitude of node Longitude of tendril)) /

    ABS (Time of node creation Time of tendril creation)

    The absolute value of the sum of differences in participant latitude and longitude

    locations is divided by the absolute value of the sum of differences in time between the

    message and the vote. The greater the differences are between participant locations and the

    more rapid the voting, the higher the cue value.

    The above discussed visual cues can help analysts in distinguishing fine-grained

    clusters of activity wherever they emerge. The grapevine technique can be treated as a useful

    extension of general purpose statistical techniques especially during exploratory data analysis

    [4].

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    21/57

    11

    CHAPTER 3

    DESIGN

    This chapter describes systems architecture and the programming elements used for

    implementing and automating grapevine functionalities. Software tools, programming

    language and technology used for the design and development of all the modules and

    components of the grapevine geovisual analytic technique is covered in this chapter. The

    visual representation of the grapevine tool followed by its analysis is carried out in ESRI

    ArcGIS Desktop environment. The input discourse database obtained from the LIT web

    portal is stored in Microsoft Access.

    3.1GRAPEVINE FEATURES AUTOMATED

    In the last chapter, I discussed grapevine as a possible geovisual analytic technique

    for analyzing the outcomes of online LIT experiment. Two categories of functions were

    identified as the candidates for automation:

    1. Creation of the grapevine structure, given the proper input.Grapevine is a space-time structure. To visualize interactions in space and time as a

    grapevine, the data needs to be subset by deliberative event types. A deliberative event

    type is a message or response to a message. Visualization of a deliberative interaction event

    in space and time requires two types of information events:

    The location and time a client created a message. The location and time a client responded to that message, with a vote or response.

    The client-server interaction event database is in Microsoft access and will be

    discussed in detail later in this chapter.

    A grapevine structure is comprised of three main elements:

    Mainstem: Each node on the mainstem represents location and time when a clientcreated a concern or a post. To create a mainstem of a grapevine, subset post or

    concern event types.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    22/57

    12

    Leaves: Leaves are replies to messages along the mainstem. A reply can be made toboth a post and a concern. To create leaves of a grapevine, subset replies or concern-

    comment event types.

    Tendrils: Tendrils are votes on messages. A vote can be made on a post, on aconcern, on reply to a post or on comment to a concern (concern-comment).

    2. Ranking subsets of grapevine.The second category of functionality, which is more advanced, is to help an analyst

    rank portions or segments of a grapevine structure. The analyst can subset and compare

    sections of grapevine using a specified length of time (expressed in some number of days).

    For example, a researcher would like to compare the quality of deliberative activity by

    days, computing each of the five visual cues (discussed in the previous chapter) as multiple-

    criteria evaluations. Thus for each DAY beginning in 0000 hours and ending in 2399 hours, a

    certain number of grapevine features will have to be sub-selected and their attributes used for

    calculating each of the five visual cues.

    3.2ARCGISDESKTOP

    A number of GIS tools are available on the market. In this project, all the grapevine

    functionalities have been automated in a desktop GIS environment based on Environmental

    Systems Research Institute (ESRI) ArcGIS software. ArcGIS Desktop is the primary

    platform for GIS professionals to compile, analyze, and manage geographic information. It is

    a family of three products:

    ArcMap ArcCatalog ArcToolbox

    All three products share the core components represented by ArcObjects, user

    interface, and development environment. Each product provides a specialized GIS

    functionality where ArcMap offers a visualization and mapping environment, ArcCatalog

    provides data management capabilities, and ArcToolbox offers analytical functions. ArcGIS

    Desktop also comes with optional extensions that offer specialized tools and additional

    capabilities to enhance the system. Developers can build customized GIS applications and

    extend the capabilities of ArcGIS Desktop using .NET, Visual C++, and Visual Basic. Visual

    Studio is used as the primary programming language for ArcGIS Desktop applications and

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    23/57

    13

    Python is the primary scripting language. Figure 3.1 [6] shows the architecture of ArcGIS

    Desktop software illustrating the division of participating software components into logical

    layers and physical tiers [6].

    Figure 3.1. ArcGIS desktop architecture. Source: ArcGIS Resource Center. Rich client:

    ArcGIS desktop, 2010. http://resources.arcgis.com/content/enterprisegis/10.0/

    rich_client_desktop, accessed Oct. 2010.

    3.2.1 ArcScene

    ArcGIS Desktop comes with a number of extensions to add more capabilities for

    performing extended tasks such as raster geoprocessing, three-dimensional analysis, and map

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    24/57

    14

    publishing. ArcGIS 3D Analyst is one such powerful extension providing tools for three-

    dimensional (3D) visualization, analysis, and surface generation. ArcGIS 3D Analyst helps

    users in the following ways [7]:

    Creating three-dimensional views directly using GIS data. Analyzing three-dimensional data using cut/fill, line-of-sight, and terrain modeling. Viewing data from a global-to-local perspective. Navigating through multiresolution terrain data seamlessly. Doing spatial analysis in two or three dimensions. Visualizing modeling or analyzing results in three dimensions. Using three-dimensional models and symbols for realism. Exporting visualizations into videos.

    ArcGlobe and ArcScene are the two applications provided by ArcGIS 3D analyst

    extension. This project uses ArcScene for the visual representation of grapevine structure.

    ArcScene helps analysts in managing 3D GIS data effectively, performing 3D analysis,

    creating 3D features, and displaying layers with 3D viewing properties [8]. ArcScene also

    provides a tool for converting two-dimensional (2D) GIS data into 3D features. ArcScene

    helps in creating realistic scenes where users can navigate and interact with GIS data.

    3.2.2 Geoprocessing

    Geoprocessing is a fundamental part of ArcGIS. Essential GIS tools such as data

    analysis, data management and data conversion are provided by geoprocessing.

    Geoprocessing tools (operators) operate on the data in ArcGIS (tables, feature classes,

    rasters, and so on) and perform tasks that are necessary for manipulating and analyzing

    geographic information across a wide range of disciplines. A geoprocessing tool is executed

    by the geoprocessor object. GeoprocessorClass is a main class that simplifies the task of

    executing geoprocessing tools and acts as a single access point for the execution of any

    geoprocessing tool in ArcGIS, including extensions [9]. The geoprocessor contains properties

    and methods that make it possible to execute tools, set global environment settings, and

    examine the resulting messages [9].

    GIS tasks can be automated using geoprocessing, as it provides a mechanism to

    combine a series of geoprocessing tools in a sequence of operations using models and scripts.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    25/57

    15

    Geoprocessing is based on a framework of data transformation as shown in Figure 3.2 [10].

    A geoprocessing tool takes ArcGIS datasets as inputs (feature classes, tables, and rasters),

    applies an operation against the data, and creates a newly derived dataset as output. There are

    hundreds of such geoprocessing tools. Geoprocessing helps in automating work to solve

    complex problems by chaining together sequences of tools, feeding the output of one tool

    into another.

    Figure 3.2. Typical geoprocessing tool. Source: ArcGIS

    9.2 Desktop Help. What is geoprocessing? Esri website,

    2010. http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm

    ?TopicName=What_is_geoprocessing?, accessed

    Oct. 2010.

    3.2.2.1GEOPROCESSING FRAMEWORK

    The Geoprocessing framework is a small collection of built-in user interfaces for

    organizing and managing existing tools and creating new tools [11]. The basic components of

    the framework are as follows [11]:

    The ArcToolbox (see Figure 3.3) window for navigating the collection ofgeoprocessing tools and opening them for execution.

    The tool dialog box for interactively filling out tool parameters and executing thetool.

    The Command Line window for typing in a tool name followed by its parameters andexecuting the tool.

    The ModelBuilder window for chaining together sequences of tools. Methods for creating scripts and adding them to the ArcToolbox.

    ArcToolbox is the primary entry point into the geoprocessing framework [12]. The

    tools in ArcToolbox are organized into toolboxes and toolsets providing a rich set of

    functionality across a wide range of disciplines. The ArcToolbox window is a tree-view user

    interface and can be viewed in ArcScene by clicking the Show/Hide ArcToolbox window

    button on the standard toolbar. Developers can create their own tools by organizing them into

    new toolsets and toolboxes and sharing them with any ArcGIS user. The geoprocessing tools

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    26/57

    16

    Figure 3.3. ArcToolBox. Source: ArcGIS 9.2 DesktopHelp. A whirlwind tour of geoprocessing. Esri website,

    2010. http://webhelp.esri.com/arcgisdesktop/9.2/index.

    cfm?TopicName=A_whirlwind_tour_of_geoprocessing,

    accessed Oct. 2010.

    can either be run from a command line or via scripting, or can be chained together via

    ModelBuilder.

    3.2.2.2SCRIPTING

    Geoprocessing tasks can be time intensive since they are often performed on a

    number of different datasets or on large datasets with numerous records. Scripting is an

    efficient method of automating geoprocessing tasks. A program that uses a scripting

    language is called a script. In geoprocessing framework, scripts can be used to create new

    tools. Advantages of using scripting languages, such as Perl and Python, in geoprocessing

    framework are as follows [11]:

    Scripting languages have been extended with third-party libraries for things such asadvanced math and statistics, web automation, database queries, and advanced systemutilities.

    Availability of low-level geoprocessing functions, such as cursors and functions toaccess the properties of ArcGIS data, in scripting.

    Scripts are great for wrapping other software- the gluing together of applications.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    27/57

    17

    Scripts can be executed outside ArcGIS from the operating system prompt, providedArcGIS is installed on the machine.

    Scripting languages have a short learning curve, making the project developmenteasier and more fun.

    Python is supported as a geoprocessing scripting language in ArcGIS 9.3. For thisproject, python is used as a scripting language to create new geoprocessing tools. There are a

    lot of advantages to using Python. Programming in Python is fast and allows for efficient

    development as well as connectivity to a vast number of open source programs written in

    Python, C, or C++. Python is a powerful open-source programming language. Python scripts

    are cross-platform and available Python libraries include tools to connect to commonly used

    relational databases.

    The two functionalities of grapevine discussed earlier in this chapter have been

    automated by creating a new toolbox, Grapevine Analysis Tools, containing a number of

    python scripts as tools. These tools automate the visual representation of grapevine in

    ArcScene followed by automating the five visual cues, discussed in Chapter 2, and ranking

    each day of the LIT experiment based on the visual cue values. All these tools take LIT

    discourse database stored in MS Access format, discussed later in this chapter, as input. All

    these grapevine analysis tools are discussed in detail in Chapter 4.

    3.3

    LIT

    DISCOURSE

    DATABASE

    Discourse database is a relational database in MS Access format obtained from the

    LIT web portal. This database is an input to each of the grapevine analysis tool discussed in

    Chapter 4. The database obtained from LIT webportal contains data regarding peoples input

    and feedback on improving transportation in the central Puget Sound region of Washington

    State. The grapevine geoprocessing tool requires database to be in a specific format for

    successfully querying the relevant data. The raw data obtained from LIT webportal was

    processed and organized into a specific format which is compatible with all the grapevine

    analysis tools. Dr. Aguirre from University of Washington in Seattle was instrumental in

    organizing the database in the standard format compatible with the grapevine tools. The

    database (see Figure 3.4) has eight tables as follows:

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    28/57

    18

    Figure 3.4. LIT discourse database.

    CONTENT_PostThis table contains data about all the posts generated over a month long LIT

    experiment. The data from this table is queried to draw a node on the mainstem.

    CONTENT_ConcernThis table contains data about all the concerns raised in the LIT experiment. The data

    from this table is queried to draw a node on the mainstem.

    CONTENT_Post_ReplyThis table contains data about all the replies made on the corresponding posts. The

    data from this table is queried to draw a leaf grown by a bud.

    CONTENT_Concern_CommentsThis table contains data about all the comments made on the corresponding

    comments. The data from this table is queried to draw a leaf grown by a bud.

    CONTENT_Post_VoteThis table contains data about all the votes made on the corresponding posts. The data

    from this table is queried to draw a tendril grown by a node.

    CONTENT_Concern_VotesThis table contains data about all the votes made on the corresponding concerns. The

    data from this table is queried to draw a tendril grown by a node.

    CONTENT_Post_Reply_VoteThis table contains data about all the votes made on the corresponding replies to a

    post. The data from this table is queried to a draw tendril grown by a leaf.

    CONTENT_Concern_Comment_Votes

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    29/57

    19

    This table contains data about all the votes made on the corresponding comments to a

    concern. The data from this table is queried to draw a tendril grown by a leaf.

    The relationship between all the above eight tables is shown in Figure 3.5. There is a

    one-to-many relationship between each pair of tables described as follows:

    Figure 3.5. Relationships between tables.

    All the concern comments are associated with their respective concerns bycreating a relationship between pgist_cvo_concerns_idcolumn in the

    CONTENT_Concern table (the primary key) and the concern_id column in theCONTENT_Concern_Comments table (the foreign key).

    All the concern votes are associated with their respective concerns by creating arelationship between pgist_cvo_concerns_idcolumn in the CONTENT_Concern

    table (the primary key) and the pgist_cvo_concerns_id column in theCONTENT_Concern_Votes table (the foreign key).

    All the concern comment votes are associated with their respective concerncomments by creating a relationship between Pgist_cvo_concern_comments_idcolumn in the CONTENT_Concern_Comments table (the primary key) and the

    Pgist_cvo_concern_comments_id column in theCONTENT_Concern_Comment_Votes table (the foreign key).

    All the post replies are associated with their respective posts by creating arelationship between Pgist_discussion_post_id column in the CONTENT_Post

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    30/57

    20

    table (the primary key) and the parent_id column in the CONTENT_Post_Reply

    table (the foreign key).

    All the post votes are associated with their respective posts by creating arelationship between Pgist_discussion_post_id column in the CONTENT_Posttable (the primary key) and the Pgist_discussion_post_id column in the

    CONTENT_Post_Vote table (the foreign key).

    All the post reply votes are associated with their respective post replies bycreating a relationship between Pgist_discussion_reply_idcolumn in the

    CONTENT_Post_Reply table (the primary key) and the

    Pgist_discussion_reply_id column in the CONTENT_Post_Reply_Vote table (theforeign key).

    3.4PYODBC

    PYODBC (Python Open Database Connectivity) is a python module that allows using

    Open Database Connectivity (ODBC) to connect to almost any database from Windows,Linux, OS/X, and other operating systems. It implements the Python Database API

    Specification v2.0, but additional features have been added to simplify database even more

    [13]. PYODBC is licensed using MIT license and is free for commercial and personal use

    [13].

    PYODBC is the critical part of this project since robust SQL queries are required for

    doing all the grapevine analysis operations. This project uses PYODBC version 2.1.7 for

    Python to communicate with MS Access discourse database. Since ArcGIS comes with two

    choices of relational database, the first is the personal geodatabase that comes with ArcGIS

    and the second one is the external relational database such as MS Access. Each database has

    its own pros and cons. For this project, external relational database in MS Access format was

    chosen because of the following reasons:

    The data from the LIT WebPortal is already in MS Access format. SQL queries are easily performed in Python to perform database operations. They are

    much faster than the same queries made against the ArcGIS geodatabase, and they

    can be more complex. This is because SQL queries in ArcGIS go through several

    layers (geoprocessor object calls to geodatabase and to underlying database engine),which slows down the operations.

    The geodatabase limits the types of SQL queries that are possible. For example, thereis only limited functionality of SQL where clauses, and there is no order by

    clause.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    31/57

    21

    CHAPTER 4

    USER MANUAL

    This chapter serves as the users manual for the Grapevine Analysis Tools. The

    User manual discusses the list of automated grapevine functionalities and the way these can

    be used by an analyst. Screenshots have been used for better understanding.

    All the grapevine analysis tools implemented have been broadly classified into three

    categories as follows:

    1. Visual representation of Grapevine, which includes the following: Posts Concerns Replies on the posts Comments on the concerns Votes on the posts Votes on the concerns Votes on the replies to the posts

    Votes on the comments to the concerns2. Ranking ordering of discourse activities for each of the following visual cues:

    Visual Cue 1: A Coiling Stem Visual Cue 2 : Lots of Nodes Visual Cue 3 : Lots of Buds Visual Cue 4 : An Open Proliferation of Shoots and Leaves Visual Cue 5 : An Open Proliferation of Tendril

    3. Discourse Contributions made by participants by zip code.An ArcToolbox with the name Grapevine Analysis Tools was created (see

    Figure 4.1) containing a number of python tool scripts for each of the above functionalities.

    The visual representation of grapevine structure is displayed on the backdrop of the map of

    Washington State since the LIT experiment was conducted in the central Puget Sound region

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    32/57

    22

    Figure 4.1. Grapevine analysis toolbox.

    of Washington. Washington State zip code shapefile is obtained from the US Census Bureau

    website [14] and imported into the ArcScene environment.

    4.1POSTS

    This tool displays all the post nodes in the ArcScene environment. Basically, the

    output of this tool is a three-dimensional (3D) point shapefile, with each point representing a

    post node. The three dimensions of a post node are latitude, longitude, and time at which the

    post was created. On clicking the Post tool in the ArcToolbox, a dialog box (see Figure 4.2)

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    33/57

    23

    Figure 4.2. Posts tool dialog box.

    pops up asking for the input database directory location and the desired output shapefile

    directory location. The input database is the LIT discourse database (discussed in Chapter 3) .

    The output shapefile is any customized name which an analyst would like to give to the

    output post nodes shapefile. After providing both the inputs, click on the OK-button to runthe tool. Once the execution completes, add the post nodes shapefile (see Figure 4.3) into the

    ArcScene environment for visualization. Each node in the shapefile has all the necessary

    information about itself. The identify tool of the ArcScene environment can be used to

    display the attribute values of each node by just clicking on it. The sample information

    revealed by identify tool is shown in Figure 4.4.

    POINT_X and POINT_Y represent the longitude and latitude of the node. POINT_Z

    represents the elapsed time (in minutes) since the first day of the experiment, and the Time

    field gives information about the elapsed time in a user-friendly manner. Msg_Title

    represents the post title. Message is the content of the post. Area and ZIP gives the

    information about the participant location. Step and sub-step represent one of five major

    steps comprising the LIT experiment (discussed in Chapter 2) and its subsequent activity.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    34/57

    24

    Figure 4.3. Post nodes shapefile (in red).

    Figure 4.4. Post node details.

    4.2REPLIES ON A POST

    This tool displays all the nodes which replied to their respective posts. The tool is run

    in a similar manner as Post tool. The output point shapefile and node data (obtained from

    identify tool of ArcScene) is shown in Figure 4.5. The nodes in red color are posts and the

    nodes in blue are their corresponding replies.

    4.3POST-REPLY RELATIONSHIP

    This tool is used to see the visual links (see Figure 4.6) between each reply and its

    corresponding post. The tool is run in a similar manner as Posts requiring the user to enter

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    35/57

    25

    Figure 4.5. Replies on posts (in blue).

    Figure 4.6. Post-reply relationship.

    LIT discourse database and desired output shapefile name as inputs. The output shapefile is apolyline shapefile with each polyline having post node and reply node as its two end-points.

    4.4CONCERNS

    This tool displays all the concern nodes in the ArcScene environment. It is run similar

    to the Posts tool discussed in Section 4.1. The output point shapefile and node data (obtained

    from identify tool of ArcScene) is shown in Figure 4.7.

    4.5C

    OMMENTS ON AC

    ONCERN

    This tool displays all the nodes representing comments directed at their respective

    concerns. The output point shapefile and node data (obtained from identify tool of

    ArcScene) is shown in Figure 4.8. The nodes in red color are concerns and the nodes in blue

    are their corresponding comments.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    36/57

    26

    Figure 4.7. Concerns (in red).

    Figure 4.8. Comments on a concern (in blue).

    4.6CONCERN-COMMENT RELATIONSHIP

    This tool is used to see the visual links (see Figure 4.9) between each concern and its

    corresponding comment. The tool is run in a similar manner as Posts and requires the user to

    provide LIT discourse database and the desired output shapefile name as inputs. The output

    shapefile is a polyline shapefile with each polyline having concern node and comment node

    as its two end-points.

    4.7VOTES ON POSTS

    This tool displays all the nodes representing votes for the respective posts. The user

    votes by either agreeing or disagreeing with a post. Both voting outcomes are tallied while

    counting the total number of votes for a post. The tool is run in a similar manner as Post tool.

    The output point shapefile is shown in Figure 4.10. The nodes in red color are posts and the

    nodes in ultramarine are their corresponding votes.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    37/57

    27

    Figure 4.9. Concern-comment relationship.

    Figure 4.10. Votes on posts (in ultramarine).

    4.8POST-VOTE RELATIONSHIP

    This tool is used to display the visual links (see Figure 4.11) between each post and

    its corresponding vote. The tool is run in a similar manner as Posts by requiring the user to

    provide LIT discourse database and desired output shapefile name as inputs. The output

    shapefile is a polyline shapefile with each polyline having post node and vote node as its two

    end-points.

    Figure 4.11. Post-vote relationship.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    38/57

    28

    4.9VOTES ON CONCERNS

    This tool displays all the nodes representing the votes for their respective concerns.

    The user votes by either agreeing or disagreeing with a concern. Both outcomes are tallied

    while counting the total number of votes for a concern. The tool is run in a similar manner as

    Post tool. The output point shapefile is shown in Figure 4.12. The nodes in red color are

    concerns and the nodes in ultramarine are their corresponding votes.

    Figure 4.12. Votes on concerns (in ultramarine).

    4.10CONCERN-VOTE RELATIONSHIP

    This tool is used to see the visual links (see Figure 4.13) between each concern and its

    corresponding vote. The tool is run in a similar manner as Posts and requires the user to

    provide LIT discourse database and the desired output shapefile name as inputs. The output

    shapefile is a polyline shapefile with each polyline having concern node and vote node as its

    two end-points.

    Figure 4.13. Concern-vote relationship.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    39/57

    29

    4.11VOTES ON POST-REPLIES

    A user can also vote on the reply to the post. This tool displays all the nodes

    representing the votes for their respective post-replies. The user votes by either agreeing or

    disagreeing with a message. Both outcomes are tallied while counting the total number of

    votes for a post-reply. The tool is run in a similar manner as Post tool. The output point

    shapefile is shown in Figure 4.14. The nodes in blue color are post-replies and the nodes in

    ultramarine are their corresponding votes.

    Figure 4.14. Votes on post-replies (in ultramarine).

    4.12VOTE-POST-REPLY RELATIONSHIP

    This tool is used to see the visual links (see Figure 4.15) between each post-reply and

    its corresponding vote. The tool is run in a similar manner as Posts and requires the user to

    provide LIT discourse database and the desired output shapefile name as inputs. The output

    shapefile is a polyline shapefile with each polyline having post-reply node and vote node as

    its two end-points.

    Figure 4.15. Vote-post-reply relationship.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    40/57

    30

    4.13VOTES ON COMMENTS TO A CONCERN

    A user can also vote on the comments regarding the concern. This tool displays all the

    nodes representing the votes for their respective concern-comments. The user votes by either

    agreeing or disagreeing with a message. Both outcomes are tallied while counting the total

    number of votes for a concern-comment. The tool is run in a similar manner as Post tool. The

    output point shapefile is shown in Figure 4.16. The nodes in blue color are concern-

    comments and the nodes in ultramarine are their corresponding votes.

    Figure 4.16. Votes on concern-comments.

    4.14VOTE-CONCERN COMMENT RELATIONSHIP

    This tool is used to see the visual links (see Figure 4.17) between each concern-

    comment and its corresponding vote. The tool is run in a similar manner as Posts by giving

    LIT discourse database and desired output shapefile name as input. The output shapefile is a

    polyline shapefile with each polyline having concern-comment node and vote node as its two

    end-points.

    Figure 4.17. Votes-concern comments relationship.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    41/57

    31

    4.15VISUAL CUE 1:ACOILING STEM

    This tool is used to execute the calculations for visual cue 1 (discussed in Chapter 2)

    which is used to analyze the pattern of the grapevine structure by assessing the extent to

    which the main stem of the grapevine has the coiling tendency. This tool ranks each day of

    the LIT experiment based on the cue value for each day. This tool computes the following

    two outputs as follows:

    LIT experiment days sorted by their rank. Bar graph for visual representation of the cue value for each day over the entire

    duration of the experiment.

    When running the tool- a dialog box, shown in Figure 4.18, pops up. It asks for the

    following two inputs:

    Input Database: LIT discourse database directory address. Output Graphs: The directory workspace in which the bar graph will be created.

    Figure 4.18. Visual cue 1 dialog box.

    Click the OK button to run the tool. The output obtained is shown in Figure 4.19.

    The Output window also outputs the location of the bar graph (in PNG format) for visual

    cue 1. Browse to that location and open the file Cue1.png as shown in Figure 4.20.

    The x-axis represents the days of the LIT experiment and y-axis represents the

    cue values.

    4.16VISUAL CUE 2:LOTS OF NODES

    This tool is used to execute the calculations for visual cue 2 (discussed in Chapter 2),

    which is used to analyze the level of participation based on the number of nodes in the

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    42/57

    32

    Figure 4.19. Visual cue 1 values.

    grapevine structure and the votes received by them. The tool computes an output in a similar

    format to the one illustrated in Section 4.15. The output showing the cue values for each day

    of the LIT experiment and the bar graph is shown in Figure 4.21 and Figure 4.22

    respectively.

    4.17VISUAL CUE 3:LOTS OF BUDS

    This tool is used to execute the calculations for visual cue 3 (discussed in Chapter 2),

    which is used to analyze the level of participation based on the number of buds in the

    grapevine structure and the replies received by them. The tool is run and produces output in a

    similar format to the one illustrated in Section 4.15. The output showing the cue values for

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    43/57

    33

    Figure 4.20. Cue1.png.

    each day of the LIT experiment and the bar graph is shown in Figure 4.23 and Figure 4.24,

    respectively.

    4.18VISUAL CUE 4:AN OPEN PROLIFERATION OF

    SHOOTS AND LEAVES

    This tool is used to execute the calculations for visual cue 4 (discussed in Chapter 2).

    The results of executing this tool facilities an assessment of the user participation based on

    the proliferation of the shoots and leaves, which indicate how far apart the users, generatingreplies to existing comments and posts, are geographically located. The tool computes an

    output in a similar format to the one illustrated in Section 4.15. The output showing the

    cue values for each day of the LIT experiment and the bar graph output is shown in

    Figure 4.25 and Figure 4.26 respectively.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    44/57

    34

    Figure 4.21. Visual cue 2 values.

    4.19VISUAL CUE 5:AN OPEN PROLIFERATION OF

    TENDRILS

    This tool is used to execute the calculations for visual cue 5 (discussed in Chapter 2).

    The results of this tool provides an assessment of the pattern of the user participation based

    on the proliferation of the tendrils, which indicate how far apart the users, voting on existing

    posts and concerns are geographically located. The tool is run and produces output in a

    similar format to the one illustrated in Section 4.15. The output showing the cue values for

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    45/57

    35

    Figure 4.22. Cue2.png.

    each day of the LIT experiment and the bar graph is shown in Figure 4.27 and Figure 4.28

    respectively.

    4.20DISCOURSE CONTRIBUTIONS BY ZIP CODE

    This tool counts the discourse contributions made by the participants (posts, replies,

    votes) by zip code, for the entire duration of the discussion, and displays for each selected zip

    code a line graph showing the productivity of participants who came from the given zip code.

    Upon clicking this tool in ArcToolbox, a dialog box shown in Figure 4.29 pops up. The tool

    asks for the following three inputs:

    Input Database : The LIT discourse database directory address.

    Output Graph: The directory workspace in which the bar graph will be created. Zip Code: The five-digit zip code for which the number of discourse contributions

    will be calculated.

    Upon clicking the OK button, the tool produces the output as shown in Figure 4.30.

    The output line graph for the same is shown in Figure 4.31.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    46/57

    36

    Figure 4.23. Visual cue 3 values.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    47/57

    37

    Figure 4.24. Cue3.png.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    48/57

    38

    Figure 4.25. Visual cue 4 values.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    49/57

    39

    Figure 4.26. Cue4.png.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    50/57

    40

    Figure 4.27. Visual cue 5 values.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    51/57

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    52/57

    42

    Figure 4.29. Discourse contributions by zip code dialog box.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    53/57

    43

    Figure 4.30. Discourse contribution output.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    54/57

    44

    Figure 4.31. Discourse contribution line graph.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    55/57

    45

    CHAPTER 5

    CONCLUSION

    The Grapevine Analysis Tools enhance the use and understanding of geovisual

    analysis and are extremely useful during exploratory data analysis. These tools enable the

    analyst to take a more active role in the discovery process by distinguishing fine-grained

    clusters of activity wherever they emerge. Anyone having a large amount of client-server

    interaction data, which can be brought into a standard database format (discussed in

    Chapter 3) can benefit from these tools by visually exploring patterns and making judgments

    based on the results.

    Future enhancements for Grapevine Analysis Tools might include the following:

    The size of the nodes/buds can be made proportional to number of replies and votesthey receive. This will aid analysts in visually determining nodes/buds receivinggreater number of replies and votes.

    Currently the tool displays the grapevine (mainstem, leaves and tendrils) for the entireduration of the experiment. A tool can be made to display segments or portions of thegrapevine within the chosen range of time, for instance, all of the posts or replies with

    a time stamp equal to a certain day.

    The five visual cues rank each day of the LIT experiment based on the cue values. Atool can be made to rank each step and sub-step of the LIT experiment based on the

    cue values.

    A software module can be developed to parse the interaction event logs generated onthe LIT web portal and collect the relevant data into MS Access database in the

    format that is compatible with the Grapevine Analysis Tools.

    A graphical query tool can be developed that allows to select any type and number ofgrapevine elements from the map and returns descriptive statistics and queue values

    for the selected elements.

    Optimizing the performance of the analysis tools by using the SQL joins to query thedata from the database.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    56/57

    46

    BIBLIOGRAPHY

    WORKS CITED1. B. M. Tomaszewski, A. C. Robinson and A. M. MacEachren. Geovisual analytics and

    crisis management. Proceedings of the 4th International ISCRAM Conference(B. Van de Walle, P. Burghardt and C. Nieuwenhuis, eds.), Delft, the Netherlands,

    May 13-16, 2007, pp. 173-179.

    2. K. Allendoerfer, S. Aluker, G. Panjwani, J. Proctor, D. Sturtz, M. Vukovic and

    C. Chen. Adapting the cognitive walkthrough method to assess the usability of aknowledge domain visualization. Proceedings, IEEE Symposium on Information

    Visualization, Minneapolis, MN, 2005.

    3. N. Gershon and W. Page. What storytelling can do for information visualization.

    Commun. ACM, 44(8):31-37, 2001.

    4. R. Aguirre and T. Nyerges. Geovisual evaluation of public participation in decisionmaking: The grapevine.J. Visual Languages Computing, forthcoming, 2010.

    5. Matthew W. Wilson and Kevin S. Ramsey. Integrating online deliberation into

    transportation investment decision-making: Preliminary reflections on a field

    experiment. Department of Geography, University of Washington, Seattle, 2006.

    6. ArcGIS Resource Center. Rich client: ArcGIS desktop, 2010.http://resources.arcgis.com/content/enterprisegis/10.0/rich_client_desktop, accessed

    Oct. 2010.

    7. ArcGIS Desktop Extensions. ArcGIS 3D analyst. Esri website, 2010.http://www.esri.com/software/arcgis/extensions/3danalyst/index.html, accessedOct. 2010.

    8. ArcGIS Desktop Help. Esri website, 2010.

    http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=An_overview_of_

    3D_Analyst, accessed Oct. 2010.

    9. ESRI Developer Network. Getting started with geoprocessing, 2010.http://edndoc.esri.com/arcobjects/9.2/net/7b2abf97-992f-4402-87e3-

    d4603464c713.htm, accessed Oct. 2010.

    10. ArcGIS 9.2 Desktop Help. What is geoprocessing? Esri website, 2010.

    http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=What_is_geoprocessing?, accessed Oct. 2010.

    11. ArcGIS 9.2 Desktop Help. Geoprocessing framework. Esri website, 2010.

    http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=Geoprocessing_framework, accessed Oct. 2010.

  • 8/12/2019 A Geovisual Analytic Technique for Exploratory Analysis of Online Discourses

    57/57

    47

    12. ArcGIS 9.2 Desktop Help. A whirlwind tour of geoprocessing. Esri website, 2010.

    http://webhelp.esri.com/arcgisdesktop/9.2/index.cfm?TopicName=A_whirlwind_tour_of_geoprocessing, accessed Oct. 2010.

    13. Python ODBC Library. Python website, 2010. http://code.google.com/p/pyodbc/,

    accessed Oct. 2010.

    14. U.S. Census Bureau. Cartographic Boundary Files. Census 2000 5-Digit ZIP Code

    Tabulation Areas (ZCTAs), 2000. http://www.census.gov/geo/www/cob/z52000.html,accessed Oct. 2010.

    WORKS CONSULTED

    Matplotlib Libraries for Python. Matplotlib website, 2010.

    http://matplotlib.sourceforge.net/index.html, accessed Oct. 2010.

    Microsoft Support. Defining relationships between tables in a Microsoft Access database,

    2010. http://support.microsoft.com/kb/304466, accessed Oct. 2010.

    NumPy Libraries for Python. NumPy website, 2010. http://numpy.scipy.org/, accessedOct. 2010.

    Python ODBC Library. Python website, 2010

    http://code.google.com/p/pyodbc/wiki/GettingStarted, accessed Oct. 2010.