a multi-facetted visual analytics tool for exploratory ... · benefit from visual analytics...

17
TECHNOLOGY REPORT published: 23 August 2016 doi: 10.3389/fninf.2016.00036 Frontiers in Neuroinformatics | www.frontiersin.org 1 August 2016 | Volume 10 | Article 36 Edited by: Arjen Van Ooyen, Vrije Universiteit Amsterdam, Netherlands Reviewed by: Hidetoshi Ikeno, University of Hyogo, Japan Xi-Nian Zuo, Chinese Academy of Sciences, China *Correspondence: Diego A. Angulo [email protected] Received: 19 April 2016 Accepted: 03 August 2016 Published: 23 August 2016 Citation: Angulo DA, Schneider C, Oliver JH, Charpak N and Hernandez JT (2016) A Multi-facetted Visual Analytics Tool for Exploratory Analysis of Human Brain and Function Datasets. Front. Neuroinform. 10:36. doi: 10.3389/fninf.2016.00036 A Multi-facetted Visual Analytics Tool for Exploratory Analysis of Human Brain and Function Datasets Diego A. Angulo 1 *, Cyril Schneider 2 , James H. Oliver 3 , Nathalie Charpak 4 and Jose T. Hernandez 1 1 IMAGINE, Systems and Computing Engineering, Universidad de los Andes, Bogota, Colombia, 2 LCNS, CHUL, Quebec, QC, Canada, 3 VRAC, Iowa State University, Ames, IA, USA, 4 Kangaroo Foundation, Bogota, Colombia Brain research typically requires large amounts of data from different sources, and often of different nature. The use of different software tools adapted to the nature of each data source can make research work cumbersome and time consuming. It follows that data is not often used to its fullest potential thus limiting exploratory analysis. This paper presents an ancillary software tool called BRAVIZ that integrates interactive visualization with real-time statistical analyses, facilitating access to multi-facetted neuroscience data and automating many cumbersome and error-prone tasks required to explore such data. Rather than relying on abstract numerical indicators, BRAVIZ emphasizes brain images as the main object of the analysis process of individuals or groups. BRAVIZ facilitates exploration of trends or relationships to gain an integrated view of the phenomena studied, thus motivating discovery of new hypotheses. A case study is presented that incorporates brain structure and function outcomes together with different types of clinical data. Keywords: exploratory analysis, visual analytics, fMRI, MRI, tractography, cohort studies, python 1. MOTIVATION An important challenge in brain research, in both normal and pathological conditions, is to better understand the extent to which the physical structure of the brain influences its functioning. The most common research procedure is characterized by experiments aimed at collecting data directed toward testing a former hypothesis. This confirmatory-like methodology imposes limitations on the way data is used, and it is typically used only once which is unfortunate since data acquisition is generally time and resource intensive. In the last two decades brain research data has increasingly been gathered in a more open fashion and many databases are now available to the public (Milham, 2012). In parallel, data collection, storage, and sharing, have been improved at both technical and policy levels, have advanced (Eckersley et al., 2003), together with technologies used to consolidate, search and access the data (Van Horn and Toga, 2009; Wood et al., 2014). This allows massive amounts of data to be consolidated into databases and searched in efficient ways. In this way questions can be explored and large data pools can be mined for interesting relationships. The resting state functional magnetic resonance imaging (rfMRI) community is exemplary with large efforts such as the Consortium for Reliability and Reproducibility (CoRR) (Zuo and Xing, 2014) and the Human Connectome Project (Marcus et al., 2013; Hodge et al., 2016), which allow researchers to share data and enables the exploration of integrated datasets containing data from thousands of subjects.

Upload: others

Post on 20-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

TECHNOLOGY REPORTpublished: 23 August 2016

doi: 10.3389/fninf.2016.00036

Frontiers in Neuroinformatics | www.frontiersin.org 1 August 2016 | Volume 10 | Article 36

Edited by:

Arjen Van Ooyen,

Vrije Universiteit Amsterdam,

Netherlands

Reviewed by:

Hidetoshi Ikeno,

University of Hyogo, Japan

Xi-Nian Zuo,

Chinese Academy of Sciences, China

*Correspondence:

Diego A. Angulo

[email protected]

Received: 19 April 2016

Accepted: 03 August 2016

Published: 23 August 2016

Citation:

Angulo DA, Schneider C, Oliver JH,

Charpak N and Hernandez JT (2016)

A Multi-facetted Visual Analytics Tool

for Exploratory Analysis of Human

Brain and Function Datasets.

Front. Neuroinform. 10:36.

doi: 10.3389/fninf.2016.00036

A Multi-facetted Visual Analytics Toolfor Exploratory Analysis of HumanBrain and Function DatasetsDiego A. Angulo 1*, Cyril Schneider 2, James H. Oliver 3, Nathalie Charpak 4 and

Jose T. Hernandez 1

1 IMAGINE, Systems and Computing Engineering, Universidad de los Andes, Bogota, Colombia, 2 LCNS, CHUL, Quebec,

QC, Canada, 3 VRAC, Iowa State University, Ames, IA, USA, 4 Kangaroo Foundation, Bogota, Colombia

Brain research typically requires large amounts of data from different sources, and often

of different nature. The use of different software tools adapted to the nature of each

data source can make research work cumbersome and time consuming. It follows that

data is not often used to its fullest potential thus limiting exploratory analysis. This paper

presents an ancillary software tool called BRAVIZ that integrates interactive visualization

with real-time statistical analyses, facilitating access to multi-facetted neuroscience data

and automating many cumbersome and error-prone tasks required to explore such data.

Rather than relying on abstract numerical indicators, BRAVIZ emphasizes brain images

as the main object of the analysis process of individuals or groups. BRAVIZ facilitates

exploration of trends or relationships to gain an integrated view of the phenomena

studied, thus motivating discovery of new hypotheses. A case study is presented that

incorporates brain structure and function outcomes together with different types of

clinical data.

Keywords: exploratory analysis, visual analytics, fMRI, MRI, tractography, cohort studies, python

1. MOTIVATION

An important challenge in brain research, in both normal and pathological conditions, is to betterunderstand the extent to which the physical structure of the brain influences its functioning. Themost common research procedure is characterized by experiments aimed at collecting data directedtoward testing a former hypothesis. This confirmatory-like methodology imposes limitations onthe way data is used, and it is typically used only once which is unfortunate since data acquisitionis generally time and resource intensive.

In the last two decades brain research data has increasingly been gathered in a more openfashion and many databases are now available to the public (Milham, 2012). In parallel, datacollection, storage, and sharing, have been improved at both technical and policy levels, haveadvanced (Eckersley et al., 2003), together with technologies used to consolidate, search and accessthe data (Van Horn and Toga, 2009; Wood et al., 2014). This allows massive amounts of data to beconsolidated into databases and searched in efficient ways. In this way questions can be exploredand large data pools can be mined for interesting relationships.

The resting state functional magnetic resonance imaging (rfMRI) community is exemplary withlarge efforts such as the Consortium for Reliability and Reproducibility (CoRR) (Zuo and Xing,2014) and the Human Connectome Project (Marcus et al., 2013; Hodge et al., 2016), which allowresearchers to share data and enables the exploration of integrated datasets containing data fromthousands of subjects.

Page 2: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

This has led to rapid changes in the way research canbe conducted, i.e., a shift from hypothesis-driven researchinto data-driven research, where data is available first andresearch questions and hypotheses are formulated on thebasis of exploration of data. The methodology used in data-driven research differs significantly from that for hypothesis-driven research. Data-driven research seeks to find and extractmeaningful insights from data using exploratory methods(Tukey, 1980) and has already proven effective in economics,terrorism prevention, and business intelligence domains whichare characterized by large and heterogeneous data sets (Cookand Thomas, 2005). Exploratory research involves iteratingthrough data several times, looking at it from different pointsof view, transforming data, searching for interesting subjects andmeasurements, gathering details and performing group analyses.These analysis tasks are carried out multiple times, and often indifferent order, as researchers learn more about the data. It istherefore helpful to provide tools to make annotations and savefindings, so that explorations can be continued later as an integralpart of the ongoing process of discovery.

During this process several data patterns may likely leadto unexpected insights. Unfortunately, it is also likely thatthese patterns are caused by the unique noise structure ofthe current data and therefore cannot be generalized to theglobal population. Automatic data-mining algorithms can findthousands of possible relations, but true findings need to bebacked up by science and current knowledge. Therefore, domainexperts must be involved in interpretation of insights. Moreover,insights that integrate data from different domains requireexperts from all these domains.

Visual analytics (Keim et al., 2008) has emerged as a disciplinethat seeks to integrate statistics, machine learning, data miningand interactive data visualization with the objective of optimizingthe use of data available for exploratory research. The analystis acknowledged as the most important actor, and all toolsare designed to support exploration and provide timely accessto the required data as well as to informatics and statisticsfunctions. Another principle in visual analytics is that analystsshould focus on data and not on operational details of thetools. Therefore, tools should provide the data and functionalityto complete the task, while keeping non-relevant details andcomplex functionality hidden.

Exploratory brain research is a domain that could certainlybenefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain imaging)and non-spatial (clinical) measurements that could be analyzedtogether to better understand the link between brain structureand function as they relate to human health and behavior.Examples of spatial measurements include brain anatomyacquired by means of magnetic resonance imaging (MRI),neural pathways trajectory acquired by diffusion weightedimaging (DWI), patterns of cerebral activation in specific tasksand acquired by functional MRI (fMRI), or brain networksfunctionality and corticomotor function tested by transcranialmagnetic stimulation (TMS) of motor and non-motor areasof brain. Specialized tools can process these neuroimagingmodalities to model brain structure, build pathways, produce

statistical maps of activation patterns, and neural connections.Brain researchers need to correlate these neurophysiologicalmeasurements and models to data of a different nature, suchas neuropsychological performance, behaviors, and other clinicaldata.

However, the tools currently used in brain research aregenerally specific to the type or domain of data analyzed and theyare optimized to support linear work flows. It follows that expertsmust often switch between tools to integrate and analyze datafrom different domains. In the worst cases, they may even haveto move to a different computer. This process is time-consumingand repetitive. It requires the analyst to focus attention onthe “how” rather than the “what,” and thus makes exploratoryanalysis challenging.

2. INTRODUCTION

This paper introduces BRAVIZ, a software tool based on visualanalytics and aimed at supporting exploratory analysis in brainresearch. More specifically, it focuses on datasets that includeMRI derived measurements and models as well as TMS andclinical outcomes. BRAVIZ is comprised of several applicationsthat integrate interactive visualizations, links to detailed meta-data, creation of new variables as well as statistical models andanalyses, all of which are designed to support and facilitate fullexploratory analyses. It is implemented on python and availableunder an open source license.

3. RELATED WORK ON NEUROIMAGINGAND EXPLORATORY ANALYSIS

Several data processing and visualization tools are availableto support research on neuroimaging. For example, Freesurfer(Fischl, 2012), FSL (Jenkinson et al., 2012), and SPM (Fristonet al., 2006) segment, register and perform statistical testingof brain image data. 3D Slicer (Fedorov et al., 2012), BrainVisa (Cointepas et al., 2001), and ITKSnap (Yushkevich et al.,2006) are commonly used to integrate data from different imagemodalities (structural, diffusion-weighted, functional amongothers). They have all proven to be efficient at processing bulkimages in a pipeline, and visualizing data from a single subject,but they fall short when several iterations through the data arerequired. The interfaces proposed for statistical testing requireextensive configuration, which is appropriate for testing specifichypotheses, but becomes cumbersome when several possibilitiesare to be explored. Efficient mechanisms for restricting analysisto only a subset of subjects (e.g., with common clinical orlesion characteristics) or going back to a subject’s details aremissing. Complementary data loaded from tables can be used, butchanging variables often means creating new tables and makingthem fit the required format.

Non-spatial information visualization tools like GGobi (Cookand Swayne, 2007) and Tableau (Hanrahan, 2003) can beused for interactive exploratory analysis. They support datatransformations, model fitting, and interactive visualization.They also enable detection of outliers (important in data-driven

Frontiers in Neuroinformatics | www.frontiersin.org 2 August 2016 | Volume 10 | Article 36

Page 3: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

research whereas usually unnecessary in hypothesis-drivenresearch), provide additional details, determine subsets ofsubjects, and visualize patterns and trends in different ways (e.g.,parallel coordinates, scatter plots, histograms, etc.). However,these tools do not integrate well with spatial data. Scalar dataderived from original images can be added but there is no easyway to link back to the original data or to explore spatial featuresthat cannot be encoded into numerical variables.

Recently, there has been an increased interest in resting statefMRI (rfMRI) and the connectivity networks that are inferredfrom it (Biswal et al., 2010; Rubinov and Sporns, 2010). Analyzingthis data requires the mixed use of spatial tools performingvoxel wise and cluster analyzes with graph oriented tools andstatistical analysis tools. The Connectome Computation System(CCS) (Xu et al., 2015) integrates data pre-processing tools,with connectome generation and finally mining and visualizationof automatic data. This kind of integrated environment allowsresearchers to efficiently explore data and pursue multiplehypotheses. BRAVIZ focuses on the last layer of this process butis focused on other neuro-image modalities.

INVIZIAN (Bowman et al., 2011, 2012a,b) uses Ggobi optionsto explore, in an abstract 3D space, the relations between scalarvalues and anatomical features of large brain datasets. It providesan environment for exploratory research involving data fromseveral databases for hypotheses generation. However, it worksonly with automatic feature extraction from structural MRI. LikeBRAVIZ, INVIZIAN focuses on easing the users’ visual patternsearch, however BRAVIZ targets a broader range of spatial data,and focuses on data generated by users rather than on machinelearning.

A visual analysis tool for high dimensional genetic and clinicaldata is presented in Hinterberg et al. (2014). The user exploresthe data by quickly iterating through several models that relategenetic data to clinical outcomes. Models are displayed as treesand linked with distributions of the selected parameters. Toolsare provided to automatically find the most relevant parametersto reduce the size of the search space. This tool is optimizedfor a single type of data, a single type of model, and a specificworkflow. In contrast, BRAVIZ proposes to support multipletasks, data types and workflows by providing a set of applicationsto be used independently or combined for complex exploratoryanalyses, as for example to interactively probe the multifacetedrelationships between spatial and non spatial outcomes atthe level of individuals, subsets of subjects or group ofsubjects.

Even though multiple tools exist for analysis and explorationof spatial and non-spatial data, the integration of multiplekinds/levels of analyses and data in a single environment remainschallenging. BRAVIZ provides a unique unified environment foranalyzing spatial and non-spatial data interactively, and in thisway, it supports data-driven research.

4. BRAVIZ ARCHITECTURE

Figure 1 shows the BRAVIZ architecture which is based onthe Model-View-Controller pattern, where the bottom layer(Project reader) together with the data repository constitute

FIGURE 1 | The BRAVIZ software architecture, the main library is

shaded.

the model, the BRAVIZ Library (shaded region) makes upthe controller, and each application represents a differentview.

Instead of a large monolithic application, BRAVIZ takesa distributed approach and thus is comprised of a set ofapplications tailored toward specific analysis tasks and datatypes. New BRAVIZ applications are implemented based onthe common library, freeing developers from thinking abouttechnical details related to data manipulation. All applicationsuse the same model, which also provides a channel for datasharing. Users can create custom samples, new variables andcustom geometric structures and store them in the database,where they can be read by any other BRAVIZ application. Anadditional communication mechanism is provided that enablesapplications to exchange data in real-time. In this way individualapplications can be combined to solve more complex tasks.Section 4.2 will describe current applications.

The library provides tools for loading spatial data (andtransforming it into an appropriate coordinate system), formanipulating tabular data, for creating spatial and non-spatialinteractive data visualizations, and for interacting with otherapplications and between different users. The types of spatial-data supported by the current implementation are: structuralMRI, diffusion MRI, functional MRI, label maps, tractographyreconstructions, structure segmentation models, and Freesurfercortex reconstructions. Non-spatial data can be any numericor categorical variable, including TMS, clinical, and socio-economic data. More details on the Braviz library will be givenin Section 4.4.

Frontiers in Neuroinformatics | www.frontiersin.org 3 August 2016 | Volume 10 | Article 36

Page 4: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

4.1. Design and Development MethodologyA “user centered” approach (Wassink et al., 2009) was takento design and implement BRAVIZ. The authors worked closelywith brain researchers of several specialties, visited several labsand hospitals, and learned as much as possible about researchworkflows and the bottlenecks they contain. Prototypes wereimplemented in several iterations and shared with differentdomain experts, whose feedback motivated the design of the nextgeneration. The initial stages of design focused on identifyingthe obstacles that affect exploratory analysis and communicationbetween experts, and examined ways of mitigating them. Theteam of experts was composed of radiologists, psychiatrists,physicians, neurophysiologists, physical therapists, pediatricians,statisticians, engineers, and economists.

From analyzing the visualization options in currentneuroimaging tools, as well as how domain experts usedthem, the BRAVIZ team learned what was expected from imageviewers, i.e., which features were important to implement andwhich were seldom used. SPM (Friston et al., 2006), Osirix(Rosset et al., 2004), and 3D-Slicer (Fedorov et al., 2012) werethe reference at this stage. For example, researchers neededto visualize several types of data in the same space and tobe able to compare brain images between two subjects. Alsoobvious was the need to integrate spatial visualizations withnon-spatial data for the same subject in order to be able tounderstand relationships. Another common issue discovered inthis exchange with expert users was that navigation from onesubject to another typically requires re-starting the visualizationapplication which is cumbersome.

In addition, there was the need for performing statisticalanalysis in real time, working with different groups of subjectsin the same space, creating and importing new data on the fly,performing group analyzes, identifying outliers, and detectingdata quality issues.

4.2. Applications SetThe current set of BRAVIZ applications can be divided in threecategories. One set of applications measures or creates newdescriptors from geometric data, one displays geometric datausing other variables as context, and the third explores numericaland categorical data. This supports different stages of theexploratory analysis process, respectively, data transformation,visualization, and subjects-to-group analyses. These applicationsare accessible from the main menu (Figure 2) by clicking onthe corresponding buttons. The bottom row provides access toutilities to manage samples, variables and scenarios, as well asimporting and exporting data.

4.2.1. Descriptors for Geometric DataThe Region of Interest (ROI) Builder application (Figure 3C)provides an interface for helping experts position a sphericalROI within the brain of each subject. These spheres are placedand sized with respect to images (from any modality) or cortexreconstructions. It is also possible to preview the fibers thatcross the ROI, and to evaluate mean value inside the sphere ofany scalar image, for example mean FA (fractional anisotropyassociated to the integrity of myelin covering axons) or meant-value associated with an fMRI test (representing to what degree

a region is involved in a given task). Linear and non-linearregistration maps can be used to approximate the position andsize of the sphere in other subjects. The expected workflowis positioning the ROI in one subject, extrapolating it to asample and making fine adjustments in position and size. Theinterface is optimized to support this (buttons, hotkeys, andvisualization). Several ROIs can be used to select specific ormore complex bundles. All data generated in the application(ROIs, bundles, and scalars) can be used on any other BRAVIZapplication.

In addition, the current BRAVIZ implementation includes anapplication for making linear measurements (lengths) of brainstructures, and an application (Logic Bundles) for defining fiberbundles by combining ROIs and segmented structures throughlogical operations. For example, a bundle may be defines as thefibers that go through structure A or ROI B but that don’t crossstructure C.

4.2.2. Geometric Data VisualizationThe Subject Overview application is shown in Figure 4. This toolprovides access to several kinds of spatial data in a unique 3Drenderer and eases the rapid navigation of the data from onesubject to another. Geometrical features such as MRI volumesor mean fractional anisotropy of DWI in relation to a chosenstructure can be captured directly from this application andadded to the database as a new variable. The application alsoshows the values per subject of selected variables as well as users’annotations. This ensures an integrated overview of each case andfollow-up between different users depending on the data availableto BRAVIZ.

A small multiples display (Tufte and Graves-Morris, 1983)with views of several subjects is useful for finding trends acrossa sample or for quality control to detect contaminated images.Subjects are ordered from left to right according to a chosennumerical variable and each row corresponds to a nominalvariable. In the example presented in Figure 5 the three rows arechildren born at term, preterm in incubators (preterm controls)and preterms with Kangaroo Mother Care intervention (see casestudies in Section 5) and subjects are sorted from left to rightaccording to the score on an IQ test. The bar-plot at the rightshows the distribution of this test amongst the three groups.This plot is linked with the 3D views, thus clicking on a bar willbring the corresponding subject into view. If more details of anysubject are required, the user may right click on that subjectsimage to load the corresponding subject on other BRAVIZapplications.

The fMRI Explorer application from Figure 3D displaysraw BOLD signals and contrast designs associated withfMRI experiments. It can also be used to compare signalsat different locations or from different subjects. The CheckRegistration application (Figure 3A) allows the user to visualizesimultaneously two images, from different modalities or differentsubjects and in a given coordinates system to assess the quality ofregistration.

4.2.3. Numerical and Categorical Data ExplorationThe ANOVA application (see Figure 6) provides access tostatistical models implemented in R (Team, 2012). In order to

Frontiers in Neuroinformatics | www.frontiersin.org 4 August 2016 | Volume 10 | Article 36

Page 5: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 2 | BRAVIZ main menu, each button provides access to an application.

Frontiers in Neuroinformatics | www.frontiersin.org 5 August 2016 | Volume 10 | Article 36

Page 6: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 3 | Examples of BRAVIZ applications. (A) Check Registration, (B) Correlations, (C) ROI Builder, (D) FMRI Explorer.

fit a model the user has to use the left side panel to selectthe outcome variable, the regressors and interaction terms andthe sample. Variables are selected from the database, whichadditionally stores the type of each variable, its description andin the case of nominal variables the labels for each level. Thevariable selection dialog shows this meta-data for each variableand allows users to modify it. It also displays an overview plotof the variable which allows the user to infer the distributionof values. This plot can be configured to show the relationshipbetween two variables (outcome vs. regressor) so that the usermay do a preliminary visual assessment of relations betweenvariables. Finally, the dialog provides mechanisms to search thedatabase.When all parameters are set the user clicks theCalculateANOVA button, the system will fetch the variable values for the

selected sample from the database and fit the model using theCAR package of R.

After fitting the model, the main plot will show diagnostics(distribution of residuals and a scatter plot of residuals vs.fitted values) for the validation of the ANOVA hypotheses(noise normally distributed and with constant variance) andthe table at the bottom right shows the resulting statistics.The application can also show box-plots and scatter-plots thatprovide additional insights on the relation between regressors(nominal and numeric variables and interaction terms) andoutcomes. Individual points in these plots can be identifiedby positioning the mouse over them. Additionally, by rightclicking the mouse, the associated individual subject data canbe loaded into other BRAVIZ applications to get supplemental

Frontiers in Neuroinformatics | www.frontiersin.org 6 August 2016 | Volume 10 | Article 36

Page 7: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 4 | Main interface of the subject overview application: at the bottom of the render some variables about the subject provide context.

FIGURE 5 | Quality control on the corpus callosum fibers from the whole sample.

Frontiers in Neuroinformatics | www.frontiersin.org 7 August 2016 | Volume 10 | Article 36

Page 8: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 6 | An application for performing ANOVA analyses using the data in the BRAVIZ database.

information. This can be especially useful for instance to interpretoutliers.

Figure 7 shows an alternative analysis application (LinearModel) which fits standard linear models and shows theeffect of each regressor (numerical variables, dummy variablesrepresenting levels of nominal variables, and interaction terms).The application provides a similar interface and functions inthe same way as the ANOVA application. The example fromthe figure shows a linear regression between the average lengthof fibers going through the mid-anterior section of the corpuscallosum and the latency in Transcraneal Magnetic Stimulationtest (see case studies in Section 5).

The Correlations application shown in Figure 3B displays alist of variables, a correlation matrix of selected variables, andscatter plots of selected correlations. Points in the plot can bequeried and right clicked. Additionally, they can be temporarilyeliminated from the analysis to see the impact they have on thecorrelation that is currently tested.

The Parallel Coordinates display from Figure 8 providesanother way of analyzing relations involving several variables

. This functionality is exemplified in the second case studypresented thereafter (see Section 5). Each vertical axis representsa variable, and each line is a subject. Users can interactively applyfilters to each axis, in order to see how changes in one variableaffect other values, and to understand relations involving severalvariables. In the figure, gray lines represent subjects excluded bythe filters while color lines are those subjects thatmatch the filters.

4.3. Global FeaturesIn addition to the unique features of each BRAVIZ application,the integrated platform provides features and combinations oftools (designed to be used together) in order to specifically favorthe complex process of exploratory analyzes.

4.3.1. Linking Clinical Outcomes and Brain DataSubjects in BRAVIZ are an integral part of the process, andtherefore all types of data per subject is always accessible. Forexample, the Subject Overview application displays a set ofclinical variables and annotations together with spatial data. Itfollows that users of BRAVIZ can make a context-dependent

Frontiers in Neuroinformatics | www.frontiersin.org 8 August 2016 | Volume 10 | Article 36

Page 9: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 7 | (A) Scatter plot and linear fit between the Ccma fibers and interhemispheric transfer timer, (B) same plot after removing four outliers, (C) corpus callosum

of the circled outlier of plot (A).

FIGURE 8 | Parallel coordinates display configured for searching possible PVL cases, lines that don’t match the filters on each axis are shown in gray.

reading of images and of other spatial data. Such informationwould only be displayed elsewhere by other tools.

Several scalar measurements can be derived from spatialdata within the tool itself. For example a group of segmentedstructures can be selected and their combined volume added asa new variable. These values can afterwards be used togetherwith clinical variables for statistical analysis, but a link is keptbetween the initial data and the environment in which it wasgenerated. Continuing with this example, if the researcher findsan outlier, she/he can right click on it and from the context menuopen the application in which the odd value was generated. Theapplication will load with the configuration it had at that time, butfocused on the subject of interest. In this way the extreme valuecan be analyzed to determine if it is caused by a particular artifactof the subject, such as a segmentation error, a problem with theimage, or a sign of an actual pathology.

4.3.2. Comparing SubjectsA common task in exploratory research is analyzing similaritiesand differences amongst a group of subjects. In most existingvisualization tools for spatial data, users have to select the subject

of interest at the onset of analysis, and then configure all thevisualization options and load the necessary files. BRAVIZ takesa different approach, and always allows the subjects of interestto change in the middle of the analysis while maintaining theconfiguration of the application. In this way it is easy to lookat different subjects from the same point of view, which makescomparing data very efficient.

4.3.3. Working with SamplesOften some properties of the data apply only to a specificgroup/sample of subjects. In BRAVIZ samples are a centralcomponent of every analysis. Samples can indeed be defined byusing filters on variable values, manually adding and removingspecific subjects, taking random subsets, or combining samplesthrough set operations (union, intersection, and difference).These samples can then be used for iterative analyses betweendifferent groups or can be modified (e.g., withdrawal of outiers),and the results of the modification visualized in real time. Thiscontrasts with hypothesis-driven research where the sample mustbe set at the onset of analyses.

Frontiers in Neuroinformatics | www.frontiersin.org 9 August 2016 | Volume 10 | Article 36

Page 10: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

4.3.4. Working with Incomplete DataIt is not uncommon for some values or geometric data to bemissing in some subjects. Applications that show geometric datadetect missing data, hide the corresponding object from thescene, and display warning in the status bar. Group analysis toolsdisplay the number of missing points, and ignore them whenperforming calculations. Also, when building samples, the usingfilters the interface provides a checkbox that allows the user toinclude or exclude subjects with a missing value for the specificvariable studied.

4.3.5. Supporting Long WorkflowsAnalyzing a complex dataset requires a significant amount oftime, and will likely be split in multiple sessions. BRAVIZapplications allow saving analyzes and restoring themusing custom names and descriptions, and attaching textualannotations to subjects, variables, and geometric objects (forexample ROIs), and thus favors re-use of previous explorations.A log of each analysis session is kept, which can be reviewed andannotated using a web interface. These features allow other usersto understand the meaning of variables, geometric structures andscenarios created by colleagues thereby favoring collaboration.

4.3.6. Integrating ToolsIn addition to the common database where all tools readand store variables and geometric objects, BRAVIZ includes areal-time communication mechanism. All applications can becoordinated to focus on the same subject (see Figure 9), towork with the same sample, or use the same set of variables.Applications may be running on different screens or evendifferent devices; allowing users to get different perspectives ofthe data at the same time. Of note, BRAVIZ applications have theoption to block specific parameters to keep the application fromchanging if the user prefers it. Also, for some actions that arelikely to produce important changes, BRAVIZ asks the user forconfirmation, and provide the option to always accept or alwaysignore the specific change.

Multiple users can work with the same data base thus canshare data and thoughts. In this way one user may create anew measurement and make it available to other users. Real-time communication is implemented through TCP, therefore itwould be possible to link together multiple workstations, but theresearch team has yet to explore whether this would provide aconvenient user experience.

4.4. The BRAVIZ LibraryThe Braviz library (shaded region of Figure 1) is divided inthree modules that provide several common features to ease thedevelopment of applications. In addition, it abstracts access todata so that applications can be ported to different datasets. Ofnote, BRAVIZ library can also be used in python applications andscripts, including interactive work in a terminal or an iPythonnotebook.

4.4.1. Read and FilterThe Read and Filter module is in charge of reading theconfiguration file and instantiating the appropriate project reader

class. In addition, it holds several utility functions for convertingimage data between formats, applying affine transformations,filtering tractography, and deriving scalar measures from scalardata. This module also abstracts reading and writing data fromthe BRAVIZ database. Currently this database is implementedin SQLite (Hipp et al., 2015) and it holds all non-spatial data,samples, scenarios, and other objects created by users.

4.4.2. VisualizationThe Visualization module contains functions and widgets thatcan be used to create spatial and non-spatial visualizations.Spatial visualizations are implemented with VTK (Schroederet al., 1996) , but several classes are available to streamlinedevelopment. For example, managers are available fortractography, fMRI, and segmentation data. These are high-level classes that connect to the Read and Filter module,load the appropriate data, and manage the VTK visualizationpipeline ending in a specified renderer. The user only needsto provide the current subject, current coordinate system, andvisualization parameters. Of note, these classes expose methodsas change_subject and change_coordinates which convenientlyallows switching to a different subject or a different coordinatesystem.

Non-spatial visualizations use Matplotlib (Hunter, 2007) andSeaborn (Waskom et al., 2014), or D3 (Bostock et al., 2011) incase of web based applications. Thismodule provides functions tocreate common visualizations by providing an input data-frameand other configuration parameters. All BRAVIZ visualizationsexhibit consistent behavior, i.e., they allow the user to query thesubject id of a data point by hovering over it, create a contextmenu by right clicking on a point, and allow for highlightingpoints on the display.

4.4.3. InteractionThe interaction module includes common QT widgets aswell as utilities to create web based applications and tohandle communications between applications. It also handlesconnection to external tools. For example, it contains functionsthat use the R system to perform statistical analysis.

4.5. Importing Data into BRAVIZAs mentioned above, making BRAVIZ generalizable to supportdifferent data-sets and data-formats was a primary designgoal. In order to use BRAVIZ on an existing data-set, spatialdata should be pre-processed using standard neuroimage tools.Instead of copying spatial data, BRAVIZ access it using specificProjectReader classes. Finally several options are available toimport non-spatial data.

4.5.1. Pre-processingWhile BRAVIZ integrates some basic image processingalgorithms, the focus remains on interactive visualization. In thisend, data should be pre-processed using third party tools beforestarting a BRAVIZ project. As a minimum, BRAVIZ expectsregistration matrices linking the different coordinate systemspresent in images. Running the FreeSurfer recon_all pipeline

Frontiers in Neuroinformatics | www.frontiersin.org 10 August 2016 | Volume 10 | Article 36

Page 11: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 9 | An example of BRAVIZ running on a large display in a collaborative setting.

will provide BRAVIZ segmentation, cortical parcellation andTalairach registration, which are basic for most visualizations.

In addition, BRAVIZ can read SPM statistical maps and warpfields as well as tractography output (currently in VTK format),Tracula bundles and any type of scalar map, for example thosederived from the tensor model.

Pipeline systems such as LONI pipeline (Dinov et al., 2009),NiPype (Gorgolewski et al., 2011), or the CCS (Xu et al., 2015) areideally suited to automate these necessary pre-processing steps.

4.5.2. Reading Spatial DataBRAVIZ accesses geometric data through the projectreader interface (see Figure 1), which can have differentimplementations for different projects. All the upper layers,including applications, will access data through ProjectReaderobjects by specifying the type of data, the subject id, thecoordinate space, and additional parameters appropriate foreach data type (e.g., scalars for tractography or contrast forfMRI maps). This object is also responsible of returning indicesof available data. Internally the class must be able to load theappropriate data, apply the correct transformations, and finallyreturn the requested data as a python object.

For example, in the case study presented below, all project’sdata was stored on hard disk, using NIFTI format for images,

and VTK format for tractography and segmented structures. Fileswere stored in a layout such that the full path for each file canbe determined based on subject id. Pre-calculated transformationmatrices and warp fields were located and internally applied onload.

New reader classes can be implemented to read data fromdifferent sources and different formats (e.g., reading DICOMobjects from a PACS). The nibabel and vtk python librariesprovide functions to load data stored in several formats, makingthe implementation of these classes feasible in a short amount oftime.

It is noteworthy that setting up BRAVIZ to work with a newdataset requires expertise from engineers. Recall that BRAVIZis targeted at interdisciplinary teams and one of the goals isallowing the members of these teams to work together efficiently.Additional details of how BRAVIZ handles spatial and non-spatial data are described in the project’s documentation.

4.5.3. Importing Non-spatial DataBy means of integrated tools clinical data can be imported intothe system from excel tables or SPSS files using integrated tools.This data is copied into a database, and can be accessed from allapplications. The python terminal or iPython notebooks (Prezand Granger, 2007) can also be used to interactively read data

Frontiers in Neuroinformatics | www.frontiersin.org 11 August 2016 | Volume 10 | Article 36

Page 12: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

from other sources (for example scraping web sites or by parsingDICOM headers), transform it into a pandas dataframe, and saveit into the BRAVIZ database using a high level API.

Of note, additional data can be added at any time and non-spatial data can also be exported to an spreadsheet using agraphical interfaces. It follows that the user is able to take somevariables of interest out of BRAVIZ, perform calculation on anexternal tool, and import the results back into BRAVIZ.

5. CASE STUDIES

The functionality provided by BRAVIZ can be used both foruntargeted, intuitive and adaptive discovery of relationships inorder to improve understanding of data; as well as for structureddissection of data. In Section 5.1 an example of adaptive dataexploration is presented. Section 5.2 illustrates pre-structured,targeted exploratory analysis to identify potential candidates fora sub-analysis of leukomalacia in a cohort of adolescents followedsince birth.

In these case studies, MRI, fMRI and diffusion data wereacquired, anonymized, and stored in DICOM format, then pre-processed using dcm2nii (Rorden and Brett, 2000), fsl (Jenkinsonet al., 2012), freesurfer (Fischl, 2012), camino (Cook et al., 2006),and spm (Friston et al., 2006). A custom BRAVIZ project readerwas implemented to read the output of this process. Clinical datawas primarily consolidated in an SPSS database then importedin BRAVIZ. These tasks were performed by the engineeringmembers of the group. Afterwards, researchers were free toexplore and analyze data with BRAVIZ on their own computers.The first case study was carried out by a neurophysiologist,while the second one was completed by a pediatrician and aneuro-radiologist.

5.1. Adaptive Exploration of Anatomicaland Functional Data in PrematurityThe collection of the data used in the case studies was approvedby the ethical committee from the school of medicine ofUniversidad Javeriana (Bogotá, Colombia) as well as the ethicalcommittee of Fundación Santafé de Bogotá (Bogotá, Colombia),all participants and their parents provided written informedconsent. This section describes an adaptive inference process inwhich insights inspire subsequent exploratory steps, and, morespecifically, highlights the flexibility that BRAVIZ affords.

5.1.1. Rationale and Initial HypothesesData comes from the seminal demonstration (Schneider et al.,2012) of the influence of a very premature birth (<33 weeksof gestational age) on brain function in a sample of 15-yearold adolescents who were compared to their term peers. Thisstudy also showed the positive impact of an early care protocol(Kangaroo Mother Care) on brain functions, but this is beyondthe present topic. The authors used the noninvasive and painlessTMS of the primary motor cortex (referred to as the motorbrain) and showed that the control of hand by brain wassuboptimal in the preterm group. Especially, the excitability ofthe motor brain was lower and the time required for the transferof nervous signal from one cerebral hemisphere to the other

(interhemispheric transfer) was longer. These abnormalities wererelated to the deleterious after-effects of a premature birth thatcould still be detected 15 years later in brain (Schneider et al.,2012). Indeed, the interruption of the in utero maturation of thecorpus callosum, a structure comprised of large-diametered fast-conducting myelinated fibers and connecting the two cerebralhemispheres, interfered with the normal installation of circuitsin each hemisphere and of the functional lateralization betweenhemispheres, thus altering the sensorimotor and cognitive brainfunction (Schneider et al., 2008; Flamand et al., 2012; Schneideret al., 2012).

However, the study had not yet established any link betweenMRI-DTImeasurements of the corpus callosum thinning and thelower clinical performances in the preterm group as comparedto the peers term. Two reasons can explain this limitation:the difficulty to statistically analyze, visualize, explore, andunderstand the link between outcomes of different nature(severely time-consuming with the use of different softwares anddata transformations likely providing biases or mistakes); therequired involvement of experts dedicated to the interpretationof each outcome.

The working hypothesis was that BRAVIZs user-centeredapproach and applications/modules with real-time statisticalanalyses (color coding for instantaneous detection of acorrelation for example), and access to multifacettedneuroscience and clinical data (spatial and non-spatial interactivedata visualizations with common operations available betweenapplications) should favor multi-tasking along with adaptationto user’s expertise leading to easy detection of outliers andemergence of unsuspected relationships between target variables.

5.1.2. Contribution of BRAVIZ to Scientific Discovery

(Ancillary Hypotheses)All data had been already imported in BRAVIZ by theengineer before onset of work and this substantially fastenedthe access to spatial and non-spatial data. BRAVIZ was used toexplore the potential links between clinical outcomes and lowerexcitability of the motor brain, interhemispheric dysfunction,and volume/numbers of fibers of the corpus callosum. Theapplications were used following the non-linear procedure thatBRAVIZ uniquely supports. ANOVA results with scatter plots,standard deviations and superimposition of individual valueshelped immediately detect four outliers in the 15-year pretermadolescents sample. This should have taken far more time withother applications not processed by BRAVIZ in real time and inparallel. The Correlation Viewer allowed to remove these outlierswith a simple click and to see instantaneously that they affectedANOVA results, which would not have been possible to detect ifall applications had not be launched in the sameworkspace. Thus,the combination of ANOVA, Correlation Viewer, and ParallelCoordinates (MRI for structure volume, DTI for fibers) showedthat, without the outliers (studied below), the medial-anteriorpart of the corpus callosum (maCC connecting the motor areasbetween hemispheres) was smaller in volume in the pretermgroup (17, 228 ± 5100 mm3) than in the term (22, 352 ± 3900mm3; p = 0.02) and exhibited fewer callosal fibers (1806 ±

573) than in the term (2600 ± 570; p = 0.008). Despite this

Frontiers in Neuroinformatics | www.frontiersin.org 12 August 2016 | Volume 10 | Article 36

Page 13: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

thinning of corpus callosum, no link was detected with theinterhemispheric time of transfer between motor areas that wasshown to be significantly longer in the preterm group (Schneideret al., 2012). However, the correlation matrix and scatter plotsbrought to attention that, by excluding the four outliers, astrongly red-colored (significant) link appeared between theincreasing interhemispheric transfer time and the longer lengthof the maCC fibers: Figure 7B shows this strong correlation forthe preterm group (orange circles) when the outliers are removed(as a contrast to Figure 7A), the longer the maCC fibers and thelonger the interhemispheric transfer time. This correlation wasabsent in the term group (green circles).

The real time access to clinical characteristics by right clickingthe dots in BRAVIZs scatter plots eased to understand thatoutliers in Figure 7B were participants with either cerebral palsyor Xfragile pathology. The parallel access DTI data and scatterplots in the same workspace favored the immediate detectionthat the circled outlier in Figure 7A was a participant with adramatic thinning of the maCC (Figure 7C). This explains that,as compared to other participants with shorter length of maCCfibers, the interhemispheric transfer time was longer (holes incorpus callosum likely leading to temporal dispersion of currentbetween hemispheres). We already knew that the preterm grouphad longer interhemispheric transfer time (Schneider et al., 2012)but BRAVIZ guided the analysis to rapidly visualize and testthat this was related to the length of fibers in the pretermgroup, even if length was not different than in the term group.This contributes to raise the very new hypothesis that theunderlying problem in our preterm sample may not not bethe structural defect (excluding the outliers) but more likelythe synaptic arrangement of connections between hemispheres,those with longer interhemispheric transfer time presenting withless efficient transcallosal networks.

Precisely, not all participants in the preterm group presentedwith lengthened interhemipsheric transfer time but five ofthem were over 20ms when 8 and 14ms are usually expectedfor men and women, respectively (Schneider et al., 2012).BRAVIZs parallel coordinates of the different variables wasuseful once again to efficiently identify that four out of thesefive participants over 20ms had a transient score at the infantneurological international battery (Infanib) at 6 months of age.This intuitively led to launch BRAVIZs Correlation Viewer andto highlight a strong link between Infanib at 6 months ofage and our neurophysiological measures known as differentin the preterm group (Schneider et al., 2012) such as thelower excitability of the motor brain and higher occurrenceof ipsilateral muscle responses to TMS of the motor brain(which is abnormal given the typical crossed organization ofmotor systems for hand function). In that vein, BRAVIZsapplications contributed to highlight that the four participantsabove 20ms thus who significantly drove the correlation inFigure 7A had a dysfunctional organization of the motor brain.One another ancillary hypothesis could be that Infanib testingat 6 months of age may be predictive of long-term impairmentof the motor brain function, including a poor efficiency of theinterhemispheric connections (time over 20ms at 15 years of age,see Figure 7A).

Merging the neurophysiological expertise with the potentialof BRAVIZ for data-driven analysis led to one last interestingnew finding in this case study. This concerns the visuomotorcontrol of movement which involved the corpus callosumfunction (Schneider et al., 2008) and which is assessed by thestandardized score of the visuomotor integration test (VMIStS). This score was found to be significantly lower in thepreterm group (95.5 ± 7) than in the term (108.5 ± 3.5;p = 0.0001) at 15 years of age. The Correlation Viewer(color codes highlighting the significant links and excludingthe outliers determined above) used together with the parallelcoordinates application (see all studied variables in real timein the same workspace) detected that the lower (lesser) VMIStS scores of the preterm group could be explained by a lowerexcitability of the motor brain (higher motor threshold, i.e.,higher intensity of TMS to generate a response in the muscle,RMTd in Figure 10, the excluded outliers are represented byempty circles) and higher occurence of abnormal ipsilateralresponses (Ipsifreqnd in Figure 11). By extension, BRAVIZafforded applications and context to detect that the impairementof visuomotor control of movement in our sample of 15-year adolescents born prematurely was related to dysfunctionof the motor brain and pathways, in terms of maladaptivesynaptic organization (rather than anatomical issues), andthat could have been predicted by the transient Infanib at6 months of age. Overall, these new hypotheses driven byBRAVIZ in our sample of preterm adolescent may impact earlyrehabilitation when synaptic issues are more sensitive thananatomical to training-induced mechanisms of adaptation andmotor learning.

5.1.3. ConclusionsBRAVIZ adapted and intuitively guided the exploratoryanalysis of these multi-kind and multifacetted outcomes. Withdifferent experts gathered (i.e., neonatologist, neurophysiologist,psychologist, physical therapist) this new tool helped unmaskimportant relationships between variables of brain functionand uncover new hypotheses in prematurity that go beyondprevious works in preterm children (Schneider et al., 2008, 2012;Flamand et al., 2012). BRAVIZ-related analysis of experimental

FIGURE 10 | VMI vs. M1 excitability excluding outliers (empty circles).

Frontiers in Neuroinformatics | www.frontiersin.org 13 August 2016 | Volume 10 | Article 36

Page 14: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 11 | VMI vs. ipsilateral corticospinal responses excluding

outliers (empty circles).

data in prematurity pointed out for the first time that thelong-term impairments (clinical data) in prematurity maynot be the smaller volume of some structures but the lesserefficacy of synapses between neurons and of networks betweenstructures (such as the corpus callosum between hemispheres).On that vein, a specific focus, in terms of quality and intensityof intervention for synaptic efficiency, should be made onthose individuals presenting characteristics that BRAVIZ uniqueexploratory analysis has identified at risk for the integrity of brainfunction.

5.2. Pre-structured, Targeted Search, andCharacterization of Leukomalacia PatientsPeriventricular Leukomalacia (PVL) is characterized by a lesionin the periventricular germinal matrix which results in a loss ofwhite matter ventricular dilation. Preterm babies, especially thoseborn prior to 33 weeks of gestation (due to non-invaginationyet of highly vascularized germinal matrix) present with asignificant risk of suffering from this condition and from its long-term physiological consequences, including motor and visionimpairments, cerebral palsy and epilepsy.

Of note, the imaging protocol of the study did not includeFLAIR images which are most appropriate for PVL detection.BRAVIZ and its applications thus offered the alternative toidentify in the database for subjects whomight possibly suffer thisPVL condition. Such exploration targeted a better understandingof the samples health issues rather than directly testing ascientific hypothesis. However, incidental results may result fromBRAVIZs data-driven analysis.

A parallel coordinates display was used to visualize theventricles volume and the total white matter (myelinated fiberscomprising brain networks). Figure 8 shows for example thatthe dataset included motor and visual testing, intelligenceassessment, a classical neurological evaluation to discriminatebetween normal and abnormal neurological status and some riskfactors, such as weight and gestational age (Ballard) at birth. Datacould be filtered by axis dragging in order to keep only subjectswith high ventricular volume and low white matter volume(shown in color at the figure, while excluded are shown in gray),

i.e., with characteristics related to PVL condition. Lines (oneline per participant) with decreasing slope from first segment(ventricules volume) to second segment (white matter) mayreflect ventricular dilation (first segment) as a result of the lossof white matter, in relation to neurological status (abnormal inred and normal in blue).

The line highlighted in Figure 8 showed ventricules dilatation,low white matter volume, low scores for motor and visualcomponents of visuomotor integration testing (VMI) and lowbirthweight (pesnacer). The context menu that appeared by rightclicking on the line enabled to switch to the individual’s brainfacts, images and statistics in other BRAVIZ applications andto get additional details for better understanding the wholeclinical profile of such an abnormal condition at 15 years ofage. For example, the Subject Overview application denoted thatthe participant was born by C-Section in emergency after atwin pregnancy; the mother had HELPP syndrome (malignanthypertension) and died after giving birth; the abnormal statusduring the first year of life transformed in a diagnosis of cerebralpalsy at the end of the first year (spastic diplegia and right armhemiparesis). Altogether, the clinical profile and the symptomstogether with the T1 image of ventricles dilatation (Figure 12)could be associated with PVL.

The 15-year subject data associated with the top line inFigure 8 (highest ventricles volume) showed in Subject Overviewa very premature birth at 31 weeks of gestational age and givenby C-section in emergency because of profuse bleeding (placentapreviae), signs of leucoencephalopathy and an increased volumeof the left lateral ventricle in CT scan, epilepsy not yetcontrolled, severe bilateral hearing loss, and abnormal bilateralvision, but a normal psychomotor development. BRAVIZscoordinated applications looking at indicator variables then atcandidates’ detailed profiles detected two cases with possibleexistence of PVL out of our sample of participants, althoughthe original data had not been collected with the view ofevaluating such white matter disorders. This witnessed theusefulness of the Subject Overview application to providesufficient information to better understand each case and itslife trajectory (variables, annotations) with no need to gatheradditional files.

6. DISCUSSION

The cases studies illustrate how BRAVIZ can be used to improveunderstanding of a real data-set. A direct access to data makesresearchers more efficient in the exploratory analysis and therecovery of data relevant to their questions. By combining severaltools, users can perform group level analyses without loosingtrack of individual subjects, while getting additional details ona participant via a simple click. Sharing samples, variables,visualizations, and subjects of interest also provides an efficientchannel for experts to communicate and share ideas, thoughts,advice, and opinions with each other.

Direct access to data helps save time and more interestingly,enables researchers to efficiently pursue and address in real timeall questions coming to mind. If testing an idea, hypothesisof relationships or finding an answer requires a cumbersome

Frontiers in Neuroinformatics | www.frontiersin.org 14 August 2016 | Volume 10 | Article 36

Page 15: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

FIGURE 12 | Details of a subject with possible PVL, the left panel shows values for several variables as well as the full clinical history.

procedure and gathering data from several sources, it will likelynot be explored. This exploration would take non intuitivemanipulation of different softwares and far more time withelevated risks of biases and mistakes in data transformation, etc.With BRAVIZ, the number of questions that can be addressedare increasing and so are the chances of finding incidental butinteresting results.

Combining clinical and spatial data with the unstructuredclinical history allows researchers to reconstruct a full picture ofeach subject. This is valuable for making accurate interpretationsof results, for understanding deeper and more accuratelythe dataset and its outliers. Participants in BRAVIZ arealways linked to all their associated data, therefore theyare always analyzed with respect to their whole portrait.Subjects with specific conditions can be analyzed in isolation,or attempts can be made to generalize the properties ofa peculiar subject. Back and forth analysis of the data aresupported by BRAVIZ with iterations to isolate a case or togeneralize to the group .In fact, samples of participants ordata in BRAVIZ are recognized as an essential componentof the analysis. They can be created in several ways, savedinto the database for future use or used instantaneously instatistical models or group data visualizations. Researchers canthus iterate through several samples, several models, several

measurements, and several views, thus gaining understandingof the data-set and raising questions and hypotheses for futureprojects.

A log of each analysis session is automatically kept and madeavailable through a web interface with all actions performed withthe different applications used. Researchers can enhance this logby labeling the most important steps they made during a sessionand providing annotations. The status of an application can bereloaded at any point with a simple click of a button so thatresearchers are able to revisit visualization and explore differentanalysis paths, thus making possible to build on top of what wasfound with previous sessions.

In contrast to current tools, BRAVIZ enables the visualizationof data from different participants and the between-participantcomparisons with most associated technical issues hidden to theuser. The traditional separation of workspace, with one windowfor medical image viewer, and a second for a spreadsheet ofclinical data, the researchers manually (and repeatedly) searchfor correlations between both kinds of data. BRAVIZ fixed thisissue of storage of individual files tomanage in different softwares(access, format, and transform data are automated and hiddenactions) and harnesses and proceeds data in the same workspace.It follows that users focus on the task at hand, and not ontechnical details.

Frontiers in Neuroinformatics | www.frontiersin.org 15 August 2016 | Volume 10 | Article 36

Page 16: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

7. CONCLUSIONS AND FUTURE WORK

BRAVIZ is a tool that facilitates interactive data visualizationand exploratory analysis of datasets that combine clinical andneuro-image data. It is implemented as a set of applications thatcan be used at different points of exploratory analysis, possibly bydifferent members of an interdisciplinary research team. Someapplications are designed to extract scalar measurements fromimage data, others provide detailed views of particular subjects ordata types and others perform group analyses. Custom samples ofsubjects can be defined and used across all applications to focuson the analysis. When several applications are running together,all of them can keep focus on the same subject (Figure 9) whichsupports and guide the users in their understanding of how themulti-dimensional views of the same individual’s data interacttogether.

The architecture also streamlines implementation ofadditional applications by encapsulating all low level datamanipulation in a library. For example, BRAVIZ could easilybe extended to support group based analysis like second levelfMRI or VBM (Voxel Based Morphometry). In addition, severalresearchers have expressed interest in integrating other kinds ofdata, for example EEG signals or connectome graphs derivedfrom DWI or fMRI. BRAVIZ is distributed under a LGPLlicense and the source code is available through the project webpage http://diego0020.github.io/braviz. Additional testing withvarious datasets and by researchers of multiple backgrounds isrequired. Any assistance will be provided to anyone interested insetting up the environment and using it.

Braviz has been tested on a larger project from theKangaroo Foundation which includes about 450 participants,complex event related fMRI paradigms, larger numbers ofclinical variables, and more anatomical imaging modalities. Acommon concern is whether this approach will work withlarger datasets, with thousands or even millions of participants.Data visualization can surprise us, but generally does notscale well; while data modeling typically does scale but cannotsurprise us (Wickham and Grolemund, 2016). Therefore, onemust iterate between visualization and modeling. One approachis to start exploring with a smaller subsample, and whenan interesting trend is found, to try to see if generalizationworks for the full sample. Likewise, visualizations on a smallersubsample can be used to further understand and explore resultsobtained elsewhere. To this end, BRAVIZ provides the subsamplemechanism.

Currently there is a trend toward moving data storage andprocessing to dedicated servers and providing access to users viaweb browsers. This allows researchers to work from any place, or

even start a session in one place and finish somewhere else. Logs,variables, samples, and other analysis artifacts could be kept in acentralized location to ease data downloading from a single place.In addition, if multiple experts used the same back-end, sharingwould become trivial. The back-end could be implemented viaa set of dedicated servers, or it could run in a scalable cloud.Powerful data processing algorithms could also be provided toleverage high performance computing infrastructure. BRAVIZought to be accessible through a web interface in the future, giventhe potential generalization of its applicability.

The evergrowing interest in collecting data openly andmaking it available to the general community makes visualexploratory analysis tools like BRAVIZ become increasinglyimportant. BRAVIZ has the ability to rapidly iterate andimprove thus its future versions could contribute to increase thescientific productivity, optimally use the collected data, integratethe analysis between different variables and different users,and eventually speed-up knowledge transfer. The case studiessuggested that BRAVIZ could represent a clinically relevanttool to understand brain function but also in the future toassess and help in decision making for care and therapy overlifespan.

AUTHOR CONTRIBUTIONS

DA designed and implemented the BRAVIZ software, under thesupervision of JH and JO. All three drafted the technical portionsof the manuscript. CS and NC contribute with final user pointof view (pediatrician and neurophysiological) along the designand prototyping process, they conduct and wrote cases studies.All authors revised and contributed to the final version of thismanuscript.

FUNDING

This work was developed as part of a PhD funded bythe Departamento Administrativo de Ciencia, Tecnología eInnovación, Colombia (Colciencias), Grant number 528, nationalPhDs.

ACKNOWLEDGMENTS

This work was made possible by the Kangaroo Foundation,that generously provided a rich dataset, and whose experts werealways willing to test prototypes and provide valuable feedback.The project also received important support from Colciencias,Universidad de los Andes, Iowa State University and UniversitLaval-CHU of Qubec.

REFERENCES

Biswal, B. B., Mennes, M., Zuo, X.-N., Gohel, S., Kelly, C., Smith, S. M.,

et al. (2010). Toward discovery science of human brain function.

Proc. Natl. Acad. Sci. U.S.A. 107, 4734–4739. doi: 10.1073/pnas.09118

55107

Bostock, M., Ogievetsky, V., and Heer, J. (2011). D3 data-driven documents. IEEE

Trans. Vis. Comput. Graph. 17, 2301–2309. doi: 10.1109/TVCG.2011.185

Bowman, I., Joshi, S., and Van Horn, J. (2011). “Query-based coordinated multiple

views with feature similarity space for visual analysis of MRI repositories,”

in 2011 IEEE Conference on Visual Analytics Science and Technology (VAST)

(Providence, RI), 267–268.

Bowman, I., Joshi, S. H., Greer, V., and Darrell Van Horn, J. (2012a). “Feature-

similarity visualization of MRI cortical surface data,” in IEEE Conference

on Visual Analytics Science and Technology (VAST) (Seattle, WA: IEEE),

211–212.

Frontiers in Neuroinformatics | www.frontiersin.org 16 August 2016 | Volume 10 | Article 36

Page 17: A Multi-facetted Visual Analytics Tool for Exploratory ... · benefit from visual analytics techniques. Indeed, brain function-related datasets are a combination of spatial (brain

Angulo et al. BRAVIZ: Visual Analysis of Brain Data

Bowman, I., Joshi, S. H., and VanHorn, J. D. (2012b). Visual systems for interactive

exploration and mining of large-scale neuroimaging data archives. Front.

Neuroinform. 6:11. doi: 10.3389/fninf.2012.00011

Cointepas, Y., Mangin, J.-F., Garnero, L., Poline, J.-B., and Benali, H. (2001).

BrainVISA: software platform for visualization and analysis of multi-modality

brain data. Neuroimage 13:98. doi: 10.1016/S1053-8119(01)91441-7

Cook, D., and Swayne, D. F. (2007). Interactive and Dynamic Graphics for Data

Analysis: with R and GGobi. New York, NY: Springer.

Cook, K. A., and Thomas, J. J. (2005). Illuminating the Path: The Research and

Development Agenda for Visual Analytics. Technical Report, Pacific Northwest

National Laboratory (PNNL), Richland, WA.

Cook, P. A., Bai, Y., Nedjati-Gilani, S., Seunarine, K. K., Hall, M. G., Parker,

G. J., et al. (2006). “Camino: open-source diffusion-MRI reconstruction and

processing,” in 14th Scientific Meeting of the International Society for Magnetic

Resonance in Medicine, Vol. 2759 (Seattle, WA).

Dinov, I. D., VanHorn, J. D., Lozev, K.M., Magsipoc, R., Petrosyan, P., Liu, Z., et al.

(2009). Efficient, distributed and interactive neuroimaging data analysis using

the LONI pipeline. Front. Neuroinform. 3:22. doi: 10.3389/neuro.11.022.2009

Eckersley, P., Egan, G. F., De Schutter, E., Yiyuan, T., Novak, M., Sebesta, V., et al.

(2003). Neuroscience data and tool sharing. Neuroinformatics 1, 149–165. doi:

10.1007/s12021-003-0002-1

Fedorov, A., Beichel, R., Kalpathy-Cramer, J., Finet, J., Fillion-Robin, J.-C.,

Pujol, S., et al. (2012). 3d Slicer as an image computing platform for the

quantitative imaging network. Magnet. Reson. Imaging 30, 1323–1341. doi:

10.1016/j.mri.2012.05.001

Fischl, B. (2012). FreeSurfer. Neuroimage 62, 774–81. doi: 10.1016/j.neuroimage.

2012.01.021

Flamand, V. H., Nadeau, L., and Schneider, C. (2012). Brain motor excitability

and visuomotor coordination in 8-year-old children born very preterm. Clin.

Neurophysiol. 123, 1191–1199. doi: 10.1016/j.clinph.2011.09.017

Friston, K. J., Ashburner, J. T., Kiebel, S., Nichols, T., and Penny, W. D., (eds.)

(2006). Statistical Parametric Mapping: The Analysis of Functional Brain Images,

1st Edn. Cambridge, MA: Academic Press Inc.

Gorgolewski, K., Burns, C. D., Madison, C., Clark, D., Halchenko, Y. O.,

Waskom, M. L., et al. (2011). Nipype: a flexible, lightweight and extensible

neuroimaging data processing framework in Python. Front. Neuroinform. 5:13.

doi: 10.3389/fninf.2011.00013

Hanrahan, P. (2003). Tableau Software White Paper-Visual Thinking for Business

Intelligence. Seattle, WA: Tableau Software.

Hinterberg, M. A., Kao, D. P., Bristow, M. R., Hunter, L. E., Port, J. D., and

Görg, C. (2014). “PEAX: interactive visual analysis and exploration of complex

clinical phenotype and gene expression association,” in Pacific Symposium on

Biocomputing, Vol. 20 (Kohala, HI: World Scientific), 419–430.

Hipp, D. R., Kennedy, D., and Mistachkin, J. (2015). SQLite (Version 3.8.10.2)

[Computer software]. North Carolina: SQLite Development Team.

Hodge, M. R., Horton, W., Brown, T., Herrick, R., Olsen, T., Hileman, M. E., et al.

(2016). ConnectomeDB—sharing human brain connectivity data. Neuroimage

124, 1102–1107. doi: 10.1016/j.neuroimage.2015.04.046

Hunter, J. D. (2007). Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9,

90–95. doi: 10.1109/MCSE.2007.55

Jenkinson, M., Beckmann, C. F., Behrens, T. E. J., Woolrich, M. W.,

and Smith, S. M. (2012). FSL. Neuroimage 62, 782–790. doi:

10.1016/j.neuroimage.2011.09.015

Keim, D., Andrienko, G., Fekete, J.-D., Grg, C., Kohlhammer, J., and Melanon, G.

(2008). “Visual analytics: definition, process, and challenges,” in Information

Visualization, number 4950 in Lecture Notes in Computer Science, eds A.

Kerren, J. T. Stasko, J.-D. Fekete, and C. North (Heidelberg: Springer), 154–175.

Marcus, D. S., Harms, M. P., Snyder, A. Z., Jenkinson, M., Wilson, J. A.,

Glasser, M. F., et al. (2013). Human connectome project informatics: quality

control, database services, and data visualization.Neuroimage 80, 202–219. doi:

10.1016/j.neuroimage.2013.05.077

Milham, M. P. (2012). Open neuroscience solutions for the connectome-wide

association era. Neuron 73, 214–218. doi: 10.1016/j.neuron.2011.11.004

Prez, F., and Granger, B. E. (2007). IPython: a system for interactive scientific

computing. Comput. Sci. Eng. 9, 21–29. doi: 10.1109/MCSE.2007.53

Rorden, C., and Brett, M. (2000). Stereotaxic display of brain lesions. Behav.

Neurol. 12, 191–200. doi: 10.1155/2000/421719

Rosset, A., Spadola, L., and Ratib, O. (2004). OsiriX: an open-source software for

navigating in multidimensional DICOM images. J. Digit. Imaging 17, 205–216.

doi: 10.1007/s10278-004-1014-6

Rubinov, M., and Sporns, O. (2010). Complex network measures of brain

connectivity: uses and interpretations. Neuroimage 52, 1059–1069. doi:

10.1016/j.neuroimage.2009.10.003

Schneider, C., Charpak, N., Ruiz-Peláez, J. G., and Tessier, R. (2012). Cerebral

motor function in very premature-at-birth adolescents: a brain stimulation

exploration of kangaroo mother care effects. Acta Paediatr. 101, 1045–1053.

doi: 10.1111/j.1651-2227.2012.02770.x

Schneider, C., Nadeau, L., Bard, C., Lambert, J., Majnemer, A., Malouin, F.,

et al. (2008). Visuo-motor coordination in 8-year-old children born pre-term

before and after 28 weeks of gestation. Dev. Neurorehabil. 11, 215–224. doi:

10.1080/17518420801887547

Schroeder, W. J., Martin, K. M., and Lorensen, W. E. (1996). “The design and

implementation of an object-oriented toolkit for 3d graphics and visualization,”

in Proceedings of the 7th Conference on Visualization′96 (Los Alamitos, CA:

IEEE Computer Society Press), 93.

Team, R. C. (2012). R: A Language and Environment for Statistical Computing.

R Foundation for Statistical Computing. Vienna: R Foundation for Statistical

Computing.

Tufte, E. R., and Graves-Morris, P. R. (1983). The Visual Display of Quantitative

Information, Vol. 31. Cheshire, CT: Graphics Press.

Tukey, J. W. (1980). We need both exploratory and confirmatory. Am. Stat. 34,

23–25.

Van Horn, J. D., and Toga, A. W. (2009). Is it time to re-prioritize

neuroimaging databases and digital repositories? Neuroimage 47, 1720–1734.

doi: 10.1016/j.neuroimage.2009.03.086

Waskom, M., Botvinnik, O., Hobson, P., Cole, J. B., Halchenko, Y., Hoyer, S., et al.

(2014). seaborn: v0.5.0 (November 2014). doi: 10.5281/zenodo.12710

Wassink, I., Kulyk, O., van Dijk, B., van der Veer, G., and van der Vet, P. (2009).

“Applying a user-centered approach to interactive visualisation design,” in

Trends in Interactive Visualization, eds R. Liere, T. Adriaansen, and E. Zudilova-

Seinstra (London, UK: Springer), 175–199.

Wickham, H., and Grolemund, G. (2016). R for Data Science. Sebastopol, CA:

O’Reilly Media.

Wood, D., King, M., Landis, D., Courtney, W., Wang, R., Kelly, R., et al.

(2014). Harnessing modern web application technology to create intuitive and

efficient data visualization and sharing tools. Front. Neuroinform. 8:71. doi:

10.3389/fninf.2014.00071

Xu, T., Yang, Z., Jiang, L., Xing, X.-X., and Zuo, X.-N. (2015). A connectome

computation system for discovery science of brain. Sci. Bull. 60, 86–95. doi:

10.1007/s11434-014-0698-3

Yushkevich, P. A., Piven, J., Hazlett, H. C., Smith, R. G., Ho, S., Gee, J. C., et al.

(2006). User-guided 3d active contour segmentation of anatomical structures:

significantly improved efficiency and reliability. Neuroimage 31, 1116–1128.

doi: 10.1016/j.neuroimage.2006.01.015

Zuo, X.-N., and Xing, X.-X. (2014). Test-retest reliabilities of resting-state

FMRI measurements in human brain functional connectomics: a systems

neuroscience perspective. Neurosci. Biobehav. Rev. 45, 100–118. doi:

10.1016/j.neubiorev.2014.05.009

Conflict of Interest Statement: The authors declare that the research was

conducted in the absence of any commercial or financial relationships that could

be construed as a potential conflict of interest.

Copyright © 2016 Angulo, Schneider, Oliver, Charpak and Hernandez. This is an

open-access article distributed under the terms of the Creative Commons Attribution

License (CC BY). The use, distribution or reproduction in other forums is permitted,

provided the original author(s) or licensor are credited and that the original

publication in this journal is cited, in accordance with accepted academic practice.

No use, distribution or reproduction is permitted which does not comply with these

terms.

Frontiers in Neuroinformatics | www.frontiersin.org 17 August 2016 | Volume 10 | Article 36