analyzing user interactions with biomedical ontologies: a visual perspective

Post on 23-Jan-2018

66 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

AnalyzingUserInteractionswithBiomedicalOntologies:AVisualPerspective

TuesdayTalk,14thNovember2017

MAU L I K K AMDA R , S IMON WA L K , T A N I A T U DO R A C H E , MA R K MU S E N

StanfordCenterforBiomedicalInforma:csResearchmaulikrk@stanford.edu

Uses: Knowledge management, decision support, semantic search, data annotation, data integration, reasoning …

Uses: Knowledge management, decision support, semantic search, data annotation, data integration, reasoning …

hGp://bioportal.bioontology.org/

hGp://bioportal.bioontology.org/

hGp://bioportal.bioontology.org/

hGp://bioportal.bioontology.org/

hGp://bioportal.bioontology.org/

Whatthistalkisabout…

•  BiOnIC-CatalogofUserInteracRonswith

BiomedicalOntologies

•  VisIOnapplicaRonwithembeddedvisualizaRons.

AnalysisofBioPortalWebUI

exploraRonandAPI

queryingstrategies,and

correlaRonwithusage.

Whatthistalkisabout…

•  BiOnIC-CatalogofUserInteracRonswith

BiomedicalOntologies

•  VisIOnapplicaRonwithembeddedvisualizaRons.

AnalysisofBioPortalWebUI

exploraRonandAPI

queryingstrategies,and

correlaRonwithusage.

BenefitsofanalyzinguserinteracRons

Ø  OntologyEngineers:v  IdenRfyexploraRonandqueryingpaGernsv  Understandontologyusageandreusev  PruneunwantedclassesandrelaRons

Ø  OntologyRepositoryMaintainers:v  Categorizeuserbehaviorsv  Developintelligentinterfacesv  ProvidetargetedrecommendaRons

Ø  BiomedicalResearchers:v  IdenRfytemporalresearchtrends

v  IdenRfyfrequentlyaccessedclasses

BiOnIC:ACatalogofUserInteracRonswith

BiomedicalOntologies

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasetscreaRon

NCBO API Usage

API R

eque

sts

per M

onth

2013−O

ct

2014−J

an

2014−A

pr

2014−J

ul

2014−O

ct

2015−J

an

2015−A

pr

2015−J

ul

2015−O

ct

2016−J

an

2016−A

pr

2016−J

ul

2016−O

ct

2M8M

32M

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

NCBO Website Traffic

Occ

urre

nces

per

Mon

th

2009−J

an

2010−J

an

2011−J

an

2012−J

an

2013−J

an

2014−J

an

2015−J

an

2016−J

an

010

0K20

0K

Page RequestsUnique IP Addresses

BiOnICdatasetscreaRon

•  Removingrobot/invalid

requests

•  Normalizingontology

idenRfiersandclassIRIs

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

•  January2015version.

•  Ontologiesshouldhave

classesthatarereusedby

othersORreuseclasses

fromotherontologies.

•  Ontologiesshouldhave

minimumof10unique

usersviaWebUIandAPI

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

ClassSta:s:csDatasets

Foreachclassineachontology:

•  AccessANributes:o  TotalIPRequests(WebUI/API)

o  UniqueIPRequests(WebUI/API)

•  ReuseANributes:o  Numberofontologiesreusingaclass

•  StructuralANributes:o  Numberofparent/child/siblingclasses

o  Depthfromontologyroot

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

ClassDepth->

2a

1

3a 4a

4b

4c2b

3b

3c

1’

2a’

2b’

2c’

3a’

3b’

3c’

2a’ 1’ 2b’ 3b’

2a 3a 4a 3a1

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

UserInterac:onSequencesDatasets

Ontology1

Ontology2

Filtering

AccessLogs

Filtering

Ontologies

CompuRng

ClassCounts

CompuRng

Sequences

Anonymizing

Data

BiOnICdatasetscreaRon

Anonymiza:onSteps

•  IPaddressesanonymizedusinguniqueSHA-224hash-encoded

useridenRfiersgeneratedfrom“user_<Random

String>_<Random_Integer>”.

•  e.g.39fd4e6d569a034973g61bb392a694d4eabe1ef98c43ee68ca2fc86

•  AbsoluteTime-stampsconvertedtorelaRveRme-stamps,with

respecttofirstinteracRonwithBioPortalrepository.

•  e.g.0,2757,2786,3586,3618,3803,3959,4047,5111(s),…

BiOnICschematomodelstaRsRcsandsequencesdata

countStat

bionic:CountStat

bionic:ReuseCount-  reuseType

-  reusingOntologies

bionic:RequestCount-  accessType

-  year

-  totalUsers

-  uniqueUsers

prov:Agent bionic:Sequence-  accessType

-  totalTime

-  uniqueClasses

bionic:SeqEn:ty-  rela6veTimestamp

bionic:Ontology-  skos:prefLabel

-  totalClasses

-  maxDepth

owl:Class-skos:prefLabel

skos:Collec:on skos:Concept

begin

end

nextEn6ty

class

class

requests

skos:member

bionic:SeqDataset-  accessType

bionic:StatDatasetdcat:Dataset

sequence

classInfo

ontology

ontology

bionic:ClassInfo-  siblings

-  directParents

-  directChildren

-  classDepth

class

subClassOf

ontology

SKOS,PROVandDCATstandardsarereusedintheBiOnICschema.

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasets

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasets

hGp://www.rdjdt.org/

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasets

hGp://www.rdjdt.org/

SPARQLServer

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasets

hGp://www.rdjdt.org/

SPARQLServer

BioPortalSPARQLEndpoint

hGp://onto-apps.stanford.edu/bionic/datasets

BiOnICdatasets

hGp://www.rdjdt.org/

SPARQLServer

BioPortalSPARQLEndpoint

1.  HowmanyagentsclickonthesubclassesaIerexploringthe

“ProteinTransmembraneAc6vity”classinGeneOntology?

2.  Whatistheaverage6mespentbyagentsonclassessimilarto

“CellGrowth”classinGeneOntology?

VisIOn(VisualizingOntologyInteracRons)WebApplicaRon

hGp://onto-apps.stanford.edu/vision

VisualizingOntologyStructurewithCountData

Force-directedNetwork IndentedTree

VisualizingOntologyStructurewithCountData

Force-directedNetwork IndentedTree

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

OntologyStructure

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

UserSequence

OntologyStructure

PolygOntovisualizaRonsofinteracRonsequences

Kamdar,etal.VisualizingRequestandReuseDataacrossBiomedicalOntologies.JournalofWebSeman:cs,(underreview)

UserSequence

SingleClick

OntologyStructure

Whatthistalkisabout…

•  BiOnIC-CatalogofUserInteracRonswith

BiomedicalOntologies

•  VisIOnapplicaRonwithembeddedvisualizaRons.

AnalysisofBioPortalWebUI

exploraRonandAPI

queryingstrategies,and

correlaRonwithusage.

CharacterisRcsoftheBiOnICCatalog

•  WebUIAccess:5.4Mclassrequests,1Muniqueagents

•  APIAccess:67.2Mclassrequests,205Kuniqueagents

•  255biomedicalontologies

•  DoBioPortalWebUIexploraRonandAPI

queryingstrategiescorrelatewitheachother?

•  DoBioPortalWebUIexploraRonandAPI

queryingstrategiesinformontologyusage?

– Ontologyusage,inthiscontext,meansreusein

otherontologies,andusageindataannotaRon.

InterfaceinfluencesinbrowsingandqueryingNum

bero

fUniqu

eAP

IUsers(LogScale)

NumberofUniqueWebUIUsers(LogScale)

1000

10

100

10 100 1000

1

1

Certainclassesbrowsedorqueriedsignificantlymore.

InterfaceinfluencesinbrowsingandqueryingNum

bero

fUniqu

eAP

IUsers(LogScale)

NumberofUniqueWebUIUsers(LogScale)

1000

10

100

10 100 1000

FemaleReproduc:ve

System

1

1

Certainclassesbrowsedorqueriedsignificantlymore.

Dermis

InterfaceinfluencesinbrowsingandqueryingDysmorphicSyndrome

Nightblindness

Num

bero

fUniqu

eAP

IUsers(LogScale)

NumberofUniqueWebUIUsers(LogScale)

1000

10

100

10 100 1000

FemaleReproduc:ve

System

1

1

Certainclassesbrowsedorqueriedsignificantlymore.

Dermis

ExploraRonandQueryingbehavioralpaGerns

•  Certainclassesinthelowerlevelsoftheontologicalhierarchyarerarelybrowsedandqueried–thismaybeanarRfactoftheindentedtreevisualizaRon.

•  Moretriangularpolygons(1parent->2childrenclasses,or2parents->1childclass)observedinWebUIAccesspolygonduetoindentedtreevisualizaRon.

ExploraRonandQueryingbehavioralpaGerns

300,000+classes200,000+users

DoBioPortalWebUIexploraRonandAPI

queryingstrategiescorrelatewitheachother?

•  SmallproporRonsofontologicalcontentexploredorqueriedfor

severalontologies,andminimalSpearmanCorrelaRonand

JaccardSimilarityobservedbetweenconsumpRonstrategies.

•  VaryingconsumpRonstrategies(triangularpolygons,blind

querying)acrosstheinterfaces–thesestrategiesaregenerallyuniformacrossontologies,withafewexcepRons(ChEBI).

•  ClassesinthelowerlayersoftheontologicalhierarchyarerarelybrowsedorqueriedusingtheWebUIorAPI.

Ontologyusage:Reuseinotherontologies

Dis:nctclasssetsreusedinotherontologies

Ontologyusage:GWASCatalogandPubChemannotaRons

Classesinthehigherlayersoftheontological

hierarchyareveryabstractandareneverusedfor

dataannotaRons.

Commonconsensusacross12differentdatasources

inLinkedOpenDatacloud,whereChEBIontologyis

usedfordataintegraRon.

Ontologyusage:GWASCatalogandPubChemannotaRons

Classesinthehigherlayersoftheontological

hierarchyareveryabstractandareneverusedfor

dataannotaRons.

Commonconsensusacross12differentdatasources

inLinkedOpenDatacloud,whereChEBIontologyis

usedfordataintegraRon.

DoBioPortalWebUIexploraRonandAPI

queryingstrategiesinformontologyusage?

•  MinimalcorrelaRonandsimilariResbetweenexploraRonand

queryingstrategies,andusage(reuseanddataannotaRons)

•  Ontologyreuseoccursfromclasseslocatedinthehigherlayersof

theontologicalhierarchyforsemanRcinteroperabilitybetween

ontologies.

•  However,theseclassesareveryabstractandnotveryrelevantfordataannotaRonordataintegraRon,whereclassesinthe

lowerlayersofhierarchyarehighlyused.

–  Theseclassesarerarelyexploredorqueried,weneedbeGerinterfaces!

NovelresearchdirecRonsmaybeenabled

throughtheBiOnICandVisIOnresources

•  CategorizeuserbrowsingbehaviorsbyincorporaRngthestructuralfeaturesoftheontologyclasses.

•  DeveloppersonalizeduserinterfacesforontologynavigaRon,withpredicRonsofthenextclassthatauserislikelyto

access,ortargetedrecommendaRonsbasedonusertype

(recurrentneuralnetworks,collaboraRvefiltering,…).

•  DevelopadvancedmethodsforontologysummarizaRonand

modularizaRon,usingBiOnICdatasetsasfeatures.

Acknowledgments

MusenLab,Stanford

BMIPhDProgram,Stanford

USNIHGrants

U54-HG004028

GM086587

maulikrk@stanford.eduhGp://onto-apps.stanford.edu/bionic

hGp://onto-apps.stanford.edu/vision

top related