exploring information seeking and searching intentions: an...
TRANSCRIPT
-
ExploringInformationSeekingandSearchingIntentions:AnOverviewofRecentResearchatRutgersUniversity
NicholasJ.BelkinSchoolofCommunication&Information
RutgersUniversityNewBrunswick,[email protected]
-
InformationSeekingandSearchingSituation
• Aperson,facingaproblematicsituation,withrespecttosometaskorgoal,decidesthatinteractingwithinformationcouldhelptoachievethegoaloraccomplishthetask.• Thatpersonmakesadecisionabouthowtobestcarryoutthatinteraction.ThisistheSeeking decision(Wilson,1999)• Whenthedecisionismadetointeractwithinformationthroughthemeansofsomesystem,Searching commences
-
TheGoal(s)ofInformationRetrieval
• Tosupporttheperson(s)inachievingthethegoalortaskwhichmotivatedthemtoengageininformationseekingandsearching• Todothisthroughhelpingtheperson(s)toresolvetheirproblematicsituation• TodothisbysupportingeffectiveinteractionwiththeIRsystemandtheinformationobjectswithinthatsystem• Todothisbyrespondingappropriatelytotheinformationsearchingintentionsoftheperson(s)duringthecourseofaninformationsearchingsession
-
MovingfromSystem-CenteredtoPerson-CenteredInformationRetrieval• Recognizethatinformationretrievalisinherentlyinteractive• Recognizethattheinformationretrievalsituationisinherentlydynamic• Recognizethatpeopleengageininformationseekingandsearchingsessions• MakethepersonintheIRsystemthecentralactor• Makeinteractionwithinformationobjectsthecentralprocess
-
AModelofInteractionwithInformation(Belkin,1996)
-
TakingAccountoftheInteractiveNatureofIR
• AresearchprogramatRutgersUniversityDepartmentofLibraryandInformationScience• PersonalizationoftheDigitalLibraryExperience(POoDLE)_IMLS• AutomaticIdentificationofInformationSearcherIntentionsDuringanInformationSeekingSession– Google• CharacterizingandEvaluatingWholeSessionInteractiveInformationRetrieval(CHEWS-IIR)– NSF(inprogress,describedtoday).
-
GeneralPatternofourStudies
• Constructworktasksofdifferenttypes,withassociatedinformationsearchingtasks• Haveparticipantsconductsearchforoneworktask
• Logbehaviors• Recordsearchsession
• Playbackinformationsearchsessionforparticipantannotation• Iteratefornextworktask,tofinalworktask• Exitinterview
-
WorkTasksandInformationSearchTasks
• JournalismDomain• Anytopic• Severalwell-definedtypesofworktasks,e.g.
• Advanceobituary;Copyediting;Prepareforinterview;Storypitch;Preparestory
• Constructedworkandsearchtasksdifferonvaluesofspecificfacets• Facetedclassificationoftask(Li&Belkin,2008)
-
Li&Belkin(2008)FacetAnalysisofTask(modified)
• SourceofTask• Self,Group,Assigned
• TaskDoer• Individual,Group
• Time• Frequency• Length• Stage
• Product• Physical, Intellectual,Decision,Factual
• Process• One-time,Multiple
• Items• NamedorNot• WholeorPart
• Goal• Quality
• Specific,Amorphous,Mixed• Quantity
• Singleormultiplegoals• Commonattributesoftask,e.g.
• Objective/Subjectivetaskcomplexity,Urgency,Salience,Difficulty,…
-
ExampleTaskandClassification
Assignment1.CopyEditing(CPE)YourAssignment:Youareacopyeditoratanewspaperandyouhaveonly20minutestochecktheaccuracyofsixitalicizedstatementsintheexcerptofapieceofnewsstorybelow.YourTask:Pleasefindandsaveanauthoritativepagethateitherconfirmsordisconfirmseachstatement.Product:Fact;Items:Named/Part;Goal:Specific
-
ExampleTaskandClassification
Assignment2.StoryPitch(STP)YourAssignment:Youareplanningtopitchasciencestorytoyoureditorandneedtoidentifyinterestingfactsaboutthecoelacanth(“see-la-kanth”),afishthatdatesfromthetimeofdinosaursandwasthoughttobeextinct.YourTask:Findandsavewebpagesthatcontainthesixmostinterestingfactsaboutcoelacanthsand/orresearchabouttheirpreservation.Product:Fact;Items:NotNamed/Part;Goal:Specific
-
ExampleTaskandClassification
Assignment3.Relationships(REL)YourAssignment:Youarewritinganarticleaboutcoelacanthsandconservationefforts.Youhavefoundaninterestingarticleaboutcoelacanthsbutinordertodevelopyourarticleyouneedtobeabletoexplaintherelationshipbetweenkeyfactsyouhavelearned.YourTask: Inthefollowingtherearefiveitalicizedpassages,findanauthoritativewebpagethatexplainstherelationshipbetweentwooftheitalicizedfacts.Product:Intellectual;Items:Named/Part;Goal:Mixed(Specific+Amorphous)
-
ExampleTaskandClassification
Assignment4.InterviewPreparation(INT)YourAssignment:Youarewritinganarticlethatprofilesascientistandtheirresearchwork.YouarepreparingtointerviewMarkErdmann,amarinebiologist,aboutcoelacanthsandconservationprograms.YourTask:Identifyandsaveauthoritativewebpagesforthefollowing:Identifytwo(living)peoplewholikelycanprovidesomepersonalstoriesaboutDr.Erdmannandhiswork.FindthethreemostinterestingfactsaboutDr.Erdmann’sresearch.FindaninterestingpotentialimpactofDr.Erdmann’swork.Product:Intellectual;Items:Not-Named/Whole;Goal:Amorphous
-
ParticipantsandProcedure• Journalismundergraduateuniversitystudents• Entryquestionnaire– demographics• Searchesfortwo(offour)tasksconductedinlabwitheyetracker (20minuteseach)• Pre-searchquestionnaire(whenpresentedwithtaskdescription)
• Familiaritywithtask,topic• Expecteddifficulty
• SearchconductedonWeb ,anysearchsystem,throughCoagmento• Post-searchquestionnaire
• Experienceddifficulty• Confidenceintasksuccess
• Playbacksearchforannotation,byQuerySegment• QSisqueryn,allthathappensuptoandincludingqueryn+1(orend)
• Exitinterview• Comparisonoftwotasksandtwosearchsessions
-
Annotation
• PlaybackQSn• Whatwereyouintendingtoaccomplishduringthisperiod
• Choiceofintentions,canbemultiple• Foreachintention:Wasthisintentionsatisfied?Ifno,whynot
• [textentry]• Whatwereyouhopingtoaccomplishwith[queryn+1]
• [textentry]• PlaybackQSn+1
-
Xie’s (2002)Interactive[Search]Intentions
• Identifysearchinformation(Somethingtostart;Somethingmoretosearch)• Learn(Domainknowledge;Databasecontent)• Find(Knownitem;Specificinformation;Sharingnamedcharacteristic;Withoutpredefinedcriteria)• Keeprecord• Accessitemorsetofitems• Evaluate(Correctness;Usefulness;Best;Specificity;Duplication)• Obtain(Specificinformation;Partofitem;Wholeitem)
-
DataAnalyses(SoFar)
• Queryingbehaviorandsearchintentions• Relationshipsbetweenqueryreformulation“types”andsearchintentions• Effectofintentionsatisfactiononqueryreformulationtype• Classificationofreasonsforqueryreformulation
• Intentionsandsearchbehaviors• AretheXie searchintentionsnecessaryandsufficient• Sequencesofsearchintentions• Predictionofsearchintentionbasedonsearchbehavior
-
QueryReformulationTypes(Liuetal.2010,modified)Type Definition Examples
Generalization Atleastonetermincommonintwoqueries;secondquerycontainsfewertermsthanfirstquery
worldeconomicimpactonglobalwarmingonArcticregionà globalwarmingonArcticregion
Specialization Atleastonetermincommonintwoqueries;secondquerycontainsmoretermsthanfirstquery
impactDr.Erdmannà impactDr.MarkErdmann
WordSubstitution Atleastonetermincommonintwoqueries;secondqueryhasthesamelengthasfirstquery,butcontainssometermsnotinthefirstquery
IgorSemiletov researchà igorsemiletov methane
Repeat Exactlythesameterm(s)repeatedfromanypreviousquerieswithinthesession
Coelacanths(1stquery)àCoelacanths(5thquery)
New Nocommontermsintwoqueries whereismadagascaràcoelacanthsliveyoung
SpellingCorrection
Thesecondquerycorrectsmisspellingofthepreviousquery
methaneclarites articeconomicimmpactà methaneclarites arcticeconomicimpact
StemIdentical Twoquerieswiththesamemorphologicalroot methanekmà methanekilometers
-
QueryAnalyses
• Datafor24participants,48searchsessions• 434queries• 383queryreformulations,therefore383instancesofreasonsforqueryreformulation• 1824searchintentions
• medianperQS4,range1-16• 1575satisfied,249unsatisfied
-
Totalcountsforeachintention
-
QueryReformulationsandSearchIntentions
• RQ1:Whattypesofreformulationsareusedfollowinganysearchintention• RQ2:Whattypesofreformulationsareusedwhenanintentioniseithersatisfiedornotsatisfied?• RQ3:Whatarethesubsequentintentionsofreformulations
Rha,E.Y.,Belkin,N.J.,Mitsui,M.&Shah,C.(2016)Exploringtherelationshipsbetweensearchintentionsandqueryreformulations.In:Proceedingsofthe79thAnnualMeetingoftheAssociationforInformationScienceandTechnology,(9pp.).SilverSpring,MD:AssociationforInformationScienceandTechnology
-
Frequencyofsatisfiedandunsatisfiedintentionsleadingtoeachreformulationtype
-
Mostfrequentintentions,mostfrequentfollowingreformulations,andmostfrequentsubsequentintentions
PreviousIntention Satisfaction
Mostfrequentreformulation Subsequentintention(s)
Secondmostfrequentreformulation
Subsequentintentions(s)
FindspecificY Specialization Findspecific Generalization Findspecific
N Specialization Findspecific Generalization Findspecific
ObtainspecificY Specialization Findspecific Generalization Obtainspecific
N Specialization Obtainspecific Generalization Findspecific
IdentifymoreY Repeat Identifymore Specialization Identifymore
N Specialization Learndomain Repeat Identifymore
Learndomain
Y Specialization Findspecific Generalization Identifymore
N Specialization Learndomain GeneralizationLearndomain,Learndatabase
Identifystart
Y Specialization Findspecific Repeat Identifymore
N Specialization FindspecificObtainspecific GeneralizationIdentifystartFindknown
-
find specific
obtain specific
identify more
learn domain
identify start
evaluate correctnesskeep link
find common
evaluate usefulness
access item
learn database
access common
evaluate specificity
find known
evaluate best
access area
obtain part
obtain whole
find without
evaluate duplication
find specific
obtain specific
identify more
learn domain
identify start
evaluate correctnesskeep link
find common
evaluate usefulness
access item
learn database
access common
evaluate specificity
find known
evaluate best
access area
obtain part
obtain whole
find without
evaluate duplication
generalization
specialization
repeat
word substitution
new
spelling correction
stem identical
FIRST INTENTION SUBSEQUENT INTENTION
QUERY REFORMULATION
-
find specific
obtain specific
identify more
learn domain
identify start
evaluate correctnesskeep link
find common
evaluate usefulness
access item
learn database
access common
evaluate specificity
find known
evaluate best
access area
obtain part
obtain whole
find without
evaluate duplication
find specific
obtain specific
identify more
learn domain
identify start
evaluate correctnesskeep link
find common
evaluate usefulness
access item
learn database
access common
evaluate specificity
find known
evaluate best
access area
obtain part
obtain whole
find without
evaluate duplication
generalization
specialization
repeat
word substitution
new
spelling correction
stem identical
FIRST INTENTION SUBSEQUENT INTENTION
QUERY REFORMULATION
-
Reformulation&IntentionsDiscussion1• RQ1:Whattypesofreformulationsareusedfollowinganysearchintention• Specialization isthemostcommonreformulationfollowing12ofthe20intentions,thenRepeat,thenGeneralization
• Intentionshavedifferentpatternsofsubsequentreformulations• RQ2:Whattypesofreformulationsareusedwhenanintentioniseithersatisfiedornotsatisfied?• Inconclusive;toofewunsatisfied
• RQ3:Whatarethesubsequentintentionsofreformulations• Inconclusivebutpromising;eachsubsequentintentionhasadifferentpatternofprecursorreformulations,despitethedominationofSpecialization
-
Reformulation&IntentionsDiscussion2
• Despitethenatureoftheworkandsearchtasks,participantshadnodifficultyidentifyingdifferentintentionsassociatedwithdifferentquerysegments• Giventhedifferentnatureofthevariousintentions,thissuggeststhatsearchsupporttechniquesotherthanqueryreformulationcouldbeusefulinsupportingeffectiveinteraction• Thedegreeofsatisfactionofintentionsmaybeduetoeitherlowexpectations,orinventiveuseofreformulation
-
ReasonsforQueryReformulation
• Peoplereformulatequeries,butwedon’tknowwhattheyaretryingtoaccomplishbydoingthis;• RQ1:Whatarereasonsforqueryreformulation
• Peoplereformulatequeries,butwedon’tknowhowreformulationtypes relatetoreasons forreformulation;• RQ2:Howaretypesofqueryreformulationrelatedtousers’reasonsforqueryreformulations
• Peopleattempttoaccomplishdifferentsearchintentions,butwedon’tknowhowtheygoaboutdoingthatthroughqueryreformulation.• RQ3:Howdopreviousinteractivesearchintentionsrelatetoreasonsoffollowingqueryreformulations
Rha,E.Y,Wei,S.&Belkin,N.J.(2017)Anexplorationofreasonsforqueryreformulation.In:Proceedingsofthe80th AnnualMeetingoftheAssociationforInformationScienceandTechnology,(11pp.).SilverSpring,MD:AssociationforInformationScienceandTechnology
-
ProcedureforAddressingRQs
• Opencodingof383textswritteninresponsetothequestion:Pleaseexplainwhyyouenteredthisnewquery,andwhatyouwerehopingtoaccomplishbydoingso
• Identificationofcommonstructureofreasons,andcommonelementsinthatstructure• Developmentofafacetedclassificationbasedonstructureandelements• Analysisoftypesofreasonsinrelationshiptotypesofreformulationsandtypesofsearchintentions
-
ReasonsandCodingExamplesReason OpenCoding
“Tryingtofindinformationthatistruthful,andmorespecifictothesubject.” FindtruthfulinformationFindspecificinformation“Clarifymyoriginalsearch” Clarifyoriginalsearch
“LookedupforanyrecentnewsregardingArcticoilandgastoseeifIcouldbolstermyargumentwithanyrecentfactsthatwereperhapsinthenews.”
LookforrecentnewsBolstermyargument
“IenteredthisnewquerybecauseIfeltIdidnotusetherightwordinmyfirstqueryreferringtopeoplethescientistwouldhavehadrelationswithtoprovidetheanswertothefirstquestionoftheassignment.”
Userightword
“Iusedamoregeneralphrasetogetmorebackgroundinformationonthetopicandhopefullyfindauthoritativesourcesthatsupportedthefacts.”
GetbackgroundinformationFindauthoritativesources
-
ExamplesofNormalizationofOpenCoding
OpenCoding FinalCombination
Findtruthfulinformation Find-accurate-informationClarifyoriginalsearch Clarify-previous-search
LookforrecentnewsBolstermyargument
Find- up-to-date-publicationVerify-specific-knowledge
Userightword Correct-previous-query
Getbackgroundinformation
Findauthoritativesources
Obtain-background-information
Find-credible-source
-
FacetedClassificationBasedonReasonStructureFacet Sub-facets Values
ProcessOperational Find;Obtain;Access;Expand; Combine;Correct;Change; Narrowdown;Start
Interpretive Evaluate;Verify;Focuson;Learn;Clarify;Use;Understand
Aspect
Depth General;Specific;Background; Basic;Detailed
Time New;Previous;Up-To-DateQuality Interesting;Accurate;Credible;Better;UsefulQuantity Multiple;Single
Relationship Similar;Different;Relevant; More
EntityContent Knowledge;Information;Topic;Definition;Fact;DomainResource Source;Website;PublicationSearch Searchresult;Query;Search
-
DistributionofReasonsforReformulation
-
MappingReasonstoSearchIntentionsXie’s (2002)SearchIntention ReasonCombination
Findspecificinformation
Find-specific-informationFind-specific-publicationFind-specific-sourceFind-specific-website
Identifymoretosearch Find-more-information
EvaluatecorrectnessVerify-specific-factVerify-specific-information
Obtainspecificinformation Obtain-specific-informationFinditemswithoutpre-definedcriteria
Find-interesting-factFind-different-information
Learndomainknowledge Learn-specific-topic
-
RelationshipofReasonstoReformulations
-
ReasonsandIntentionsDiscussion1
• RQ1:Whatarereasonsforqueryreformulation• Afacetedclassificationschemeprovideswaystocharacterizereasonsatdifferentlevelsofgranularity,buttherearemanypossiblecombinations
• Manyofthereasons(butnotall)maptoXie’s (2002)interactivesearchintentions
• RQ2:Howaretypesofqueryreformulationrelatedtousers’reasonsforqueryreformulations• Participantsuseddifferentqueryreformulationtypestoaccomplishthesamereasons,and
• Thesamereformulationtypeswereusedtoaccomplishmultiplereasons
-
ReasonsandIntentionsDiscussion2
• RQ3:Howdopreviousinteractivesearchintentionsrelatetoreasonsoffollowingqueryreformulations• Inconclusive;dominanceoffind-specific-informationasareason,andlackofunsuccessfulintentions,didnotallowmeaningfulanalysis
• Overallconclusion:People,duringthecourseofaninformationsearchsession,attempttodomorethanjust“makeabetterquery”;itseemsclearthatmanyofthereasonsforqueryreformulationwouldbebetterachievedthroughothermeans.
-
SearchIntentionsandSearchBehaviors
Giventhatpeopleattempttoaccomplishdifferentintentionsduringthecourseofaninformationsearchsession,canasystemidentifywhatthoseintentionsare,withoutintervention?• RQ1:Howisauser’sWebsearchbehaviorassociatedwithhisorherinformationseekingintentionsinthesamequerysegment• RQ2:Howisauser’sWebsearchbehaviorinthecurrentquerysegmentassociatedwithhisorherinformationseekingintentionsinthesubsequentquerysegment
-
Procedure,DataandMethods
• Procedureaspreviouslydescribed,butwithdatafor40participants• 80searchsessions,693querysegments• Observedsearchbehaviorstreatedasgroups• Twodifferentanalyses,usingtwoslightlydifferentbehaviorgroups
• Identifyingintentionsasabinaryclassificationproblem– logisticregression• Identifyingandpredictingintentionsthroughsignificantlydifferentcorrelationsofbehaviorswhenintentionispresent
-
ObservedBehaviors,perQuerySegment
• Saveditem(binary)• Numberofsaveditems• Dwelltimesoncontentpages• DwelltimesonSERPviewports• Querylength• Queryreformulationtype• Numberofclicks• Numberofsourcesvisited• Numberofpagesviewed
• Dwelltimesare:• totaldwelltime• totaldwelltimeuntilapageissaved
• totalopentime• totalopentimeuntilapageissaved
• firstdwelltime• meanofalldwelltimes
-
BehavioralGroupsforBinaryClassification(Testedsinglyandincombinations)
• Savingfeatures• Saveditem(binary)• Numberofsaveditems
• Contentpagefeatures• Dwelltimes• Numberofcontentpages,bytypes:saved,notsaved,unsaved,total
• SERP(i.e.viewportonSERP)features• Dwelltimes
• Queryfeatures• Querylength• Queryreformulationtype
-
MeasuringPerformance
• MeasuresTP=TruePositive;FP=FalsePositive;TN=TrueNegative;FN=FalseNegative
• Accuracy:ACC= TP+TN /TP+TN+FP+FN
• Precisionforintentionpresent:P1 =TP/TP+FP
• Precisionforintentionabsent:P0 =TN/TN+FN
• Baselines• Stratifiedsamplingofpositive/negativelabelsproportionaltotheirdistributionintrainingdata
• Assigningthemostfrequentlabelinthetrainingdata
• TestsforIdentification• Improvementoverthebetterofthetwobaselines,Kolmogorov-Smirnoff
-
ResultsforIdentificationbyClassification(1)
-
ResultsforIdentificationbyClassification(2)• Accuracy
• Significant(p<.01)butnotlargeimprovementinACCoverbetterbaselineforallintentionsbutone.Formostintentions,usingallfeaturegroupswasbest
• Precisionpresent• Significant(p<.01)andmeaningfulimprovementinPpres forallintentions;Formostintentions,one,oracombinationoftwofeaturegroupsperformedbest,ratherthancombiningall.
• Precisionabsent• Slightimprovements,mostnon-significant,overbestbaseline.Scoreswereuniformlyfairlyhighforbothbaseline
-
ClassificationDiscussion
• Doingbetterthanrandomwithaverysimpleclassifierfortwooutofthreemeasures• DoingverywellinPositiveidentification,likelybecauseit’saconservativealgorithm• Identifyingfewerintentions,withmorecertainty,isprobablyawingiventheproblem
• Negativeidentificationmaybeuninteresting,giventheproblem• Interestingstartontheproblem;nextstepsare:
• Moreanddifferentfeatures• Prediction,ratherthanjustidentification
-
BehavioralGroupsforPrediction
• Overallsearchbehavior• Querylength• Numberofsourcesvisited• Numberofpagesviewed
• Dwelltimefeatures• MeandwelltimeoneachSERPviewport
• Meandwelltimeoncontentpages
• Usefulnessjudgment• Saveditem(binary)• Numberofsaveditems
-
MeasuringStrengthofRelationship
• RQ1:Howisauser’sWebsearchbehaviorassociatedwithhisorherinformationseekingintentionsinthesamequerysegment• Meanvalueofeachsearchbehaviorforallquerysegments• Meanvalueofeachsearchbehaviorforquerysegmentwithgivenintention• Degreeofdifferencebetweenthetwoindicatesstrengthofrelationship
• RQ2:Howisauser’sWebsearchbehaviorinthecurrentquerysegmentassociatedwithhisorherinformationseekingintentionsinthesubsequentquerysegment• Meanvalueofeachsearchbehaviorforallquerysegments• Meanvalueofeachsearchbehaviorforquerysegmentprecedingquerysegmentwithgivenintention
• Degreeofdifferencebetweenthetwoindicatesstrengthofrelationship
-
Methods
• Correlationanalysisforeachbehavior-intentionpair• Doneforallcurrent,andallsubsequent,pairs• Behaviorsdistributednon-normally• Mann-Whitneytestsforsignificantdifferences
-
ResultsforIdentificationandPredictionbyDeviationfromMean• Ingeneral,differentbehaviors,andpatternsofbehaviors,areassociatedwithdifferentintentionsinthecurrentquerysegment.Manysignificantsuchassociations• Ingeneral,differentbehaviors,andpatternsofbehaviors,inthecurrentquerysegmentareassociatedwithdifferentintentionsinthesubsequentquerysegment.Fewersignificantsuchassociationsthanforcurrentintention,butstillsomeforalmostallsubsequentintentions• Nexttwoslidesshowtheseresultsfor(1)identificationand(2)prediction.Blackissignificantlyabovethemean;greyissignificantlybelowthemean
-
NextSteps
• Addanalysisofeyefixationbehaviorstotheidentificationandpredictionmodels• BasedondissertationworkbyMichaelCole,andrelatedtoresultsreportedinCole,M.J.,Hendahewa,C.,Belkin,N.J.&Shah,C.(2015)Useractivitypatternsduringinformation search.ACMTransactionsonInformationSystems,33(1):ArticleNo.1(39p.)
• Carryoutanalyseswithrespecttotasktypesandfacetvalues• Substantialevidencethattasktypeinfluencessearchbehaviorssignificantly• Strongsuspicionthattasktypeinfluencespatternsofintentions
• Carryoutinsitustudyofsearchbehaviorsandsearchintentions• Thirty“professional”participants,searchesloggedandannotatedbyintentions,foroneweek.
-
ThanksforYourAttention
• AcknowledgementsduetoallofthemembersofthePOoDLE andCHEWS-IIRprojects,andtoourfunders• WorkreportedherewassupportedthroughtheNationalScienceFoundation,grant#IIS-1423239.• WorkreportedherewassupportedbyaGoogleFacultyResearchAwardtoN.J.Belkin&C.Shah• SomeworkreportedherewassupportedbyIMLSgrantLG#06-07-0105-07