extraction of adaptation knowledge from internet communities
DESCRIPTION
presentation of the paper "Extraction of Adaptation Knowledge from Internet Communities"TRANSCRIPT
Extraction Extraction ofof Adaptation Knowledge Adaptation Knowledge from Internet Communities from Internet Communities **
Norman Ihle, Alexandre Hanft, and Klaus-Dieter Althoff
University of HildesheimInstitute for Computer Science
Intelligent Information Systems Lab
[lastname]@iis.uni-hildesheim.de
* This is an extended version of the paper presented at the Workshop ”WebCBR: Reasoning from Experiences on the Web” at ICCBR’09
FGWM’09 Fachgruppentreffen Wissensmanagement at LWA09, TU Darmstadt, 2009-09-22
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 2
OutlineOutline
Motivation
CookIIS
CommunityCookA system for model-based knowledge extraction from Internet-Communities
Evaluation
Conclusion & Outlook
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 3
MotivationMotivation
Adaptation is the „Reasoning“ in CBR [Kolodner 1997]
Most CBR-Systems avoid adaptation [Schmidt et al. 2003; Minor 2006]
Adaptation Knowledge Acquisition (AKA) is cost intensive and time consuming− Experts hardly available− Small number of research papers and systems− Most systems focus on the case-base as source of knowledge
The Internet is a large source of knowledge, especially user-generated content
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 4
The Cooking DomainThe Cooking Domain
The cooking domain is well suited for adaptation, because:The context can be described easily:
1. all ingredients can be listed with exact amount and quality2. ingredients can be obtained in standardized quantity and in
comparable quality3. kitchen machines and tools are available in a standardized manner4. (in case of a failure) the preparation of a meal can start over every
time again from the same initial situation (except that we have more experience in cooking after each failure)
Cooking is about creativity and variation
FGWM @ LWA‘2009 | 2009-09-22 5
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 6
CookIISCookIIS
System for the retrieval and adaptation of recipes in the cooking domain http://cookiis.iis.uni-hildesheim.de
Competes in the ComputerCookingContestGiven recipesDifferent tasks and requirements− Identification of negations, type of meal and origin of the dish− Handling of certain diets− Creation of a three course menu
Developed using the empolis:Information Access Suite (e:IAS)
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 7
CookIISCookIIS knowledge modelknowledge model
Most important component: modelled ingredients11 different classes, about 1000 conceptsModelled in English and German with synonymsConcepts organised in taxonomiesCombined similarity
Other components: tools, origins, methods, etc.Overall about 2000 modelled concepts
Rules for the recognition of the origin of the dish
Rules for the recognition of the type of meal
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 8
Adaptation in Adaptation in CookIISCookIIS
Model-based approach:Replace unwanted ingredients with similar onesSimilarity is mainly based on taxonomies and using a set-function offered by e:IAS Rule Engine:− Parent and Child concepts are retrieved as well as sibling concepts− Too many similar ingredients are retrieved
In many cases the approach is not appropriate
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 9
Cooking CommunitiesCooking Communities
A number of Internet communities deal with cooking knowledge
Users upload recipes and discuss them inside comments
They express affirmation, critics and what they changed for their own variation of the recipe (their personal adaptation)
If they vary the recipe, they name ingredients
Idea: using the CookIIS knowledge model to extract those ingredients
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 10
CommunityCookCommunityCook: Classification Idea: Classification Idea
Comments can be classified according to extracted ingredients into three categories:
NEW: all ingredients that are discussed, but are not part of the recipeAdd some ingredient
OLD: all ingredients that are discussed and are part of the recipe“more”/ “less” of an ingredient, explanation for an ingredient
OLDANDNEW: some ingredients that are discussed are part of the recipe and others not
Replacement of ingredients == adaptationSpecialisation (“for cheese I took parmesan”)
Latter category can be an adaptation suggestion, especially if ingredients are of the same class of the knowledge model
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 11
CommunityCookCommunityCook: Crawling: Crawling
Crawling of a large German cooking community:About 76.000 recipes with 286.000 related commentsHTML source code
Extraction of necessary data by building filters with the help of an open source tool (HTMLParser)
Recipe titleSingle ingredients (amount, measurement, name)CommentsStatistics
Saved into a database
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 12
CommunityCookCommunityCook: Text Mining : Text Mining case basescase bases
Configuring e:IAS with two case-bases: recipe, comment
Cases representation is based on the modelled ingredients of the CookIIS knowledge model
Use e:IAS TextMiner to fill cases with concepts from text
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 13
CommunityCookCommunityCook: perform Classification: perform ClassificationIn the next step we retrieved one recipe and all comments relating to that recipeEach comment classified into on of the three categoriesAdditionally we tried to find phrases in the text that support and specify the classificationassigned a score to determine the confidence of the classificationIf a pair of ingredients of the same class is found we also analyse if one concept is the parent concept of the other
no adaptation, but specialisationAbout 35.000 comments classified as OLDANDNEW, 16.000 with the subcategory “adaptation”
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 14
CommunityCookCommunityCook: Aggregation: Aggregation
One way:Aggregation of all classified comments belonging to the recipe
We counted the number of same classifications per recipe and aggregated the score by calculating the average and assigning a bonus for every classification
Second way:also aggregated all classified ingredients without regarding the recipe (statistical)
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 15
CommunityCookCommunityCook: Realization: Realization
Transformation of data:Building of adaptation suggestions in database-rows to easily retrieve thoseWith regard to the recipe and without (“independent”)
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 16
CommunityCookCommunityCook: Integration into : Integration into CookIISCookIIS
6200 different adaptation suggestions are available for 570 different ingredients
Using the two most common adaptation suggestions per ingredient (without regard of the recipe) to create adaptation suggestions
Integration into CookIIS-Workflows:− If no adaptation suggestion is created with community data, the
model-based adaptation is used
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 17
CommunityCookCommunityCook: Realization: Realization
Query: Chicken, but no cream
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 18
Starting EvaluationStarting Evaluation
First look: The class “supplement” of the knowledge model− Too many different kinds of ingredients are in these class so that the
adaptation suggestions are not adequateThe class “basic” of the knowledge model− Basic ingredients like flour or egg are just hard to replace
both not in the review, not used in CookIIStwo different evaluations:
One evaluation to review the classification scheme− Do the classified ingredients represent what was expressed in the original
comment?One evaluation to review the extracted knowledge− Are the created adaptation suggestion applicable?
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 19
EvaluationEvaluation
Evaluation of the extracted knowledge:Expert survey− Real chefs review the adaptation suggestions for recipes− Questionnaire with recipe and adaptation suggestions− One adaptation suggestion that was extracted from comments
belonging to that recipe (“dependent”)− Two adaptation suggestions without regard of the recipe
(“independent”) each with two ingredients as replacement suggestion (as in CookIIS)
50 Questionnaires with 50 dependent and 100 pairs of independent ingredients
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 20
Evaluation: overall ApplicabilityEvaluation: overall Applicability
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 21
Evaluation 1st vs. 2nd suggestionEvaluation 1st vs. 2nd suggestion
Only 11 of the 100 independent adaptation suggestions included no ingredient that can be used as substitution
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 22
Evaluation: QualityEvaluation: Quality
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 23
Future workFuture work
Further improvements on the knowledge model
Usage of adaptation suggestions that were extracted from recipes similar to the current one
Adding some semantic analysis to improve accuracy
Usage of comments with other classifications for building variations of recipes
Check the applicability in other domains
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 24
Related workRelated workSystems that give preparation advises for meals:
CHEF [Hammond 1986]JULIA [Hinrichs 1992]
Adaptation knowledge aquisition:DIAL [Leake et. al 1996]CABAMAKA [d'Aquin et al. 2007]IAKA [Cordier et al. 2008]
Using the web as knowledge source in CBR:SEASALT [Bach et al. 2007]EDIR [Plaza 2008]
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 25
ConclusionConclusionAdaptation knowledge is hard to acquire
The World Wide Web is a large source for knowledge
CommunityCook is a system the extracts adaptation knowledge from web communities in the domain of cooking
uses an existing knowledge model
The evaluation shows the applicability of the extracted knowledge
Extraction of Adaptation Knowledge from Internet Communities
FGWM @ LWA‘2009 | 2009-09-22 26
Thank you for your attention!
Questions?
FGWM @ LWA‘2009 | 2009-09-22 27
LiteratureLiterature[Bach et al., 2007] Kerstin Bach, Meike Reichle, and Klaus-Dieter Althoff. A domain
independent system architecture for sharing experience. In Alexander Hinneburg, editor, Proceedings of LWA 2007, Workshop Wissens- und Erfahrungsmanagement, pages 296–303, Sep. 2007.
[Cordier et al., 2008] Am´elie Cordier, B´eatrice Fuchs, L´eonardo Lana de Carvalho, Jean Lieber, and Alain Mille. Opportunistic acquisition of adaptation knowledge cases - the iakaapproach. In Althoff et al. [2008], pages 150–164.
[d’Aquin et al., 2007] Mathieu d’Aquin, Fadi Badra, Sandrine Lafrogne, Jean Lieber, Amedeo Napoli, and Laszlo Szathmary. Case base mining for adaptation knowledge acquisition. In Manuela M. Veloso, editor, IJCAI, pages 750–755. Morgan Kaufmann, 2007.
[Hammond, 1986] Kristian J. Hammond. Chef: A model of case-based planning. In American Association for Artificial Intelligence, AAAI-86, Philadelphia, pages 267–271, 1986.
[Hinrichs, 1992] Thomas R. Hinrichs. Problem solving inopen worlds. Lawrence Erlbaum, 1992.
[Leake et al. 1996]: D. Leake, A. Kinley und D. Wilson: Acquiring Case-Adaptation Knowledge: A hybrid Approach, in: Proceedings of theThirteenth National Conference on ArtificialIntelligence, S. 684-689, AAAI Press, 1996.
[Minor 2006]: M. Minor: Erfahrungsmanagement mit fallbasierten Assistenzsystemen, Dissertation, Humbolt-Universitat zu Berlin, 2006.
[Plaza, 2008] Enric Plaza. Semantics and experience in the future web. In Althoff et al. [2008], pages 44–58. invited talk.
[Schmidt et al. 2003]: R. Schmidt, O. Vorobieva und L. Gierl: Case-based Adaptation Problems in Medicine, in: U. Reimer (Hrsg.): Proceedings of WM2003: Professionelles Wissensmanagement – Erfahrungen und Visionen, Kollen-Verlag, 2003.