Lecture Notes in Computer Science 9103
Commenced Publication in 1973Founding and Former Series Editors:Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen
Editorial Board
David HutchisonLancaster University, Lancaster, UK
Takeo KanadeCarnegie Mellon University, Pittsburgh, PA, USA
Josef KittlerUniversity of Surrey, Guildford, UK
Jon M. KleinbergCornell University, Ithaca, NY, USA
Friedemann MatternETH Zurich, Zürich, Switzerland
John C. MitchellStanford University, Stanford, CA, USA
Moni NaorWeizmann Institute of Science, Rehovot, Israel
C. Pandu RanganIndian Institute of Technology, Madras, India
Bernhard SteffenTU Dortmund University, Dortmund, Germany
Demetri TerzopoulosUniversity of California, Los Angeles, CA, USA
Doug TygarUniversity of California, Berkeley, CA, USA
Gerhard WeikumMax Planck Institute for Informatics, Saarbrücken, Germany
More information about this series at http://www.springer.com/series/7409
Chris Biemann • Siegfried HandschuhAndré Freitas • Farid MezianeElisabeth Métais (Eds.)
Natural LanguageProcessing andInformation Systems20th International Conference on Applicationsof Natural Language to Information Systems,NLDB 2015Passau, Germany, June 17–19, 2015Proceedings
123
EditorsChris BiemannTechnische Universität DarmstadtDarmstadtGermany
Siegfried HandschuhUniversität PassauPassauGermany
André FreitasUniversität PassauPassauGermany
Farid MezianeUniversity of SalfordSalfordUK
Elisabeth MétaisConservatoire National des Arts et MétiersParisFrance
ISSN 0302-9743 ISSN 1611-3349 (electronic)Lecture Notes in Computer ScienceISBN 978-3-319-19580-3 ISBN 978-3-319-19581-0 (eBook)DOI 10.1007/978-3-319-19581-0
Library of Congress Control Number: 2015939997
LNCS Sublibrary: SL3 – Information Systems and Applications, incl. Internet/Web, and HCI
Springer Cham Heidelberg New York Dordrecht London© Springer International Publishing Switzerland 2015This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of thematerial is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology nowknown or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in this book arebelieved to be true and accurate at the date of publication. Neither the publisher nor the authors or the editorsgive a warranty, express or implied, with respect to the material contained herein or for any errors oromissions that may have been made.
Printed on acid-free paper
Springer International Publishing AG Switzerland is part of Springer Science+Business Media(www.springer.com)
Preface
We are living in a fascinating period, when the applications of natural language pro-cessing (NLP) are going mainstream. The maturity of the field has reached a stage inwhich its solidity can be witnessed in the adoption and uptake of its technologiesoutside the core NLP academic community. The success and visibility of softwaresystems such as IBM Watson, Siri, and Google Knowledge graph are incentivizing theincrease of interest in the field and the adoption of NLP in diverse areas.
The past two decades in which NLDB has been active were fundamental for theestablishment of the foundations and the further maturation of this field and NLDB canclaim the share of its contributions. During this time, NLDB has been one of the mainconferences for applications in NLP, in which researchers could find a sweet spotbetween rigorous scientific contributions and openness to new ideas and perspectives.Another notable characteristic that has been consistently present at NLDB is multidis-ciplinarity: NLDB has welcomed contributions from applications of NLP into differentareas, and has helped in bridging NLP to different communities and application fields.
This year’s NLDB featured a special track on natural language and its connection tosemantic and cognitive computing. Semantic computing aims at connecting the user’sinformation need with the meaning of content in a multidisciplinary fashion. Cognitivecomputing systems naturally interact with people and learn over time.
The NLDB 2015 program spanned a wide range of topics that all revolve around theuse of natural language to access information. Three invited speakers covered topics asdiverse as distributional semantics, computational humanities, and linked open data.Sessions comprised the following topics: unsupervised and semi-supervised machinelearning, information extraction, event extraction and named entity recognition, multi-lingual alignment and translation, sentiment detection and user-generated content pro-cessing, indexing and the lexicon, query processing, question answering, speechprocessing, and dialog systems. Last, but not least, the special session on semantic andcognitive computing attracted position papers and featured a sponsored talk by IBM.
NLDB 2015 had a higher number of submissions compared with previous years. Outof 100 submissions, 18 papers were accepted as full papers (18 % acceptance rate), 15 asshort papers (33 % acceptance rate) and 14 as posters and demos (47 % acceptance rate).A lot of work was involved in the careful selection of the papers and we would like tothank the Program Committee and reviewers for their hard work and dedication. Wewould also like to thank the invited speakers for their inspiring contributions to theprogram.
NLDB 2015 was held in the picturesque town of Passau. Dating back to Romantimes, Passau is located at the German-Austrian border and it is notable for being at theconvergence of three rivers (the Danube, the Inn, and the Ilz). Its location at the heart ofEurope and its proximity to different borders made Passau a natural confluence fordifferent cultures and influences, a fact that is embodied in its architecture that har-moniously blends German, Austrian, and Italian styles. Despite its openness, Passau is
deeply connected to its German and Bavarian traditions and symbols and offers a greatentry point for the German culture.
The conference was generously supported financially by the Insight Centre for DataAnalytics, the SSIX EU Project, IBM (Platinum Sponsors), by the University ofPassau, which hosted the conference, and by 3rdParty (Silver Sponsor). AdamantiosKoumpis, Stephanie Pauli, Elfried Kronawitter and Ulrike Holzapfel were fundamentalfor the organization of the conference, supporting the sponsorship, and in the localcoordination of the event.
In its 20th anniversary we would like to celebrate the effort behind the constructionof the NLDB community, expressing our gratitude to authors, Program Committeemembers, and organizers for the past editions of NLDB.
April 2015 Chris BiemannSiegfried Handschuh
André FreitasFarid Meziane
Elisabeth Métais
VI Preface
Organization
Organizing Committee
Program Chair
Chris Biemann TU Darmstadt, Germany
Conference Chairs
Siegfried Handschuh University of Passau, GermanyElisabeth Métais CNAM, FranceFarid Meziane University of Salford, UK
Organization Chair
André Freitas University of Passau, Germany
Tutorials and Workshop Chairs
Christin Seifert University of Passau, GermanyAndré Freitas University of Passau, Germany
Sponsorship Chair
Adamantions Koumpis
Senior Program Committee
Gerard de MeloValia KordoniMathieu LafourcadeJohannes Leveling
Els LefeverSimone Paolo PonzettoMathieu RocheMaguelonne Teisseire
Christina UngerTorsten ZeschMichael Zock
Program Committee
Hidir ArasImran Sarwar BajwaPierpaolo BaslieNicolas BéchetJohan BosGosse BoumaMihaela BorneaYacine Rezgui
Sandra BringayCornelia CarageaDiego CeccarelliChristian KopPhilipp CimianoChristian ChiarcosKostadin CholakovErnestoWilliam De Luca
Kees van DeemterBart DesmetOlivier FerretAntske FokkensVladimir FomichovThierry FontenelleDebasis GangulyYaakov Hacohen-Kerner
Sebastian HellmannMichael HerwegHelmut HoracekDino IencoAshwin IttooPaul JohannessonRichard JohanssonEpaminondas KapetaniosSaurabh KatariaSophia KatrenkoZoubida KedadEric KergosienValia KordoniLeila KosseimJochen LeidnerDeryle W. LonsdaleCdric LopezJohn McCraeMarie-Jean MeursLuisa MichClaudiu Mihaila
Shamima MithunAndres MontoyoAndrea MoroRafael MuozGuenter NeumannJan OdijkAlexander PanchenkoHeiko PaulheimDavide PiccaPascal PonceletViolaine PrinceGábor PrόszékyBehrang QasemizadehShaolin QuReinhard RappMartin RiedlAndré FreitasEric RinggerMathieu RocheMike RosnerPaolo Rosso
Patrick Saint DizierBahar SateliRoman SchneiderKhaled ShaalanMax SilberzteinVijayan SugumaranKrishnaprasadThirunarayanJuan TrujilloDan TufisL. Alfonso Ureña-LόpezSunil VaderaPanos VassiliadisAndreas VlachosJoachim WagnerTonio WandmacherFeiyu XuWlodek ZadroznyFabio Massimo ZanzottoErqiang Zhou
Fig. 1. NLDB 2015 sponsors.
VIII Organization
Sponsors
The NLDB organizers are grateful for the NLDB sponsors and their generouscontributions:
Insight Centre for Data Analytics, Ireland (Platinum), SSIX EU Project (Platinum),IBM (Platinum), Lionbridge (Gold), and 3rdPlace (Silver).
Organization IX
Contents
Information Extraction
Improving Supervised Classification Using Information Extraction . . . . . . . . 3Mian Du, Matthew Pierce, Lidia Pivovarova, and Roman Yangarber
Supervised Machine Learning Techniques to Detect TimeML Eventsin French and English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Béatrice Arnulphy, Vincent Claveau, Xavier Tannier, and Anne Vilnat
Distributional Semantics
In Defense of Word Embedding for Generic Text Representation . . . . . . . . . 35Guy Lev, Benjamin Klein, and Lior Wolf
Using Distributed Word Representations and mRMR Discriminant Analysisfor Multilingual Text Summarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Houda Oufaida, Philippe Blache, and Omar Nouali
Combining Pattern-Based and Distributional Similarity for Graph-BasedNoun Categorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Michael Wiegand, Benjamin Roth, and Dietrich Klakow
Acquiring a Large Scale Polarity Lexicon Through UnsupervisedDistributional Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Giuseppe Castellucci, Danilo Croce, and Roberto Basili
Querying and Question Answering Systems
Query Refinement Using Conversational Context: A Methodand an Evaluation Resource . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Maryam Habibi and Andrei Popescu-Belis
Applying Semantic Parsing to Question Answering Over Linked Data:Addressing the Lexical Gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Sherzod Hakimov, Christina Unger, Sebastian Walter,and Philipp Cimiano
Pragmatic Query Answering: Results from a Quantitative Evaluation. . . . . . . 110Jon Scott Stevens, Anton Benz, Sebastian Reuße, Ralf Klabunde,and Lisa Raithel
What was the Query? Generating Queries for Document Sets withApplications in Cluster Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Matthias Hagen, Maximilian Michel, and Benno Stein
Context-Aware NLP
Using Context-Aware and Semantic Similarity Based Model to EnrichOntology Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Zenun Kastrati, Sule Yildirim Yayilgan, and Ali Shariq Imran
NADIA: A Simplified Approach Towards the Development of NaturalDialogue Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Markus M. Berg
Cognitive and Semantic Computing
How to Talk to a Cognitive Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153Csaba Veres
Comparing Recursive Autoencoder and Convolutional Networkfor Phrase-Level Sentiment Polarity Classification . . . . . . . . . . . . . . . . . . . . 160
Johannes Jurgovsky and Michael Granitzer
The Interplay of Language Processing, Reasoning and Decision-Makingin Cognitive Computing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Sergei Nirenburg and Marjorie McShane
Towards Benevolent Sales Assistants in Retailing Scenarios . . . . . . . . . . . . . 180Sabine Janzen and Wolfgang Maass
Sentiment and Opinion Analysis
A Rule-Based Approach to Implicit Emotion Detection in Text . . . . . . . . . . 197Orizu Udochukwu and Yulan He
Deciphering Review Comments: Identifying Suggestions, Appreciationsand Complaints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Sachin Pawar, Nitin Ramrakhiyani, Girish K. Palshikar,and Swapnil Hingmire
Associating Intent with Sentiment in Weblogs . . . . . . . . . . . . . . . . . . . . . . 212Mark Kröll and Markus Strohmaier
PSO-ASent: Feature Selection Using Particle Swarm Optimizationfor Aspect Based Sentiment Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Deepak Kumar Gupta, Kandula Srikanth Reddy, Shweta, and Asif Ekbal
XII Contents
Improving Spanish Polarity Classification Combining DifferentLinguistic Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Eugenio Martínez-Cámara, Fermín L. Cruz,M. Dolores Molina-González, M. Teresa Martín-Valdivia,F. Javier Ortega, and L. Alfonso Ureña-López
Information Extraction and Social Media
Tree-Structured Named Entities Extraction from CompetingSpeech Transcriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Davy Weissenbacher and Christian Raymond
Interactive Learning with TREE: Teachable Relation and EventExtraction System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Maya Tydykov, Mingzhi Zeng, Anatole Gershman,and Robert Frederking
Identification and Ranking of Event-Specific Entity-Centric InformativeContent from Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Debanjan Mahata, John R. Talburt, and Vivek Kumar Singh
Automatic Classification and PLS-PM Modeling for Profiling Reputationof Corporate Entities on Twitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Jean-Valère Cossu, Eric Sanjuan, Juan-Manuel Torres-Moreno,and Marc El-Bèze
NLP and Usability
An Adaptable and Personalised E-Learning System Based on FreeWeb Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Eiman Aeiad and Farid Meziane
A Controlled Natural Language for Business Intelligence Monitoring . . . . . . 300Christian Colombo, Jean-Paul Grech, and Gordon J. Pace
Text Summarization and Speech Synthesis for the Automated Generationof Personalized Audio Presentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Séamus Lawless, Peter Lavin, Mostafa Bayomi, João P. Cabral,and M. Rami Ghorab
Text Classification and Extraction
Unsupervised Classification of Translated Texts . . . . . . . . . . . . . . . . . . . . . 323Sergiu Nisioi
Contents XIII
A Language-Independent Method for Detection and Correctionof Alignment Errors in Parallel Corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Katarzyna Niżałowska and Urszula Markowska-Kaczmar
High-Precision Person Name Extraction from Turkish TextsUsing Wikipedia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Dilek Küçük and Doğan Küçük
A Hybrid Approach for Extracting Arabic Persons’ Namesand Resolving Their Ambiguity from Twitter . . . . . . . . . . . . . . . . . . . . . . . 355
Omnia H. Zayed and Samhaa R. El-Beltagy
Extracting Relations from Unstructured Text Sources for MusicRecommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Mohamed Sordo, Sergio Oramas, and Luis Espinosa-Anke
Posters and Demonstrations
Simulating Misreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385Armin Hoenen
(German) Language Processing for Lucene. . . . . . . . . . . . . . . . . . . . . . . . . 390Bastian Entrup
Optimized Uyghur Segmentation for Statistical Machine Translation . . . . . . . 395Chenggang Mi, Yating Yang, Rui Dong, Xi Zhou, Lei Wang, Xiao Li,Tonghai Jiang, and Turghun Osman
Management and Publishing of Multimedia Dictionary of the CzechSign Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Adam Rambousek and Aleš Horák
Automatic Detection of Modality with ITGETARUNS. . . . . . . . . . . . . . . . . 404Rodolfo Delmonte
Gathering Knowledge for Question Answering Beyond Named Entities . . . . . 412Piotr Przybyła
MaNER: A MedicAl Named Entity Recogniser. . . . . . . . . . . . . . . . . . . . . . 418Isabel Moreno, Paloma Moreda, and M.T. Romá-Ferri
Upper Bound for Cross-Lingual Concept Mapping with ExternalTranslation Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Mamoun Abu Helou and Matteo Palmonari
Generating Logical Representations for Natural Language RequirementsUsing Syntactic Dependencies and Norm Analysis Patterns . . . . . . . . . . . . . 432
Richa Sharma and K.K. Biswas
XIV Contents
Random Indexing Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437Behrang QasemiZadeh
On Developing Extraction Rules for Mining Informal Scientific Referencesfrom Altmetric Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Waqas Khawaja, Michael Taylor, and Brian Davis
Lemonade: A Web Assistant for Creating and Debugging Ontology Lexica. . . . . 448Mariano Rico and Christina Unger
A Comparative Study on Twitter Sentiment Analysis: Which Featuresare Good? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Fajri Koto and Mirna Adriani
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Contents XV