SemanticMining: Major Events July - December 2005

  • Workshop on Foundations of Clinical Terminologies and ClassificationsLocation: Timioara, Romania, April 8, 2006Collocation: EFMI Special Topic Conference Endorsement: EFMI, GMDS, SemanticMiningLocal Organizer: George MihalasScientific Chairs: Stefan Schulz, Jeremy Rogers. SPC to be definedInvited Speakers: Olivier Bodenreider, N.N.Submission Deadline: Jan 15Bursaries by Semantic Mining

  • WP 20: Research Activity Multilingual Medical Dictionary

  • WP20 Partners contributing to activities in 2005 Jul-DecUKLFRFreiburg University Hospital Medical InformaticsJENAUniversity of Jena Computational Linguistics LIU IMTLinkping Univ. Medical InformaticsLIU IDALinkping Univ. Computer ScienceUGOTGteborg UniversitySUSahlgrenska Hospital Gteborg DIMGeneva Univ. Hospital Med. Informatics OUOpen UniversityINSERMParis, Med. Informatics

  • Revised Subtasks of WP20Development and Implementation of Lexical Acquisition Methodologies Population of Medical Subword LexiconSpecification and Acquisition of Multilingual Lexical Resources Specification and Acquisition of Multilingual Corpora

  • WP 20 achievementsMilestone 4 reached: interchange format for a multi-lingual medical dictionary, multi-lingual medical dictionary in at least five languages with more than 20,000 entriesWP20 workshop in September (Paris)Specification of Common Link FormatOngoing enhancement of the MorphoSaurus Lexicon Enhancement of the MorphoEditWeb lexicon editorUse of standardized IR performance measurements in order to assess the progress in the lexiconResearch on subword cognate aquisition. Ongoing feasibility studies and prototypical implementations in collaboration with German partners (industry, government)Implementation of corpus exchange guideline

  • WP 20 achievements (continued)Use of NLG techniques for the automated generation of Procedure description out of a formalized procedure ontologyAutomatic annotation of medical corpus with morphosyntactic and semantic information. Restructuring of LEXIN lexicon (20,000 entries) into a medically-oriented resource Compound decomposition, named entity recognition, and acronym decomposition for Swedish Adaptation of ITools for French terminology extractionContinuation of preparation of parallel French-English medical corpus Semi-automatic enrichment of French subword lexicon by exploiting existing subword lexicons in other languages

  • WP 20 Cross-WP activitiesWP 21 and WP 26. Use of Morphosaurus indexing for content retrieval in EHR archetype descriptions. WP21: Joint proposal on biomedical processes within biomedical ontologies WP22 translation of SNOMED CT into Swedish WP22 use of SNOMED Microglossary for a bilingual English-Russian lexiconWP24 Elaboration of a white paper on token and sentence segmentationWP23 (planned): Acquisition of Multilingual Terms from Lab Medicine

  • WP20 Strategic PlanningUsing the common interchange and link formats, multilingual linking/merging of lexicons plus semantic classes from subword lexicons Corpora pool as new deliverable for 2006Extension of the parallel corpora alignment for new language pairsOngoing population and cleansing of subword lexicons, extension of the corpus-based validation of subword lexicons to new language pairs Cross-validation of lexicons (e.g. removing duplicates) IR studies with new language pairsExperimental addition of Italian and Russian to the subword lexicon using automated Workshop at LREC (accepted) Tutorial at MIE

  • WP 20 MobilityRahil Qamarfrom UOM to UKLFR Vincent Claveaufrom INSERM to UKLFRMikael Nystrmfrom LIU to UKLFRLouise Dlgerfrom INSERM to LIUCaspar Hasenclever from UKLFR to JENAMichael Popratfrom JENA to UKLFRJoachim Wermterfrom JENA to UKLFRHarald Kirschfrom EBI to UKLFR

  • WP 20 Joint PublicationsR.H. Baud, M. Nystrm, L. Borin, R. Evans, S. Schulz, P. Zweigenbaum. Interchanging Lexical Information for a Multilingual Dictionary. AMIA 2005 annual symposium. Accepted for publication. K. Mark, S. Schulz, U. Hahn: Automatic Lexicon Acquisition for a Medical Cross-Language Information Retrieval System. Proceedings of the XIX International Congress of the European Federation for Medical Informatics (MIE '05), Geneva, Switzerland. 2005: 829-834. K. Mark, S. Schulz, O. Medelyan, U. Hahn: Bootstrapping Dictionaries for Cross-Language Information Retrieval. Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '05 ), Salvador, Brazil. 2005: 528-535.K. Mark, S. Schulz, U. Hahn: MorphoSaurus - Design and Evaluation of an Interlingua-based, Cross-language Document Retrieval Engine for the Medical Domain. Methods of Information in Medicine. 4/2005(44): 537-545 K. Mark, S. Schulz, U. Hahn: Unsupervised Multilingual Word Sense Disambiguation via an Interlingua. Proceedings of the 20th National Conference on Artificial Intelligence (AAAI '05), Pittsburgh, Pennsylvania. 2005: 1075-1080 M. Poprat, U. Hahn. Enough is Enough Estimating Upper Bounds of the Size of Training Corpora for Unsupervised PP Attachment Disambiguation. Proceedings of Fifth International Conference on Recent Advances in Natural Language Processing (RANLP-2005)K. Mark, S. Schulz and U. Hahn. Multilingual Lexical Acquisition by Bootstrapping Cognate Seed Lexicons. Proceedings of Fifth International Conference on Recent Advances in Natural Language Processing (RANLP-2005)U. Hahn, P. Daumke, S. Schulz, K. Mark: : Cross-Language Mining for Acronyms and their Completions from the Web. Proceedings of the 8th International Conference on Discovery Science (DS '05), Singapore. 2005. S. Schulz, K. Mark, R. L. de Andrade, E. Pacheco, P. Nohama, U. Hahn, M. Romacker: The Morphosaurus Medical Subword Lexicon. Lexicographic and Semantic Aspects. Proceedings of the 3th Workshop em Tecnologia da Informao e da Linguagem Humana (TIL '05), So Leopoldo, Brasil. 2005 Mikael Nystrm, Magnus Merkel, Lars Ahrenberg, Michael Petterstedt, Hkan Petersson & Hans hlfeldt. Generering av ett medicinskt engelskt-svenskt lexikon med hjlp av interaktiv ordlnkning. Accepted for Svenska Lkaresllskapets riksstmma (Annual Meeting of Swedish Society of Medicine).K. Marko, P. Daumke, S. Schulz, U. Hahn. Automatische Generierung einer sprachbergreifenden Akronymdatenbank. 50. Jahrestagung der Deutschen Gesellschaft fr Medizinische Informatik, Biometrie und Epidemiologie (gmds), Freiburg 11. - 15. September 2005 (Annual Meeting of the German Society of Medical Informatics, Biometry and Epidemiology)M. Poprat, K. Mark, U. Hahn. Automatische Klassifikation medizinischer Dokumente nach Sprache und Zielgruppe fr Text-Retrieval-Systeme. 50. Jahrestagung der Deutschen Gesellschaft fr Medizinische Informatik, Biometrie und Epidemiologie (gmds), Freiburg 11. - 15. September 2005 (Annual Meeting of the German Society of Medical Informatics, Biometry and Epidemiology)Website:, updated in August 2005