Creating a medical English-Swedish dictionary using interactive word alignment

Affordable Access

Creating a medical English-Swedish dictionary using interactive word alignment

Publisher
BioMed Central
Publication Date
Oct 12, 2006
Source
PMC
Keywords
Disciplines
  • Mathematics
  • Medicine
License
Unknown

Abstract

1472-6947-6-35.fm ral BMC Medical Informatics and ss BioMed CentDecision Making Open AcceResearch article Creating a medical English-Swedish dictionary using interactive word alignment Mikael Nyström*1, Magnus Merkel2, Lars Ahrenberg2, Pierre Zweigenbaum3,4,5, Håkan Petersson1 and Hans Åhlfeldt1 Address: 1Department of Biomedical Engineering, Linköpings universitet, SE-58185 Linköping, Sweden, 2Department of Computer and Information Science, Linköpings universitet, SE-58183 Linköping, Sweden, 3Assistance Publique-Hôpitaux de Paris, F-75683 Paris Cedex 14, France, 4Inserm, U729, F-75270 Paris Cedex 06, France and 5Inalco, CRIM, F-75343 PARIS Cedex 07, France Email: Mikael Nyström* - [email protected]; Magnus Merkel - [email protected]; Lars Ahrenberg - [email protected]; Pierre Zweigenbaum - [email protected]; Håkan Petersson - [email protected]; Hans Åhlfeldt - [email protected] * Corresponding author Abstract Background: This paper reports on a parallel collection of rubrics from the medical terminology systems ICD-10, ICF, MeSH, NCSP and KSH97-P and its use for semi-automatic creation of an English-Swedish dictionary of medical terminology. The methods presented are relevant for many other West European language pairs than English-Swedish. Methods: The medical terminology systems were collected in electronic format in both English and Swedish and the rubrics were extracted in parallel language pairs. Initially, interactive word alignment was used to create training data from a sample. Then the training data were utilised in automatic word alignment in order to generate candidate term pairs. The last step was manual verification of the term pair candidates. Results: A dictionary of 31,000 verified entries has been created in less than three man weeks, thus with considerably less time and effort needed compared to a manual approach, and without compromising quality. As a side effect of our work we found 40 different translation problems in the terminology systems a

Report this publication

Statistics

Seen <100 times