skip to main content
10.1145/3362789.3362878acmotherconferencesArticle/Chapter ViewAbstractPublication PagesteemConference Proceedingsconference-collections
research-article

Cataloguing Spanish Medical Reports with UMLS Terms

Published: 16 October 2019 Publication History

Abstract

UMLS (Unified Medical Language System) is one of the most comprehensive terminological resources for the medical domain. Thus, the provision of instruments to assist in the cataloguing of medical reports with UMLS is of utmost relevance, especially when these reports are written in unstructured free-text natural language. For this purpose, it is possible to use tools that, like MetaMap, enable the automatic annotation of clinical texts with UMLS terms. However, these tools typically work on reports written in English, which seriously hinders their applicability to other languages. In this paper, we describe an approach to mitigate these shortcomings, which pipelines state-of-the-art language translation services with automatic mapping tools. We demonstrate the feasibility of the approach by combining Google Translate with MetaMap and by using the resulting pipeline to catalog, with UMLS, a representative set of Spanish-written X-Ray Thorax reports corresponding to images taken from the Indiana Chest X-ray radiology corpus. The resulting cataloguing is not significantly different, in quality, from that obtained through the direct application of MetaMap on a similar set of reports written in English and selected from this Indiana Chest X-ray corpus.

References

[1]
McMorrow L. Breaking the Greco-Roman Mold in Medical Writing: The Many Languages of 20th Century Medicine in: Fischbach H. (ed.) Translation and Medicine. 13--28 Amsterdam / Philadelphia: John Benjamins (1998).
[2]
Montgomery S.L. English and Science: realities and issues for translation in the age of an expanding lingua franca. The Journal of Specialised Translation. Issue 11, 2009.
[3]
Gotti, M. Investigating specialized discourse. Peter Lang (2008).
[4]
Bodenreider. O. The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research.; vol. 32, pp. 267--270, 2004.
[5]
Aronson A.R. Effective mapping of biomedical text to the UMLS Metathesaurus: the metamap program" Proc AMIA Symposium, 2001, pp. 17--21.
[6]
Bouayad-Agha N., Power R. and Belz A., 2002. PILLS: Multilingual generation ofmedical information documents with overlapping content. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC), p. 2111--2114.
[7]
Mićić, S. Languages of Medicine-present and future. JAHR, 4(7), 2013, 217--233.
[8]
Steinberger, R. (2012). A survey of methods to ease the development of highly multilingual text mining applications. Language Resources and Evaluation, 46(2), 155--176.
[9]
Cotik, V., Filippo, D., Roller, R., Uszkoreit, H., & Xu, F. (2017, September). Annotation of Entities and Relations in Spanish Radiology Reports. In RANLP (pp. 177--184).
[10]
Roller, R., Rethmeier, N., Thomas, P., Hübner, M., Uszkoreit, H., Staeck, O., ... & Schmidt, D. (2017, September). Detecting named entities and relations in German clinical reports. In International Conference of the German Society for Computational Linguistics and Language Technology (pp. 146--154). Springer, Cham.
[11]
Carrero, F. M., Cortizo, J. C., Gómez, J. M., & De Buenaga, M. (2008, October). In the development of a spanish metamap. In Proceedings of the 17th ACM conference on Information and knowledge management (pp. 1465--1466). ACM.
[12]
Névéol, A., Dalianis, H., Velupillai, S., Savova, G., & Zweigenbaum, P. (2018). Clinical natural language processing in languages other than english: opportunities and challenges. Journal of biomedical semantics, 9(1), 12.
[13]
Txabarriaga, R. IMIA Guide on Medical Translation. International Medical Interpreters Association. January2009.
[14]
Karwacka, W. (2014). Quality assurance in medical translation. The Journal of Specialised Translation, 21, 19--34.
[15]
Stearns, M. Q., Price, C., Spackman, K. A., & Wang, A. Y. (2001). SNOMED clinical terms: overview of the development process and project status. Proceedings of the AMIA Symposioum, 662--666
[16]
Lipscomb, C.E. (2000). Medical Subject Headings (MeSH). Bulletin of the Medical Libray Association, 88(3), 265--266
[17]
World Health Organization. (1992). The ICD-10 classification of mental and behavioural disorders: clinical descriptions and diagnostic guidelines. World Health Organization.
[18]
Demner-Fushman, D., Kohli, M. D., Rosenman, M. B., Shooshan, S. E., Rodriguez, L., Antani, S., Thoma, G. R., & McDonald, C. J. (2016). Preparing a collection of radiology examinations for distribution and retrieval. Journal Am. Med. Inf. Ass., 23(2), 304--310.
[19]
McCray, A-T., Srinivasan, S., & Browne, A-C. (1994). Lexical methods for managing variation in biomedical terminologies. Proc Annu Symp Comput Appl Med Care, pp 235--239.
[20]
Buendía, F., Gayoso-Cabada, J., Juanes-Méndez, J.A., Sierra, J.L. (2019).Transforming Unstructured Clinical Free-Text Corpora into Reconfigurable Medical Digital Collections. Proc. of the 32th IEEE International nternational Symposium on Computer-Based Medical Systems.
[21]
Goldberg, Y. (2017). Neural Network Methods in Natural Language Processing. Morgan & Claypool Publisher
[22]
Wu, Y et al. (2016). Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144v2 [cs.CL] 8 Oct 2016
[23]
Salton, G., McGill, M.J. (1986). Introduction to Modern Information Retrieval. McGraw-Hill
[24]
Field, A. (2017). Discovering Statistics Using IBM SPSS Statistics. SAGE Publications Ltd.
[25]
Gayoso-Cabada, J., Gómez-Albarrán, M., & Sierra, J.-L. (2018). Query-Based Versus Resource-Based Cache Strategies in Tag-Based Browsing Systems. In Proceedings of the Maturity and Innovation in Digital Libraries; Dobreva, M., Hinze, A., Žumer, M., Eds.; Springer International Publishing, 41--54.
[26]
Gayoso-Cabada, J., Rodríguez-Cerezo, D., & Sierra, J.-L. (2016). Multilevel Browsing of Folksonomy-Based Digital Collections. In Proceedings of the Web Information Systems Engineering - WISE 2016; Cellary, W., Mokbel, M.F., Wang, J., Wang, H., Zhou, R., Zhang, Y., Eds.; Springer International Publishing, 43--51.
[27]
Gayoso-Cabada, J., Rodríguez-Cerezo, D., & Sierra, J.-L. (2017). Browsing Digital Collections with Reconfigurable Faceted Thesauri. In Proceedings of the Complexity in Information Systems Development; Goluchowski, J., Pankowska, M., Linger, H., Barry, C., Lang, M., Schneider, C., Eds.; Springer International Publishing, 69--86.
[28]
Buendía, F., Gayoso-Cabada, J., Sierra, J.-L. (2018). Using Digital Medical Collections to Support Radiology Training in E-learning Platforms. In Proceedings of the Lifelong Technology-Enhanced Learning; Pammer-Schindler, V., Pérez-Sanagustín, M., Drachsler, H., Elferink, R., Scheffel, M., Eds.; Springer International Publishing, 566--569.
[29]
Buendía, F., Gayoso-Cabada, J., & Sierra, J.-L. (2019). Generation of Standardized E-Learning Content from Digital Medical Collections. Journal of Medical Systems, 43, 188.

Cited By

View all
  • (2020)UMLS at 30 years: How it is used and published (Preprint)JMIR Medical Informatics10.2196/20675Online publication date: 25-May-2020

Index Terms

  1. Cataloguing Spanish Medical Reports with UMLS Terms

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    TEEM'19: Proceedings of the Seventh International Conference on Technological Ecosystems for Enhancing Multiculturality
    October 2019
    1085 pages
    ISBN:9781450371919
    DOI:10.1145/3362789
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    In-Cooperation

    • University of Salamanca: University of Salamanca

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 October 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Clinical Reports
    2. Google Translate
    3. Medical knowledge
    4. MetaMap
    5. UMLS

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    TEEM'19

    Acceptance Rates

    Overall Acceptance Rate 496 of 705 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)UMLS at 30 years: How it is used and published (Preprint)JMIR Medical Informatics10.2196/20675Online publication date: 25-May-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media