Affordable Access

Access to the full text

Trends in COVID-19 Publications: Streamlining Research Using NLP and LDA

Authors
  • Gupta, Akash1
  • Aeron, Shrey2
  • Agrawal, Anjali3
  • Gupta, Himanshu4
  • 1 Department of Engineering, University of Cambridge, Cambridge , (United Kingdom)
  • 2 Electrical Engineering and Computer Science, University of California, Berkeley, Berkeley, CA , (United States)
  • 3 Harmony School of Innovation – Sugar Land (High School), Sugar Land, TX , (United States)
  • 4 Valley Health System, Ridgewood, NJ , (United States)
Type
Published Article
Journal
Frontiers in Digital Health
Publisher
Frontiers Media S.A.
Publication Date
Jul 06, 2021
Volume
3
Identifiers
DOI: 10.3389/fdgth.2021.686720
Source
Frontiers
Keywords
Disciplines
  • Digital Health
  • Original Research
License
Green

Abstract

Background: Research publications related to the novel coronavirus disease COVID-19 are rapidly increasing. However, current online literature hubs, even with artificial intelligence, are limited in identifying the complexity of COVID-19 research topics. We developed a comprehensive Latent Dirichlet Allocation (LDA) model with 25 topics using natural language processing (NLP) techniques on PubMed® research articles about “COVID.” We propose a novel methodology to develop and visualise temporal trends, and improve existing online literature hubs. Our results for temporal evolution demonstrate interesting trends, for example, the prominence of “Mental Health” and “Socioeconomic Impact” increased, “Genome Sequence” decreased, and “Epidemiology” remained relatively constant. Applying our methodology to LitCovid, a literature hub from the National Center for Biotechnology Information, we improved the breadth and depth of research topics by subdividing their pre-existing categories. Our topic model demonstrates that research on “masks” and “Personal Protective Equipment (PPE)” is skewed toward clinical applications with a lack of population-based epidemiological research.

Report this publication

Statistics

Seen <100 times