Affordable Access

Harnessing Google Health Trends API Data for Epidemiologic Research

Authors
  • Neumann, Krista
  • Mason, Susan M
  • Farkas, Kriszta
  • Santaularia, N Jeanie
  • Ahern, Jennifer
  • Riddell, Corinne A
Publication Date
Feb 24, 2023
Source
eScholarship - University of California
Keywords
License
Unknown
External links

Abstract

Interest in using internet search data, such as that from the Google Health Trends Application Programming Interface (GHT-API), to measure epidemiologically relevant exposures or health outcomes is growing due to their accessibility and timeliness. Researchers enter search term(s), geography, and time period, and the GHT-API returns a scaled probability of that search term, given all searches within the specified geographic-time period. In this study, we detailed a method for using these data to measure a construct of interest in 5 iterative steps: first, identify phrases the target population may use to search for the construct of interest; second, refine candidate search phrases with incognito Google searches to improve sensitivity and specificity; third, craft the GHT-API search term(s) by combining the refined phrases; fourth, test search volume and choose geographic and temporal scales; and fifth, retrieve and average multiple samples to stabilize estimates and address missingness. An optional sixth step involves accounting for changes in total search volume by normalizing. We present a case study examining weekly state-level child abuse searches in the United States during the coronavirus disease 2019 pandemic (January 2018 to August 2020) as an application of this method and describe limitations.

Report this publication

Statistics

Seen <100 times