Affordable Access

Publisher Website

Separating Algorithms from Questions and Causal Inference with Unmeasured Exposures: An Application to Birth Cohort Studies of Early BMI Rebound.

  • Aris, Izzuddin M1
  • Sarvet, Aaron L2
  • Stensrud, Mats J3
  • Neugebauer, Romain4
  • Li, Ling-Jun5
  • Hivert, Marie-France1, 6
  • Oken, Emily1, 7
  • Young, Jessica G1, 2
  • 1 Division of Chronic Disease Research Across the Lifecourse, Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA.
  • 2 Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
  • 3 Swiss Federal Institute of Technology Lausanne, Lausanne, Switzerland. , (Switzerland)
  • 4 Kaiser Permanente Northern California Division of Research, Oakland, California, USA.
  • 5 Department of Obstetrics and Gynecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore, Singapore. , (Singapore)
  • 6 Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, USA.
  • 7 Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
Published Article
American journal of epidemiology
Publication Date
Feb 10, 2021
DOI: 10.1093/aje/kwab029
PMID: 33565574


Observational studies reporting adjusted associations between childhood body mass index (BMI) rebound and subsequent cardio-metabolic outcomes have often not given explicit attention to causal inference, including definition of a target causal effect and assumptions for unbiased estimation of that effect. Using data from 649 children in a Boston, Massachusetts-area cohort recruited in 1999-2002, we considered effects of stochastic interventions on a chosen subset of modifiable, yet unmeasured, exposures expected to be associated with early (< age 4 years) BMI rebound (a proxy) on adolescent cardiometabolic outcomes. We consider assumptions under which these effects may be identified with available data. This leads to an analysis where the proxy, rather than exposure, acts as exposure in the algorithm. We applied Targeted Maximum Likelihood Estimation, a doubly-robust approach that naturally incorporates machine learning for nuisance parameters (e.g. propensity score). We estimated a protective effect of an intervention that assigns modifiable exposures according to the distribution in the observational study of those without (vs. with) early BMI rebound for fat-mass index (-1.39 kg/m2; 95% CI -1.63,-0.72), but weaker or no effects for other cardiometabolic outcomes. Our results clarify distinctions between algorithms and causal questions, encouraging explicit thinking in causal inference with complex exposures. © The Author(s) 2021. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: [email protected]

Report this publication


Seen <100 times