Abstract We present a suite of experiments with a hierarchy of biogeochemical models of increasing complexity coupled to an offline global ocean circulation model based on the “transport matrix method”. Biogeochemical model structures range from simple nutrient models to more complex nutrient-phytoplankton–zooplankton-detritus-DOP models. The models’ skill is assessed by various misfit functions with respect to observed phosphate and oxygen distributions. While there is generally good agreement between the different metrics employed, an exception is a cost function based on the relative model-data misfit. We show that alterations in parameters and/or structure of the models – especially those that change particle export or remineralization profile – affect subsurface and mesopelagic phosphate and oxygen, particularly in the upwelling regions. Visual inspection of simulated biogeochemical tracer distributions as well as the evaluation of different cost functions suggest that increasing complexity of untuned, unoptimized models, simulated with parameters commonly used in large-scale model studies does not necessarily improve performance. Instead, variations in individual model parameters may be of equal, if not greater, importance.