Fitting of MRS data plays an important role in the quantification of metabolite concentrations. Many different spectral fitting packages are used by the MRS community. A fitting challenge was set up to allow comparison of fitting methods on the basis of performance and robustness. Synthetic data were generated for 28 datasets. Short-echo time PRESS spectra were simulated using ideal pulses for the common metabolites at mostly near-normal brain concentrations. Macromolecular contributions were also included. Modulations of signal-to-noise ratio (SNR); lineshape type and width; concentrations of γ-aminobutyric acid, glutathione, and macromolecules; and inclusion of artifacts and lipid signals to mimic tumor spectra were included as challenges to be coped with. Twenty-six submissions were evaluated. Visually, most fit packages performed well with mostly noise-like residuals. However, striking differences in fit performance were found with bias problems also evident for well-known packages. In addition, often error bounds were not appropriately estimated and deduced confidence limits misleading. Soft constraints as used in LCModel were found to substantially influence the fitting results and their dependence on SNR. Substantial differences were found for accuracy and precision of fit results obtained by the multiple fit packages. © 2021 International Society for Magnetic Resonance in Medicine.