Abstract In this work we present a methodology aimed to use the data contained in R&D databases efficiently in order to assist the design of new products by providing models that relate formulations to properties, which are introduced into an optimization problem that represents the design problem together with constraints proposed in order to avoid inaccurate predictions. In this work the methodology is applied to obtain optimal formulations of brake fluids satisfying rigid technical norms and market specifications. Mixture models are built with Principal Components Regression (PCR) and Partial Least Squares Regression (PLS) which are adequate for systems with incomplete or redundant information, because the informativity of the data with respect to the models is not guaranteed. The models are integrated in an optimization problem, in order to perform the design, which is solved by Mixed Integer Linear Programming (MILP) techniques. Equations that restrain the solution to a set where the information is more trustable are added to the problem. They avoid inaccurate extrapolations that could result in an excessive number of experiments to confirm predictions. The results obtained by this methodology presented good agreement with validation experiments. The models and optimization tools generated in this work are currently being used at Oxiteno S.A. R&D laboratories and have led to a reduction of time and effort during the development of new formulations.