Background Enterobacter sakazakii is an emergent pathogen associated with ingestion of infant formula and accurate identification is important in both industrial and clinical settings. Bacterial species can be difficult to accurately characterise from complex biochemical datasets and computer algorithms can potentially simplify the process. Results Artificial Neural Networks were applied to biochemical and 16S rDNA data derived from 282 strains of Enterobacteriaceae, including 189 E. sakazakii isolates, in order to identify key characteristics which could improve the identification of E. sakazakii. The models developed resulted in a predictive performance for blind (validation) data of 99.3 % correct discrimination between E. sakazakii and closely related species for both phenotypic and genotypic data. Three main regions of the partial rDNA sequence were found to be key in discriminating the species. Comparison between E. sakazakii and other strains also constitutively positive for expression of the enzyme α-glucosidase resulted in a predictive performance of 98.7 % for 16S rDNA sequence data and 100% for phenotypic data. Conclusion The computationally based methods developed here show a remarkable ability in reducing data dimensionality and complexity, in order to eliminate noise from the system in order to facilitate the speed and reliability of a potential strain identification system. Furthermore, the approaches described are also able to provide valuable information regarding the population structure and distribution of individual species thus providing the foundations for novel assays and diagnostic tests for rapid identification of pathogens.