Abstract Straightforward library search methods, aiming at identification of (organic) compounds and based on comparison of analytical data for continuous variables, are considered with respect to a definition of the similarity of data. In the context used, the main object of such a search method is simply the retrieval of the reference date of the unknown compound. The proposed similarity index has the form of a significance probability ( P value), a quantity originating from the general theory of hypothesis testing, and can be calculated from a statistical model of the reproudicibility of the quantities used for comparison. The index is defined in general terms, but is intended for applicability to library search methods for different types, or combinations, of analytical data. It is primarily designed for use in situations where the application of very large data bases suffers from the generally low (interlaboratory) reproducibility of the data.