Abstract The problem of connecting together a number of different databases to produce an integrated information system has attracted a considerable amount of attention over the years and various approaches have been developed to handle this. However, the general problem of gathering related information from a number of existing heterogeneous databases is complex because of the differences in representation and meaning of data in different data sets. Many different approaches have been described to resolve this problem, and some prototype systems built. However, it is difficult to compare the effectiveness of different approaches and prototypes. This paper is aimed at addressing the specific issue of assessing the generality of different approaches. To this end it presents a framework for classifying the differences between data in different databases and a test-suite which can be used to evaluate and compare the extent to which different approaches handle different aspects of this heterogeneity.