Abstract Introduction Medical documentation is a time-consuming task and there is a growing number of documentation requirements. In order to improve documentation, harmonization and standardization based on existing forms and medical concepts are needed. Systematic analysis of forms can contribute to standardization building upon new methods for automated comparison of forms. Objectives of this research are quantification and comparison of data elements for breast and prostate cancer to discover similarities, differences and reuse potential between documentation sets. In addition, common data elements for each entity should be identified by automated comparison of forms. Materials and methods A collection of 57 forms regarding prostate and breast cancer from quality management, registries, clinical documentation of two university hospitals (Erlangen, Münster), research datasets, certification requirements and trial documentation were transformed into the Operational Data Model (ODM). These ODM-files were semantically enriched with concept codes and analyzed with the compareODM algorithm. Comparison results were aggregated and lists of common concepts were generated. Grid images, dendrograms and spider charts were used for illustration. Results Overall, 1008 data elements for prostate cancer and 1232 data elements for breast cancer were analyzed. Average routine documentation consists of 390 data elements per disease entity and site. Comparisons of forms identified up to 20 comparable data elements in cancer conference forms from both hospitals. Urology forms contain up to 53 comparable data elements with quality management and up to 21 with registry forms. Urology documentation of both hospitals contains up to 34 comparable items with international common data elements. Clinical documentation sets share up to 24 comparable data elements with trial documentation. Within clinical documentation administrative items are most common comparable items. Selected common medical concepts are contained in up to 16 forms. Discussion The amount of documentation for cancer patients is enormous. There is an urgent need for standardized structured single source documentation. Semantic annotation is time-consuming, but enables automated comparison between different form types, hospital sites and even languages. This approach can help to identify common data elements in medical documentation. Standardization of forms and building up forms on the basis of coding systems is desirable. Several comparable data elements within the analyzed forms demonstrate the harmonization potential, which would enable better data reuse. Conclusion Identifying common data elements in medical forms from different settings with systematic and automated form comparison is feasible.