Affordable Access

Validation of XML Document Based on Parallel Bit Stream Technology

Authors
Publication Date
Disciplines
  • Computer Science

Abstract

The validating of XML files is a main component of XML file processing. This thesis investigates single-instruction multiple-data(SIMD) and parallel bit stream technologies in high performance XML validation. The content model and datatypes of the schema are translated into regular expressions and then into parallel bitwise operations. The element content and data of the instance file are extracted to form byte streams, and then transformed into parallel bit streams. Finally, the parallel bitwise operations are applied on corresponding bit streams to validate the content model or datatype. This method is then studied by changing the characteristic of the instance files, such as the proportion of content data, occurrences of elements. Comparisons of the performance are also made with Xerces, the well known XML parser with validator. Whereas the parallel bit stream validation algorithm requires less than 20 cycles per byte, while Xerces requires 40 to 300 cycles per byte.

There are no comments yet on this publication. Be the first to share your thoughts.