Affordable Access

Publisher Website

New algorithms for text fingerprinting

Authors
Journal
Journal of Discrete Algorithms
1570-8667
Publisher
Elsevier
Publication Date
Volume
6
Issue
2
Identifiers
DOI: 10.1016/j.jda.2007.05.001
Keywords
  • Text Fingerprinting
  • Character Set
  • Text Algorithm
  • Maximal Location
  • Naming
Disciplines
  • Computer Science

Abstract

Abstract Let s = s 1 . . s n be a text (or sequence) on a finite alphabet Σ. A fingerprint in s is the set of distinct characters contained in one of its substrings. Fingerprinting a text consists of computing the set F of all fingerprints of all its substrings and being able to efficiently answer several questions on this set. A given fingerprint f ∈ F is represented by a binary array, F, of size | Σ | named a fingerprint table. A fingerprint, f ∈ F , admits a number of maximal locations 〈 i , j 〉 in S, that is the alphabet of s i . . s j is f and s i − 1 , s j + 1 , if defined, are not in f. The set of maximal locations is L , | L | ⩽ n | Σ | . We present new algorithms and a new data structure for the three problems: (1) compute F ; (2) given F, answer if F represents a fingerprint in F ; (3) given F, find all maximal locations of F in s. These problems are, respectively, solved in O ( n + | L | log | Σ | ) , Θ ( | Σ | ) , and Θ ( | Σ | + K ) time—where K is the number of maximal locations of F.

There are no comments yet on this publication. Be the first to share your thoughts.

Statistics

Seen <100 times
0 Comments

More articles like this

Faster query algorithms for the text fingerprintin...

on Information and Computation Jan 01, 2011

A new approach to create textured urban models thr...

on Information Sciences Sep 10, 2013

A new textbook.

on Medical & biological illustrat... February 1977
More articles like this..