Affordable Access

: The Representation of Compound Words

  • Gross, Maurice
Publication Date
Jan 01, 1986
External links


The essential feature of a lexicon-grammar is that the elementary unit of computation and storage is the simple sentence: suject-verb-complement(s). This type of representation is obviously needed for verbs. We have applied lexicon-grammar representation not only to the two obvious predicative parts of speech, verb and adjective, but to nouns and adverbs as well.<br />The process of accumulation that led to the formalized lexicon-grammar of 12,000 French verbs has run into what seemed to be at first a minor problem of representation of words: the difference between simple and compound words. On the one hand, there are simple words such as the verb 'know' and complex (idiomatic) forms such as 'keep in mind'.<br />Compound terns raise a problem of representation. The unit of representation in a linear lexicon is roughly the word as defined by its written form, that is, a sequence of letters separated from neighboring sequences by boundary blanks. As a consequence, compound words cannot be directly put into a dictionary the way simple words are. An identification procedure is needed for their occurrences in texts, and this procedure will make use of the various simple parts of the compound utterance. Hence, the formal linguistic properties of compound terms will determine both the procedure of identification in texts and the type of storage they require. We will discuss the main types of compounds and single out those properties that bear on automatic parsing and dictionary lookup. Preliminary figures have shown that compound terms form the essential part of a lexicon-grammar.

Report this publication


Seen <100 times