Publication View

General study of the distribution of N-tuples of letters or words based on the distributions of the single letters or words (2000)
  • Egghe, Leo [83]

Abstract
This paper establishes the general relation between the distribution of N-tuples of letters (e.g., N-truncations, N-grams) or words (e.g., N-word phrases) and the distributions of the single letters or words. Here the very general case is treated: the case where there is dependence on the place i in the N-tuple (i = 1,…, N) in the sense that, for each i = 1,…, N, a different distribution of the letters or words is supposed. Concrete calculations are performed in the important case of Zipfian distributions (i.e., power laws) for the single letters or words. In this case, we prove that the distribution of the N-tuples (N-fixed) is the sum of power laws.

Publication details
Download http://hdl.handle.net/1942/787
Publisher Elsevier
Repository Document Server@UHasselt (Belgium)
Type Article
Language Englisch
Relation http://dx.doi.org/10.1016/S0895-7177(00)00058-3

Cited publications (3)
Introduction to Informetrics : quantitative methods in library, documentation and information science (1990)
Introduction to Modern Information Retrieval (1984)
On the law of Zipf-Mandelbrot for multi-word phrases (1999)
  • Egghe, Leo [83]