Contents of Glottometrics 4, 2002 (including abstracts)
To Honor G. K. Zipf
| Balasubrahmanyan, V.K., Naranan, S. | |
| Algorithmic
information, complexity and Zipf´s law
|
1 - 26 |
| Zipf’s law
of word frequencies for language discourses is established with
statistical rigor. Data show a departure from Zipf’s power law term at
low frequencies. This is accounted by a modi-fying exponential term. Both
arise naturally in a model for word frequencies based on Information
Theory, algorithmic coding of a text preserving the symbol sequence,
concepts from quantum statistical physics and computer science and
extremum principles. The Optimum Meaning Preserving Code (OMPC) of the
discourse is realized when word frequencies follow the Modified Power Law
(MPL). The model predicts a variant of the MPL for the relative
frequencies of a small fixed set of symbols such as letters, phonemes and
grammatical words. The OMPC can be viewed as containing orderly and random
parts. This leads us to a quantitative definition of complexity of a
string (C) that tends to 0 for the extremes of ‘all order’ and
‘all random’ but is a maximum (C = 1) for a mixture of both
(Gell-Mann). It is found that natural languages have maximum complexity.
The uniqueness of Zipf’s power law index (γ = 2) is shown to
arise in four different ways, one of which depends on scale invariance
characteristic of fractal structures. It is argued that random text models
are unsuitable for natural languages. It is speculated that a drastic
change in symbol frequency distribution starting from phrases is related
to emergence of meaning and coherence of a discourse. |
|
| Roelcke,
Thorsten |
|
| Efficiency of communication. A new concept of language economy | 27 - 38 |
| George
Kingsley Zipf is known not only as the “father” of language statistics
or quantitative linguistics in general, but also as one of the first who
discussed the phenomenon of linguistic economy in detail. The following
discussion in linguistics and communication sciences shows a wide spread
of more or less scientific grounded concepts. This great conceptual
diversity however disturbs the scientific discussion further on. Hence in
the following contribution a new concept of language economy that fulfils
holistic (and atomistic) requirements will be shown. |
|
| Schroeder,
Manfred |
|
| Power laws: from Alvarez to Zipf | 39 - 44 |
| This
article is a reprint of a chapter from M. Schroeder, Fractals, Chaos,
Power Laws: Minutes from an Infinite Paradise (33-38), New York:
Freeman 1991, published here with the kind permission of the author and
the Freeman Publishing House. The book, very popular among linguists,
makes us realize that we are not alone in the universe of sciences but
have joined with all other disciplines at least due to the omnipotent
power law which, almost everywhere, bears the name of the linguist G.K.
Zipf. At the same time it shows us how physicists look at language. |
|
| Wheeler,
Eric, S. |
|
| Zipf´s law and why it works everywhere | 45 - 48 |
| Zipf's law is a consequence of independently categorizing items, and rank ordering the categories. Therefore, it can be applied to almost anything. Testing methods on random input helps us see what is the artifact of the method rather than the property of the subject matter. | |
| Debowski, Lukasz | |
| Zipf´s law against the text size: a half-rational model | 49 - 60 |
| In
this article, we consider Zipf-Mandelbrot's law as applied to texts in
natural lan-guages. We present a simple model of dependence of the law on
the text size, which is featured by variable power-law tail and constant
ratio of the most frequent words. As a result we derive several
closed formulas, which accord with empirical data qualitatively and
partially quanti-tatively. For example, there appears to be a minimal
length of literary texts equal to ≈ 159 word tokens for English. |
|
| Kornai, András | |
| How many words are there? | 61 - 86 |
| The
commonsensical assumption that any language has only finitely many words
is shown to be false by a combination of formal and empirical arguments.
Zipf's Law and related formulas are investigated and a more complex model
is offered. |
|
| Montemurro,
Marcelo A., Zanette, D. |
|
| Frequency-rank distribution of words in large text samples: phenomenology and models | 87 - 98 |
| In this paper we revisit Zipf’s law in the context of linguistics. The deviations from the original simple power law are analysed and a dynamic model for text generation is proposed whose parameters can be associated with some structural features of languages. Furthermore, for the case of large corpora a novel phenomenology is disclosed. In this case a quantitative description of all the scaling regimes is possible by considering the family of solutions of a single first order differential equation. | |
Glottometrics ist eine unregelmäßig erscheinende Zeitschrift für die quantitative Erforschung von Sprache und Text.
|
Glottometrics is a scientific journal for the quantitative research on language and text published at irregular intervals |
Beiträge in Deutsch oder Englisch sollten an einen der Herausgeber in einem gängigen Textverarbeitssystem (vorrangig WORD) geschickt werden.
|
Contributions in English or German written with a common text processing system (preferably WORD) should be sent to one of the editors |
| Glottometrics kann aus dem Internet heruntergeladen, auf CD-ROM (PDF-Format) oder in Buchform bestellt werden. | Glottometrics can be downloaded from the Internet, obtained on CD-ROM (in PDF) or in form of printed copies |
Herausgeber/Editors:
| G. Altmann | 02351973070-0001@t-online.de |
| K.-H. Best | kbest@gwdg.de |
| A. Hardie | a.hardie@lancester.ac.uk |
| L. Hrebicek | hrebicek@orient.cas.cz |
| R. Köhler | koehler@uni-trier.de |
| V. Kromer | kromer@newmail.ru |
| O. Rottmann | otto.rottmann@t-online.de |
| A. Schulz | reuter.schulz@t-online.de |
| G. Wimmer | wimmer@mat.savba.sk |
| A. Ziegler | arneziegler@compuserve.de |
Herunterladen/ Downloading: http://www.ram-verlag.de
ISSN 1617-8351 back