Repetition Characteristic for Single Texts

Анотація

The repetition characteristic v(t) introduced by F. Golcher is calculated for single natural texts in different languages and random Miller’s monkey texts. It is shown that the saturated v(t) value v0 obtained at the largest times t is not governed by single-character information entropy and parameter of semantic load of a text. The parameter v0 manifests intra-language variations comparable with inter-language ones. In a slightly modified calculation regime, it provides a powerful tool for detecting even small repeated textual fragments.

Опис

Ключові слова

golcher’s repetition characteristic, textual constants, information entropy, semantic load

Бібліографічний опис

Repetition Characteristic for Single Texts / Oleh Kushnira, Lyubomyr Ivanitskyia, Andriy Kashubab, Mariana Mostova, Vitaliy Mykhaylyk // CEUR Workshop Proceedings. – 2021. – Vol. 2870. – P. 629–641. (Scopus)