In terms of ancient, indecipherable books, no manuscript matches the sheer mystery of the Voynich manuscript. This ancient text, filled with strange and fantastical illustrations of unidentified plants and odd astrological symbols, has defied explanation for centuries. But it has also always defied translation from anybody who has attempted to decode it.
“Research on the Voynich manuscript has revealed some clues about its origins,” Keagan Brewer, Research Fellow, Department of Media, Communications, Creative Arts, Language, and Literature, Macquarie University, explains in a piece for The Conversation. “Carbon dating provides a 95 percent probability the skins used to make the manuscript come from animals that died between 1404 and 1438. However, its earliest securely known owner was an associate of Holy Roman Emperor Rudolf II, who lived from 1552 to 1612, which leaves more than a century of ownership missing.”
“Certain illustrations (the zodiac symbols, a crown design and a particular shape of castle wall called a swallowtail merlon) indicate the manuscript was made in the southern Germanic or northern Italian cultural areas.”
Despite these clues, and the text itself now being available to the public, we are no closer to understanding what it actually says. There have been suggestions that it could even be an elaborate hoax.
But even though we cannot decipher the text, it is still possible to analyze the hell out of it. The manuscript has been found to follow Zipf’s law, a law that appears to be followed by all known languages, even if we have no idea why.
Words are used with varying frequency, as you might expect. You have more use for the word “the” than you do for the words “ecumenical” or “phubbing“, for example. But analyzing the frequency of word use in large texts reveals that it closely follows a specific statistical law.
“About 80 years ago, George Kingsley Zipf reported an observation that the frequency of a word seems to be a power law function of its frequency rank, formulated as f(r) ∝ 𝑟𝛼, where f is word frequency, r is the rank of frequency, and 𝛼 is the exponent,” a paper on the topic explains.
To put it simply, the most frequently used word in a language – in English, “the” – is used twice as often as the next most common word, and three times as often as the next, and four times as often as the next, and so on following this power law for a surprisingly long time.
You may think this is some weird quirk of English, but it isn’t. Zipf’s law appears to apply to almost all languages that have been looked into. No matter whether you are speaking English, Hindi, French, Mandarin, or Spanish, the frequency of a word appears to drop off scaling to its popularity rank.

Zipf’s law applies to the first 10 million words in 30 different languages on Wikipedia.
Weirder still, it even applies to languages we haven’t even deciphered yet. The words appearing in the Voynich manuscript appear to follow this law according to studies, as neatly visualized by Redditor realmathtician.
“One of the strongest clues in this puzzle is the fact that the frequency of words in the Voynich text obeys Zipf’s law,” a study explains, adding that frequency-rank distributions in human languages differ significantly from random texts. The team concluded after analyzing the text using information theory that a hoax was unlikely, as it would have been beyond mathematicians of the time.
“Precise features of Zipf’s law in languages do not emerge in simple random sequences and generally require interplay between multiplicative and additive processes. Moreover, Zipf’s law was discovered centuries after the accepted date of creation of the Voynich text. Thus, proposed solutions like the use of sixteenth-century cipher methods, although not impossible, can hardly account for the presence of Zipf’s law in the Voynich text.”
“Interestingly, the network of relationships that we obtained showed that related words share similar morphological patterns, either in their prefixes or suffixes,” the team adds. “This fact suggests that any underlying code or language in the Voynich manuscript has a strong connection between morphology and semantics, recalling scripts where – as in the cases of Chinese and hierographical Ancient Egyptian – the graphical form of words directly derives from their meaning.”
While interesting, there have been many claims of deciphering the text, with some suggesting that it is written in a Turkic language. And the truth is, we really don’t understand Zipf’s law all that well. Individual texts, if they are large enough, will roughly follow these laws too, with the top-ranked word appearing twice as much as the next etc, etc. Even Charles Darwin can’t evolve his way out of this one, with one analysis finding it applies fairly neatly to his text On the Origin of Species. In fact, it crops up all over the place.
“It is worth reflecting on the peculiarity of this law,” a review of the topic explains. “It is certainly a nontrivial property of human language that words vary in frequency at all; it might have been reasonable to expect that all words should be about equally frequent. But given that words do vary in frequency, it is unclear why words should follow such a precise mathematical rule – in particular, one that does not reference any aspect of each word’s meaning.”
There are many potential explanations for the idea, from statistical problems to constraints imposed by human memory and vocabulary. George Zipf himself proposed that the law comes from a balance of effort minimization, with speakers (or writers) attempting to minimize their own effort by using more frequently occurring words, and listeners (or readers) seeking clarity in language from less-frequently used words. An extension of this is that humans attempt to convey meaning as efficiently as possible, tending towards using words that maximize the amount of information they can convey.
Another idea is that more common words tend to become more popular over time as language spreads and develops, leading to a sort of snowball effect. But none are truly accepted as the explanation, and the cause behind it remains a bit of a mystery. The fact that the Voynich manuscript appears to follow it? Doubly so.
Source Link: The Voynich Manuscript Appears To Follow Zipf's Law. Could It Be A Real Language?