Library and Information Science 61: 1-23 (2009)

原著論文Original Article

文書クラスタリングによる未解読文書の解読可能性の判定ヴォイニッチ写本の事例Determining the possibility of deciphering an unintelligible text by text clustering: The case of the Voynich Manuscript

1亜細亜大学Asia University ◇ 東京都武蔵野市境5-24-10 ◇ Sakai 5-24-10, Musashino, Tokyo 180-8629, Japan

2慶應義塾大学Keio University ◇ 〒108-8345 東京都港区三田二丁目15番45号 ◇ Mita 2-15-45, Minato-ku, Tokyo 108-8345, Japan

受付日:2008年3月12日Received: March 12, 2008
受理日:2009年4月19日Accepted: April 19, 2009
発行日:2009年6月30日Published: June 30, 2009




Purpose: One of the most common approaches to understanding an undeciphered text is to identify and then decipher the underlying code. If a document remains unintelligible or undeciphered for a long period of time even after many attempts at decoding it, the possibility of it being “gibberish” must be considered. This study proposes a method to detect the existence, or non-existence, of a coherent structure within a previously non-translated text in order to determine the possibility of deciphering it.

Methods: The present method begins with the assumption that natural languageprocessing methods that are commonly employed in analyzing known languages can be applied to an undeciphered text. To detect a coherent structure in a text, the similarity of every pair of partial document is measured, and then the similarity matrix is analyzed by clustering methods. The next step is to compare the detected structure with the sections suggested by other clues such as illustrations and the page order. Thus, it is determined whether an undeciphered text contains an identifiable structure which corresponds to the latter, or whether it is “gibberish” containing no order or structure.

Results: We applied the proposed method to the Voynich Manuscript, which is a renowned undeciphered text. The results clearly demonstrate that the text of the Voynich Manuscript possesses an identifiable structure, and that the structure corresponds to the existing sections of the manuscript suggested by the accompanying illustrations. Thus, the results strongly suggest that the Voynich Manuscript is not “gibberish”; additional attempts to decipher its contents would be justified. The present experiment proves the usefulness of applying this method to a previously non-deciphered text.

