Library and Information Science

Library and Information Science ISSN: 2435-8495
三田図書館・情報学会 Mita Society for Library and Information Science
〒108‒8345 東京都港区三田2‒15‒45 慶應義塾大学文学部図書館・情報学専攻内 c/o Keio University, 2-15-45 Mita, Minato-ku, Tokyo 108-8345, Japan
Library and Information Science 17: 93-102 (1979)

原著論文Original Article

引用傾向の類似性に基づく文献クラスタリングの一手法A clustering method of scientific literature based on averaged citation multiplicity

発行日:1980年3月25日Published: March 25, 1980

A Clustering method is proposed in order to form groups of articles in a specific discipline of science, with an expression of their inter-relationship. The technique is based on the similarity of citation, introducing a topological space which is called here “citation spac”: Each cited article forms one axis of the citation space and a source article is considered to be a point in this space. Then the measure of clustering is defined to be a scalar product for arbitrary pair of articles. Furthermore, any groups of source articles may be represented by a point in the space, not by a set of points, with each coordinate given by the citation probability of the corresponding axis of article in this group. On the other hand, the hierarchical clustering method imposes a restriction on the amount of data, due to memory requirement and the complexity of the resulting dendrogram. The above argument solves this problem by taking initial clusters for the hierarchical connection to be groups of articles instead of individual papers.

This method is applied to 3505 articles in instrumentation/control engineering extracted from Science Citation Index (1977) and 225 initial clusters are made by the source articles citing a specific article (axis). On the resulting dendrogram, groups, formed above a specified similarity are summarized and named according to their contents, giving 25 reduced clusters hierarchically connected. The major part of the clusters shows theoretical development of the control engineering including 4 distinct groups of reseaches in USSR.

The essential difference between the Garfield's clustering and the present method is that the former is based on set-theoretical definition of the similarity measure, while the latter uses the notion of a topological space.

This page was created on 2022-07-14T11:55:41.90+09:00
This page was last modified on