Information content based ranking metric for linked open vocabularies

Atemezing, Ghislain Auguste ; Troncy, Raphaël
SEM 2014, 10th International Conference on Semantic Systems, 4-5 September 2014, Leipzig, Germany

It is widely accepted that by controlling metadata, it is easier to publish high quality data on the web. Metadata, in the context of Linked Data, refers to vocabularies and ontologies used for describing data. With more and more data published on the web, the need for reusing controlled taxonomies and vocabularies is becoming more and more a necessity. Catalogues of vocabularies are generally a starting point to search for vocabularies based on search terms. Some recent studies recommend that it is better to reuse terms from \popular" vocabularies [3]. However, there is not yet an agreement on what makes a popular vocabulary since it depends on diverse criteria such as the number of properties, the number of datasets using part or the whole vocabulary, etc. In this paper, we propose a method for
ranking vocabularies based on an information content metric which combines three features: (i) the datasets using the vocabulary, (ii) the outlinks from the vocabulary and (iii) the inlinks to the vocabulary. We applied this method to 366 vocabularies described in the LOV catalogue. The re-sults are then compared with other catalogues which provide alternative rankings.

DOI
Type:
Conférence
City:
Leipzig
Date:
2014-09-04
Department:
Data Science
Eurecom Ref:
4375
Copyright:
© ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in SEM 2014, 10th International Conference on Semantic Systems, 4-5 September 2014, Leipzig, Germany http://dx.doi.org/10.1145/2660517.2660533

PERMALINK : https://www.eurecom.fr/publication/4375