Progetto

Generale

Profilo

Actions

Costruzione di thesauri » Cronologia » Versione 10

« Precedente | Versione 10/11 (diff) | Successivo »
Redmine Admin, 11-10-2019 09:25


Costruzione di thesauri

Definizione di thesaurus

A tool for vocabulary control of a specific subject domain (1)

It contains:
  • Preferred terms and non-preferred terms
  • The semantic relations between terms
  • Rules for use and other administrative information
It presupposes:
  • A particular collection of documents
  • A particular group of users

According to Foskett (1980), the purposes of the thesaurus are

  • To provide a map of a given field of knowledge.
  • To provide a standard vocabulary for a given subject field.
  • To provide a system of references between terms.
  • To provide a guide for users of the system.
  • To locate new concepts in a scheme of relationships with existing concepts in a way which, makes sense to users of the system.
  • To provide classified hierarchies.
  • To provide means by which the use of terms in given subject field may be standardized.

The thesaurus can also be used for generation of keyword lists which form the basis for planning, priority setting, and other research management tasks (Nestsel, et.al., 1992). The thesaurus is also useful in computer-assisted indexing and abstracting. Thesaurus helps in defining terms.

Thesaurus can be used for three basic purposes (Rowley and Farrow, 2000a).

  1. In indexing but not in searching.
  2. In searching but not in indexing.
  3. In both indexing and searching.

Strumenti per il controllo dei termini

  • List
  • Synonym Ring
  • Hierarchy
  • Thesaurus

List

Semplici elenchi di termini.

Applicazioni: inserimento dati (vocabolari di termini controllati per definire il dominio dei possibili valori delle unità di informazione)

Synonym Ring

Elenchi di termini e relativi sinonimi.

Applicazioni: motori di ricerca (per aumentare la capacità di richiamo [recall] anche se a scapito della precisione)

Hierarchy

Termini organizzati secondo una struttura ad albero (formalmente: grafo di nodi e archi indiretto aciclico e connesso)

Applicazioni: inserimento dati

Esempi:

Relazioni di tipo isPartOf/hasPart, contains/isContained, derivedFrom, BT/NT tra classi o istanze

Thesaurus

Esistono degli standard.

Standard ISO 2788 (monolingua, 1974) [https://www.iso.org/standard/7776.html]
Standard ISO 5964 (multilingua, 1985) [https://www.iso.org/standard/12159.html]
Standard ISO 25964 (frutto del WG8 del NISO 2008-2013) [https://www.niso.org/schemas/iso25964]
Parte 1 e Parte 2
Nella parte 2: Interoperabilità con SKOS: https://www.niso.org/schemas/iso25964#skos
Annuncio su W3:
Alignment between SKOS and new ISO 25964 thesaurus standard (2012-12-13) [https://www.w3.org/2004/02/skos/]

Modello UML: https://www.niso.org/schemas/iso25964/Model_2011-06-02.jpg
Schema XSD: https://www.niso.org/schemas/iso25964/iso25964-1_v1.4.xsd

Evoluzione di "termine" (e classiche relazioni BT,NT,UF) verso "concetto", più generico e dotato di maggiore potere espressivo formale.

SKOS Simple Knowledge Organization System Reference
https://www.w3.org/TR/skos-reference/

https://www.w3.org/TR/skos-reference/#vocab

Coordinamento (pre vs post)

Coordination refers to the construction of phrases from individual terms. Two distinct coordination options are recognized in thesauri: precoordination and post-coordination. A precoordinated thesaurus is one that can contain phrases. Consequently, phrases are available for indexing and retrieval. A postcoordinated thesaurus does not allow phrases. Instead, phrases are constructed while searching. The choice between the two options is difficult. The advantage in precoordination is that the vocabulary is very precise, thus reducing ambiguity in indexing and in searching. Also, commonly accepted phrases become part of the vocabulary. However, the disadvantage is that the searcher has to be aware of the phrase construction rules employed. Thesauri can adopt an intermediate level of coordination by allowing both phrases and single words. This is typical of manually constructed thesauri. However, even within this group there is significant variability in terms of coordination level. Some thesauri may emphasize two or three word phrases, while others may emphasize even larger sized phrases. Therefore, it is insufficient to state that two thesauri are similar simply because they follow precoordination. The level of coordination is important as well. It should be recognized that the higher the level of coordination, the greater the precision of the vocabulary but the larger the vocabulary size. It also implies an increase in the number of relationships to be encoded. Therefore, the thesaurus becomes more complex. The advantage in postcoordination is that the user need not worry about the exact ordering of the words in a phrase. Phrase combinations can be created as and when appropriate during searching. The disadvantage is that search precision may fall, as illustrated by the following well known example, from Salton and McGill (1983): the distinction between phrases such as "Venetian blind" and "blind Venetian" may be lost. A more likely example is "library school" and "school library." The problem is that unless search strategies are designed carefully, irrelevant items may also be retrieved. Precoordination is more common in manually constructed thesauri. Automatic phrase construction is still quite difficult and therefore automatic thesaurus construction usually implies post-coordination. Section 9.4 includes a procedure for automatic phrase construction.

Riferimenti:
Dextre Clarke, Stella G. and Zeng, Marcia Lei From ISO 2788 to ISO 25964: The evolution of thesaurus standards towards interoperability and data modelling. Information Standards Quarterly (ISQ), 2012, vol. 24, n. 1
[http://eprints.rclis.org/16818/1/SP_clarke_zeng_isqv24no1.pdf]

Kumbhar, Rajendra Madhavrao, Contruction of vocabulary control tool thesaurus for library and information science, Dr. Babasaheb Ambedkar Marathwada University, 2003
[http://shodhganga.inflibnet.ac.in:8080/jspui/handle/10603/150911]

Note
--------------
(1) Rich Gazan, CONTROLLED VOCABULARY & THESAURUS DESIGN - Trainee’s Manual, p. 59 (session 5-7
[https://www.loc.gov/catworkshop/courses/thesaurus/pdf/cont-vocab-thes-trnee-manual.pdf]

Aggiornato da Redmine Admin oltre 4 anni fa · 10 revisions