Progetto

Generale

Profilo

Costruzione di thesauri » Cronologia » Versione 9

Versione 8 (Redmine Admin, 11-10-2019 09:04) → Versione 9/11 (Redmine Admin, 11-10-2019 09:04)

h1. Costruzione di thesauri 

 h2. Definizione di thesaurus 

 _A tool for vocabulary control of a specific subject domain_ (1) 

 _It contains_: 
 * _Preferred terms and non-preferred terms_ 
 * _The semantic relations between terms_ 
 * _Rules for use and other administrative information_ 

 _It presupposes_: 
 * _*A particular collection of documents*_ 
 * _*A particular group of users*_ 

 According to Foskett (1980), the purposes of the thesaurus are 
 *To a) To provide a map of a given field of knowledge. 
 * b) *To provide a standard vocabulary for a given subject field*. 
 * c) To provide a system of references between terms. 
 * d) To provide a guide for users of the system. 
 * e) To locate new concepts in a scheme of relationships with existing concepts in a way which, makes sense to users of the system. 
 * f) To provide classified hierarchies. 
 * g) To provide means by which the use of terms in given subject field may be standardized. 

 The thesaurus can also be used for generation of keyword lists which form the basis for planning, priority setting, and other research management tasks (Nestsel, et.al., 1992). The thesaurus is also useful in computer-assisted indexing and abstracting. Thesaurus helps in defining terms. 

 Thesaurus can be used for three basic purposes (Rowley and Farrow, 2000a). 

 # In indexing but not in searching. 
 # In searching but not in indexing. 
 # In both indexing and searching. 

 h2. Strumenti per il controllo dei termini 


 * List 
 * Synonym Ring 
 * Hierarchy 
 * Thesaurus 

 h3. List 

 Semplici elenchi di termini. 

 Applicazioni: inserimento dati (vocabolari di termini controllati per definire il dominio dei possibili valori delle unità di informazione) 

 h3. Synonym Ring 

 Elenchi di termini e relativi sinonimi. 

 Applicazioni: motori di ricerca (per aumentare la capacità di richiamo [recall] anche se a scapito della precisione) 

 h3. Hierarchy 

 Termini organizzati secondo una struttura ad albero (formalmente: grafo di nodi e archi indiretto aciclico e connesso) 

 Applicazioni: inserimento dati 

 Esempi: 
 
 * suddivisioni geografiche amministrative (regioni, province, comuni ISTAT) [https://www.istat.it/it/archivio/6789] 
 * schemi di classificazione iconografica (ICONCLASS [http://www.iconclass.org/help/outline] 
 * tassonomia linneana [https://upload.wikimedia.org/wikipedia/commons/b/bb/Linnaeus_-_Regnum_Animale_%281735%29.png] 

 Relazioni di tipo isPartOf/hasPart, contains/isContained, derivedFrom, BT/NT tra classi o istanze 

 h3. Thesaurus 

 Esistono degli standard. 

 Standard ISO 2788 (monolingua, 1974) [https://www.iso.org/standard/7776.html] 
 Standard ISO 5964 (multilingua, 1985) [https://www.iso.org/standard/12159.html] 
 Standard ISO 25964 (frutto del WG8 del NISO 2008-2013) [https://www.niso.org/schemas/iso25964] 
 Parte 1 e Parte 2 
 Nella parte 2: Interoperabilità con SKOS: https://www.niso.org/schemas/iso25964#skos 
 Annuncio su W3: 
 Alignment between SKOS and new ISO 25964 thesaurus standard (2012-12-13) [https://www.w3.org/2004/02/skos/] 

 Modello UML: https://www.niso.org/schemas/iso25964/Model_2011-06-02.jpg 
 Schema XSD: https://www.niso.org/schemas/iso25964/iso25964-1_v1.4.xsd 

 Evoluzione di "termine" (e classiche relazioni BT,NT,UF) verso "concetto", più generico e dotato di maggiore potere espressivo formale. 

 SKOS Simple Knowledge Organization System Reference 
 https://www.w3.org/TR/skos-reference/ 

 https://www.w3.org/TR/skos-reference/#vocab 

 h4. Coordinamento (pre vs post) 

 Coordination refers to the construction of phrases from individual terms. Two distinct coordination options are recognized in thesauri: precoordination and post-coordination. A *precoordinated* thesaurus is one that *can contain phrases*. Consequently, phrases are available for indexing and retrieval. A *postcoordinated* thesaurus *does not allow phrases*. Instead, phrases are constructed while searching. The choice between the two options is difficult. The advantage in precoordination is that the vocabulary is very precise, thus reducing ambiguity in indexing and in searching. Also, commonly accepted phrases become part of the vocabulary. However, the disadvantage is that the searcher has to be aware of the phrase construction rules employed. Thesauri can adopt an intermediate level of coordination by allowing both phrases and single words. This is typical of manually constructed thesauri. However, even within this group there is significant variability in terms of coordination level. Some thesauri may emphasize two or three word phrases, while others may emphasize even larger sized phrases. Therefore, it is insufficient to state that two thesauri are similar simply because they follow precoordination. The level of coordination is important as well. It should be recognized that the higher the level of coordination, the greater the precision of the vocabulary but the larger the vocabulary size. It also implies an increase in the number of relationships to be encoded. Therefore, the thesaurus becomes more complex. The advantage in postcoordination is that the user need not worry about the exact ordering of the words in a phrase. Phrase combinations can be created as and when appropriate during searching. The disadvantage is that search precision may fall, as illustrated by the following well known example, from Salton and McGill (1983): the distinction between phrases such as "Venetian blind" and "blind Venetian" may be lost. A more likely example is "library school" and "school library." The problem is that unless search strategies are designed carefully, irrelevant items may also be retrieved. Precoordination is more common in manually constructed thesauri. Automatic phrase construction is still quite difficult and therefore automatic thesaurus construction usually implies post-coordination. Section 9.4 includes a procedure for automatic phrase construction. 


 Riferimenti: 
 Dextre Clarke, Stella G. and Zeng, Marcia Lei From ISO 2788 to ISO 25964: The evolution of thesaurus standards towards interoperability and data modelling. Information Standards Quarterly (ISQ), 2012, vol. 24, n. 1 
 [http://eprints.rclis.org/16818/1/SP_clarke_zeng_isqv24no1.pdf] 

 Kumbhar, Rajendra Madhavrao, Contruction of vocabulary control tool thesaurus for library and information science, Dr. Babasaheb Ambedkar Marathwada University, 2003 
 [http://shodhganga.inflibnet.ac.in:8080/jspui/handle/10603/150911] 



 Note 
 -------------- 
 (1) Rich Gazan, CONTROLLED VOCABULARY & THESAURUS DESIGN - Trainee’s Manual, p. 59 (session 5-7 
 [https://www.loc.gov/catworkshop/courses/thesaurus/pdf/cont-vocab-thes-trnee-manual.pdf]