Pubblicazioni

Automatic Generation of Dictionaries: the journalistic lexicon case.  (2019)

Autori:
Cristani, Matteo; Tomazzoli, Claudio; Zorzi, Margherita
Titolo:
Automatic Generation of Dictionaries: the journalistic lexicon case.
Anno:
2019
Tipologia prodotto:
Contributo in atti di convegno
Tipologia ANVUR:
Contributo in Atti di convegno
Lingua:
Inglese
Formato:
Elettronico
Nome rivista:
LECTURE NOTES IN COMPUTER SCIENCE
ISSN Rivista:
0302-9743
N° Volume:
11606
Titolo del Convegno:
32nd International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2019)
Luogo:
Graz, Austria
Periodo:
July 9-11, 2019
Editore:
Springer Verlag Germany
Casa editrice:
Springer Verlag Germany
ISBN:
978-303022998-6
Intervallo pagine:
744-752
Parole chiave:
Natural Language Processing, Lexicon generation, Document Analysis
Breve descrizione dei contenuti:
Text normalisation is an important task in the context of Natural Language Processing. By normalisation, free text is mapped into dictionaries, i.e. indexed collections of locutions recognised as typical of a particular jaergon. In general, technical dictionaries are difficult to build and validate. They are typically constructed by hand on the basis of everyday human work and they are agreement-based. This is indubitably time consuming and the approach requires a strong human supervision and does not provide a general methodology. In this paper, we perform the first steps towards the to automatic building of a dictionary for Italian journalistic lexicon, called NewsDict, based on sub dictionaries able to characterise main topics occurring in newspaper articles. We exploit a dataset of annotated documents from some Italian newspapers and a statistical techniques based on the Mutual Information Principle. Documents contains information such as the release date and the topic of the article and has been directly annotated by the author. To check the accuracy of the dictionary we built, we develop an initial test. We normalise a control set of journal article into NewsDict. Crossing results presented in this paper against the human annotation, we provide a fist measure of performances of the described methodology
Pagina Web:
https://link.springer.com/chapter/10.1007/978-3-030-22999-3_63
Id prodotto:
107539
Handle IRIS:
11562/992863
ultima modifica:
15 novembre 2022
Citazione bibliografica:
Cristani, Matteo; Tomazzoli, Claudio; Zorzi, Margherita, Automatic Generation of Dictionaries: the journalistic lexicon case. in «LECTURE NOTES IN COMPUTER SCIENCE» vol. 11606 Springer Verlag Germany  in Proceedings of 32nd International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2019)Springer Verlag GermanyAtti di "32nd International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2019)" , Graz, Austria , July 9-11, 2019 , 2019pp. 744-752

Consulta la scheda completa presente nel repository istituzionale della Ricerca di Ateneo IRIS

<<indietro

Attività

Strutture

Condividi