CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://revistas.uam.es/chimera <p style="text-align: justify;">CHIMERA es una revista científica con doble revisión anónima de ámbito internacional y que publica <strong>estudios basados en corpus sobre lenguas romances</strong>.</p> <p style="text-align: justify;">El objetivo de la revista es la difusión internacional de investigaciones teóricas y aplicadas de calidad científica probada, especialmente aproximaciones innovadoras al análisis de lenguas romances. CHIMERA también pretende promover una mejor conexión entre las comunidades académicas dedicadas a las lenguas romances, tanto europeas como americanas.</p> <p>Se aceptan originales centrados en el análisis de corpus tanto de lengua escrita como oral desde una amplia variedad de perspectivas teóricas y sobre cualquier área lingüística. La revista también publica reseñas de libros relacionados con su temática, corpus (desarrollo de recursos y etiquetados) y herramientas de análisis de corpus.</p> <p>CHIMERA publica artículos escritos en lenguas románicas y en inglés. No hay cobro por tasas por envío de trabajos ni cuotas por la publicación de artículos.</p> Universidad Autónoma de Madrid es-ES CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos 2386-2629 <p>Los autores que publican en CHIMERA aceptan:</p><ul><li>Los autores mantienen todos los derechos y ceden a la revista el derecho a la primera publicación del trabajo bajo la licencia de <a href="http://creativecommons.org/">Creative Commons Attribution License</a> que permite a otros difundir las publicaciones con el reconocimiento explícito de la autoría y de su primera publicación en esta revista.</li><li>Los autores pueden llegar a acuerdos paralelos para la distribución no exclusiva de sus artículos (p.ej. publicarlo en un repositorio institucional o en un libro), con el reconocimiento explícito de la autoría y de su primera publicación en esta revista.</li><li>Los autores tienen permiso (y se les recomienda) que publiquen sus artículos online incluso antes de que hayan sido aceptados por la revista ya que es un buen modo de conseguir intercambios y de favorecer las citas.</li></ul> Una aplicación para explorar la frecuencia léxica a partir de corpus de referencia https://revistas.uam.es/chimera/article/view/15525 <p>La frecuencia léxica o cuánto se utiliza una palabra frente a otros conjuntos en una lengua es un factor fundamental en las tareas de lectura y procesamiento de textos. Esto se ha demostrado mediante investigación experimental tanto con adultos como con niños. Estos estudios han mostrado la estrecha relación entre la comprensión lectora, la habilidad léxica y la descodificación léxica. En este trabajo se presenta una aplicación en línea para explorar la frecuencia léxica en textos en español en función de una serie de corpus de referencia. La versión actual incluye los tres corpus de la Real Academia Española (CORDE, CREA y CORPES XXI), lo que permite realizar investigaciones tanto diacrónicas como sincrónicas. Esta aplicación permite al usuario enviar textos (o palabras) para procesar y devuelve una tabla con información variada sobre la frecuencia de cada forma del texto. La información de frecuencia incluye el orden de la forma en la lista de frecuencias y las frecuencias absoluta y normalizada. Los resultados se pueden descargar fácilmente para ser utilizados en herramientas externas.</p> Mario Casado-Mancebo Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-07-24 2023-07-24 10 89 95 Frequency distribution of inflectional properties of nouns: data on written Italian https://revistas.uam.es/chimera/article/view/16058 <p class="CHIMERAabstractkeywordsGrassettoNonGrassetto"><span lang="EN-US">The inflectional features of words show distributional peculiarities that affect the cognitive processing of language. For instance, it has been consistently reported that the association between the grammatical gender and other inflectional properties of nouns affects comprehension and production processes. However, reliable quantitative data about the distribution of nouns’ inflectional properties are scarce. The paper analyzes the inflectional system of nouns in Italian, a language where nouns are inflected for gender and number and are organized into different inflectional classes. The DeGNI lexical database (De Martino <em>et al</em>., 2019) was interrogated in order to obtain measures of the distribution of genders, gender suffixes and declensional patterns of the Italian nouns. </span></p> Maria De Martino Giulia Bracco Alessandro Laudanna Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-07-24 2023-07-24 10 121 133 Per studiare il vocabolario del passato https://revistas.uam.es/chimera/article/view/15698 <p>This paper aims to propose a new method for describing the lexicon of a language in a specific period of its history. The first paragraph outlines the two main ideas to be found in the studies concerning both synchronic and diachronic lexicology. In the second paragraph our method for lexical inquiry is presented together with its core concepts, which are textual Corpus Representativeness, Connotation, Connotation Rate (<em>Quoziente Connotativo</em>, QC) and word Position in the Center-Periphery Vocabulary Model. The third paragraph sketches two possible research lines, the first one regarding the lexicon of a given historical period (Old Italian), the second dealing with the comparison between two different linguistic historical phases (Old Italian vs. Contemporary Italian).</p> Cosimo Burgassi Elisa Guadagnini Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-03-01 2023-03-01 10 1 18 10.15366/chimera2023.10.001 Using keywords in the automatic classification of language of gender violence https://revistas.uam.es/chimera/article/view/15486 <div> <p class="CHIMERAabstractkeywordsGrassettoNonGrassetto"><span lang="EN-US">This paper employs lexical analysis tools, quantitative processing methods, and natural language processing procedures to analyze language samples and identify lexical items that support automatic topic detection in natural language processing. This paper discusses how keyword extraction, a technique from corpus linguistics, can be employed in obtaining features that improve automatic classification; in particular, this research is concerned with extracting keywords from a corpus obtained from social networks. The corpus consists of 1,841,385 words and is subdivided into three sub-corpora that have been categorized according to the topic of the comments in each one of them. These three topics are violence against women, violence against the LGBT community, and violence in general. The corpus has been obtained by scraping comments from YouTube videos that address issues such as street harassment, femicide, feminist movements, drug trafficking, forced disappearances, equal marriage, among others. The topic detection tasks performed with the corpus extracted from the social media showed that the keywords rendered a 98% accuracy when classifying the collection of comments from 51 videos, as one of the three categories mentioned above, and 92% when classifying almost 7,500 comments individually. When keywords were removed from the classification task and all words were used to perform the classification task, accuracy dropped by an average of 17%. These results support the argument for keyword relevance in automatic topic detection.</span></p> </div> Héctor Castro Mosqueda Antonio Rico Sulayes Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-03-01 2023-03-01 10 19 43 10.15366/chimera2023.10.002 La Knowledge Base LiLa https://revistas.uam.es/chimera/article/view/15998 <p>Questo articolo descrive la Knowledge Base LiLa che consente l'interoperabilità tra risorse linguistiche della lingua latina. Le risorse interagiscono grazie alla rappresentazione dei loro (meta)dati tramite ontologie e vocabolari condivisi, in accordo con i principii del paradigma Linked Data. Dopo aver presentato l'architettura di LiLa, che è basata su una raccolta di forme di citazione di parole latine, l'articolo descrive la modellizazione ontologica di ciascuna delle risorse lessicali e testuali attualmente connesse a LiLa. Infine,è riportata una serie di considerazioni in merito alle prospettive di lavoro relative a LiLa.</p> Marco Passarotti Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-06-23 2023-06-23 10 45 72 La sezione Sardegna del corpus CLaSSES https://revistas.uam.es/chimera/article/view/16052 <div> <p class="CHIMERAabstractkeywordsGrassettoNonGrassetto"><span lang="EN-US">This paper presents the salient features of the Latin inscriptions included in the section <em>Sardinia</em> of the CLaSSES database (<em>Corpus for Latin Sociolinguistic Studies on Epigraphic Texts</em>). This section gathers 1184 inscriptions (14413 tokens) from the island, covering a broad time span (from the first century BCE to the seventh century CE), including several text types (public and private inscriptions, as well as sacred and funerary texts). The results of our examination helped us to determine the salient features of the variety of Latin spoken in Sardinia, which on the one hand foreshadows the Romance outcomes of the Sardinian varieties, and on the other hand, enables us to highlight common linguistic features between Sardinia and Africa. </span></p> </div> Lucia Tamponi Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-06-29 2023-06-29 10 73 87 La gestione del dialogo in lingua straniera https://revistas.uam.es/chimera/article/view/16046 <p>Nel presente contributo esaminiamo parlato dialogico in Lingua Straniera (LS) confrontando le strategie pragmatiche adottate da apprendenti italofoni di spagnolo e tedesco e da parlanti nativi delle stesse lingue. L'obiettivo del lavoro è quello di indagare se e in che misura le strategie impiegate in LS presentino pattern pragmatici riconducibili a quelli propri della L1 (italiano), propri della lingua target (LT, spagnolo, tedesco) o, piuttosto, caratteristiche legate alla competenza linguistica e strategica in LS (quindi indipendenti dalla L1 e dalla LT). Consideriamo l'articolazione della struttura testuale, le preferenze e le "dispreferenze" accordate per introdurre e gestire argomenti del discorso, insieme a un grado di fluenza basato su alcuni parametri temporali. I nostri risultati indicano che il parlato in LS presenta una struttura testuale meno elaborata e più frammentata rispetto al parlato nativo: le entità topicali trattate tendono a essere disposte linearmente, non gerarchicamente. Allo stesso tempo, i “giochi conversazionali”, sebbene sempre meno approfonditi, si concludono in media impiegando un numero maggiore di mosse, nel contesto di una generale lentezza di elaborazione, minore fluenza generale e difficoltà di gestione dell'interazione.</p> Iolanda Alfano Loredana Schettino Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-07-24 2023-07-24 10 97 120 CHROME Corpus https://revistas.uam.es/chimera/article/view/16013 <p>In this paper we describe the methodology employed for the annotation of a resource developed within the CHROME (Cultural Heritage Resources Orienting Multimodal Experience) project, aimed at the protection and promotion of cultural heritage. More specifically, the ultimate goal of the project is the modelling of multimodal data (including speech features and gestures) for the design of a virtual agent serving in museums and capable of communicating in intelligible as well as effective and natural way. In order to grasp the relationship between linguistic and gestural behaviours, multi-level annotation systems have been developed and implemented for the labelling of linguistic and gesture features on different levels of analysis. This article is dedicated to a general presentation of the corpus and to the description of the different levels of linguistic annotation; then, the final section, reports conclusive remarks considering the applications of the described methodology. The CHROME corpus and the mark-up methodology described in this work represent valuable multimodal resources for investigations on communicative dynamics which may offer valid support for both theoretical and practical applications.</p> Iolanda Alfano Violetta Cataldo Loredana Schettino Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-07-24 2023-07-24 10 135 153 Functions of Anticipatory Completion in Mandarin Chinese https://revistas.uam.es/chimera/article/view/16725 <p class="CHIMERAabstractkeywordsGrassettoNonGrassetto"><span lang="EN-US">This study investigates anticipatory completion in Mandarin Chinese (Lerner 1991), adhering to the previous study of conversation analysis in studies of English and Japanese (Lerner 1991, 1996; Lerner &amp; Takagi 1999; Hayashi 2003, 2017), combined with an examination of prosodic features using Praat (Boersma and Weenink 2018). To examine underexplored questions of joint utterance construction in Mandarin Chinese, 17 natural two-party face-to-face conversation recordings from 30 Mandarin Chinese speakers are utilized. The findings indicate that anticipated completion can be observed in Mandarin Chinese in diverse activity contexts. The data exhibit six functions. These functions are identified in relation to their different forms and the position inside interactive conversations. The results show that multiple resources, including syntactic, lexical, and prosodic features are important for accomplishing anticipatory completion and play a key role in understanding the functions of anticipatory completion. </span></p> Jia LI Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-07-24 2023-07-24 10 155 178 Assessing the quality of ChatGPT’s generated output in light of human-written texts https://revistas.uam.es/chimera/article/view/17979 <p>This contribution has an exploratory nature, marking the initial phase of a broader research project aimed at achieving both descriptive and theoretical objectives. The primary goal is to evaluate the ‘quality’ of texts produced by Language Model Models (LLMs). Two key aspects are examined: the quality of generated texts in comparison to human-authored texts and the identification of distinctive features characterizing this emerging text typology. The analysis is centered on textual parameters, encompassing various phenomena related to text segmentation and three dimensions of text organization (the referential-thematic dimension, the logico-argumentative dimension, and the polyphonic-enunciative dimension). Results of different case studies based on a self-assemble corpus of biographies generated by ChatGPT-3.5 and published on Wikipedia are presented.</p> Anna-Maria De Cesare Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-11-01 2023-11-01 10 179 210 Prosody, gesture, and self-adaptors https://revistas.uam.es/chimera/article/view/18396 <p class="p1">Individuals with Autism Spectrum Disorder (ASD) display distinctive speech patterns and bodily movements. This pilot study examines spontaneous interactions between an individual with ASD and a typically developing peer (age 19), incorporating monological and dialogical contexts. The analysis, grounded in the Language into Act Theory framework, explores the information structure of the speech and linguistic parameters influenced by prosody, such as utterance boundaries, information structure, speech disfluency, mean length of prosodic units, and speech rate. The study also employs Kita's model to analyze bodily movements, including gestures and self-adaptors, and their temporal relation with speech. Notable findings reveal that ASD speech is characterized by a monotonous information structure and prosodic contour, featuring slower and longer units with a limited rate variation and information type. On the gestural side, the ASD subject exhibits fewer gestures and more self-adaptors, with some instances of asynchrony between gestures and speech. This pilot study serves as a foundational step for a broader corpus-based project dedicated to exploring the development of pragmatic skills in individuals with ASD.</p> Valentina Saccone Giorgina Cantalini Massimo Moneglia Derechos de autor 2023 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-12-20 2023-12-20 10 211 245 Le risorse della banca dati IS-LeGI per lo studio della lingua del diritto https://revistas.uam.es/chimera/article/view/16008 <div> <p class="CHIMERAabstractkeywordsGrassettoNonGrassetto"><span lang="EN-US">For several years, within the framework of institutional research aimed at the documentation of legal language, the Institute of Legal Informatics and Judicial Systems of the CNR has developed an information system (called Semantic Index for the Italian Legal Lexicon, IS-LEGI), used as operational support of the two historical databases: VOCANET-LGI (Italian Legal Language) and LLI (Italian Legislative Language). Significant terms of the selections have been made that are attributed to the prevailing meanings, as well as a conspicuous appropriate phraseology to make better understand the semantic evolution that these entries have suffered during the centuries. Among the terms chosen deserves particular attention to the terms ‘cittadinanza’ and ‘corruzione’ that provides an important contribution for understanding the social changes and cultural influences that have taken place throughout Italy. The system attests the presence of these terms in the legislation, doctrine and practice in a time span ranging from 1377 to 1966. </span></p> </div> Francesco Romano Antonio Cammelli Derechos de autor 2024 CHIMERA: Revista de Corpus de Lenguas Romances y Estudios Lingüísticos https://creativecommons.org/licenses/by-nc/4.0 2023-12-22 2023-12-22 10 237 245