Parenthetical Units and Structures in Italian and German spoken language Prosodic and textual analysis

This work presents the prosodic analysis of Parenthetical Units in spoken Italian and German within the theoretical framework of the Language into Act Theory (L-AcT), which defines such textual units as Parentheses (PARs), i.e., devices used by the speaker to add information to the utterance on a secondary level of the text. Data were extracted from the Italian Minicorpus of the DB-IPIC resource and from the German FOLK corpus. Linguistic content and distribution of Parentheses were analyzed. Furthermore, a subset of approximately 100 PARs was selected for each corpus and these were prosodically analyzed in relation to the mandatory textual units called Comment within the L-AcT framework, which express the illocutionary force of the utterance. The results show similar characteristics for the two languages, such as an overall decrease of f0 values of the PARs and the relationship between the length of the PARs, as well as the presence of additional units separating it from the COM, and frequency values. The authors also noted the presence of Parenthetical Structures that go beyond the level of the simple textual unit, which could suggest that Parentheses are a wider textual strategy operating as inter-utterance.


Introduction
This work presents a description and an analysis of the Textual Unit called Parenthesis in spontaneous spoken Italian and German. The analysis has been carried out on two different corpora: the Italian Minicorpus included in the DB-IPIC resource (Panunzi & Gregori 2012) 1 and a German sample that includes five communicative events chosen from the FOLK corpus (Schmidt 2016) 2 . Parentheses are an information hierarchy tool used by the speaker to insert a secondary level of the text. After an overview of its general characteristics, this paper underlines the various functions of Parentheses in both Italian and German spoken languages beginning with the definition and descriptions given by the Language into Act Theory (Cresti 2000;Moneglia & Raso 2014) ( §2). Furthermore, the analysis goes into details on various aspects: a qualitative sample of sequences has been selected from both corpora (nearly 100 Parentheses per each language) ( §3) and analyzed in their linguistic contents through a textual analysis (Ferrari 2014) and their prosodic form using the Praat software and tools (Boersma & Weenink 2005) ( §4-5). The resulting outcomes suggest the possibility that the Parenthesis is not only a type of Unit that works inside the utterance but also a possible strategy with a wider scope, inter-utterances ( §6). A general overview of the results can be found in §7.

Fundamentals
Language into Act Theory (L-AcT) is a theoretical framework that derives its origins from Austin's works on Speech Acts and Illocutions (Austin 1962). Its core is the correspondence between pragmatic and prosodic units in speech and it is based on the empirical observation of linguistic corpora (see the project C-ORAL-ROM 3 ) and tonal contour analysis.
To briefly describe the theory 4 , it is important to outline a few key elements: the central role of the linguistic action performed by the speaker during the communicative act, and the importance of prosody that leads to the fulfillment of the utterance and its decoding by the listener. Accordingly, each utterance expresses an illocutionary value and corresponds to a prosodic unit recognizable in the structure of the language, which allows the segmentation of the speech flow into autonomous sequences (Illocutive Principle); furthermore, each utterance can consist of a pattern of information units whose boundaries correspond to those of a pattern of minor prosodic units (Information Patterning Principle) (Cresti 2000;Cresti & Moneglia 2010).
To sum up, firstly an utterance can be distinguished in the speech flow through its prosodic features. Indeed, every listener perceives prosodic boundaries in the speech flow that divide it into terminated sequences. In addition, an utterance can be composed by only one Tone Unit or more than one, each of which having a specific pragmatic function inside the utterance. Only one of them carries the illocutionary value of the utterance, that is the unit called Comment (COM) and it is, therefore, necessary and sufficient for the accomplishment of the Speech Act 5 . The Comment expresses the illocutionary value that allows the interpretation of the utterance, which is possible on the basis of its prosodic contour 6 independently of its morpho-syntactic structure.
Thus, according to L-AcT the Comment is the only necessary element to build and interpret an utterance. Usually, a terminated sequence contains only one Comment. However, corpus-based empirical analyses show cases in which more than one unit carry the illocutionary value. These are two particular prosodic and information structures formed by two or more independent units bearing an illocutionary value: Multiple Comments and Bound Comments.
The former occurs when a spoken sequence contains two or more Comments, each with its own illocutionary force, held together by a single melodic pattern that connects them. Thus, a higher Information Unit is formed, which cannot be interpreted separately and whose components are unified in a coherent prosodic configuration. It creates a sequence of illocutive information units within a 3 C-ORAL-ROM (Cresti & Moneglia 2005) is a European project that led to the creation of a linguistic corpus of spontaneous spoken language. Four different Romance languages are collected in it: Italian, French, Portuguese, and Spanish. The corpus consists of 123 hours, 772 texts and 1,200,000 lexical occurrences (300,000 lexical occurrences per each language); http://www.elda.org/en/proj/coralrom.html 4 For the sake of brevity, refer to the texts quoted in §2.1 for an exemplification of L-AcT theoretical concepts. 5 Around half of the utterances in spontaneous spoken language is composed of a single Comment unit. According to Cresti (2005: 220): 42.88% in Italian, 46.80% in Portuguese, 49.92% in Spanish, and 61.50% in French. 6 Cresti (2000, 2017) describes a taxonomy of five illocutionary classes (assertion, direction, expression, rite, refusal) each of which is divided in more specific subclasses. compositional structure that produces rhetoric effects such as list, comparison, alternative, and reinforcement relations (Cresti 2000). Each Comment has its own characterization and can be, in most cases, pragmatically interpreted in isolation.
The latter is a chain of units sharing a homogeneous and weak illocutionary force, with a continuative prosodic profile so that the Comments in the sequence appear "bound" together. The sequence of Bound Comments is functional to the realization of a unified "story": its purpose is to build an oral text rather than to accomplish a single Speech Act (Panunzi & Scarano 2009). Bound Comments usually characterize monologues and storytelling, in which the exchange between speakers is infrequent. The term Stanza is used when referring to a sequence composed of Bound Comments to differentiate it from the Comment/Multiple Comment-Utterance 7 .
Therefore, a terminated sequence can consist of more than one Tone Unit; when that happens, the listener can also distinguish non-terminal prosodic boundaries inside the terminated sequence. In these cases, the utterance combines different Information Units together: Comment Unit and other Information Units that support the fulfillment of the illocutionary act and the communicative exchange.
Information Units can have Textual or Dialogic functions. Textual Units structure the utterance and determine its semantic features (Comment, Topic, Appendix of Topic or Comment, Parenthesis, Locutive Introducer, Multiple Comment, Bound Comment); Dialogic Units help the progression of the discourse by addressing the listener directly and do not contribute semantic content to the utterance (Incipit, Conative, Phatic, Allocutive and Expressive Units, Discourse Connector). There are also some Units without informational value such as Scanning and Time Taking Units 8 .

Parentheses in Language into Act Theory
Inside the theoretical framework of L-AcT, Parentheses are described among the Textual Units that articulate the flow of speech and they are labeled as PAR. They add information to the utterance and can operate on the expression of modality of the main illocution; these Units give voice to the speaker's evaluation of his own utterance and clarify the speaker's attitude toward the illocution and the utterance. They can be removed from the sound stream without affecting the remaining parts as they are clearly separated from the other Units, sometimes by pauses and f0 shifts. They have free distribution, but they cannot appear at the beginning of a sentence; they can follow or interrupt a Textual Unit such as Topic and Comment Units (Moneglia & Raso 2014) See in (1) an example of Parenthesis in Italian that expands a Topic Unit. Figure  1 shows the prosodic shape; the PAR Unit is indicated by the darker rectangle: (1) *LID: allora / INP Pallino / TOP il gatto l'è straordinario / PAR si mette / SCA accanto anche lui alla cagna / COB e guarda 'n su perché aspetta &lu [//1] EMP anche lui qualche cosa // COM (ifamdl02_340) [link to 1.wav] 'so / Pallino / this cat is amazing / also stands / next to the dog / and looks up because he too is expecting / something //' As can be seen in Figure 1, the intonation curve in the darker rectangle (PAR) occupies a lower frequency level than the other Information Units 10 ; in this case, the PAR is clearly recognizable in the prosodic contour not only for the significant reset of f0 but also for the presence of the two pauses (p) that frame it. See in (2) an example of Parenthesis in German, in which two units of this type follow each other: (2) *AK: und ziegenhagen is der kleinste / i-COM (.) war &äh / PAR (.) (heute/hotel) ham sie au alles weggemacht // PAR aber is immer noch der kleinste / SCA (.) luftkurort hessens (.) // COM (FOLK_E_00147_c689_03) [link to 2.wav] 'and Ziegenhagen is the smallest / i-COM it was / PAR today / hotels have all been closed / PAR but it is still the smallest / SCA Luftkurort of Hesse // COM' Figure 2 shows the prosodic shape of the PARs, which are indicated in the rectangle: 10 The tags INP, TOP, SCA and COB stand for Incipit, Topic, Scanning Unit and Bound Comment. In the following examples tags will be used but not specified. In-depth studies on Parentheses in the L-AcT theoretical framework have been carried out by Tucci (2010) and Firenzuoli & Tucci (2003), who have defined Parentheses according to functional, prosodic, and distributional criteria analyzing the Italian section of the C-ORAL-ROM corpus 11 (Cresti & Moneglia 2005). In these studies, metanarrative, modal and metalinguistic functions are underlined and identified as the main features of the Parentheses. From a prosodic point of view, these studies stress the significant decrease in the average values of f0 compared to the previous Unit, a mostly flat profile with a possible final upward tail which brings the f0 back to the values of the main level of the utterance, and a change of speech rate. Tucci also points out that the Parenthesis does not establish syntactic relationships with the utterance that hosts it, neither a hypotactic nor a paratactic one; the PAR is instead linearly added to the utterance as a structure in itself, not as a constituent. The Parenthesis is, therefore, a prosodically independent and syntactically autonomous insertion, semantically related to the content of the context, with the scope on the previous part of the sequence, as already seen in (1)  The scope can also be on the subsequent part, as in the following example. In (4) the adverb sostanzialmente (basically) is related to the predicate expressed in the COM just after the PAR sia sottomano (is on hand).

Corpora and samples
The analysis presented here concerns two corpora of spontaneous spoken language.

The Italian sample
The Italian corpus is a section of the corpus collected in DB-IPIC (Panunzi & Gregori 2012), a database of spontaneous spoken Italian of 124,735 lexical occurrences and 20,835 terminal sequences. It includes informal speech sessions in both family and public contexts, in the form of monologues, dialogues between two speakers, and conversations with more than two participants. The database aims at analyzing the variation of speech structures in face-to-face interactions; it collects audio recordings and their transcriptions chosen from the Italian speaking section of C-ORAL-ROM. All the audios are analyzed with PoS tagging, terminal and non-terminal prosodic breaks, and with the labels of the Information Units according to L-AcT (Cresti 2000) and the Informational Patterning Theory (Cresti & Moneglia 2010).
The section chosen for the study is a subset of informal Italian labeled as Minicorpus, which consists of 20 texts (32,589 words and 5,663 terminated sequences). Considering the already existing tags, the Minicorpus hosts 328 terminated sequences that contain Parentheses. Initially, the sequences were analyzed in order to recognize the pragmatic and semantic value of PAR units ( §4.1) and describe their distribution ( §4.2). A subset of 100 Parentheses were then selected on the basis of the best acoustic quality in order to describe the prosodic behavior of these tone units ( §4.3).

The German sample
The German corpus is a section of the FOLK corpus (Forschungs-und Lehrkorpus Gesprochenes Deutsch), a corpus of spontaneous spoken German whose data have been collected since 2008 by the Department of Pragmatics of the present Leibniz-Institut für Deutsche Sprache of Mannheim (Schmidt 2016). The resource is freely available on the DGD (Datenbank für Gesprochenes Deutsch) website in its latest version (2.14), published on 27 th April 2020. FOLK primarily addresses researchers and teachers in the fields of conversation research, corpus linguistics and related approaches, with the aim of providing the scientific community with a corpus of German spontaneous speech as varied as possible in terms of both the type of recorded interactions and the documented communicative contexts.
In its current version, the FOLK database comprises 332 communicative events involving a total of 1,131 speakers. The recordings have a total duration of 285 hours and 39 minutes, and the transcriptions consist of 2,719,948 words. The communicative events of FOLK (Ereignisse) are organized and classified according to the domain of interaction (Interaktionsdomäne), the linguistic region of reference (Dialektregion) and the type of interaction (Art). The corpus is transcribed according to GAT 2 (Gesprächsanalytisches Transkriptionssystem 2) transcription conventions (Selting et al. 2009) and PoS-tagged according to an extended version of the Stuttgart-Tübingen-Tagset (STTS). In the present work, however, the authors have chosen to apply the transcription and annotation conventions of the CHAT-LABLITA format.
As was done for the Italian section, the German sequences were analyzed with the aim of recognizing the pragmatic and semantic value of the PAR unit ( §5.1) and describing their distribution ( §5.2). A subset of 95 Parentheses was then selected according to their acoustic quality in order to describe the prosodic value of PAR units ( §5.3).

4.
Analysis of the Italian sample

Linguistic content of Parenthesis
A semantic and functional analysis was carried out on the Parentheses of the Italian Minicorpus. The investigation enabled the classification of the linguistic content that the Units typically host, which is mostly: exemplifications as in (5), generalizations (6), and explanations (7) or recapitulative expressions (8). Thus, Parentheses can establish logic-argumentative relations with the context and contribute to the textual and hierarchical composition of the sequences 12 .
(5) *ANT: perché / DCT con i miei amici / TOP ne ho qua / SCA due o tre carissimi / PAR non voglio / SCA andare a vivere insieme / COB perché + EMP (ifamdl05_89) [link to 5.wav] 'because / with my friends / I have here / two or three very close ones / I don't want / to move in together / because +' (6) *LID: allora / INP Pallino / TOP il gatto l'è straordinario / PAR si mette / SCA accanto anche lui alla cagna / COB e guarda 'n su perché aspetta &lu [//1] EMP anche lui qualche cosa // COM (ifamdl02_340) [link to 6.wav] 'then / Pallino / this cat is amazing / also stands / next to the dog / and looks up because he too is expecting / something //' There are also Parentheses that report an attenuation, a weakening of the main illocutionary force as in (9): (9) *ELA: no &que [//2] EMP queste persone / TOP a quanto ho capito / PAR però erano abbastanza giovani // COM (ifamcv01_576) [link to 9.wav] 'no the / these people / as far as I understand / but they were quite young //' and Parentheses that signal the speaker's uncertainty in "finding the words", leading to a proper illocutionary change with respect to the level of the main utterance 13 , as in (10) (10) *VAL: 'so / let's say / from the point of view of / of work / it's not so much / let's say / how can you tell / correct//' Lastly, the label PAR is used in the Minicorpus for the verbs that frame a direct speech in the inserted position, that is within reported words; this kind of Parentheses are clearly distinguished from the others from the point of view of their function. What unites these occurrences to the others is mainly the operation of distinguishing between levels of speech -in this case between the reported speech and the direct speech ("quoted" and "quoting" words), creating a parenthetical frame of the reported speech (Calaresu 2000) as in (11) with the verb dice ('says'): (11) *AND: o anche / i-COM_r dice / PAR se ce l'ha di fiume ... COM_r (ifamcv28 _320) 14 [link to 11.wav] 'or even / he says / if he has it from the river …'

Distribution of Parentheses in the Italian Minicorpus
Parentheses occur in about 6% of the terminated sequences of the DB-IPIC Minicorpus. They can follow or interrupt Comment or Topic Units and can present additional Parenthetical Units within them (one level within the other). From a distributional point of view, their preferred position is within the sequence that hosts them, but they can also be found in the final position. The resulting data are collected in Table 1; the inserted position includes cases in which the PAR interrupts a Unit (typically a Comment or a Bound Comment), while those Units that are at the end of interrupted sequences, therefore not finished, are part of the number of PARs 15 in the final position: Thus, Parentheses are mostly seen in the inserted position, that is, inside a terminated sequence, and part of them interrupt another Unit. Parentheses are not distributed in a homogeneous way between Utterances and Stanzas (terminated sequences in which the illocutionary value is conveyed by two or more Bound Comments, as previously mentioned). In fact, Parentheses occur in 4% of the Utterances and 17% of the Stanzas (which correspond to about 10% of the Minicorpus). The different percentage highlights that, despite the illocutive and prosodic monotony of Bound Comments, the speaker uses Parentheses as a means to create a prosodic and textual hierarchy even inside Stanzas.

Prosodic analysis
To carry out a more detailed analysis of their prosodic characteristics, a sample of Parentheses was selected from the Minicorpus: according to the criterion of the best possible acoustic quality 16 , 100 PARs were chosen, 53 of which immediately follow the Comment and the other interrupt it. The study aims to describe the PAR's behavior, and thus a generic reference to the nuclear Units was used (COM label) regardless of whether they were Bound, Simple, or Multiple Comments 17 .
The analysis was conducted adopting, with few changes, a Praat script available online 18 , which made it possible to automatically obtain durations and f0 measures (average, maximum and minimum) for each Unit. Data were used to calculate Δf0 mean, and f0 range (mean and standard deviation) of the PAR Units with respect to the COM to which they refer. We report the results in the table below; the percentages are to be understood as relative values of the PAR with respect to the related COM: These measures highlight that the Parentheses' f0 decreases on average by 11.5% with respect to the COM, with a clear f0 shift. Furthermore, the f0 variation of the PAR Unit is similar to the one of the COM, as shown by the average of PAR f0 range which corresponds to 93.6% of the COM range; a wide f0 range seems to be relevant in comparing PAR Unit to other Textual Units, such as the Appendix, also characterized by a f0 decrease. As underlined by the standard deviation (127.9), this percentage is highly variable signaling that the PAR set is a heterogeneous group of units, as exemplified in the comparison shown in Figure  3 19 : The last column of Table 2 reports the duration in milliseconds of the pauses after PAR, present in 21 cases out of 100: only two of these are shorter than 250 ms, i.e. below the threshold indicated as perceptually relevant silence in Moneglia (2005); 15 of the 21 pauses appear in the subgroup of Parentheses in the inserted position, as in the following example: Relevant and worthy of further study is the prosodic contour of the Comments that are interrupted by a Parenthesis. The analysis has highlighted the decrease of the f0 mean in the second part of the Comment in 35 sequences out of the 48 in which the PAR is in an inserted position within the COM. Looking at the mean values of the Comment's f0 mean before and after the PAR, we see on average a decrease of 13.8% -greater than the decrease between COM and PAR that were previously seen (11.5%). In fact, half of the analyzed sequences has the following feature: the second part of the continuation of the COM shows a decreasing f0 19 Cfr. short and long PAR in Santos & Bossaglia (2018). mean not only compared to the first interrupted part but also compared to the preceding part. The difference in f0 means between the two parts of the interrupted COM is not homogeneous in the sample: the percentage varies significantly, from 46.5% to 1%, distributed as shown in the The largest group (15 sequences) shows a decrease similar to that of PAR; the second bar indicates a slightly smaller but compact group (12 sequences) in which the f0 of the second part of the COM decreases up to 20% less than the first part; the remaining cases, as shown in the last two lines of the table, are marked by an even more drastic decrease of f0, over 20% of the range of the first part of the COM. A group of 13 cases does not appear in Table 3. Here, the f0 mean of the COM does not decrease after the interruption of the PAR, rather it increases (up to 33.6% compared to the first part of the unit). In these cases, a relationship was observed with the following parameters: the duration and number of Parentheses that interrupt the Comment; the presence of other Units, such as Scanning Units (of the PAR) or Time Taking Units between the two parts of the COM; the presence of pauses. In other words, in this group the f0 mean of the interrupted COM increases after the Parenthesis and this behavior depends on the characteristics of the Parenthesis itself: if the parenthesis is very long or presents pauses and Time Taking Units, it involves an increase of f0 in the next part of the COM.

5.
Analysis of the German sample

Linguistic content of Parenthesis 20
A semantic and functional analysis of the Parentheses of the German section under consideration was carried out. The contents of these units were mostly classified as: exemplifications, generalizations, and explanations or recapitulative expressions.
Below are examples of each of the mentioned functions 21 . Example (13) shows a case of exemplification:  'and we were also always open to other people and we had also / which was special / that parents or no father and son or whatever / well / it was possible to play cross-generationally /' In (15) we show an example of explanation and recapitulative expression: (15) *JR: ja / PHA °h der eine / i-COM gut / PHA der eine war generalquartiermeister // COM er bat &äh / EMP also grob gesagt / PAR &äh bisschen vereinfacht gesagt auch en general / PAR °h und sie heiraten alle drei ausländische frauen / COM ds find ich auch bemerkenswer<t / PAR eine> französin / CMM ne holländerin / CMM und der dritte die schwedin // COM (FOLK_E_00339_c625_03) [link to 15.wav] 'yes / and one of them / well / one was a Quartermaster General // he asked / so roughly speaking / a bit simplified, a general too / and they all married three foreign women / which I also find remarkable / a Frenchwoman / a Dutchwoman / and the third the Swede //' A further function that emerged during the analysis is that of mitigation and attenuation of the main illocutionary force, as shown in (16) and (17) 'it was great // was it // waves so / as high as a house / you can't imagine such a thing / and we / we were forty-eight / men on so a little boat that / how was it ? twelve meters wide / forty meters long //' Finally, there are also cases of Parentheses signaling the presence of direct reported speech. This kind of Parenthesis occurs between reported words and, as already mentioned for the Italian section ( §4.1), its function is to distinguish between different levels of speech, the "quoted" and "quoting" one, as shown in the following example with the verb sagt ('says'): 'I recently met a postmistress who told me / an e-Bike is good as far as it works / but when something doesn't / she said / you are really in trouble // because it / because it's too heavy //'

Distribution of Parentheses in the German sample
Overall, 306 Parentheses were found in the five communicative events selected for the analysis 22 . With regards to their position, they can be inserted within a host sequence, integrating a Comment or a Topic Unit, or placed at the end of the utterance. Parenthetical Units can also present other Parentheses within them. When analyzing their distribution, most of the identified Parentheses occur within a sequence, although some appear at the end of the utterance as well. The resulting data are collected in Table 4; the inserted position includes both cases in which the Parenthesis interrupts a Unit and cases in which the preceding Unit is not interrupted, while the final position only includes cases in which the Parentheses are at the end of the sequence:

Prosodic analysis
The prosodic analysis of Parenthetical Units was carried out on a subset of 94 Parentheses selected from the five communicative events considered. The group in question consists only of Parentheses that extend the Comment Units and that are characterized by the best audio quality within the sample. Within the subset taken into account, 36 Parentheses follow the Comment Unit, while 58 interrupt it. As it was done for Italian, no differentiation was made between Comment Units, Multiple Comments and Bound Comments, since the aim of the analysis concerns parenthetical structures only. In addition, no significant difference between the aforementioned units with regards to their relationship to Parentheses has been detected so far.
The analysis was carried out by manually obtaining the f0 (average, maximum and minimum) values using Praat. These were used to calculate mean Δf0 and range of f0 (mean and standard deviation) of the PAR Units with respect to the related COM. The results are reported in the following two tables, with a distinction between cases in which the PARs are placed at the end of the utterance (Table 5) and cases in which the PARs interrupt the related Comment Unit (Table  6); the percentages are to be understood as relative values of the PAR in respect to the COM that they refer to: The measurement results show that the f0 of the Parentheses decreases on average by 17.62% with respect to the COM when the PAR Unit is placed at the end of the sequence, while it decreases on average by 4.28% when the Parenthesis interrupts the COM. One aspect worth highlighting is the fact that in 17 cases the mean Δf0 value was found to be negative 23 . In particular, these are all cases in which PAR Units interrupt the COM and are inserted within it, as in Figure 4: In such cases, the mean f0 value of the PAR is higher than that of the second part of the COM, although the latter has a wider pitch range, with respectively higher and lower maximum and minimum f0 values than those of the PAR. This explains the existing difference between the percentage of decrease in terms of mean f0 values of the PARs at the end of the utterance and the percentage of intra-sequence PARs, which was in fact found to be smaller. We report in Table 7 the maximum and minimum Δf0% of the PARs' values with respect to the related COM, according to the position of PARs Units: When looking at the relationship between i-COMs and COMs, in more than half of the analyzed cases the second part of the COM shows a decrease from the first interrupted part. On average, we see a decrease of 5.04%, which is similar to the decrease between COM and PAR previously mentioned (4.28%). The percentage in terms of f0 between the two parts of the interrupted COM is not homogeneously distributed in the subset and varies significantly, from 28.65% to -26.95%. Nevertheless, in 17 cases the f0 mean of the COM was found to be higher than the one of the first part of the unit, with an increase of up to 21.23%. Except for three cases, the rest of the cases with these characteristics show a relationship with: the presence of either a particularly long Parenthesis or more than one consecutive Parenthesis; the presence of other units between the i-COM and the COM, such as pauses, Time Taking Units or Scanned Units. Therefore, it was found that, if the interrupted part of the COM and the second part are separated by long and consistent material, then there is an increase in the f0 mean of the COM's second part after the Parenthesis.

Parenthetical Structures
Parentheses are not the only units in L-AcT characterized by a decrease of f0 mean compared to the Comment. Another unit called Appendix shows a similar prosodic behavior (Saccone 2021): both of them show a lowering f0 and a variation of speech rate but the Parenthesis gives the speaker the possibility of structuring the textual architecture of the Utterance through hierarchically embedded semantic levels and can operate on the text through illocutionary changes; furthermore, the wide variability of f0 helps to assimilate Parentheses' prosodic behavior to the Comments more than to the Units supporting the semantic core of the utterance, such as the Appendix. This suggests a greater semantic and pragmatic strength of Parentheses as Textual Units. These remarks lead to the idea of Parentheses as a strategy, a specific structure that can operate on a wider level than that of a Textual Unit, reaching the dimension of an entire terminated sequence.
As the analysis shows, Parentheses allow the speaker to overcome the limits of the linearity of speech through a change of informational level, which typically corresponds to a prosodic -and syntactic (Schneider 2014) -interruption. The characteristics of the Parenthesis noted so far suggest an autonomy that distinguishes it from other Textual Units: the Parenthesis creates a hierarchical level that is lower than that of the Comment, sometimes exhibiting particular illocutionary nuances, and can be internally articulated.
A comparison between PARs in L-AcT and Parentheses as described in the Interfaccia Model (Ferrari 2014) suggests an interpretation of the Unit as an Inserted Utterance, therefore totally autonomous from the sequence that hosts it: describing the Utterance and its boundaries in written text, Ferrari & Zampese (2016) highlight the presence of Utterances with the status of a Parenthesis that gives rise to a deeper level than the main level of the text, which can be deleted without affecting the coherence of the main architecture. They can have the function of Illustration, Specification or Reformulation of the central content of the text, and they are not a constituent of the main Utterance: they have enunciative -therefore illocutionary -autonomy and allow the creation of a new and deeper level of the text; otherwise, they affect the illocutionary force of the main Utterance and act on it as a metalinguistic comment 24 .
From the empirical observation of the corpus, through an analysis of the audiorecordings in terms of not only utterance but also textual dimension of monologic macrostructure, some complex structures have been underlined that show characteristics of the Parenthesis and go beyond single intonation/information units 25 . Indeed, they correspond to one or more terminated sequences of comment, explanation, or background information and are inserted between other sequences (both Utterances and Stanzas) that add information at a deeper enunciative level of the text than the main one, marked by lower levels of f0 means. Using these structures, the speaker finds a way of introducing further necessary information, based on the knowledge that the listener has about the communicative contextfunction considered to be the main feature of the Parenthesis 26 according to the description of excursus as macro-parentheses by Sornicola (1981). Indeed, Sornicola describes the excursus as a typical macro-structural effect of the speech 24 Cfr. Cignetti (2011). 25 Santos & Bossaglia (2018) suggest the presence of long Parentheses borrowing an illocutionary value. 26 Cfr. Dehé & Kavalova (2007). planning in "chains" -close to the Bound Comment chain found in L-AcT -and considers the excursus itself as a macro-progression of the Information structure of the discourse, a transition structure.
To underline the characteristics of Parenthetical Structures, a detailed analysis has been conducted on a sample of monologues in a familiar contest of DB-IPIC and an interview of FOLK. Below is a description of them through the exemplification of one communicative event per language.

Parenthetical Structures in Italian
The audio choice was dictated by the quality of the recordings; to eliminate the variability of f0 due to the change of speaker, monologic interventions were selected and the rare interventions of other speakers during the recordings were cut out.
The audio files were segmented through Praat by noting the pauses and perceptively dividing the recordings into inserted/main textual level. Every segment was analyzed in order to calculate its duration and f0 measures (minimum, maximum, mean, and slope) and to observe its prosodic behavior; using this method, it was possible to collect adjacent segments with homogeneous f0 mean. Figure 5 schematically illustrates the succession of different levels of f0 mean in the Italian monologue ifammn15 27 . See on the horizontal axis the duration of the segments measured in seconds (s); on the vertical axis, f0 mean values in Hertz (Hz) 28 .
Note that the grouping of units does not directly correspond with terminal boundaries: each group can be wider than an utterance/stanza or can correspond to a set of units inside an utterance/stanza. It is possible to identify clear decreases in the level of f0 means inside the monologues. The decreases correspond to lower information levels which are perceptually distinguishable: in other words, it is possible to perceive and therefore distinguish a main information level (M) of the discourse and an inserted one (I). Thus, we find a formal confirmation of the perceptual distinction between the two information levels in the acoustic parameters. Data reported in Table 8 show that the inserted level has lower values than the main one in terms of the f0 mean and slope, and f0 variation (see the values of the standard deviation in brackets). More variable values have been found in monologues which present speeches by other interlocutors or laughter, both factors that affect the width of the f0 range: Furthermore, the decrease of the f0 mean between the two levels, main and inserted, has been calculated in order to compare the result with what the study on PARs inside Utterances and Stanzas highlighted. The data for the three monologues are shown in Table 9: The percentages in the ifammn15 column refer to the difference in the f0 mean between the main level and the inserted one. The trend of ifammn15 best reflects the results obtained from the analysis on the Parenthesis as a Unit; the percentage increases in the other two monologues. The monologues analyzed here are a much more homogeneous sample of study than the one built for the analysis that was carried out on single PARs, as shown by the standard deviation values, given that in the second study one speaker is analyzed at a time.
Below is an example of Parentheses as a macro-structure that builds an inserted information level with specific textual characteristics. The Parenthetical Structures detected can have the form of Utterances, simple or compound, as well as of Stanza or they can group, as already mentioned, more than one terminated sequence. The latter is the case exemplified in (20), an extract taken from ifammn15. The expressions with a lower f0 mean are in bold.  (Ferrari 2014). The speaker here describes his teaching experiences and emphasizes the importance of 'developing a personal path' for every student. Stanza [18] gives support to this idea with an example. Note that the beginning of Stanza [21] is lexically marked by the incipit per tornare al discorso ('going back to the subject' -underlined in the example), i.e. resuming the previous subject. The speaker uses this incipit to resume the line signaling the end of the digression in the inserted level of the discourse and the return to the previous theme, thus to the main level of the discourse.
In the proposed example, it seems reasonable to analyze the monologue as a coherent and uniform text that can be studied through the tools of Textual Linguistics. This is made possible by observing and interpreting semantic and pragmatic aspects of the transcription, but also analyzing the prosodic behavior of the recordings. In fact, a prosodic analysis points out that prosodic boundaries can indicate different textual plans like the inserted and the main one and therefore define a textual hierarchy.

Parenthetical Structures in German
By analyzing one communicative event from the FOLK corpus, namely FOLK_E_00339, it was possible to identify structures whose characteristics are similar to those of the Parenthetical Structures identified in Italian, although a larger amount of data should be analyzed in order to draw comprehensive and reliable conclusions. These structures thus appear to take shape as a broader textual strategy, going beyond that of the simple Information Unit and, from the prosodic point of view, they appear as one or more terminated sequences inserted between other terminated sequences and performing a function similar to that of the PAR Units, as they introduce more details, a comment or an explanation of what is said in the main textual level.
Overall, five of such structure were found. Below is an example in which the Parenthetical Structure consists of a comment on what is expressed in the main textual level: 'there is also a certain concentration that I find in prayer // and a certain calmness that I also find [/3] a relaxation that I also find in prayer // now I'm not saying that one should try something else // or that I would now wander into Buddhism or / I don't know where // I have that very firmly imprinted in my mind //' Parenthetical Structures were analyzed using Praat by manually and perceptually distinguishing an inserted and a main textual level. The f0 values (mean, maximum and minimum) were extracted. Tables 10 and 11 show the extracted values for each of the five segments:  Table 12 shows the decrease of f0 mean between the main and the inserted level: As can be seen, the values are quite varied and in one case the f0 mean of the inserted level is higher than that of the main one, causing the difference between the two textual levels to be negative.
We schematically report the difference between the two prosodic levels in Figure 6: What seems to be characteristic of this type of structure is the prosodic conclusion: Parentheses that have been identified as Parenthetical Structures are delimited by terminal prosodic breaks. Moreover, they mostly act as comments on previously introduced information, adding extra details or a personal consideration of the speaker.

Comparisons and general results
The aim of this study was to investigate peculiar features of Parentheses in spontaneous spoken language, starting from the descriptions and characteristics given by the theoretical framework of Language into Act Theory. This analysis was carried out on two samples of Italian and German taken from DB-IPIC and FOLK corpus from a double perspective: a first semantic and functional perspective and a second formal and acoustic one. The analysis of the two samples shows similar behavior and features of Parentheses as an Information Unit inside a terminated sequence.
To sum up, the observation of Parentheses' distribution shows the preference of the inserted position, that is inside a terminated sequence -or even inside another Unit -than the final position.
From a semantic point of view Parentheses typically correspond to exemplifications, generalizations, explanations or recapitulative expressions; they can operate on the illocution of the utterance as a weakening or an attenuation. As for the prosodic analysis, Parentheses are marked by f0 decrease on average by more than 10% (IT: 11.5%; G: 17.62%) with respect to the COM. A different trend between the two languages concerns the f0 range of PAR, since the f0 variation of the PAR Unit is similar to the one of the COM in the Italian sample (93.6% of the COM range) but not in the German sample, where the percentage is 54.5% for PAR in the final position and 68.5% for PAR in the inserted position.
A relevant result is about Parentheses that interrupt a Comment, whose prosodic behavior is much more various compared to the other PARs, as well underlined in German data. Furthermore, the presence of PAR results in a decreasing f0 mean in the interrupted COM after the Parenthesis. The exceptions found directly depend on the characteristics of the Parenthesis itself: if this is very long or presents pauses and Time Taking Units, it involves an increase of f0 in the next part of the COM.
Analyzing the sequences and observing the general characteristics of audiorecordings with a wider scope, it was possible to recognize particular structuresespecially inside monologues and monologic parts of communicative eventsthat share both semantic and prosodic features with Parenthetical Units. From such a perspective, both samples have been analyzed with the result of underlining the presence of Parenthetical Structures: whole terminated sequences, or more than one sequence together united by a similar f0 mean that is lower than the general f0 level. In other words, there are segments of the discourse that can be recognized as inserted in the main textual and prosodic level, which usually collect background information or exemplifications and that enrich the main discourse. From the prosodic perspective, the same decrease of f0 means just mentioned for Parenthetical Units have been measured and directly correlate with a different textual function of the sequences under discussion.
Thus, Parentheses can structure the speech flow in different levels both on a textual and a prosodic level beyond the dimension of the utterance.