The aerodynamics of voiced stop closures

Jesus, Luis M. T.; Costa, Maria Conceição

doi:10.1186/s13636-019-0162-z

Research
Open access
Published: 28 January 2020

The aerodynamics of voiced stop closures

EURASIP Journal on Audio, Speech, and Music Processing volume 2020, Article number: 2 (2020) Cite this article

4280 Accesses
2 Citations
3 Altmetric
Metrics details

Abstract

Experimental data combining complementary measures based on the oral airflow signal is presented in this paper, exploring the view that European Portuguese voiced stops are produced in a similar fashion to Germanic languages. Four Portuguese speakers were recorded producing a corpus of nine isolated words with /b, d, ɡ/ in initial, medial and final word position, and the same nine words embedded in 39 different sentences. Slope of the stop release (SLP), voice onset time (VOT), release and stop durations and steady-state oral airflow amplitude characteristics preceding and following the stop were analysed. Differences between independent groups (three different places of articulation and two vowel contexts) and correlations between variables were studied; generalised linear mixed effects models were developed to study the effects of VOT, SLP and the factors place of articulation and vowel context on the mean oral airflow. A classification of stop’s voicing was automatically extracted. Both SLP (p = .013) and VOT (p = .014) were significantly different for the three places of articulation. Weak voicing was observed for 57% of the stops. It is hypothesised that the high percentages of weakly voiced stops are a consequence of passive voicing and that the feature of contrast in Portuguese is privative [spread glottis].

1 Introduction

The concept of contrast in the phonology of a language is closely linked to the competence of being able to isolate meaningful units such as phonemes or words. More specifically, the phonological laryngeal/voicing contrast is cued by a number of different features [13]: vocal fold vibration, duration of the adjacent phonemes and voice onset time (VOT) are just some of them.

The theoretical framework of this study is grounded on views of the laryngeal feature of contrast for stops that have been considerably enriched over the last decade by new acoustic and articulatory phonetics evidence which strengthened arguments that in some languages, stop voicing is phonologically active and in others, it is passive [6]. A clear relation between phonetic cues and phonological processes that support this has yet to be found, so studies such as ours, based on new aerodynamic data that is more closely related to laryngeal behaviour, could contribute towards clarifying these issues.

Laryngeal contrast has been shown to be highly correlated to VOT in a variety of languages but other parameters such as the duration, the fundamental frequency (f0) and the frequency of the first formant (F1) of adjacent vowels have also been proposed as cues of voicing [3, 6, 9, 13, 14, 20, 32, 33, 40, 46, 51, 54].

Current knowledge concerning the different contributions of acoustic parameters for voicing distinction in European Portuguese (EP) has been the focus of various studies based on adult’s and children’s acoustic data [8, 33, 40]. It has been shown that stop duration, duration of the preceding and following vowel, duration of voicing during closure, are relevant acoustic properties for the classification of voicing and that the percentage of devoiced exemplars decreases as the place of articulation moves anteriorly for word medial and word final stops [33]. A more recent cross-linguistic (Portuguese, Italian and German) speech production study looked at voicing status during closure based on time-dependent measures computed from voicing profiles [35, 40]. European Portuguese voicing patterns were different from other Romance languages and EP speakers’ characteristics resembled those of German speakers. Velar stops from five out of six speakers were least likely to be produced with voicing during closure in low vowel context [40].

The motivation for this study is that although laryngeal articulation strategies used by EP speakers have recently been recognised to differ from other Romance languages, inherent aerodynamic processes remain to be clarified [49]. This paper contributes towards clarifying what Solé ([49], p. 237) recently pointed out: “voicing patterns (and targets) may differ in language families and, therefore, a word of caution is in order when making generalizations about genetically related languages”. Therefore, Portuguese language-specific features and attributes are explored and, how these mediate the speech outputs in relation to the place of articulation, the preceding and following phone is determined, providing a new insight into voicing contrast in EP. The corpus design and analysis methodology of complementary experimental measures based on the oral airflow signal of voiced stops and adjacent phones are presented in great detail. Novel results are discussed in the context of the most recent literature and conclusions are presented supporting the view that voicing, in Portuguese, results from speech mechanisms that have also been observed for German and English [36].

1.1 The aerodynamics of stops

The aerodynamics of transient speech sounds such as stops, [and] more particularly, their intraoral pressure and the nasal airflow, have been extensively described in the literature [47, 48, 55]. We focus here on studies that have used parameters based on oral airflow because valid glottal airflow mean amplitude values, inferred from oral airflow measures, have been shown to be a reliable indicator of laryngeal characteristics [4, 11, 16, 27].

Peak oral airflow values have been reported in consonant vowel (CV), vowel consonant (VC) and vowel consonant vowel (VCV) syllables where C was one of the stops /p, b, t, d, k, ɡ/ and /i, ɑ/ were selected as vowels—V [16]. Results showed voiced stops’ peak oral airflow values significantly lower than their voiceless cognates, nonsignificant vowel context effects in CV and VCV sequences and a tendency for female values to be lower than male’s [16]. The lowest peak oral airflow average values were measured for /ibi/ syllables (66 cm³/s for females and 112 cm³/s for males) and the highest for /tɑ/ syllables (1162 cm³/s when produced by female speakers and 1324 cm³/s for male speakers). This was also one of the first papers to discuss relative flow values (linguistically more relevant than absolute values and in line with one of the central goals of speech production: to achieve broad aerodynamic targets), concluding that “air flow differences between voiced and voiceless productions may be largely attributable to the flow resistance imposed by vocal action in voicing” ([16], p. 253).

Additional mean peak values during closure reported in the literature include those of Stathopoulos and Weismer’s [50] study ([b]—284 ± 123 cm³/s; [d]—634 ± 164 cm³/s; [ɡ]—293 ± 111 cm³/s).

Moreover, Cho et al. [11] studying fortis, lenis and aspirated bilabial stops in three real words (the bilabial stops were in word-initial position and followed by the vowel /e/) reported maximum oral airflow after stop release of more than 500 cm³/s (up to 3500 cm³/s for Seoul Korean speakers), and significant effect of stop category (fortis, lenis and aspirated) was found.

1.2 Contextual effects on stops’ production

Various effects of vowel context on VOT, closure and release duration have been reported in the literature, but most of them have no systematic influence across languages and some results are even contradictory [1, 18, 33, 40, 41]. In French, /p, t, k/ closures have been found to be significantly longer than those of /b, d, ɡ/, only between /a/ vowel contexts; and short-lag (positive) VOT values significantly longer in between voiceless fricative /s/ context than in between vowel /a/ context [1]. Italian short-lag VOT results showed a distinct behaviour for voiceless and voiced stops suggesting different laryngeal articulations to sustain vocal fold vibration [18]. Whereas in German closure durations reported were not systematically affected by vowel context and the percentage of devoiced stops was higher in low to mid-vowel context [41]. In EP, conflicting results have also been reported, with average VOT values not exhibiting any clear pattern regarding the influence of vowel height, and a more recent study showing that EP and German (not Italian) stops in low vowel context were more likely to be devoiced than in high vowel context [33, 40].

The effect of the place of articulation on acoustic correlates of voicing contrast in stops has also been the subject of various studies [2, 12, 15, 18, 25, 33, 40, 41], and the observation of language-specific variations in VOT has guided modifications [12, 18] to classical models of stop voicing [28] and motivated new studies on the “interaction of universal and language specific process” ([2], p. 68). In French, voiced stops’ short-lag VOT has been found to significantly increase as the place of articulation moves more posteriorly but place of articulation does not seem to have a significant effect on closure durations [2].

Although initial evidence has suggested that stops in the context of high vowels would be less likely to devoice than stops in the context of nonhigh vowels, not much support for this was found in phonology [38, 39]. However, the degree of articulatory constraint (DAC) model of speech production predicts that different stop places of articulation result in various degrees of resistance to contextual effects [24, 44].

Results on the aerodynamic effect of vowel context on stops (e.g., the need to control for backness and the other contextual questions) and what effect it has on airflow are yet unclear, especially when real words are considered. Previous studies, presenting speaker-specific vowel effects, were based on nonsense word productions of stops which have been recently shown to differ in terms of the observed patterns from real words [1, 18, 31, 40, 41, 52].

Table 1 shows key stop production-related results in the literature based on oral airflow amplitude measures. Significant vowel effects have been found in French, but in American English (AE) contextual effects are still unclear, although oral airflow signals conveying idiosyncratic elements of voicing onset and offset have been reported [10, 21, 22, 30, 31, 37].

Table 1 Key literature results

The aerodynamics of voiced stop closures

Abstract

1 Introduction

1.1 The aerodynamics of stops

1.2 Contextual effects on stops’ production

1.3 Purpose of this study and research hypothesis

2 Method

2.1 Speakers, corpus and data acquisition

2.2 Data annotation

2.3 Aerodynamic measures

2.4 Voicing classification

2.5 Statistical analysis

3 Results

3.1 Oral airflow waveforms

3.2 Slopes of the stops’ releases

3.3 Voice onset time, release and stop durations

3.4 Amplitude of the oral flow waveforms

3.5 Correlation analysis

3.6 Mixed effects models of the mean oral airflow

3.7 Voicing classification

4 Discussion

4.1 The aerodynamics of stops

4.2 Contextual effects on stops’ production

5 Conclusions

Availability of data and materials

Abbreviations

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords