Laboratory Phonology

Variability in English vowels is comparable in articulation and acoustics
Noiray A, Iskarous K and Whalen DH
The nature of the links between speech production and perception has been the subject of longstanding debate. The present study investigated the articulatory parameter of tongue height and the acoustic F1-F0 difference for the phonological distinction of vowel height in American English front vowels. Multiple repetitions of /i, ɪ, e, ε, æ/ in [(h)Vd] sequences were recorded in seven adult speakers. Articulatory (ultrasound) and acoustic data were collected simultaneously to provide a direct comparison of variability in vowel production in both domains. Results showed idiosyncratic patterns of articulation for contrasting the three front vowel pairs /i-ɪ/, /e-ε/ and /ε-æ/ across subjects, with the degree of variability in vowel articulation comparable to that observed in the acoustics for all seven participants. However, contrary to what was expected, some speakers showed reversals for tongue height for /ɪ/-/e/ that was also reflected in acoustics with F1 higher for /ɪ/ than for /e/. The data suggest the phonological distinction of height is conveyed via speaker-specific articulatory-acoustic patterns that do not strictly match features descriptions. However, the acoustic signal is faithful to the articulatory configuration that generated it, carrying the crucial information for perceptual contrast.
Aligning the timelines of phonological acquisition and change
Beckman ME, Li F, Kong EJ and Edwards J
This paper examines whether data from a large cross-linguistic corpus of adult and child productions can be used to support an assumed corollary of the Neogrammarian distinction between two types of phonological change. The first type is regular sound change, which is assumed to be incremental and so should show continuity between phonological development and the age-related variation observed in the speech community undergoing the change. The second type is dialect borrowing, which could show an abrupt discontinuity between developmental patterns before and after the socio-historical circumstances that instigate it. We examine the acquisition of two contrasts: the Seoul Korean contrast between lax and aspirated stops which is undergoing regular sound change, and the standard Mandarin contrast between retroflex and dental sibilants which has been borrowed recently into the Sōngyuán dialect. Acquisition of the different contrasts patterns as predicted from the assumed differences between continuous regular sound change and potentially abrupt dialect borrowing. However, there are substantial gaps in our understanding both of the extent of cross-cultural variability in language socialization and of how this might affect the mechanisms of phonological change that must be addressed before we can fully understand the relationship between the time courses of the two.
Phonetic reduction and variation in American Sign Language: A quantitative study of sign lowering
Tyrone ME and Mauk CE
During normal sign language use, a signer's productions will often be reduced from the citation forms of signs. This study examines a form of phonetic reduction in American Sign Language, in which signs that are located at the forehead are lowered in space. In particular, we explore the effects of signing rate and phonetic environment on the lowering of specific ASL signs and on their phonetic variation along the other two movement axes. Movement data were captured as native signers produced utterances that were controlled for phonetic environment and signing rate. We found that all signers produced lowered forms as an effect of the phonetic factors that we manipulated. In addition, several rate-induced effects occurred, which we had not predicted. Results are discussed in relation to past research on variation in sign production and in speech.
A gestural account of the velar fricative in Navajo
Iskarous K, McDonough J and Whalen DH
Using the framework of Articulatory Phonology, we offer a phonological account of the allophonic variation undergone by the velar fricative phoneme in Navajo, a Southern or Apachean Athabaskan language spoken in Arizona and New Mexico. The Navajo velar fricative strongly coarticulates with the following vowel, varying in both place and manner of articulation. The variation in this velar fricative seems greater than the variation of velars in many well-studied languages. The coronal central fricatives in the inventory, in contrast, are quite phonetically stable. The back fricative of Navajo thus highlights 1) the linguistic use of an extreme form of coarticulation and 2) the mechanism by which languages can control coarticulation. It is argued that the task dynamic model underlying Articulatory Phonology, with the mechanism of gestural blending controlling coarticulation, can account for the multiplicity of linguistically-controlled ways in which velars coarticulate with surrounding vowels without requiring any changes of input specification due to context. The ability of phonological and morphological constraints to restrict the amount of coarticulation argues against strict separation of phonetics and phonology.
The distribution of speech errors in multi-word prosodic units
Choe WK and Redford MA
Sequencing errors in natural and elicited speech have long been used to inform models of phonological encoding and to understand the process by which serial ordering is achieved in speech. The present study focused on the distribution of sequential speech errors within multi-word prosodic units to determine whether such units are relevant to speech planning, and, if so, how. Forty native English-speaking undergraduate students were asked to produce sentences that varied in length and in the extent to which certain phonological features were repeated (tongue twisters or not). Participants prepared their utterances in advance of speaking and were coached to be as fluent as possible once they started speaking. The goal was to ensure the production of well-structured utterances, while maximizing the number of errors produced, and minimizing the effects that excessive self-correction might have on prosodic structure. Speech errors were perceptually identified in the recorded speech and categorized. Strong and weak prosodic boundaries were prosodically-transcribed in sentences with sequencing errors. Speech error patterns were found to correspond well with the boundaries of the multi-word prosodic units defined by the strong and weak prosodic boundaries. In particular, the number of sequencing errors was found to vary as a function of position within a unit such that the fewest errors were found in initial position, more occurred in early-mid position, and even more occurred in late-mid position. This pattern of increasing errors across the multi-word prosodic unit was referred to as the cumulative error pattern. The analyses also revealed a final position effect. When multi-word prosodic units occurred in utterance-initial or utterance-medial position, a disproportionate number of errors occurred in the final position of the unit. However, when the units occurred in utterance-final position, more errors occurred in late-mid position than in final position. The cumulative error pattern and final position effect are interpreted to suggest the serial activation and decay in activation of multi-word planning domains during phonological encoding.
Dynamical account of how /b, d, g/ differ from /p, t, k/ in Spanish: Evidence from labials
Parrell B
This study examines articulatory lenition of intervocalic stops in Spanish and tests the theories that 1) /b, d, g/ have an intended target for closure equal to that of /p, t, k/ and 2) spirantization of /b, d, g/ is caused by undershoot due to their short duration phrase medially. Consistent with past acoustic studies, subjects produce /b/ with incomplete closure phrase medially and complete closure phrase initially. Additionally, /b/ is shorter than /p/ phrase medially though not initially. For /b/, though not for /p/, there is a correlation between constriction degree and duration, consistent with the theory of dynamical undershoot. The results from the study are accurately modeled with a virtual target for /b/ slightly beyond the point of articulator contact. Such a target results in full closure at long durations (such as found phrase initially) and incomplete closure at shorter durations. Based on this evidence, it is proposed that /b, d, g/ differ from /p, t, k/ in three ways: they are shorter, lack a devoicing gesture, and have a target closer to - but still beyond - the point of articulator contact.
Word-types, not word-tokens, facilitate extraction of phonotactic sequences by adults
Richtsmeier PT
Phonotactics-the permissibility of sound sequences within a word-correspond to lexical statistics, but controversy persists over which statistics are being tracked. In this study, lexical type and token counts were compared as they contributed to phonotactic extraction from an artificial lexicon. Young-adult participants were familiarized with a set of CVCCVC nonwords contextualized as a lexicon of Martian animal names. The type and token frequencies of word-medial consonant sequences within those names were varied systematically. Participants then rated new nonwords, containing the same medial sequences, on a 7-point scale for similarity to the Martian animal names. Higher ratings only followed high type frequency familiarization conditions, suggesting that word-types drove phonotactic extraction. Additionally, participants reversed the typical preference for high frequency English sequences, likely because they rated nonwords according to their membership to an unknown language. This finding suggests cognitively separable tracking of artificial language statistics and preexisting representations.
Phonetic convergence in spontaneous conversations as a function of interlocutor language distance
Kim M, Horton WS and Bradlow AR
This study explores phonetic convergence during conversations between pairs of talkers with varying language distance. Specifically, we examined conversations within two native English talkers and within two native Korean talkers who had either the same or different regional dialects, and between native and nonnative talkers of English. To measure phonetic convergence, an independent group of listeners judged the similarity of utterance samples from each talker through an XAB perception test, in which X was a sample of one talker's speech and A and B were samples from the other talker at either early or late portions of the conversation. The results showed greater convergence for same-dialect pairs than for either the different-dialect pairs or the different-L1 pairs. These results generally support the hypothesis that there is a relationship between phonetic convergence and interlocutor language distance. We interpret this pattern as suggesting that phonetic convergence between talker pairs that vary in the degree of their initial language alignment may be dynamically mediated by two parallel mechanisms: the need for intelligibility and the extra demands of nonnative speech production and perception.
Generalizing over Lexicons to Predict Consonant Mastery
Beckman ME and Edwards J
When they first begin to talk, children show characteristic consonant errors, which are often described in terms that recall Neogrammarian sound change. For example, a Japanese child's production of the word kimono might be transcribed with an initial postalveolar affricate, as in typical velar-softening sound changes. Broad-stroke reviews of errors list striking commonalities across children acquiring different languages, whereas quantitative studies reveal enormous variability across children, some of which seems related to differences in consonant frequencies across different lexicons. This paper asks whether the appearance of commonalities across children acquiring different languages might be reconciled with the observed variability by referring to the ways in which sound change might affect frequencies in the lexicon. Correlational analyses were used to assess relationships between consonant accuracy in a database of recordings of toddlers acquiring Cantonese, English, Greek, or Japanese and two measures of consonant frequency: one specific to the lexicon being acquired, the other an average frequency calculated for the other three languages. Results showed generally positive trends, although the strength of the trends differed across measures and across languages. Many outliers in plots depicting the relationships suggested historical contingencies that have conspired to make for unexpected paths, much as in biological evolution."The history of life is not necessarily progressive; it is certainly not predictable. The earth's creatures have evolved through a series of contingent and fortuitous events." (Gould, 1989).
Phonological and Semantic Cues to Learning from Word-Types
Richtsmeier P
Word-types represent the primary form of data for many models of phonological learning, and they often predict performance in psycholinguistic tasks. Word-types are often tacitly defined as phonologically unique words. Yet, an explicit test of this definition is lacking, and natural language patterning suggests that word meaning could also act as a cue to word-type status. This possibility was tested in a statistical phonotactic learning experiment in which phonological and semantic properties of word-types varied. During familiarization, the learning targets-word-medial consonant sequences-were instantiated either by four related word-types or by just one word-type (the experimental frequency factor). The expectation was that more word-types would lead participants to generalize the target sequences. Regarding semantic cues, related word-types were either associated with different referents or all with a single referent. Regarding phonological cues, related word-types differed from each other by one, two, or more phonemes. At test, participants rated novel wordforms for their similarity to the familiarization words. When participants heard four related word-types, they gave higher ratings to test words with the same consonant sequences, irrespective of the phonological and semantic manipulations. The results support the existing phonological definition of word-types.
The Distribution of Talker Variability Impacts Infants' Word Learning
Quam C, Knight S and Gerken L
Infants struggle to apply earlier-demonstrated sound-discrimination abilities to later word learning, attending to non-constrastive acoustic dimensions (e.g., Hay et al., 2015), and not always to contrastive dimensions (e.g., Stager & Werker, 1997). One hint about the nature of infants' difficulties comes from the observation that input from multiple talkers can improve word learning (Rost & McMurray, 2009). This may be because, when a single talker says both of the to-be-learned words, consistent talker's-voice characteristics make the acoustics of the two words more overlapping (Apfelbaum & McMurray, 2011). Here, we test that notion. We taught 14-month-old infants two similar-sounding words in the Switch habituation paradigm. The same amount of overall talker variability was present as in prior multiple-talker experiments, but male and female talkers said different words, creating a gender-word correlation. Under an acoustic-similarity account, correlated talker gender should help to separate words acoustically and facilitate learning. Instead, we found that correlated talker gender impaired learning of word-object pairings compared with uncorrelated talker gender-even when gender-word pairings were always maintained in test-casting doubt on one account of the beneficial effects of talker variability. We discuss several alternate potential explanations for this effect.
A Kinematic Study of Prosodic Structure in Articulatory and Manual Gestures: Results from a Novel Method of Data Collection
Krivokapić J, Tiede MK and Tyrone ME
The primary goal of this work is to examine prosodic structure as expressed concurrently through articulatory and manual gestures. Specifically, we investigated the effects of phrase-level prominence (Experiment 1) and of prosodic boundaries (Experiments 2 and 3) on the kinematic properties of oral constriction and manual gestures. The hypothesis guiding this work is that prosodic structure will be similarly expressed in both modalities. To test this, we have developed a novel method of data collection that simultaneously records speech audio, vocal tract gestures (using electromagnetic articulometry) and manual gestures (using motion capture). This method allows us, for the first time, to investigate kinematic properties of body movement and vocal tract gestures simultaneously, which in turn allows us to examine the relationship between speech and body gestures with great precision. A second goal of the paper is thus to establish the validity of this method. Results from two speakers show that manual and oral gestures lengthen under prominence and at prosodic boundaries, indicating that the effects of prosodic structure extend beyond the vocal tract to include body movement.