Citation

Minh PHN, Yun EM, Hong KH (2020) A Study of the Correlation between Phonetic Parameters during Sustained Vowel and Speech Production with Benign Laryngeal Disorders. Int Arch Commun Disord 3:014. doi.org/10.23937/2643-4148/1710014

Original Article | OPEN ACCESS DOI: 10.23937/2643-4148/1710014

A Study of the Correlation between Phonetic Parameters during Sustained Vowel and Speech Production with Benign Laryngeal Disorders

Phan Huu Ngoc Minh3*, Eun Mi Yun2 and Ki Hwan Hong1

1Department of Otolaryngology-HNS, Chonbuk National University Hospital, Korea

2Speech Therapy, Research Institute of Clinical Medicine, Chonbuk National University, Korea

3Department of Otolaryngology, Hue University of Medicine and Pharmacy, Vietnam

Abstract

The purpose of this study to analyze and discuss the magnitude of the correlation between aerodynamic evaluations, acoustic measures, and auditory-perceptual parameters. We analyzed 39 voices of patients with benign vocal pathology. Four sensitive acoustic parameters were measured from a sustained vowel /a/ and aerodynamic parameters from a set of syllables /pi//phi//p'i/. Perceptive assessment was performed using the GRBAS (Grade, Rough, Breathy, Asthenic, Strained) scale. Firstly, results potentially indicated that almost all parameters had some significant changes in mean values over the thresholds of pathology in the evaluative setting. Secondly, only bilabial heavily aspirated /phi/ was correlated to both the acoustic parameters and the perceptual parameters. A strong correlation between local jitter and shimmer and parameter G, R, B and S were found with P < 0.001. The present study pointed out that there are some significant correlations between 3 methods of voice assessment. Combined analyses of several approaches will help to comprehensively evaluate voice disorders.

Keywords

Voice Disorders, Aerodynamic, Acoustic, Perceptual assessment, Correlation

Introduction

Voice disorder may manifest itself through a change in voice quality from structural and/or physiological changes in the larynx, in addition to the symptoms presented by the patient and the impact of this disorder on their quality of life [1]. A recent survey revealed the prevalence of benign vocal cord lesions (i.e., vocal nodules, cysts, polyps and leukoplakia) in South Korea, as investigated in a large-scale nationwide epidemiological study over 4 years. The results revealed that 1.96% (385/19636 adult subjects) were positive for abnormal findings with organic laryngeal disease, not taking into account other laryngeal pathologies, for example, paralysis recurrent laryngeal nerve, muscle tension dysphonia and spasmodic dysphonia, as well as laryngeal cancer [2]. This finding highlights the demand for methods to comprehensively assess pathological voice at the initial stage of disease, and to track the outcomes of specific voice prevention, education, and treatment programs. Voice assessment is multidimensional and includes laryngeal examination, aerodynamic measurement, acoustic evaluation, and perceptual analysis to determine how severe the alteration is and which aspects of voice production are involved with the voice disorder [3].

The presence of the tissue mass, such as vocal nodules or cysts, associated with the lesion causes incomplete closure of the vocal cords, which can lead to a further increase in the vocal effort as the patient attempt to improve adduction. Likewise, polyps which create mass lesions on the vocal cords can result in compensatory hyperfunction in an effort to increase vibratory closure. Moreover, subtle changes in laryngeal pressure, airflow and laryngeal constriction can all provide sensory feedback related to vocal effort.

In this context, it should be noticed that performing a multifactorial analysis plays a very important role as it allows for a broad, appropriate, and effective knowledge about laryngeal function and voice quality. We emphasize that a certain type of assessment cannot replace another of a different nature. All methods complement each other and are constructive in the therapeutic process [4,5]. The correlation between the results of the parameters of voice assessment is still a topic of research and debate. To the best of our knowledge, several researchers have investigated the relationship between isolated acoustic measures, aerodynamic measures and perceptual evaluation. However, thorough literature review does not reveal any studies applying all vocal assessment protocol including simultaneously aerodynamic, acoustic, and perceptual analyses of benign vocal pathology.

The purpose of this article was to present the combined analysis of several approaches and discuss the magnitude of the correlations that were found between aerodynamic evaluations, acoustic measures, and auditory-perceptual parameters of the GRBAS scale.

Materials and methods

Subjects

This is a quantitative, descriptive, and cross-sectional study. The initial study group consisted of 60 subjects with voice disorders. All patients had an otorhinolaryngological report based on laryngeal imaging. However, after evaluating the vocal cords, we selected 39 patients with benign vocal pathology, such as: Polyp, nodule, edema, sulcus, atrophy, and contact granuloma. Patients with palsy or laryngeal cancer were excluded. All the participants underwent voice evaluations via laryngoscopic examination, aerodynamic measurement, acoustic analysis and perceptual assessment. All participants received the same vocal evaluations at the same voice laboratory. All of the voice recordings followed the same protocol. The recordings were made at the Department of Otolaryngology and at the Speech and Language Laboratory in Chonbuk National University Hospital, Jeonju, Korea in the period between August 2015 and February 2016.

Aerodynamic analysis

All subjects holding a high-frequency pressure transducer were asked to phonate a repeated and consecutive pronunciation of /pi//phi//p'i/, 3 times. The bilabial stops consonants /p, ph, p'/ were used for the syllables /CVCV/, whereas the vowel /i/ was used for /CVCV/. The /p/ represented the lax consonant, /ph/ the strong aspirated consonant, and /p'/ the glottalized, no aspirated consonant. Digitally recorded data were transferred to a computer and underwent analysis using the software AEROPHONE II, model 6800 from Kay Elemetrics. The signal resulting from the second of each sample was analyzed – the most stable part of the segment was used. VOT was measured. VOT of initial /CVCV/ syllable is defined by measuring between the burst onset and the identifiable periodic vibration for the following vowel within the acoustic wave. Laryngeal aerodynamic analysis of voice production included measurement of duration, airflow, air pressure, sound pressure level, power, efficiency, and resistance for each consonant /pi/ /phi/ /p'i/.

Acoustic analysis

A microphone was placed 5 cm from the mouth and the subject was asked to phonate and sustain the vowel /a/ at the most comfortable pitch and volume. Digitally recorded data was transferred to a computer at the sampling frequency of 44 100 Hz to facilitate the analysis using the software Multidimensional Voice Program (MDVP) from Kay Elemetrics. The 39 segmented voice samples have been evaluated and classified according to the 4 sensitive acoustic parameters, namely, fundamental frequency (Fo), jitter (relative average perturbation), shimmer (amplitude perturbation quotient) and harmonics to noise ratio (HNR).

Perceptual analysis

The audio-perceptual scale used was the GRBAS. A native Korean professional speech-language pathologist researching voice disorders was asked to rate all voice samples. Recordings were presented by a computer in random order. The listener was blinded regarding the identity and diagnosis of the 39 subjects. For the auditory-perceptual judgment of dysphonia severity, the rater was instructed to score the G component (overall dysphonia) of the GRBAS, using a 4-point ordinal scale (i.e., 0, normal voice quality; 1, slight dysphonia; 2, moderate dysphonia; and 3, severe dysphonia) as suggested by the Japan Society of Logopedics and Phoniatrics.

Statistical analysis

The statistical analysis used the computer program Statistical Package for the Social Sciences - IBM SPSS for Windows, version 21.0 (IBM Corp., Armonk, USA). Firstly, we performed a descriptive analysis. Means and standard deviations (SDs) were calculated for all measures. Next, we calculated the Pearson correlation coefficient, two-tailed between aerodynamic parameters, acoustic parameters and perceptual parameters. We used the Pearson square correlation factor (r2), which expresses the common variance between 2 values. We applied a correlation t - test to determine if the correlation coefficients were statistically significant. We statistically estimated a significant difference threshold of P < 0.05 for all results in our study.

Results and discussion

Participants

Records from 39 patients were included in the study. The gender distribution was 56% male voices (n = 22) and 44% female voices (n = 17). The average age was 51 years, with a range of 23-83 years. Patients included had the following diagnoses: polyp (n = 26), edema (n = 3), sulcus (n = 1), nodule (n = 5), leukoplakia (n = 3), contact granuloma (n = 1).

Descriptive statistics

This study is the first to quantify aerodynamic and acoustic parameters and perceptual evaluations of voice disorders in a speech laboratory. These results have potentially indicated that almost all parameters have some significant change on the threshold of pathology in mean values in the evaluative clinic setting. Results of the statistical descriptive analyses, means and standard deviations of each objective parameter for voice efficiency of aerodynamic parameters of 39 subjects for syllables /pi/ /phi/ and /p'i/ are shown in Table 1. Voice efficiency was determined using expressions involving simultaneous values of sound pressure level, mean flow rate, and intrapulmonic pressure. We observed that the mean values for duration, airflow, air pressure, sound pressure level and mean power parameters of the consonant /phi/ were higher than that of /pi/ and of /p'i/.

Table 1: Mean Values of Aerodynamic Parameters IPIPI (Voice Efficiency) for the Bilabial Stop Consonant /pi/, /phi/ and /p'i/ (N = 39). View Table 1

Aerodynamic measures reflect the vibratory behavior of the vocal cords, and show changes from normative values when measured in people with hyperfunctional or hypofunctional voice disorders [6]. Vocal nodules or polyps cause an increase in mass and stiffness, which will increase the amount of subglottal pressure required for phonation [7]. These parameters are the same as in our study.

Results of the statistical descriptive analysis for 4 common acoustic parameters for the vowel /a/ of all 39 subjects are shown in Table 2. The vowel /a/ was chosen because it can be comfortably produced and is mainly dependent on acoustic, rather than or sensitive control. We observed that the mean values of jitter and shimmer are higher than the thresholds of pathology (The Multi-Dimensional Voice Program (MDVP), Kay Elemetrics, 2008, which indicates a threshold of pathology of ≤ 1.04% for jitter and ≤ 3.81% for shimmer). Thus, they are considered to be a sign of potential pathology.

Table 2: Statistical Values of Acoustic Parameters for the Vowel /a/ (N=39). View Table 2

Since the voice is mainly a perceptual phenomenon, perceptual, perceptual voice analysis was chosen as the gold standard. The GRBAS scale was chosen for perceptual analysis because its efficacy has been validated by numerous previous studies [8,9]. Results of the statistical descriptive analyses for perceptual evaluation by GRBAS of the 39 subjects are shown in Table 3. We expected that a 0 value of all parameters assessed in GRBAS was intended to have a normal/non-altered voice quality. Based on this observation, there were alterations to the mean and standard deviation of each of the GRBAS parameters, varying between a mild and moderate grade of perturbation. Grade (G), Roughness (R), and Breathiness (B) means were presented as the highest scores.

Table 3: Statistical Values of Perceptual analysis by GRBAS Score (N = 39). View Table 3

Correlation between the aerodynamic and acoustic measurements and GRBAS parameters

Results of the statistical Pearson correlation coefficient, shown in Table 4, revealed that there were significant correlations between some aerodynamic parameters of all /pi/, /phi/, and /p'i/ sounds and common acoustic parameters and GRBAS parameters, in some way. However, these correlations were mostly apparent at /ph/, which was intended to be the heavily aspirated Korean bilabial stop consonant compared to /p/ and /p'/. Based on this observation, there was a significantly correlation between the mean power parameter of /ph/ with all GRBAS parameters, except for the parameter asthenia. Yet, there was no correlation between the consonant /pi/ and /p'i/ aerodynamic measures and acoustic parameters. We identified only an inversed correlation between the duration parameter of /pi/ and parameter roughness (R).

Table 4: Correlation between Aerodynamic, Acoustic and Perceptual GRBAS Parameters (N = 39). First column indicates parameters and variables correlated, second column shows the correlation coefficient values, third column gives test P values. View Table 4

We note that Korean stops are classified into 3 categories, while English stops are classified into 2 different types, voiceless and voiced, according to the place of articulation for stop consonants. Hence, /p/ has been called lax and slightly aspirated, whereas, /ph/ is described as heavily aspirated, and /p'/ is said to be tense and nonaspirated. When we found results showing a significant correlation (P < 0.05), the impression was that correlations were mainly occurring when /phi/ was compared to /pi/ and /p'i/ (Table 4). Changes in the acoustic features of the speech waveform can also be associated with physiological changes in the vibrations of the vocal cords, and are often related to aerodynamic changes.

Of note, lesions in the membranous portion of the vocal cords can cause changes in the biomechanics of voice production, including the effects of increased lung volume initiation, expiratory respiration muscles and laryngeal adductor muscles, generating a greater glottal and supraglottal vocal effort. Hillman and colleagues found increased levels of subglottal pressure in patients with voice disorders that were related to hyperfunctional patterns of voice production [10]. In this study, we have compared the aerodynamic and perceptual measures of 39 patients with voice disorders and found that the mean air pressure parameters of /phi/ and /p'i/ sounds were correlated with an asthenia voice quality (P < 0.05). Vocal nodules and polyps can allow air leakage during the close phase because they prevent full approximation of the cords. This air leakage leads to a lower voice energy originating from a weak voice (parameter A is altered) [11,12].

Results of the statistical correlation coefficients between 4 common acoustic measures and GRBAS parameters are shown in Table 5. In our study, the acoustic measures are strongly and significantly correlated with the GRBAS parameters: Jitter and shimmer (G, R, B, S) and HNR (R, B). This means that increasing these acoustic measures (jitter, shimmer and HNR) will imply higher classifications of voice quality disturbance. There was no correlation between acoustic parameters with the parameter asthenia.

Table 5: Correlation between the Acoustic Measurements in MDVP Software and GRBAS Parameters (N = 39). View Table 5

Acoustic measures are of interest because they significantly differentiate various voice disorders from normal speakers and show a strong capacity to predict perceptual dysphonia severity [13,14]. In the literature, oral airflow and jitter have been widely used to evaluate the pathological voice and have been correlated with the intensity of dysphonia [15]. Results indicated that there were strong relations between local jitter and shimmer and audio-perceptual Grade, Roughness, Breathiness and Strain parameters (P < 0.001) (Table 5). In voice disorders, aperiodicity of vocal cord vibration and additive noise in the signal are common. The irregularity of the vocal cord cycles (parameter R reflects this) due to the unhealthy vocal cords may present as a passive vibration.

Incomplete glottal closure originates from excess air during phonation that creates a breathy voice (parameter B is altered) [11,12]. Whereas, a frequency perturbation jitter is affected mainly by a lack of control of vocal cord vibration, pathological patients often have a higher percentage of jitter. Local shimmer changes with the reduction of glottal resistance and mass lesion on the vocal cords correlates with the presence of noise emission and breathiness. Grade (G) is related with other parameters and varies according to the severity of overall voice perturbation [16].

Conclusions

Although perceptive assessment is the most widely used technique for vocal assessment, it is a subjective process that leads to some variability issues. In contrast, aerodynamic and acoustic data allows objective and noninvasive measurement of the behavior of the vocal cords [17]. Our results reveal that, among these voice assessments, there are some significant correlations. Combined analyses of a multiparametric approaches will help to comprehensively and objectively evaluate the pathological voice at the initial stage.

References

  1. (1993) Definitions of communication disorders and variations. ASHA Ad Hoc Committee on service delivery in the schools. American Speech-Language-Hearing Association. ASHA Suppl 1993: 40-41.
  2. Woo SH, Kim RB, Choi SH, Lee SW, Won SJ (2014) Prevalence of laryngeal disease in South Korea: data from the Korea National Health and Nutrition Examination Survey from 2008 to 2011. Yonsei Med J 55: 499-507.
  3. Dejonckere PH (2010) Assessment of voice and respiratory function. Otorhinolaryngology, Head & Neck Surgery. Springer, European Manual of Medicine 563-578.
  4. Ma EP, Yiu EM (2006) Multiparametric evaluation of dysphonic severity. J Voice 20: 380-390.
  5. Martens JW, Versnel H, Dejonckere PH (2007) The effect of visible speech in the perceptual rating of pathological voices. Arch Otolaryngol Head Neck Surg 133: 178-185.
  6. Hillman RE, Holmberg EB, Perkell JS, Walsh M, Vaughan C (1989) Objective assessment of vocal hyper function: an experimental framework and initial results. J Speech Hear Res 32: 373-392.
  7. Jiang J, Stern J, Chen H, Solomon NP (2004) Vocal efficiency measurements in subjects with vocal polyps and nodules: A preliminary report. Ann Otol Rhinol Laryngol 113: 277-282.
  8. Bodt MD, Wuyts F, Van de Heyning P, Croux C (1997) Test-retest study of the GRBAS scale: influence of experience and professional background on perceptual rating of voice quality. J Voice 11: 74-80.
  9. Dejonckere PH, Obbens C, De Moor GM, Wieneke GH (1993) Perceptual evaluation of dysphonia: Reliability and relevance. Folia Phoniatr 45: 76-83.
  10. Hillman RE, Holmberg EB, Perkell JS, Walsh M, Vaughan C (1990) Phonatory function associated with hyper functionally related vocal fold lesions. Journal of Voice 4: 52-63.
  11. Hartl DM, Hans S, Vaissiere J, Riquet M, Brasnu DF (2001) Objective voice quality analysis before and after onset of unilateral vocal fold paralysis. Journal of Voice 15: 351-361.
  12. Baylor C, Yorkston K, Strand E, Eadie T, Duffy J (2005) Measurement of treatment outcome in unilateral vocal fold paralysis: a systematic review. UVFP Technical Report 5, Academy of Neurologic Communication Disorders and Sciences, Washington, DC, USA.
  13. Awan SN, Roy N (2005) Acoustic prediction of voice type in women with functional dysphonia. J Voice 19: 268-282.
  14. Lowell SY, Kelley RT, Awan SN, Colton RH, Chan NH (2011) Spectral and cepstral-based measures during continuous speech: Capacity to distinguish dysphonia and consistency within a speaker. J Voice 25: 223-232.
  15. Hirano M, Hibi S, Yoshida T, Hirade Y, Kasuya H, et al. (1988) Acoustic analysis of pathological voice. Some results of clinical application. Acta Otolaryngol 105: 432-438.
  16. Takahashi H (1979) Assessment of auditory impression of dysphonia. In: Voice Examination Methods, Japan Society of Logopedics and Phoniatrics, Ed., Interna, Tokyo, Japan.
  17. Oguz H, Demirci M, Safak MA, Arslan N, Islam A, et al. (2007) Effects of unilateral vocal cord paralysis on objective voice measures obtained by Praat. European Archives of Oto-Rhino-Laryngology 264: 257-261.

Citation

Minh PHN, Yun EM, Hong KH (2020) A Study of the Correlation between Phonetic Parameters during Sustained Vowel and Speech Production with Benign Laryngeal Disorders. Int Arch Commun Disord 3:014. doi.org/10.23937/2643-4148/1710014