Project 5
Voice Quality Analysis Spring 2009

Download proj5-sp09-dkp.xls from Assignment Details web page


A.  Introduction
In this project the Excel template proj5-sp09-dkp.xls will be used to complete an analysis of voice quality for normal young, middle-aged, and elderly voices, as well as for a pathological voice.  The analysis technique to be used involves the computation of the harmonic-to-noise ratio (HNR) for each of several voice samples.  This analysis is detailed in Section 5.2, and specific questions and hints are given below.  Once the project is complete, students should:

  • Turn in the printed report in class.
  • Turn in your Excel file and .doc file to the Oncourse DropBox.

B.  Background
Voice pathologies are often associated with noisy voice quality, sometimes referred to as breathy or hoarse voice quality. Normal vocal fold vibrations are quasi-periodic, with a fundamental frequency (F0) that deviates from true periodicity by less than 10%. Various measures to determine how to represent the periodic versus the noisy aspects of voice have been proposed. One of them is called the harmonic-to-noise ratio (HNR), originally proposed as a time-domain algorithm by Yumoto, Gould, and Baer (1982). Algorithms for HNR are still being investigated, as indicated by a recent update by Qi and Hillman (1997), who proposed a spectral-domain algorithm.

It has been demonstrated that HNR for prolonged /a/ is greater than 10 dB for normal voices and becomes negative for various voice disorders. It has also been shown that HNR decreases with age. To develop some insight into acoustic measures of voice, you will calculate HNR for four samples of the vowel /a/ from young, middle-aged, elderly normal females, and one female with Parkinsonism. All waveform samples in the Excel spreadsheet proj5-sp09-dkp.xls contain 10 pitch pulses.

C. Follow the steps below and the Guidelines for Writing Project Reports to complete Project 5.

The measurement of HNR begins by identifying individual pitch pulses in the waveform. All pitch pulses have been marked and extracted for the young /a/ in proj5-sp09-dkp.xls. The Yumoto et al. procedure is as follows.

  1. Identify the individual pitch pulses in the samples (PP1 to PP10). Copy over the pitch pulses to a matrix, one PPi per column. Let L be the length of the longest column. For /a/ young, PP4 and PP10 are the longest, L = 61.  Pad columns with fewer than L samples with zeros. Then each column is a pitch pulse with n samples, i = 1, . . ., L

  2. Calculate an average pitch pulse, [PROJECT 5 - EQUATION 1], across the rows and graph it (see example for /a/ young):

    [PROJECT 5 - EQUATION 2]

  3. Calculate the harmonic power, H, as the sum of the squares of each sample in the average pitch pulse, and multiply it by the number of pitch pulses (e.g., m = 10):

    [PROJECT 5 - EQUATION 3]

    Note: Use the sumsq() function in Excel for calculating H as shown in ah_young.

  4. Calculate the noise component, NPi, for each pitch pulse, PP1to PP10, as:

    [PROJECT 5 - EQUATION 4]

    Store each noise component, NPi, as a column in another matrix. For /a/ young this was done with arrays. Plot all of them on a single graph to observe the amount and location of the noise component for each pitch pulse as shown for /a/ young.

  5. Calculate the total power in the noise, N, as the sum of the squares of each sample in the noise component, summed over all 10 noise components.

    [PROJECT 5 - EQUATION 5]

  6. HNR is the ratio of the power values of H to N in dB,

    HNR = 10 · log (H/N) dB

  7. Graduate work. Download through IU libraries the Yumoto et al. (1982) article. Scan through the article, but read enough to answer the following questions:

1) Take a look at Fig. 1. You used the "manual pitch extraction" method on an orginal waveform. Describe the usefulness of the filtering in your own words, and whether the marking of pitch pulses would be easier with the filtered wave.

2) Make a copy of Fig. 6 and print it. Mark on it clearly where your 4 measures of HNR (e.g. red pen). Explain how typical your measurements were of diseased or normal voice relative to the ones shown on Fig. 6.


D.  Report Questions for Project 5 - Answer all!

  1. Describe in your own words what information about voice can be captured in the HNR. 
  2. Listen and compare the /a/ vowels. Provide a perceptual description of the pitch, loudness, and qualities of breathy and hoarseness for each voice. Compare these descriptions to the HNR you calculated. Comment on whether the correspondence of decreased HNR with age and hoarseness is demonstrated across these four samples. What insights do these results give you on the clinical importance of HNR for measuring the severity of breathy and hoarse voice disorders?
  3. The HNR could be calculated using spectral-domain algorithms (analysis of the speech signal in the frequency domain rather than the time domain).  Explain one way this could be done.
  4. Explain some ways you think HNR could be used in the clinical management of voice disorders.  What evidence from your HNR values suggests that an HNR analysis would not be a particularly good diagnostic tool?
  5. What are the possible sources of noise in normal and pathological voices? Do you think a voice will be judged as "good" if there is no noise component? What would be the value of HNR in this hypothetical case?

References

Yumoto, E., Gould, W. J., and Baer, T. (1982). Harmonic-to-noise ratio as an index of the degree of hoarseness. Journal of the Acoustical Society of America, 71, 1544 – 1550.

Qi, Y., and Hillman, R. E. (1997). Temporal and spectral estimations of harmonic-to-noise ratio in human voice signals. Journal of the Acoustical Society of America, 102, 537 – 544.