PREVENTATIVE VOICE MONITORING (PVM): ASSESSING THE ABILITY OF DAILY FEEDBACK FROM DOSIMETRY TO CHANGE VOICE PRODUCTION IN OCCUPATIONAL VOICE USERS  By Lisa M. Kopf   
     A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Communicative Sciences and Disorders-Doctor of Philosophy 2016   ABSTRACT  PREVENTATIVE VOICE MONITORING (PVM): ASSESSING THE ABILITY OF DAILY FEEDBACK FROM DOSIMETRY TO CHANGE VOICE PRODUCTION IN OCCUPATIONAL VOICE USERS   By Lisa M. Kopf This study explored the possibility of proactive voice dosimetry for use by occupational voice users to prevent future voice disorders. This study included two parts, the design and initial testing of a Preventative Voice Monitoring (PVM) feedback system on voice use. In Part 1, the feedback displays were designed using an iterative user-centered approach. In Part 2, testing of the design occurred in two phases: laboratory and real world (classroom). In Part 1, the researcher found that users want a layered display structure. In Part 2, the researcher found that while users felt the displays were user-friendly, users wanted an interactive, flexible system with well-defined measures that provides both objective feedback and suggestions to improve voice use. Pitch strength was found to be a statistically significant predictor of vocal fatigue, with pitch strength (associated with voice quality) decreasing with increasing vocal fatigue. Readiness to change increased from the start to the end of the study for a majority of the participants, 
indicating that engagement with voice monitoring and feedback increases the likelihood of engaging in behavior change. Finally, pitch and phonation time showed decreasing trends with voice monitoring and feedback, which are associated with decreased risk of developing voice disorders.      
 
 
 
 
  
 Copyright by 
LISA M. KOPF 2016iv  ACKNOWLEDGMENTS  I would like to thank everyone who supported me in the dissertation process. I would especially like to thank my family, committee, members of the VBALAB, and my participants.  This research was supported with funding from the College of Communication Arts & Sciences and the Graduate School of Michigan State University.  This work was partially supported by the National Institutes of Health Grant No. R01DC009029 and R01DC004224 from the National Institute on Deafness and Other Communication 
Disorders. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.      v  TABLE OF CONTENTS  LIST OF TABLES .......................................................................................................................... x  
LIST OF FIGURES ....................................................................................................................... xi  
KEY TO ABBREVIATIONS ....................................................................................................... xx  
CHAPTER 1: Introduction ............................................................................................................. 1 The Problem ................................................................................................................................ 1 A Potential Solution .................................................................................................................... 2 The Current Study ....................................................................................................................... 4  
CHAPTER 2: Literature Review .................................................................................................... 6 Voice Use in the Workplace ....................................................................................................... 6 Traditional Methods of Voice Disorder Prevention ................................................................... 7 Persuasive Systems ..................................................................................................................... 9 Voice Dosimetry ....................................................................................................................... 11 Need for Objective Measures of Vocal Fatigue ........................................................................ 12 Voice Quality ............................................................................................................................ 13 Current Biofeedback Options for Voice ................................................................................... 19 Behavior Change ....................................................................................................................... 20 Stages of Change (SOC) ........................................................................................................... 21 Assessing SOC .......................................................................................................................... 24 Readiness to Change (RTC) ..................................................................................................... 25 Self-Efficacy (S-E) ................................................................................................................... 26 Vocal Fatigue Index (VFI) ........................................................................................................ 27  
CHAPTER 3: Study Aims & Hypotheses .................................................................................... 28 Aim 1: To extract design requirements for conveying feedback to users. ............................... 28 Aim 2: To identify changes in voice behavior management after receiving feedback. ............ 29 Aim 3: To quantify changes in the voice after receiving feedback. ......................................... 29  
CHAPTER 4: Feedback Requirements and Changes in Voice Behavior Management (Aims 1 & 2) ................................................................................................................................................... 31 Study Overview ........................................................................................................................ 31 Inclusion Criteria .................................................................................................................. 33 Part 1 Methods .......................................................................................................................... 33 Participants ............................................................................................................................ 34 Procedures ............................................................................................................................. 35 Part 1 Analysis ...................................................................................................................... 38 Aim 1 ................................................................................................................................ 38 Part 1 Results ............................................................................................................................ 38 Initial Feedback Designs ....................................................................................................... 38 Iteration 1 .............................................................................................................................. 39 vi  Icon Suggestions ............................................................................................................... 40 Loudness Suggestions ....................................................................................................... 41 Pause Suggestions ............................................................................................................. 42 Quality Suggestions .......................................................................................................... 43 Strain Suggestions ............................................................................................................. 43 Multi-Measure Suggestions .............................................................................................. 44 Iteration 2 .............................................................................................................................. 44 Icon Suggestions ............................................................................................................... 44 Loudness Suggestions ....................................................................................................... 45 Pause Suggestions ............................................................................................................. 45 Quality Suggestions .......................................................................................................... 46 Strain Suggestions ............................................................................................................. 46 Multi-Measure Suggestions .............................................................................................. 46 Iteration 3 .............................................................................................................................. 47 Icon Suggestions ............................................................................................................... 47 Loudness Suggestions ....................................................................................................... 47 Pause Suggestions ............................................................................................................. 48 Quality Suggestions .......................................................................................................... 48 Strain Suggestions ............................................................................................................. 48 Multi-Measure Suggestions .............................................................................................. 49 Iteration 4 .............................................................................................................................. 49 Icon Suggestions ............................................................................................................... 49 Loudness Suggestions ....................................................................................................... 49 Pause Suggestions ............................................................................................................. 50 Quality Suggestions .......................................................................................................... 50 Strain Suggestions ............................................................................................................. 51 Multi-Measure Suggestions .............................................................................................. 51 Iteration 5 (Final) Results ..................................................................................................... 51 Summary of Results .............................................................................................................. 52 Part 2 Methods .......................................................................................................................... 52 Participants ............................................................................................................................ 52 Procedures ............................................................................................................................. 54 Interview Sessions ............................................................................................................ 56 Recording Sessions ........................................................................................................... 57 Recording Sessions- Part 2.1 ........................................................................................ 58 Recording Sessions- Part 2.2 ........................................................................................ 60 Feedback ........................................................................................................................... 61 Part 2 Analysis ...................................................................................................................... 62 Aim 1 ................................................................................................................................ 62 Aim 2 ................................................................................................................................ 62 Part 2 Results and Discussion ................................................................................................... 63 Aim 1 Interview Results and Discussion .............................................................................. 63 Theme 1: Positive Comments on Current Feedback Displays .......................................... 64 Sub-Theme 1: Displays are user-friendly ..................................................................... 64 Sub-Theme 2: Measures are helpful and should stay included in feedback ................. 65 Theme 2: Occupational Voice User Needs ....................................................................... 65 vii  Sub-Theme 1: Clearer definitions of measures are needed .......................................... 66 Sub-Theme 2: Strategies for improving the voice based on feedback are needed ....... 67 Sub-Theme 3: The system should be adaptable for a range of user needs ................... 68 Theme 3: Recommended Feedback Display Improvements ............................................ 70 Sub-Theme 1: Users should be able to include notes and labels in data ...................... 70 Sub-Theme 2: Displays should show relative trends across days ................................ 71 Sub-Theme 3: Additional feature suggestions .............................................................. 71 Aim 1 Interview Results Summary ................................................................................... 73 Aim 2 Interview Results and Discussion .............................................................................. 73 Theme 1: Reported Behavioral Changes .......................................................................... 73 Sub-Theme 1: Active changes in vocal behavior due to increased awareness and feedback ........................................................................................................................ 74 Sub-Theme 2: Changes in vocal behavior due to monitoring ...................................... 75 Sub-Theme 3: Task specific voice changes in Part 2.1 ................................................ 76 Theme 2: Increased Awareness of Voice .......................................................................... 76 Sub-Theme 1: Interpretations of feedback .................................................................... 76 Sub-Theme 2: Learning about own vocal fatigue and risk of voice problems ............. 78 Theme 3: No Observed Changes ...................................................................................... 78 Sub-Theme 1: No conscious behavioral change ........................................................... 78 Sub-Theme 2: No change observed in feedback measures ........................................... 79 Aim 2 Interview Results Summary ................................................................................... 79 Interview Results Limitations ........................................................................................... 80 Aim 2 Statistical Analysis ..................................................................................................... 81 Part 2.1 .............................................................................................................................. 82 Main Effects .................................................................................................................. 82 Interactions .................................................................................................................... 84 Qualitative Analysis ...................................................................................................... 86 Part 2.2 .............................................................................................................................. 90 Main Effects .................................................................................................................. 90 Interactions .................................................................................................................... 92 Qualitative Analysis ...................................................................................................... 93 Summary ................................................................................................................................... 97 Extracting Design Requirements for Conveying Feedback .............................................. 97 Identifying Changes in Voice Behavior Management after Receiving Feedback ............ 97 Conclusions ............................................................................................................................... 99  CHAPTER 5: Quantification of Voice Changes After Feedback ............................................... 100 Study Overview ...................................................................................................................... 100 Part 2 Methods ........................................................................................................................ 100 Participants .......................................................................................................................... 100 Procedures ........................................................................................................................... 101 Recording Sessions ......................................................................................................... 101 First Moment Specific Loudness ................................................................................ 103 Pitch Strength: Part 2.1 ............................................................................................... 103 Pitch Strength: Part 2.2 ............................................................................................... 103 Pauses: Part 2.1 ........................................................................................................... 103 viii  Pauses: Part 2.2 ........................................................................................................... 104 dB Level: Part 2.2 ....................................................................................................... 104 Analysis ............................................................................................................................... 105 Aim 3 .............................................................................................................................. 105 Hypothesis 1 ............................................................................................................... 105 Hypothesis 2 ............................................................................................................... 106 Hypothesis 3 ............................................................................................................... 106 Results and Discussion ........................................................................................................... 107 Correlation Results .............................................................................................................. 107 Comparison of pitch from accelerometer and audio signals ........................................... 107 Comparison of fundamental frequency and pitch ........................................................... 108 Comparison of pitch strength from accelerometer and audio signals ............................. 108 Comparison of first moment specific loudness from accelerometer and audio signals .. 108 Aim 3 Results and Discussion ............................................................................................ 109 Hypothesis 1.................................................................................................................... 110 Part 2.1 Main Effects .................................................................................................. 110 Part 2.1 Interactions .................................................................................................... 111 Qualitative Analyses ................................................................................................... 112 Phonation time ........................................................................................................ 112 Vocal Intensity ........................................................................................................ 114 Pitch ........................................................................................................................ 115 Part 2.2 Main Effects .................................................................................................. 116 Part 2.2 Interactions .................................................................................................... 117 Qualitative Analysis .................................................................................................... 117 Phonation time ........................................................................................................ 118 Vocal Intensity ........................................................................................................ 119 Pitch ........................................................................................................................ 120 Hypothesis 2.................................................................................................................... 122 Part 2.1 Results ........................................................................................................... 122 Qualitative Analysis .................................................................................................... 122 Part 2.2 Results ........................................................................................................... 125 Qualitative Analysis .................................................................................................... 125 Hypothesis 3.................................................................................................................... 127 Part 2.1 Main Effects .................................................................................................. 127 Part 2.1 Interactions .................................................................................................... 127 Qualitative Analysis .................................................................................................... 128 Pitch Strength .......................................................................................................... 128 First Moment Specific Loudness ............................................................................ 130 Part 2.2 Main Effects .................................................................................................. 132 Part 2.2 Interactions .................................................................................................... 132 Qualitative Analysis .................................................................................................... 132 Pitch Strength .......................................................................................................... 133 First Moment Specific Loudness ............................................................................ 134 Summary ................................................................................................................................. 136 Conclusions ............................................................................................................................. 138  ix  CHAPTER 6: Study Discussion and Conclusions ...................................................................... 139 Summary of Findings .............................................................................................................. 139 Study Limitations .................................................................................................................... 141 Feedback Issues ...................................................................................................................... 143 Future Implications ................................................................................................................. 145 Conclusions ............................................................................................................................. 145  APPENDICES ............................................................................................................................ 147 Appendix A: Intake Form (Modified VBALAB form) .......................................................... 148 Appendix B: Initial Semi-Structured Interview (Parts 1 & 2) ................................................ 150 Appendix C: Part 1 Semi-Structured Interview Questions ..................................................... 151 Appendix D: Initial Feedback Displays .................................................................................. 152 Appendix E: Iteration 1 Feedback Displays ........................................................................... 159 Appendix F: Iteration 2 Feedback Displays ........................................................................... 166 Appendix G: Iteration 3 Feedback Displays ........................................................................... 174 Appendix H: Iteration 4 Feedback Displays: .......................................................................... 180 Appendix I: Final Feedback Displays ..................................................................................... 187 Appendix J: Part 2 Midpoint and Final Semi-Structured Interview Questions ...................... 192 Appendix K: Sample Feedback (Part 2.1) .............................................................................. 197 Appendix L: Supplemental Feedback Displays (Part 2.2) ...................................................... 207 Appendix M: Part 2.1 Readiness to Change ........................................................................... 211 Appendix N: Part 2.1 Self-Efficacy ........................................................................................ 212 Appendix O: Part 2.1 Vocal Fatigue Index ............................................................................. 213 Appendix P: Part 2.2 Readiness to Change ............................................................................ 214 Appendix Q: Part 2.2 Self-Efficacy ........................................................................................ 215 Appendix R: Part 2.2 Vocal Fatigue Index ............................................................................. 216 Appendix S: Steps for Feedback Analysis (Part 2.1) .............................................................. 217 Appendix T: Steps for Feedback Analysis (Part 2.2) ............................................................. 223 Appendix U: Part 2.1 Phonation Time .................................................................................... 229 Appendix V: Part 2.1 Vocal Intensity ..................................................................................... 230 Appendix W: Part 2.1 Pitch .................................................................................................... 231 Appendix X: Part 2.2 Phonation Time .................................................................................... 232 Appendix Y: Part 2.2 Vocal Intensity ..................................................................................... 233 Appendix Z: Part 2.2 Pitch ..................................................................................................... 234 Appendix AA: Average Pitch Strength (Part 2.1) .................................................................. 235 Appendix AB: Average First Moment Specific Loudness (Part 2.1) ..................................... 236 Appendix AC: Average Pitch Strength (Part 2.2) ................................................................... 237 Appendix AD: Average First Moment Specific Loudness (Part 2.2) ..................................... 238  
BIBLIOGRAPHY ....................................................................................................................... 239 x  LIST OF TABLES  Table 1: Partial correlations between spectral moments and strain ratings–––––––..–.19  Table 2: Study design overview––––––––––––––––––––––––.–.31 
 
Table 3: List of inclusion criteria––––––––––––––––––––...–––.–33  
Table 4: Part 1 Participant Demographics––––––––––––––––––––.–.35  
Table 5: Part 2, Phase 1 Demographics–––––––––––––––––––––.–.53  
Table 6: Part 2, Phase 2 Demographics–––––––––––––––––––––.–.54  
Table 7: Basic course structure, by instructor––––––––––––––––..––..–.61  
Table 8: Emergent themes related to Aim 1–––––––––––––––––––.–..64  
Table 9: Emergent themes related to Aim 2–––––––––––––––––––..–.74  
Table 10: Comparison of Part 2.1 scores for the initial interview. Reported are the mean difference in measures across participants (standard deviation) and p-value–––––––––––..–84  
Table 11: Comparison of Part 2.1 scores for the midpoint interview. Reported are the mean difference in measures across participants (standard deviation) and p-value.–––––.––..85  
Table 12: Comparison of Part 2.1 scores for the final interview. Reported are the mean difference in measures across participants (standard deviation) and p-value––––.––––––––.85  
Table 13: Ethnic category–––––––––––––––––––––––––––.149 
 
Table 14: Racial category––––––––––––––––––––––––.–––149   xi  LIST OF FIGURES  Figure 1: VoxLog collar and Roland recorder–––––––––––..–––.–––.––..4  Figure 2: Outline of Part 1...––––––––––––––.––––––––––––..34  
Figure 3: Initial icons representing objective voice measures. Icons and the number of measures changed in later iterations based on participant input.––––––––––––––.––...36  
Figure 4: Outline of Part 2––––––––––––––––––.–––––––––.55  
Figure 5: Sound-treated booth configuration for Part 2.1. The two squares indicate the location of chairs for the participant during the session and the researcher during the presentation of the 
feedback–––––––––––––––––––––––––––––––––.....59  
Figure 6: Average Readiness to Change scores over time by gender for Part 2.1–––––..–.86  
Figure 7: Average Self-Efficacy scores over time by gender for Part 2.1–––––––..–.–.87  
Figure 8: Average Vocal Fatigue Index, Factor 1 scores over time by gender for Part 2.1––...88 
 
Figure 9: Average Vocal Fatigue Index, Factor 2 scores over time by gender for Part 2.1–––88  
Figure 10: Average Vocal Fatigue Index, Factor 3 scores over time by gender for Part 2.1–––89 
 Figure 11: Average Readiness to Change scores over time by gender for Part 2.2–––––...93  
Figure 12: Average Self-Efficacy scores over time by gender for Part 2.2–––––––––.94  
Figure 13: Average Vocal Fatigue Index, Factor 1 scores over time by gender for Part 2.2––..95 
 
Figure 14: Average Vocal Fatigue Index, Factor 2 scores over time by gender for Part 2.1–......96 
 
Figure 15: Average Vocal Fatigue Index, Factor 3 scores over time by gender for Part 2.1–..–96  
Figure 16: Average phonation time for each session by gender for Part 2.1––––––..–..113  
Figure 17: Average phonation time by recording type for Part 2.1. Results are reported separately by gender.–––––––––––––––––––––...–––––––––...–...113  
Figure 18: Average vocal intensity for each session by gender for Part 2.1––––––––..114  
Figure 19: Average vocal intensity by recording type for Part 2.1. Results are reported separately by gender.–––––––––––––––––––––..––..–––––.––––115 xii   Figure 20: Average pitch for each session by gender for Part 2.1––––––––..–.–..–116  
Figure 21: Average pitch by recording type for Part 2.1. Results are reported separately by gender–––––––––––––––––––––––––––––.–––...–...116  
Figure 22: Average phonation time for each session by gender for Part 2.2––––––––.118  
Figure 23: Average phonation time by recording type for Part 2.2. Results are reported separately 
by gender–.––––––––––––––––––––..–––––––––.––..119  
Figure 24: Average vocal intensity for each session by gender for Part 2.2–––––––.–119  
Figure 25: Average vocal intensity by recording type for Part 2.2. Results are reported separately by gender.–––––––––––––––––––––..–––––––.–––.–.120  
Figure 26: Average pitch for each session by gender for Part 2.2––––––––––––..121  
Figure 27: Average pitch by recording type for Part 2.2. Results are reported separately by gender.––––––––––––––––––––––––––––––––––..121  
Figure 28: Linear regression of vocal fatigue rating and pitch strength including P2101––.–.123  
Figure 29: Linear regression of vocal fatigue rating and first moment specific loudness including P2101––––––––––––––––––––––––..––––––––.––124 
 Figure 30: Linear regression of vocal fatigue rating and pitch strength excluding P2101––––––––––––––––––––––––––––––––––...124  Figure 31: Linear regression of vocal fatigue rating and first moment specific loudness excluding P2101––––––––––––––––––––––––––––––––––...125 
 
Figure 32: Linear regression of vocal fatigue rating and pitch strength.––––––.––....126  
Figure 33: Linear regression of vocal fatigue rating and first moment specific loudness...–....127  
Figure 34: Average change scores (pre Œ post) for pitch strength by session in Part 2.1. Positive values indicate greater pitch strength before the vocal loading task––––––...––––..129  
Figure 35: Average change scores (pre Œ post) for pitch strength by recording type in Part 2.1. Positive values indicate greater pitch strength before the vocal loading task–––––.–..–129  
Figure 36: Average change scores (pre Œ post) for first moment specific loudness by session in Part 2.1. Positive values indicate greater first moment specific loudness before the vocal loading 
task–..––––––––...––––––––––––––––––––––..–––131  xiii  Figure 37: Average change scores (pre Œ post) for first moment specific loudness by recording type in Part 2.1. Positive values indicate greater first moment specific loudness before the vocal loading task––..–––––––––––––––––––––.––––––––..–––..131  
Figure 38: Average change scores (pre Œ post) for pitch strength by session in Part 2.2. Positive values indicate greater pitch strength before the vocal loading task––––––––––....133  
Figure 39: Average change scores (pre Œ post) for pitch strength by recording type in Part 2.2. Positive values indicate greater pitch strength before the vocal loading task–..––––.––.134  
Figure 40: Average change scores (pre Œ post) for first moment specific loudness by session in Part 2.2. Positive values indicate greater first moment specific loudness before the vocal loading task–––––––––––––––––––––––––––––––––––...135 
 
Figure 41: Average change scores (pre Œ post) for first moment specific loudness by recording type in Part 2.2. Positive values indicate greater first moment specific loudness before the vocal loading task––..––––––––––––––––––––––––––.–––..–––..135  
Figure 42: Initial displays- Distance small multiple line graph–––––––––––––152  
Figure 43: Initial displays- Distance bar graph–––––––––––––––––––152  
Figure 44: Initial displays- Distance speedometer––––––––––––––––––152  
Figure 45: Initial displays- Loudness bar graph–––––––––––––––––––153  Figure 46: Initial displays- Loudness clock––––––––––––––––––––..153  
Figure 47: Initial displays- Loudness small multiple sparklines––––––––––––..153  
Figure 48: Initial displays- Pauses small multiple line graph––––––––––........–...154  
Figure 49: Initial displays- Pauses clock–––––––––––––––––––––..154  
Figure 50: Initial displays- Pauses bar graph––––––––––––––––––––154  
Figure 51: Initial displays- Quality small multiple smileys––––––––––––––.155  
Figure 52: Initial displays- Quality small multiple line graphs–––––––––––––155  
Figure 53: Initial displays- Quality multi-day line graph–––––––––––––––.155  
Figure 54: Initial displays- Clarity single-day line graph–––––––––––––––..156  
Figure 55: Initial displays- Clarity vertical line graph––––––––––––––––..156  xiv  Figure 56: Initial displays- Clarity bar graph––––––––––––––––––––156  Figure 57: Initial displays- Strain small multiple line graph–..––––––––––––..157  
Figure 58: Initial displays- Strain multi-day line graph––––––––––––––––157  
Figure 59: Initial displays- Strain bar graph––––––––––––––––––––.157  
Figure 60: Initial displays- Multi-measure matrix––––––––––––––––––158  
Figure 61: Iteration 1 displays- Dynamic loudness icons–––––––––––––––.159  
Figure 62: Iteration 1 displays - Dynamic pause icons––––––––––––––––.159  
Figure 63: Iteration 1 displays - Dynamic quality icons––––––––––––––––160  
Figure 64: Iteration 1 displays - Dynamic strain icons––––––––––––––––.160  
Figure 65: Iteration 1 displays Œ Loudness small multiple sparklines––––––––––..161  
Figure 66: Iteration 1 displays Œ Loudness multiple sparklines–––––––––––––161  
Figure 67: Iteration 1 displays Œ Individual pause time and length–––––––––––..162  
Figure 68: Iteration 1 displays Œ Pause clocks and line graph–––––––––––––..162  Figure 69: Iteration 1 displays Œ Pause line graph––––––––––––––––––162  
Figure 70: Iteration 1 displays Œ Quality small multiple smileys––––––––––––.163  
Figure 71: Iteration 1 displays Œ Quality multi-line graph–––––––––––––––163  
Figure 72: Iteration 1 displays Œ Quality small multiple line graph–––––––––––..163  
Figure 73: Iteration 1 displays Œ Strain multi-shade line graph–––––––––––––164  
Figure 74: Iteration 1 displays Œ Strain labelled line graph––––––––––––––..164  
Figure 75: Iteration 1 displays - Multi-measure matrix––––––––––––––––165  
Figure 76: Iteration 2 displays- Dynamic loudness icons–––––––––––––––.166  
Figure 77: Iteration 2 displays - Dynamic pause icons––––––––––––––––..166  
Figure 78: Iteration 2 displays - Dynamic quality icons–––––––––––––––...167  xv  Figure 79: Iteration 2 displays - Dynamic strain icons––––––––––––––––.167  Figure 80: Iteration 2 displays Œ Loudness single-day sparkline––––––––––––..168  
Figure 81: Iteration 2 displays Œ Loudness small multiple sparklines––––––––––.168 
 
Figure 82: Iteration 2 displays Œ Pause line graph––––––––––––––––––.169  
Figure 83: Iteration 2 displays - Individual pause time and length–––––––––––...169  
Figure 84: Iteration 2 displays Œ Pause clocks and counts–––––––––––––––169  
Figure 85: Iteration 2 displays Œ Quality small multiple smileys––––––––––––..170  
Figure 86: Iteration 2 displays Œ Quality small multiple line graphs–––––––––––170  
Figure 87: Iteration 2 displays Œ Quality single day line graph–––––––––––––170  
Figure 88: Iteration 2 displays - Strain labelled line graph––––––––––––––..171  
Figure 89: Iteration 2 displays Œ Strain day and night line graph––––––––––––.171  
Figure 90: Iteration 2 displays Œ Multi-measure matrix––––––––––––––––172  
Figure 91: Iteration 2 displays - Dynamic icon array–––––––––––––––––172  Figure 92: Iteration 2 displays Œ Multi-measure man––––––––––––––––...173  Figure 93: Iteration 3 displays ŒIcons for all four measures, with two pause options––––.174  Figure 94: Iteration 3 displays ŒLoudness single day sparkline–––––––––––––175 
 
Figure 95: Iteration 3 displays Œ Loudness small multiple sparklines––––––––––..175  
Figure 96: Iteration 3 displays Œ Loudness single day sparkline with zeros––––––––175  
Figure 97: Iteration 3 displays Œ Individual pause time and length–––––––––––..176  
Figure 98: Iteration 3 displays Œ Pause bar graph with counts–––––––––––––.176  
Figure 99: Iteration 3 displays Œ Quality small multiple smileys––––––––––––.177  
Figure 100: Iteration 3 displays Œ Quality small multiple line graphs–––––––––.–177 
 
Figure 101: Iteration 3 displays Œ Quality single day line graph––––––––––––.177 
 xvi  Figure 102: Iteration 3 displays Œ Strain labelled line graph––––––––––––––178  Figure 103: Iteration 3 displays Œ Strain small multiple line graph––––––––––.–178 
 
Figure 104: Iteration 3 displays Œ Strain day and night bar graph–––––––––––...178 
 
Figure 105: Iteration 3 displays Œ Multi-measure matrix–––––––––––––––.179  
Figure 106: Iteration 4 displays Œ Icons for the four measures–––––––––––––180  
Figure 107: Iteration 4 displays Œ Loudness multi-day danger zone counts––––––––181  
Figure 108: Iteration 4 displays Œ Loudness single day sparkline––––––––––––.181  
Figure 109: Iteration 4 displays Œ Loudness small multiple sparklines––––––––––181  
Figure 110: Iteration 4 displays Œ Pause multi-day counts–––––––––––––––182  
Figure 111: Iteration 4 displays Œ Individual pause time and length–––––––––––182  
Figure 112: Iteration 4 displays Œ Pause small multiple bar graphs–––––––––––.182  
Figure 113: Iteration 4 displays Œ Pause bar graph––––––––––––––––––183  
Figure 114: Iteration 4 displays Œ Quality small multiple smileys––––––––––––184  Figure 115: Iteration 4 displays Œ Quality single day line graph–––––––––––.–..184  
Figure 116: Iteration 4 displays Œ Quality small multiple line graphs–––––––––.–..184  
Figure 117: Iteration 4 displays Œ Strain small multiple arrows––––––––––––...185  
Figure 118: Iteration 4 displays Œ Strain small multiple time points–––––––––––185  
Figure 119: Iteration 4 displays Œ Strain labelled line graphs–––––––––––––..185  
Figure 120: Iteration 4 displays Œ Multi-measure matrix–––––––––––––––.186  
Figure 121: Final displays Œ Icons for the four measures–––––––––––––––..187  
Figure 122: Final displays Œ Layered structure for loudness––––––––––––––188  
Figure 123: Final displays Œ Layered structure for pauses––––––––––––.–.–..189  
Figure 124: Final displays Œ Layered structure for quality–––––––––––––.–..190  xvii  Figure 125: Final displays Œ Layered structure for strain–––––––––––––––..191  Figure 126: Initial image seen by participants. Icons are displayed in a different random order for each participant. This participant™s order is: pauses, quality, and strain–––––––––..197  
Figure 127: First pause display. This display shows the pause count (number of pauses equal to or greater than one second in length) over the course of the reading VLT–––––––––...198  
Figure 128: Second pause display. This display shows the amount of time (% total time) spent in pauses equal to or greater than one second in length for each 3 minutes of reading–––––.199  
Figure 129: Third pause display. Participants had one of these for each day. This display shows when these pauses of a second or greater occurred, and their individual durations–––––.200  Figure 130: First quality display. The smileys indicate the average value for quality for each day (across all 15 minutes of reading). Note that the smiley with a straight line for a mouth is equal to ﬁbaselineﬂ–––––––––––––––––––––––––––––––––.201  Figure 131: Second quality display. The ﬁbaselineﬂ smiley is the average of the first minute of the baseline recordings (3 total)–––––––––––––––––––––––...–202 
 Figure 132: Third quality display. Amplified image of one day™s quality. Participants had one of these for each day––––––––––––––––––––––––––––––.203  Figure 133: First strain display. The stick figures indicate the average value for the /after the reading VLT (whichever value they are closest to). Note that the stick figure for days 3, 5, and 7 is the ﬁbaseline or betterﬂ stick figure––––––––––––..––––––..204  
Figure 134: Second strain display. Note that if the value went above the second stick figure, it was considered to be in the ﬁdanger zoneﬂ and was colored red––––––––––––.205 
 Figure 135: Third strain display. Amplified image of one day™s strain. Just like for quality, participants had one of these for each day––––––––––––––...––––––206  Figure 136: Example quality display. This display shows the difference between the quality display in Part 2.1, where instead of one 15-minute segment, two 15 minute segments are shown––––––––––––––––––––––––––––––––––...207  Figure 137: First loudness display. The ﬁdanger zoneﬂ differs by day. It represents time spent greater than 2 standard deviations of the mean above the average dB level for that day–––208  xviii  Figure 138: Second loudness display. The ﬁdanger zoneﬂ is indicated by the red dashed lines–––––––––––––––––––––––––––––––––––..209  Figure 139: Third loudness display. Amplified image of one day™s loudness pattern. Just like for quality, participants had one of these for each day–––––––––––––––––..210  Figure 140: The change in readiness to change for each participant in Part 2.1 from the initial to the midpoint to the final interview–––––––––––––––––––...––––211  Figure 141: The change in self-efficacy for each participant in Part 2.1 from the initial to the midpoint to the final interview––––––––––––––––––––––––.–212 
 Figure 142: The change in vocal fatigue for each participant in Part 2.1 from the initial to the midpoint to the final interview–––––––––––––––––––––––––.213  Figure 143: The change in readiness to change for each participant in Part 2.2 from the initial to the midpoint to the final interview––––––––––––––––––––––...–214  Figure 144: The change in self-efficacy for each participant in Part 2.2 from the initial to the midpoint to the final interview–––––––––––––––––––––––––.215 
 Figure 145: The change in vocal fatigue for each participant in Part 2.2 from the initial to the midpoint to the final interview–––––––––––––––––––––––––.216  
Figure 146: The change in average phonation time for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the 
feedback recording number––––––––––––––––––––––––––..229  Figure 147: The change in average vocal intensity for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback 
recording number––––––––––––––––––––––––––––––.230  Figure 148: The change in average pitch for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number––––––––––––––––––––––––––––––.231  Figure 149: The change in average phonation time for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the 
feedback recording number––––––––––––––––––––––––––..232  xix  Figure 150: The change in average vocal intensity for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number–––––––––––––––––––––––––––––.....233  Figure 151: The change in average pitch for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback 
recording number––––––––––––––––––––––––––––––.234  Figure 152: The average pitch strength for each participant in Part 2.1 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number––––––––––––––––––––––––––––––.235 
 Figure 153: The average first moment specific loudness for each participant in Part 2.1 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number–––––––––––––––––––––––––236  Figure 154: The average pitch strength for each participant in Part 2.2 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number–––––.––––––––––––––––––––––––....237 
 Figure 155: The average first moment specific loudness for each participant in Part 2.2 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates 
the feedback recording number–––––––––––––––––––––––––238   xx  KEY TO ABBREVIATIONS  PWP  Persons with Parkinson™s Disease PVM  Preventative Voice Monitoring F0  Fundamental frequency dB  Decibels GRBAS Grade, Roughness, Breathiness, Aesthenia, Strain CAPE-V Consensus and Perceptual Evaluation of Voice CPP  Cepstral Peak Prominence CPPS  Smoothed Cepstral Peak Prominence ERB  Equivalent rectangular bandwidth SOC  Stages of Change TTM  Transtheoretical Model PC  Precontemplation (stage of change) C  Contemplation (stage of change) PA  Preparation (stage of change) A  Action (stage of change) M  Maintenance (stage of change) URICA University of Rhode Island Change Assessment RTC  Readiness to change S-E  Self-efficacy VFI  Vocal Fatigue Index HRPP  Human Research Protection Program (Institutional Review Board) xxi  Loudness Vocal intensity (simplified name in feedback) Quality Pitch strength (simplified name in feedback) Pauses  Frequency and duration of pauses (simplified name in feedback) Strain  First moment specific loudness (simplified name in feedback) Clarity  Cepstral Peak Prominence (simplified name in feedback) Distance Distance travelled by vocal folds (simplified name in feedback) ANOVA Analysis of variance VLT  Vocal loading task SPL  Sound pressure level VBALAB Voice BioAcoustics LABoratory  1  CHAPTER 1: Introduction  The Problem  One-third of adults will experience a voice disorder during their lifetime (Roy, Merrill, Gray, & Smith, 2005). Many voice disorders stem from vocal overuse and misuse (Child & Johnson, 1991), which suggests that many voice problems may be preventable. Voice disorders resulting from vocal overuse or misuse are often treated successfully through behavior modification. For example, a retrospective study by McCrory (McCrory, 2001) found that 70% of patients with vocal fold nodules experienced reduction or elimination of nodules through therapy alone. Additionally, therapists encourage clients to avoid recurrence by continuing to use the modified vocal behavior learned in therapy. Therefore, if behavior modification can treat voice disorders and prevent reoccurrence, why aren™t more people changing their behavior to prevent voice disorders from occurring in the first place? Many people may be unaware of their risk for developing a voice disorder, or unaware of what behaviors might place them at an increased risk. Approximately 25% of the U.S. workforce (over 30 million people) are at an increased risk for developing voice disorders due to their chosen occupations (Hunter & Titze, 2009; Titze, Lemke, & Montequin, 1997). This increased risk stems from the pivotal role that an individual™s voice plays in his/her daily work routine. These individuals have been referred to as occupational voice users and include professionals such as teachers, speech-language pathologists, salespeople, actors, broadcasters, singers, receptionists, lawyers, members of the clergy, and psychologists (Hunter & Titze, 2009; Titze et al., 1997). Because these individuals™ voices are an integral part of their jobs, voice disorders 2  would be highly detrimental to their ability to work in their chosen professions (Titze et al., 1997).  Therefore, measures should be taken to educate and facilitate behavior change for individuals at risk for voice disorders. These professionals should not have to wait until a voice problem develops before they are given support and resources for behavior modification.   A Potential Solution   Quantified self is a phenomenon where people collect data from life events to better understand and improve themselves (Choe, Lee, Lee, Pratt, & Kientz, 2014). Ubiquitous computing allows for the collection and analysis of data from personal sensors, allowing for easy access to one™s own information when paired with mobile technology (Epstein, Borning, & 
Fogarty, 2013; Fogg, 2003).   One popular aspect of quantified self is physical activity tracking. Devices such as FitBit, Jawbone UP, and Nike+ Fuelband (Fritz, Huang, Murphy, & Zimmermann, 2014; Harrison, Berthouze, Marshall, & Bird, 2014) help individuals to track their physical activity. Many adults focus on increasing their physical activity to reduce the risk of diseases such as heart disease (Manson et al., 1999) and Type II diabetes (Hu, Sigal, Rich-Edwards, et al., 1999). Some adults simply use a device to track their physical activity over the course of the day (Fritz et al., 2014), while some use data from additional device features such as tracking calorie intake.  Physical activity trackers are a type of persuasive system. Persuasive systems are defined as: ﬁinteractive computing systems designed to change people™s attitudes and behaviorsﬂ (Fogg, 2003). The persuasive nature of these systems stems from the application of principles such as self-monitoring and recognition for goal attainment (Fogg, 2003; Oinas-Kukkonen & Harjumaa, 3  2008). In addition, this type of system can allow individuals room to explore and experiment with strategies, similar to fish tanks and sandboxes used in video games that allow players to try different things in a relatively ﬁsafeﬂ environment (Gee, 2008). Allowing exploration and experimentation can increase the robustness of learning, even in communication sciences and disorders (Alfieri, Brooks, Aldrich, & Tenenbaum, 2011; Folkins, Brackenbury, Krause, & Haviland, 2015).  In the field of communication sciences and disorders, a number of studies have looked at clinical applications of quantified self. For example, prior research explored the use of a 
wearable device for monitoring saliva management in persons with Parkinson™s disease (PWP) (McNaney et al., 2011). Additional studies have looked at using Google Glass for monitoring voice and speech in PWP (McNaney et al., 2014, 2015; McNaney et al., 2016).  In the area of voice science, researchers have also explored the area of quantified self. An iOS application was developed to help individuals with voice disorders monitor their home practice of voice therapy (van Leer, Pfister, & Zhou, 2016).  In addition, researchers have investigated potential applications for voice dosimeters, devices that offer long-term voice monitoring (Carroll et al., 2006; Ghassemi et al., 2014; Hunter & Titze, 2009). Like physical activity trackers, these devices can be worn daily to track vocal activity. A voice dosimeter includes an accelerometer that is worn at the neck and can either be held in place using surgical tape, or can be part of a collar that is worn around the neck (Figure 1). Dosimeters have been used in multiple contexts, including the study of voice use patterns in populations with voice disorders and high risk populations (Carroll et al., 2006; Ghassemi et al., 2014; Hunter & Titze, 2010), and for potential clinical applications (Hillman, Heaton, Masaki, Zeitels, & Cheyne, 2006; Misono, Banks, Gaillard, Goding, & Yueh, 2015). However, there is a 4  paucity of literature exploring the potential use of these monitoring systems as persuasive systems for voice disorder prevention.    Figure 1: VoxLog collar and Roland recorder.   The same persuasive principles that have been leveraged in preventative health using physical activity trackers could be adapted for voice use monitoring. This could reduce the need to seek voice therapy, reducing the cost to both the occupational voice user and the healthcare system. In addition, if someone does need to seek treatment for a voice disorder, the increased 
awareness of vocal behavior from use of this type of system may contribute to improved therapy outcomes.  The Current Study Many individuals use self-monitoring devices, such as those that track physical activity, to allow them to observe their current performance and form future goals. This study seeks to determine whether providing feedback on voice use might have a similar impact on vocal behavior, especially for individuals who are at high risk for voice disorders. With many individuals at high risk for voice disorders, this is a potentially unmet need.  5  While some studies have evaluated the ability of traditional methods, such as voice hygiene education or voice therapy, for voice disorder prevention in occupational voice users with positive results (Nanjundeswaran et al., 2012; Richter, Nusseck, Spahn, & Echternach, 2016), wider implementation of these types of protocols may be restricted due to reduced motivation as well as cost. Preventative intervention by a trained voice professional may be cost 
prohibitive, especially if coverage is denied by insurance, which was demonstrated to be a common factor in non-adherence to voice therapy (Portone, Johns III, & Hapner, 2008). On the other hand, a voice monitoring tool may prove to be more cost effective and readily available when an individual is interested in changing their own voice. The current study is designed to assess whether Preventative Voice Monitoring (PVM), using a dosimeter with task-based feedback, may impact vocal behavior in occupational voice users and future occupational voice users. Additionally, the results will provide preliminary evidence for which measures are most sensitive to changes resulting from PVM, including both objective measures from the voice signal and behavior change measures.     6  CHAPTER 2: Literature Review  Voice Use in the Workplace When the vocal folds vibrate, they create sound. The vocal folds of occupational voice users vibrate frequently throughout the day, and many occupational voice users report voice changes associated with sustained voice use. For example, one study found that 44% of air traffic controllers, a subpopulation of occupational voice users, experienced changes in voice quality after a work shift (Villar, Korn, & Azevedo, 2016).   The changes due to prolonged voice use reported by occupational voice users can be attributed to vocal fatigue. Solomon (2008) defines vocal fatigue as ﬁthe self-report of an increased sense of effort with prolonged phonation.ﬂ This increased sense of effort is likely attributed to vocal hyperfunction [voice production with ﬁoverstrained musclesﬂ (Froeschels, 1952)] to compensate for the sensation of vocal fatigue (Hillman, Holmberg, Perkell, Walsh, & Vaughan, 1989; Solomon, 2008). However, vocal fatigue is generally transient nature, due to the voice returning to normal after voice rest (Hillman et al., 1989). Despite the temporary duration of vocal fatigue, Hillman et al. (1989) described the hypothetical relationship between vocal fatigue, vocal hyperfunction, and the development of 
voice disorders. The researchers hypothesized that hyperfunction of normal vocal folds leads to 
vocal fatigue, with the vocal folds returning to normal with rest. However, they also hypothesized that if vocal hyperfunction persists, it can lead to the development of voice disorders. This suggests that occupational voice users, whose vocal demands often require 
continued voice use in the presence of vocal fatigue, are at increased risk for voice disorders. 7  Safety standards for occupational voice use are one way to protect occupational voice users. Safety standard recommendations have been made for exposure to multiple types of vibration in the workplace, including standards for hand and whole body vibration exposure (Griffin, 2004) and hearing protection (OSHA, 2008). However, safety standards for voice use have only recently been recommended (Titze, 2012; Titze, −vec, & Popolo, 2003). Unlike other vibration-related standards, which are generalizable across the population, vocal dosing standards are based on multiple factors related to an individual including gender, speaking fundamental frequency, and potential genetic factors (Titze, 2012; Titze et al., 2003). Fundamental frequency (F0) is a measure of the number of cycles per second (hertz) that occur during vocal fold vibration. The perceptual correlate of F0 is pitch (Titze, 2000). The researchers determined that the interaction of these factors (gender, fundamental frequency, genetic factors) with vocal intensity and time dose determine an individual™s risk of developing a voice disorder. Intensity measures the energy of a sound in decibels (dB). The perceptual correlate of intensity is loudness (Titze, 2000). Time dose measures the accumulated voicing time, the time the vocal folds are oscillating (−vec, Popolo, & Titze, 2003). This measure involves detecting each period when the vocal folds are vibrating (binary decision), and summing these periods. Time dose is related to phonation time, which is generally reported as the percent of time the person is phonating. Because of the variation of all these factors across individuals, individualized prevention measures are critical to the continued vocal health of occupational voice users.  Traditional Methods of Voice Disorder Prevention  A number of studies have explored the potential of voice disorder prevention. Many of these studies have investigated more traditional means of training, commonly referred to as 8  indirect and direct intervention. Indirect intervention methods involve the management of non-voice production contributors to voice disorders, and often include education on how improve vocal hygiene (Speyer, 2008). Vocal hygiene education varies between protocols, but usually involves some or all of the following elements: moderating voice use, avoiding phonotraumatic 
behavior such as screaming, and increasing hydration (Achey, He, & Akst, 2016). On the other hand, direct intervention involves modifying voice production (Speyer, 2008). Gartner-Schmidt, Roth, Zullo, & Rosen (2013) found that some of the most common types of direct intervention for voice therapy include resonant voice, easy voice production with vibrations felt forward in the mouth (Grillo & Verdolini, 2008), and flow phonation, voice production with appropriate 
airflow to allow easy vibration of the vocal folds (Titze, 2015). In prior studies of voice disorder prevention, researchers employed indirect methods, direct methods, or a combination of the two. Indirect intervention was categorized as voice 
education and/or vocal hygiene education (Chan, 1994; Duffy & Hazlett, 2004; Nanjundeswaran et al., 2012). The following methods were categorized as direct interventions: vocal warm-ups (Pereira, Masson, & Carvalho, 2015), breathing training (Pereira et al., 2015), voice training (Ohlsson et al., 2015), adapted Lessac-Madsen Resonant Voice Therapy (Nanjundeswaran et al., 2012), course attendance with voice tracking (Bovo, Galceran, Petruccelli, & Hatzopoulos, 2007), and non-specific ﬁdirect trainingﬂ (Duffy & Hazlett, 2004). Authors reported significant (Bovo et al., 2007; Chan, 1994; Ohlsson et al., 2015; Pereira et al., 2015) or trending (Duffy & Hazlett, 2004; Nanjundeswaran et al., 2012) improvements in voice in response to intervention. In one study reporting trends, the authors did not report statistical results due to a small number 
of participants (Nanjundeswaran et al., 2012).  9  For the other study reporting trends (Duffy & Hazlett, 2004), there were improvements on two measures (Dysphonia Severity Index, Voice Handicap Index) for individuals in the direct intervention group, and poorer performance on two measures (Vocology Screening Profile, 
Voice Handicap Index) for individuals in the indirect intervention group. Voice Handicap Index scores improved for the control group, with poorer scores on the other two assessments. These 
results suggest that direct intervention is better than indirect for voice disorder prevention. However, the reason for decreasing scores for those in indirect intervention, and improvements 
for those in the control group is unclear.   While there is some support for voice disorder prevention interventions, these methods often occur under the supervision of a trained voice professional. These interventions provide strategies to occupational voice users, with some interventions providing feedback on performance. However, the majority of the work is self-monitoring, and the burden is left to the occupational voice user. The outcomes from two studies (van Leer & Connor, 2012; van Leer et al., 2016) have demonstrated that technology support can increase the likelihood of home practice of therapy, and this same principle could apply to voice disorder prevention. Technology can be used as a persuasive system, a system designed to support attitude and behavior change 
(Oinas-Kukkonen & Harjumaa, 2008).  Persuasive Systems The use of computers as persuasive technology, coined ﬁcaptologyﬂ (Fogg, 2003), is not a new concept. Compared with human persuaders, Fogg (2003) identified multiple advantages for computers as persuaders: ubiquity (can be nearly anywhere), greater persistence in persuasion, 
greater anonymity for both the persuaders and those being persuaded, ability to manage large 10  amounts of data, and persuasion tactics can easily be scaled for a larger audience. To make the persuasive system effective, Intille (2004) suggested that messages should be easy to understand, occur at the ﬁrightﬂ time and place, and should not be annoying or intrusive.  Fogg (2003) introduced the functional triad to categorize persuasive systems. The functional triad is used to categorize persuasive systems into three non-mutually exclusive categories: (1) a tool for increasing users™ abilities to perform and/or analyze behaviors, (2) a medium for exploration and rehearsal of behaviors, and (3) a social actor that either engages the user directly or facilitates engagement with other users.  Many suggested principles exist for persuasive systems and the best principles to use depend on the type of system. These have been outlined in prior literature (Fogg, 2003; Oinas-Kukkonen & Harjumaa, 2008). Oinas-Kukkonen & Harjumaa (2008) summarized these principles by assigning them to one of four categories: primary task support (guiding the user to 
improve the target behavior), human-computer dialogue support (supporting the user in the behavior change process through rewards and suggestions), system credibility support (providing 
accurate and generalizable information), and optional social support (from interactions with other users). For example, tools can simplify complex tasks, tailor information for a particular user 
group or personalize information for a specific individual, and use self-monitoring to provide actual feedback on behavior.   However, in order for a system to be persuasive, it has to include appropriate measures for providing information and support for users. In addition, an appropriate method for obtaining 
the measures must be established to ensure that the measures reported are as accurate as possible. 
Prior work in voice science has developed tools (measures and equipment) that may be 
applicable in preventative voice monitoring. 11  Voice Dosimetry Voice dosimeters can monitor voice use patterns. These systems typically employ, at a minimum, an accelerometer attached to the skin of the neck to capture information about the vibrations of the vocal folds. The following measures can be extracted directly from the accelerometer signal: fundamental frequency (F0), vocal intensity, and time dose (Titze et al., 2003). Increases in these measures have been established as risk factors for the development of voice disorders (Titze, 2012; Titze et al., 2003). In addition, distance dose can be calculated from these three measures. Distance dose is a measure of how far the vocal folds have traveled in centimeters, and is a mathematical combination of fundamental frequency, intensity, and time dose (Titze et al., 2003).  A number of voice dosimetry studies have observed occupational voice users over a period of days or weeks. For example, Titze, Hunter, & vec (2007) compared periods of voicing and silence for a group of teachers over an average of 13.3 days per teacher, including both time and work and non-work time (evenings and weekends). One interesting finding was that voicing (the majority of which are less than one second) occurred an average of 1800 times 
per hour on weekdays, and an average of 1200 times per hour on weekends, leading to an 
average of 23% voicing per hour on weekdays and 13% on weekends. In another study, 
Schloneger & Hunter (2016) observed voice use patterns of college/university singing students over a three-day period. Data analysis included fundamental frequency, vocal intensity, and time dose, in addition to other measures. The study found significant increases in all three measures 
during singing compared with other times of the day. 
 In addition to monitoring studies, voice dosimeters are starting to be analyzed for the ability to be used in biofeedback. Van Stan, Mehta, & Hillman (2015) completed a proof-of-12  concept study to determine whether vocal intensity could be used as a biofeedback measure. While the intention was to assess the feasibility of using biofeedback from dosimetry for voice 
therapy applications, the study was conducted using individuals without voice disorders. During biofeedback days, six participants received a vibrotactile cue (pager vibration) to reduce vocal 
intensity when phonation was above a certain threshold. Results of mixed ANOVA found significant reductions in vocal intensity on days with biofeedback, but vocal intensity increased 
to baseline levels on days without biofeedback. 
 Vocal dosimetry has been used primarily to study fundamental frequency, vocal intensity, and time dose in speakers. Increases in all three measures have been associated with an increased risk of voice disorders in occupational voice users. While many dosimetry studies have observed 
patterns in voice use in occupational voice users, newer studies support the use of voice dosimetry in biofeedback. However, while the accelerometer recordings from a voice dosimeter 
provide a number of measures related to risk factors for voice disorders, other measures may be able to provide information about vocal fatigue.  Need for Objective Measures of Vocal Fatigue Currently, no objective measures have been identified that are able to reliably identify the presence of vocal fatigue (Solomon, 2008). Solomon reported that fundamental frequency is often the most correlated measure with fatigue, but for some individuals it increases with fatigue, and for others it decreases, making it difficult to determine a specific relationship. Solomon (2008) hypothesized a likely reason for the inability to reliably measure vocal fatigue is related to vocal fatigue occurring even when the sound of the voice is within the perceptually normal range and the larynx appears normal when visualized. Despite the lack of reliable objective 13  measures for vocal fatigue, many individuals who experience vocal fatigue report perceived changes in how the voice sounds. For this reason, it is important to understand how changes in voice quality associated with vocal fatigue may be quantified and reported to enable their use in 
a biofeedback application for preventative voice monitoring.  Voice Quality  In addition to research for developing safety standards for voice use, researchers are also developing objective measures that predict listener perception of voice quality. Voice disorders are often characterized as being abnormal in one or more dimensions of voice quality, including breathiness, roughness, and strain. These dimensions are subjectively rated in standard voice 
assessment protocols including: Grade, Roughness, Breathiness, Asthenia, Strain scale (GRBAS; 
(Hirano, 1981) and Consensus Auditory Perceptual Evaluation of Voice (CAPE-V; (Kempster, Gerratt, Abbott, Barkmeier-Kraemer, & Hillman, 2009). Despite their common use in the both clinical and research settings, there is no clear consensus on the best objective measures to 
predict these voice quality perceptions. There is often disconnect between subjective ratings of the voice and objective measures from the voice signal. For example, patients often report changes in at least one voice quality dimension when they experience vocal fatigue (Colton & Casper, 2006; Stemple, 2000), although traditional acoustic measures have been unable to capture these self-reported changes (Solomon, 2008).  Researchers have evaluated a number of measures to assess their ability to predict listener perception of voice quality. A recent systematic review identified nine categories of measures, 
including both subjective and objective measures (Roy et al., 2013). The categories of objective measures were: acoustic measures (i.e., measures from acoustic recordings; Werth, Voigt, 14  Döllinger, Eysholdt, & Lohscheller, 2010), aerodynamic measures (i.e., measures related to airflow and air pressure; Jiang & Stern, 2004), electroglottography (i.e., measures from an electroglottogram; Baken, 1992), and image processing measures (i.e., measures from videostroboscopy and videokimography; Deliyski & Hillman, 2010).  From the acoustic signal, two types of objective measures have been explored: acoustic, measured directly from the voice signal, and auditory, measures taken after filtering the acoustic signal through an auditory processing front-end (Shrivastav, 2003; Shrivastav & Sapienza, 2003). A number of acoustic measures have been studied with both normal and dysphonic voices. Some of the more commonly cited ones are described below. Perturbation measures capture the variability in cycle-to-cycle vibrations, and the most commonly reported are: jitter (frequency perturbation), shimmer (amplitude perturbation), and noise-to-harmonic ratio (amount of aperiodicity) (Ma & Yiu, 2005). One of the major limitations of these measures is that they are only applicable to nearly-periodic signals (Ma & Yiu, 2005). The ability of these measures to separate normal and dysphonic voices has been inconsistent. For example, Bhuta, Patrick, & Garnett (2004) looked at these three measures (in addition to other measures), and from these three, found that only noise-to-harmonic ratio predicted listener perception of quality (p= .02 for roughness, p= .007 for grade).  Newer measures have proven to be better predictors of listener perception of voice quality. One measure is the cepstral peak prominence (CPP). CPP is a measure taken from the 
cepstrum, the Fourier transformation of the spectrum of a voice signal (Heman-Ackah et al., 2003). A study found that smoothed CPP (CPPS) for sustained // vowels (CPP averaged over a number of frames) had a -0.80 correlation with grade, -0.70 with breathiness, and -0.43 for roughness. These negative correlations indicated that as each of these three dimensions 15  increased, CPPS values decreased. For running speech, the correlations were greater: -0.86 for grade, -0.71 for breathiness, and -0.50 for roughness. These findings indicate that CPPS is best at predicting overall grade, moderately good at predicting breathiness, but not well correlated with roughness perception. Another measure is a mathematical combination of multiple acoustic measures, the Acoustic Voice Quality Index (AVQI). The AVQI is a combination of CPPS, harmonic-to-noise ratio, shimmer local, shimmer local dB, spectral slope, and tilt of the regression line through the spectrum (Maryn, De Bodt, & Roy, 2010). The authors found that this measure was able to reliably distinguish between normal and dysphonic voices (p<.001). While some acoustic measures have demonstrated better predictive ability of voice quality ratings, supporters of auditory measures argued that analysis of a signal first filtered in a manner similar to the human auditory system, rather than an unfiltered acoustic signal, leads to even better matches with listener perception of voice quality (Shrivastav, 2003; Shrivastav & Sapienza, 2003). The auditory processing front-end used in the research by Shrivastav et al. (e.g., Camacho,2007a; Shrivastav, 2003; Shrivastav, Eddins, & Anand, 2012; Shrivastav & Sapienza, 
2003) was based on the model by Moore, Glasberg, & Baer (1997). This model consists of a set of filters that correspond with the series of filters in the human ear. First, an acoustic signal passes through a filter representing the outer and middle ear. This filter acts in a manner analogous to the transfer function of the outer and middle ear. The signal is then processed 
through a nonlinear filterbank consisting of overlapping band pass filters characteristic of the 
band pass filters along the length of the human cochlea, which convert the signal output from the linear frequency scale to the equivalent rectangular bandwidth (ERB) scale. An ERB is the 16  frequency range around a center frequency in which no change is detected by the human auditory system by small deviations in frequency from the center frequency. The ERB increases as the center frequency increases (Moore, 1983). This is in contrast with critical bands, where the critical band in which a change in frequency cannot be detected increases with increasing center 
frequency above 500 Hz but is constant below 500 Hz (Fastl & Zwicker, 2007). The result of the summation of the overlapping ERBs is an excitation pattern (in power units). The last step of the filter involves additional nonlinear compression, which increases the gain for frequencies above 500 Hz (Moore et al., 1997) resulting in a specific loudness pattern. The specific loudness pattern is a representation of the output from each cochlear filter as a function of frequency (in ERB 
units). The sum of the output from each filter represents the loudness elicited by that acoustic 
signal, and this is referred to as the specific loudness, which is analogous to the output to the auditory nerve.  The search for the best objective measures for signal quality is not unique to disordered voices. Fastl & Zwicker (2007) describe three primary dimensions of general sound quality: pitch strength, roughness, and sharpness. These authors define pitch strength as the saliency of pitch on a scale from weak to strong (distinct) for a given acoustic signal. This is different from pitch, which is ordered on a scale from low to high. In 2012, Shrivastav, Eddins, & Anand found 
a strong negative correlation (-0.989) between pitch strength and perceived breathiness in voices. These results suggest that pitch strength in general acoustic signals may be analogous to 
breathiness in voices.  Additionally, Fastl & Zwicker (2007) define roughness in an acoustic signal as the presence of modulation greater than 20 Hz that follows a bandpass function, the width of which is dependent on the center frequency of the tone being modulated. Along the bandpass function, 17  the perception of roughness increases up to a certain modulation frequency, and then decreases for increasing modulation frequencies. For example, for a 100% amplitude-modulated tone with a center frequency of 1000 Hz, the perception of roughness begins at a modulation frequency of 20 Hz and increases up to a modulation frequency of 70 Hz. From there the perception of 
roughness decreases until the tone is no longer heard as rough, around 300 Hz.  In addition to modulation frequency, modulation depth can influence the perception of roughness. Fastl & Zwicker (2007) demonstrated that modulation depth (extent of modulation) influenced fluctuation strength, a perceptual measure. Fluctuation strength is perceived in an acoustic signal with modulation less than 20 Hz. The relationship between fluctuation strength and modulation depth follows a sigmoid curve, with increasing modulation depth leading to an increased perception of fluctuation strength. For dysphonic voices, Eddins & Shrivastav (2013) found a similar relationship between modulation depth and perceived vocal roughness. These results suggest that the process or phenomenon that applys to the perception of roughness in most naturally occurring sounds may also apply to the perception of roughness in dysphonic 
voices. Finally, Fastl & Zwicker (2007) describe sharpness as being related to an acoustic signal™s envelope, and that sharpness perception is critical-band rate dependent. Critical bands are organized on a scale from 1 (lowest) to 24 (highest), where critical bands are adjoining but 
not overlapping, and this is referred to as the critical-band rate. Sharpness is calculated using the specific loudness of a given sound (over the critical-band range) and has additional weighting for critical-bands over 16 Bark. General acoustic signals with increased energy at higher frequencies are perceived as having increased sharpness. While sharpness is one way to capture the change in 
the relative distribution of energy in the spectrum, another way is through spectral moments. The 18  first four spectral moments are typically used to describe a spectrum, and are written with the upper case M to denote spectral moment and a subscript number to indicate which spectral moment is reported (Pinkowski, 1993). M1 is the mean of the spectrum, M2 is the dispersion (standard deviation), M3 is the skewness, and M4 is the kurtosis. In several experiments evaluating disordered voice quality, the perception of strain has been shown to be correlated with the presence of greater energy in the higher frequencies of the vocal spectrum (Bergan, Titze, & Story, 2004; Sundberg & Gauffin, 1978). Since this change mirrors the changes observed in the perception of sharpness, one may speculate that sharpness and strain are the same (or at least, similar) perceptual constructs applied to different classes of 
sound stimuli. Research in voice has also examined the relationship between strain and spectral moments. The first spectral moment measures the mean value of the energy for a given signal, and has been shown to be related to ﬁpressednessﬂ (strain) of the voice (Sundberg & Gauffin, 1978). Recent research found that spectral moments are better predictors of human perception when taken from an auditory rather than an acoustic signal (Kopf, Shrivastav, & Eddins, 2013). Table 1 includes all the results for each of the four spectral moments (M1, M2, M3, M4) at each stage in the auditory-processing front end, when controlling for breathiness and roughness perception. As seen in Table 1, the first spectral moment of specific loudness has been found to be the most strongly positively correlated (.832) with listener perception of vocal strain (Kopf et al., 2013). For occupational voice users, vocal fatigue is a common report after extended voice use, and some occupational voice users report a change in the quality of their voice with fatigue. However, there are no clear ﬁbestﬂ objective measures for capturing voice changes due to vocal 19  fatigue (Solomon, 2008). It is possible that auditory measures, which are more closely associated with listener perception of the voice quality, will be better predictors of voice quality changes associated with vocal fatigue.  Measure Acoustic Signal After First Filter After Filter Bank (Excitation Pattern) After Nonlinear Compression (Specific Loudness Pattern) First Moment  0.659  0.526  0.743  0.832 Second Moment  0.303  0.611 0.013  0.309 Third Moment  0.170  0.115  0.209 -0.112 Fourth Moment  0.165  0.550  0.097  0.115 Table 1: Partial correlations between spectral moments and strain ratings.  Current Biofeedback Options for Voice  While voice dosimeters are available to capture longitudinal voice data and objective measures are being developed that are better predictors of voice quality, there is a paucity of literature looking at the application of automated biofeedback in changing voice production. One study investigated the use of real-time biofeedback in prevention of voice disorders in call center workers (Schneider-Stickler, Knell, Aichstill, & Jocher, 2012). In this study, the intervention involved PC-based software (VidiVoice) that allowed an individual to monitor vocal intensity, fundamental frequency, and phonation time by providing indications of when the voice was out 
of an acceptable range. The researchers report that settings were adjusted for each individual, but 
other than target values, few additional details were reported. For example, the target value was 
specified, but not an acceptable range around the target value.  The study included a treatment group receiving the biofeedback, and a control group. Biofeedback was given over the course of 4 weeks (Schneider-Stickler et al., 2012). A statistically significant decrease in Vocal Handicap Index score (lower scores indicating less 20  perceived vocal handicap) was found for both the treatment and control groups. The researchers also reported increased Voice Range Profile for participants in the treatment group after the 
intervention, as evidenced by a greater maximum dB value achieved by those who were considered to have vocal hypofunction at the start of the study. No information about participant 
thoughts on the software, or how much they paid attention to it during the workday were 
provided. Another system that has been developed for use in the prevention of voice disorders is the Voice-Care (Astolfi, Carullo, Pavese, & Puglisi, 2015; Astolfi, Carullo, Vallan, & Pavese, 2013; Carullo, Vallan, & Astolfi, 2013). Publications have discussed the development of this small, lightweight device, compared its functioning to other developed dosimeters, and investigated its 
ability to monitor the voice in multiple acoustic environments. No publications regarding its functionality in voice disorder prevention have been published to date.  Behavior Change The above information may be useful feedback to promote behavior change in Preventative Voice Monitoring (PVM). However, researchers in behavior change argue that it is not only important to account for objective changes resulting from behavior change, but the 
subjective, internal changes in motivation that occur as part of the behavior change process should also be accounted for (Prochaska & DiClemente, 1984). Although the observed objective changes may be subtle or difficult to determine, there may be internal changes occurring that are 
important catalysts in eventual observed changes (Prochaska & DiClemente, 1983; van Leer, 2010).  21  While there are only a few studies looking at behavior change measures in voice (e.g., Teixeira et al., 2013; van Leer, 2010), studies in other research areas have demonstrated the importance of evaluating self-change of behavior, such as in smoking cessation (Wilcox, Prochaska, Velicer, & DiClemente, 1985). In the study by Wilcox et al. (1985), the researchers were interested in the influence of participant characteristics (demographics, smoking history, health history, life experiences) in predicting self-initiated change in smoking status. The researchers found that higher scores on pleasure from smoking and smoking duration led individuals not interested in quitting to remain smokers at a 6-month follow-up. On the other hand, individuals not interested in quitting who scored high on health were more likely to quit or consider quitting at the 6-month follow-up. For a second group of individuals from the onset of the study, ﬁrelapsers,ﬂ increased income increased the likelihood of quitting, and those not 
quitting reported smoking more daily cigarettes. Based on these findings, Wilcox et al. (1985) reported the support for their hypothesis of the importance of individual characteristics influencing the likelihood that individuals would engage in smoking cessation. Overall, three common concepts when discussing behavior change are stages of change, readiness to change, and self-efficacy for behavior change. All three concepts help characterize where the individual is in the behavior process: their stage, how ready they are to make a change, and how confident they are that they can make changes to their behavior.  Stages of Change (SOC) The Transtheoretical Model (TTM) is one way to describe the process of moving from lack of motivation for behavior change to taking action and maintaining changes. The TTM divides the process of behavior change into a series of five stages (Prochaska & DiClemente, 22  1984). The first stage is precontemplation (PC), where an individual lacks knowledge and/or motivation to make a needed behavior change. After an individual successfully moves through the precontemplation stage, they move into contemplation (C). In the contemplation stage, an individual considers behavior change but is not yet ready to initiate the behavior change process. In the third stage (preparation, PA), the individual is ready to start behavior change in the near future, but has not fully committed himself/herself to the process. When the individual starts the behavior change, he/she moves into the action (A) stage and becomes engaged in the active process of behavior change. If the individual successfully completes the action stage, they move into the final stage: maintenance (M). In the maintenance stage, the individual continues to display the changed behavior and works to avoid relapsing to previous behavior (DiClemente et al., 1991; DiClemente, Schlundt, & Gemmell, 2004). The authors added that it is possible for individuals to successfully move past the maintenance stage and keep the behavior change 
without active maintenance.  As an example of the stages of change, DiClemente et al. (2004) outlined the staging process for smokers. If smokers are not contemplating quitting in 6 months, they were classified in the PC stage. If they were planning to quit in the next month and had attempted to quit at least 
once in the last 6 months, they were assigned to the PA stage. Those that did not fit either 
category but were currently smoking were classified as being in the C stage. If an individual 
reported not being a current smoker, they were classified in the A stage if they quit less than 6 months prior and the M stage if they had quit more than 6 months prior.  In contrast, the main difference in findings between general behavior change processes and physical activity behavior change was the report of engagement in exercise at all stages of 
change, to differing degrees. For example, one study of SOC for physical activity (Kim, Hwang, 23  & Yoo, 2004) found that engagement in an exercise program leads to forward movement in stages of change for individuals in all stages below maintenance (PC, C, PA, A). In addition, a meta-analysis of 80 studies of stages of change in physical activity and exercise (Marshall & Biddle, 2001) found that even in the early stages of change, movement to later stages involved increases in physical activity. Therefore, all stages of change in increasing physical activity are marked with some increasing level of physical activity (Cardinal, 1995). This may also be true of the population at risk for voice disorders. For example, even individuals grouped in the precontemplation stage for changing vocal behavior may find themselves making some small 
changes in response to feedback on voice use. A framework for the use of the TTM in voice therapy has been outlined (van Leer, Hapner, & Connor, 2008), and these authors also offered suggestions for how to move individuals in one stage of change to later stages. The authors described possible voice patients in each stage of change: PC- either not realizing that behavior contributes to their voice problem, or not interested in changing their voice; C- expressing ambivalence about current vocal behavior and behavior change; PA- interested in making changes and collaborative goal setting; A- actively engaging in behavior change inside and outside of the clinic; M- patient is independently sustaining vocal behavior changes made in therapy. The authors also described 10 
behavior change processes, adapted for voice therapy, including programming reminders to 
practice into the patient™s electronic calendar. Finally, the authors referred to some problems that 
may arise in the clinic related to the behavior change process and how to address them. For 
example, making sure that strategies taught in therapy are appropriate to the individual™s stage of 
change. However, as stated in this article, an empirical study of the application of this framework 24  in voice therapy is necessary. In addition, the application of stages of change to the prevention of voice disorders was not referenced.  Five main stages of change have been identified both within a larger health-related context, and within the voice disorders population. However, the exact nature of these stages varies discipline by discipline. Whereas individuals who smoke are not interested in quitting in 
the PC stage, individuals in the PC stage for physical activity may engage in some light physical 
activity. While a framework of these stages has been developed for voice therapy, empirical 
support is still needed. Further work also needs to be done to extend stages of change to the at-risk occupational voice user population.  Assessing SOC There are currently multiple methods of assessing an individual™s stage of change. For example, studies such as Boyle, O™Connor, Pronk, & Tan (1998) assess stage of change through a single question with multiple answer choices, with each answer corresponding to a single stage 
of change. In 1983, a scale was developed to assess an individual™s stage of change (McConnaughy, Prochaska, & Velicer, 1983). The researchers hypothesized five stages of change (precontemplation, contemplation, action, maintenance, and relapse), but only identified four through a principle component analysis (precontemplation, contemplation, action, and maintenance). Out of 125 items, a 32-item questionnaire was maintained, with eight items loading on each of the four factors representing the four stages of change. This scale, the 
University of Rhode Island Change Assessment (URICA) was developed and validated for a population of outpatient enrollees in psychotherapy (McConnaughy, Prochaska, & Velicer, 1983). The URICA has also been used in other populations such as arthritis management (Keefe 25  et al., 2000) and weight loss (Prochaska, Norcross, Fowler, Follick, & Abrams, 1992). Finally, the URICA has been adapted for use with individuals with voice disorders (URICA-VOICE; (Teixeira et al., 2013), and was used as an outcome measure in a study of teachers with voice complaints (Rossi-Barbosa, Gama, & Caldeira, 2015). While there are multiple ways to evaluate stages of change, one (URICA-VOICE) has been developed specifically for use with the voice disorders population. However, its application for the at-risk occupational voice user population still needs to be investigated.  Readiness to Change (RTC) While the URICA-VOICE can assess stages of change, it can also be used to assess individuals™ readiness to change. Readiness to change (RTC) is described as the interaction of how important an individual thinks a problem is and how confident he/she feels that a change can be made (DiClemente et al., 2004). Even though it is related to stages of change, this concept is a separate entity. RTC has been used as a predictor of future behavior change: the greater the RTC, the more likely the individual will engage in behavior change (DiClemente et al., 2004).  Readiness to change can be calculated arithmetically from the results of the URICA (DiClemente et al., 2004) and the URICA-VOICE (Teixeira et al., 2013). Readiness to change was discussed in the paper by Teixeira et al. (Teixeira et al., 2013), with individuals in later stages of change demonstrating greater readiness to change. Another study used the readiness to change measure from the URICA-VOICE and concluded that a majority of teachers with vocal complaints have low readiness to change and are in the pre-contemplation stage of change (Rossi-Barbosa et al., 2015).  26   Readiness to change is related to SOC, but is a different concept. RTC has been explored in the voice disorders population, but further exploration is needed, especially in the at-risk occupational voice user population.  Self-Efficacy (S-E) Related to readiness to change is self-efficacy. Self-efficacy for behavior change is a measure of an individual™s confidence in his/her own ability to successfully change his/her own behavior (Bandura, 1977). As individuals transition to later stages of change, self-efficacy for behavior change also tends to increase (DiClemente et al., 2004; Dijkstra, De Vries, & Bakker, 1996). Researchers found that larger increases in self-efficacy occur moving from preparation to action and from action to maintenance than forward progression in the earlier stages of change. The researchers attributed these larger increases in self-efficacy for later stages to individuals™ successful engagement in behavior change, whereas those in earlier stages were not yet ready to engage in behavior change. Self-efficacy has been described as an important construct to evaluate in individuals with voice disorders (van Leer, 2010; van Leer & Connor, 2010, 2012). A scale was designed to assess S-E for voice therapy (van Leer & Connor, 2010, 2012), but this scale focuses on barriers to voice therapy home practice, and does not cover the more broad topic of voice change that may occur outside of voice therapy, such as through preventative measures initiated by the individual himself or herself.  While S-E has been evaluated for multiple types of behavior change, there is a paucity of literature exploring it in the voice disorders population, and its application to the at-risk occupational voice user population remains an area of needed study. 27  Vocal Fatigue Index (VFI) Stated previously, there are a number of factors to consider with vocal fatigue, including subjective and objective measures. One questionnaire has been developed to assess vocal fatigue, the Vocal Fatigue Index (VFI). As discovered by Nanjundeswaran, Jacobson, Gartner-Schmidt, & Verdolini Abbott (2015) in the development of the Vocal Fatigue Index, vocal fatigue appears to be divided into three factors. Factor 1 is characterized by a feeling that the voice is ﬁtired,ﬂ 
which may lead to a reduction in further voice use. Factor 2 is characterized by physical 
discomfort, such as a sore throat. Finally, Factor 3 is characterized alleviation of fatigue symptoms with vocal rest. These findings are not surprising given the complex nature of vocal fatigue, and highlight the need to identify the most important factors for assessing the voices of occupational voice users.  
        28  CHAPTER 3: Study Aims & Hypotheses The goal of the current study was to determine whether Preventative Voice Monitoring (PVM), using a dosimeter with task-based feedback, impacts vocal behavior in occupational voice users and future occupational voice users. Three study aims were defined: to design the feedback (Aim 1), and to assess its ability to influence both behavior change measures (Aim 2) and voice production (Aim 3). To address the aims, the study was divided into two parts: the creation of the feedback (Part 1) and testing of the feedback (Part 2). The experimental methods, data, and results are explained in greater detail in Chapters 4 and 5. This was done to explain the temporal sequence in which the study was conducted.  Aim 1: To extract design requirements for conveying feedback to users. The formative evaluation was exploratory in nature. In Part 1 of the dissertation, a qualitative, interpretivist approach was used to identify the most relevant objective measure(s) and type(s) of visual displays for potential users. The researcher used iterative prototyping (Bernstein & Yuhas, 2005; Goldman & Narayanaswamy, 1992) to design the best delivery method for the feedback. For each iteration, the information gathered from additional 
participants shaped display prototypes for the next iteration. One participant returned to check 
the final iterations of the feedback to determine their appropriateness. In Part 2, participants used 
the feedback displays with vocal loading tasks. Interviews provided insight into future iterations 
of the feedback displays. Interviews after baseline recordings (no feedback) provided additional 
information on occupational voice user needs. Final interviews provided insight into aspects of 
the feedback that worked well and those that need further development.  29  Aim 2: To identify changes in voice behavior management after receiving feedback.  Increased RTC and S-E are linked to increased engagement in the behavior change process (DiClemente et al., 2004; Dijkstra et al., 1996; Marshall & Biddle, 2001). In addition, voice training can reduce self-perceived vocal fatigue symptoms (McCabe & Titze, 2002). In Part 2, RTC, S-E, and vocal fatigue were assessed at three time points: at the beginning of the study (prior to vocal loading tasks), after baseline recordings (with no feedback), and at the completion of the study (after feedback recordings). It was hypothesized that active engagement in changing voice production would manifest as improvements in: RTC (increase), S-E (increase), and vocal fatigue. The improvements in vocal fatigue were defined as changes in one 
or more of the following of the VFI (consistent with Nanjundeswaran et al., 2015): a decrease in Factor 1, a decrease in Factor 2, and/or an increase in Factor 3. In addition to these assessments, interview results after baseline recordings provided further insight to what behavior changes 
were made intuitively, and final interviews provided insight into what changes were facilitated by the current feedback.  Aim 3: To quantify changes in the voice after receiving feedback. This aim consists of three hypotheses: 1. Vocal intensity, time dose, and fundamental frequency (F0) have been identified as factors contributing to the distance travelled by the vocal folds during speech ( −vec, Popolo, & Titze, 2003; Titze et al., 2003). These prior studies reported that decreasing the distance travelled by the vocal folds (distance dose) reduces the likelihood of damage to 
the vocal folds. In Part 2 of the current study, participants completed vocal loading tasks 30  and received feedback on voice production using the feedback displays developed in Part 1. It was hypothesized that occupational voice users would improve voice production in response to feedback, and these improvements would manifest as decreases in one or more of the following: vocal intensity, voicing time, and/or F0. 2. When patients report vocal fatigue, they often describe perceived changes in voice 
quality, including increased breathiness and vocal effort (Colton & Casper, 2006; 
Stemple, 2000). Part 2 participants produced three sustained // vowels before and after each vocal loading task, and average values of pitch strength and first moment specific 
loudness were compared. It was hypothesized that increasing vocal fatigue would result in the following changes in objective voice quality measures: increasing strain 
(increasing first moment specific loudness) and/or increasing breathiness (decreasing pitch strength). 3. It was further hypothesized that the changes in breathiness and vocal strain pre- to post-vocal loading task would be greater for baseline tasks than tasks with feedback. This was 
anticipated because vocal fatigue should decrease with behavior change related to feedback, and therefore should lead to smaller increases in breathiness and/or vocal 
strain.     31  CHAPTER 4: Feedback Requirements and Changes in Voice Behavior Management (Aims 1 & 2)  Study Overview The goal of this study was to determine whether Preventative Voice Monitoring (PVM) would impact vocal behavior in occupational voice users and future occupational voice users. Additionally, the results would provide detailed user insight into the design and implementation of PVM (Aim 1), and insight on the sensitivity of behavior change questionnaires to internal changes experienced by users of PVM (Aim 2). This user-centered design study, a study involving the end user throughout the design and testing process (Beyer & Holtzblatt, 1998; 
Lucas Jr., 1971; Slegers, Duysburgh, & Jacobs, 2010), consisted of two parts (Table 2).   Study Part Iteration/Phase # Sessions  (per participant) # Unique Subjects Part 1 Feedback Display Prototype Development 1 15  Initial Feedback Display Prototypes  6  Iteration 1 of Feedback Display Prototypes  3  Iteration 2 of Feedback Display Prototypes  3  Iteration 3 of Feedback Display Prototypes  3 Part 2 Feedback Display Prototype Testing 11 18  1: Laboratory Testing  13  2: Field Testing  5 Table 2: Study design overview.  The length of each part of the study and number of participants enrolled were appropriate based on literature in both communication sciences and disorders and human computer 
interaction. In a systematic review of voice therapy literature on the effects of therapy, 14 of the 32  47 included studies had 20 or fewer participants (Speyer, 2008). In a study of an application for voice monitoring, 14 individuals enrolled in voice therapy piloted an application and completed short interviews to assess their perceptions of the application (van Leer et al., 2016). A study of a Google Glass application to monitor vocal intensity for persons with Parkinson™s disease (PWP) by McNaney et al. (2015) followed a similar structure to the current study. This study consisted of three phases: (1) a design workshop involving 7 PWP, (2) a 30-minute pilot of the application with 8 individuals who did not have Parkinson™s disease, and (3) a three-day field trial of the application with 6 PWP. Consolvo et al. (2008) recruited 12 individuals to complete a study of a physical activity tracker. This study involved individuals coming in for three interviews (initial, midpoint, final), and using the physical activity tracker for a total of three weeks. Another 
research group tested a physical activity tracker with 8 participants over a month, one week of 
baseline measurements and three weeks with feedback (Lim, Shick, Harrison, & Hudson, 2011).  Part 2 of the current study was similar in length to the Lim et al. (2011) study. Participants in Part 2, Phase 1 (Part 2.1) were scheduled three days a week (Monday, Wednesday, Friday) at the same time each day, and the estimated length of time to complete the study, with no missed sessions, was four weeks. The researcher chose this scheduling format 
because it mirrored teaching schedules for instructors who taught a class three times per week. Due to sickness and other excused absences, all participants completed Part 2.1 within six weeks. 
Participants in Part 2, Phase 2 (Part 2.2) were scheduled on teaching days for lecture-based courses. Participants were included in the study if they taught at least two different days per 
week. All Part 2.2 participants taught two days a week (Monday, Wednesday), and the estimated 
time to complete the study was six weeks. Due to sickness and other excused absences, all participants completed Part 2.2 within ten weeks. One participant in Part 2.2 (P2205) only 33  completed seven of the eight recording sessions due to a change in course format partway through the semester. Due to this change, only one day per week was recorded for the final few weeks because the other day was primarily student presentations and little instructor talking.  Inclusion Criteria  All participants were recruited from the Michigan State University community through posted flyers and word of mouth. The same inclusion criteria were used for all participants in the 
study and are included in Table 3. While being an occupational voice user was a criterion for inclusion, this particular criterion was not limited to previously established occupational voice 
user categories, but was broadened to include any occupation where the participant felt that 
talking was an integral part of the job requirements.  Category Inclusion Criteria Occupation Occupational voice user or student studying to be occupational voice user Age 18-65 years old Hearing Screen Pass at 0.125 to 8.0 kHz at 20 dB (ANSI S3.21-2004) History No history of (or current) voice disorders requiring medical intervention Table 3: List of inclusion criteria.  Part 1 Methods Part 1 involved formative evaluation of design requirements for feedback using iterative paper prototyping. In the iterative design process, the researcher presented iterations (versions) 
of display images to potential target users (people that the product is being designed for), and asked potential target users™ opinions and suggestions for how to improve the displays (Bailey, 1993; Buley, 2013; Hansen, 1997). The researcher used hand-drawn sketches in prototyping because sketches welcome more conceptual commentary than more polished computer-generated 34  images, which may evoke more commentary on formatting (Buley, 2013). Changes in later iterations were based on input from previous potential target users. Part 1 consisted of four iterations, and displays were finalized at the end of the fourth iteration. An overview of the organization of Part 1 can be seen in Figure 2.   Figure 2: Outline of Part 1.  Participants Fifteen participants enrolled in Part 1 (Table 4), (4M, 11F; M=42.0 years, SD=15.38 years, Range=22-64 years). Two of the older participants (P1209, P1414) failed the hearing screen at one or more of the higher frequencies (over 4000 Hz) in one or both ears. However, 
their hearing did not appear to affect their understanding of conversational speech, and therefore, 
the researcher included these participants in the study. The researcher assigned participants to 35  one of the four iterations based on the date of their participation, and the iteration is reflected in the participant number (e.g., participant numbers starting with 11 viewed the initial displays, 
participant numbers starting with 12 viewed the second set of displays, and so on).   Participant ID Age Gender Occupation P1101 35 F Course Instructor P1102 27 F Office Assistant P1103 30 F Office Assistant P1104 35 M Podcast Producer P1105 32 F Teaching Assistant, Student P1106 54 F Lawyer P1207 64 F Course Instructor P1208 27 F Teaching Assistant, Student P1209 56 F Library Clerk P1310 22 M Office Assistant, Student P1311 56 F Research Assistant, Student P1312 62 F Radio Volunteer P1413 26 M Course Instructor P1414 63 M Retail, Substitute Teacher, Santa Claus P1415 41 F Research Assistant, Student Table 4: Part 1 Participant Demographics.  Procedures  Each participant enrolled in an individual session of no more than 1.5 hours in duration. Prior to participation, all participants first completed an informed consent form approved by the 
Michigan State University Human Research Protection Program (HRPP). Then, each participant 
completed an intake form (Appendix A). After inclusion criteria were met and the initial forms were completed, the remainder of the session was recorded using two digital recorders to ensure 
no data loss (Roland R-05, Lake Stevens, WA, USA; TASCAM DR-40, Montebello, CA, USA). In the semi-structured interview, the researcher had a series of printed questions, but asked follow-up questions when relevant to gain a deeper understanding of a participant™s answers. For 36  example, if a participant reported that their environment was only occasionally noisy, the researcher might ask her/him to describe situations when this occurs.  After the initial interview (Appendix B), the researcher showed the participant a set of icons (Figure 3), one representing each of the objective measures to be tested. Icons were randomized between participants to ensure that presentation order did not have an effect. The researcher assigned simplified names to these objective measures to ensure easier understanding 
by participants. These measures were: vocal intensity (loudness), frequency/duration of pauses during speech (pauses), distance travelled by the vocal folds (distance), pitch strength (quality), 
and first moment specific loudness (strain). A sixth measure, cepstral peak prominence (clarity), 
was only used with the first participant because while this measure is different from quality, the basic explanations of the two measures were so similar that it caused confusion. Explanations of 
measures were provided as needed. The researcher first provided a simple definition of the 
measure, and if prompted by the participant, expanded on the definition as needed.    Figure 3: Initial icons representing objective voice measures. Icons and the number of measures changed in later iterations based on participant input.   37   After the introduction of the measures, the researcher instructed the participant to order the icons from most useful to least useful, and assign each a number (1=not useful, 10=the most useful). The participant could assign multiple measures the same usefulness. The researcher then encouraged the participant to describe why she/he assigned the given order to the measures. Participants also indicated how important they thought each measure was, and how confident 
they were that they could change a given measure based on daily feedback using the same scale.   Next, the researcher encouraged participants to provide initial insight on the design of the feedback displays without seeing the initial prototypes. The researcher provided options for 
responses: verbal, drawn using a pencil and paper, or a combination of the two. 
 Finally, the researcher introduced prototypes of visual feedback displays for measures of interest one measure at a time in a random order. The participant saw the displays sequentially 
and then simultaneously. During this process, the researcher asked a series of questions about the visual feedback displays (Appendix C). Questions for displays included asking participants what kind of information they are able to gather from a display (without researcher explanation), 
whether they can identify general patterns in the display, whether the display itself is useful, and if the display should change (and how). Additional questions included whether the participant 
would combine displays, and whether the participant could offer any additional design ideas. After displays of individual measures, participants saw one or more examples of multi-measure displays.    38  Part 1 Analysis  Aim 1 To extract design requirements for conveying feedback to users.  In Part 1, the researcher conducted qualitative analysis of participant comments and suggestions following the six individual sessions with the initial feedback displays. This analysis allowed the researcher a better understanding of the needs of occupational voice users, and 
provided guidance in determining the most relevant measures, feedback displays, and display 
edits to incorporate into updated prototypes for the next iteration. The most common suggestions 
were incorporated, as well as any additional insightful suggestions judged worthy of further evaluation. The same analysis and iterative design was completed for each successive iteration. 
After determining the finalized displays, the researcher brought one participant from an earlier 
iteration back to look over the displays. This participant was particularly insightful in the earlier session, making her a trusted individual for a final check of the displays.  Part 1 Results  The results from Part 1 are reported below. The results highlight the major changes in feedback displays from one iteration to the next based on participant input.   Initial Feedback Designs   The initial feedback display designs can be seen in Appendix D. These initial designs were based on recommendations from the literature on data visualization (Holmes, 1984; Tufte, 1983, 2001). For example, the shape used for the daily feedback is an example of a golden 39  rectangle (Tufte, 1983, 2001), a rectangle with dimensions that have been shown to be highly pleasing to the eye. In addition, limited ﬁdata inkﬂ (Tufte, 1983, 2001) was used in these designs, as evidenced by the white bars in the bar graphs with only a black outline. In addition, the color palate was limited to black ink on white paper, except for occasional red to mark a particular 
contrast (Tufte, 1983, 2001). Finally, some of the designs, such as the speedometer for distance, were designed to look like a common object measuring a similar phenomenon (Holmes, 1984; 
Tufte, 1983, 2001).  Iteration 1  Two measures were omitted (distance, clarity). Clarity was merged with quality after the first participant (displays were all presented as quality to the subsequent 5 participants) because the explanation of these measures was extremely similar, even though they are unique objective 
measures. Distance was removed because, while participants occasionally asked for clarification about one or more measures during the study, each of the first six participants asked for clarification on the meaning of distance between 1-3 times during the interview. Therefore, the researcher determined that this measure may be too abstract for simple voice feedback.  Other changes included introducing dynamic icons. Many participants commented that icons could be a way to track performance, and so the researcher presented two versions of each 
icon in Iteration 2. In addition, the pause icon will feature the pause count in the center of the stop sign. Participants also indicated that the loudness icon was confusing- the sound should be coming from the stick figure, not the wall, to convey that the system is measuring the speaker™s 
loudness. Participants felt strain should include with two features: lightning bolts from neck and position of dumbbells (lower for less strain).  40  Icon Suggestions  P1102 introduced the idea of dynamic icons through describing appropriate changes for the icons (Appendix E, Figures 61-64). For example, she suggested the face would change for strain to demonstrate ﬁmore relaxed or more effortﬂ (Figure 64). In addition, pauses be larger for more pauses, and smaller for less pauses.  Changes for all icons were then shaped by the idea of having dynamic icons. P1101 suggested, ﬁLoudness icon doesn't convey that you're being too loud. Loudness should be person talking with audience holding their ears.ﬂ However, this idea would involve a greater amount of 
data ink, making the icon more complex. Therefore, loudness was changed to look more like 
other devices, such as more lines or dots on a radio or cell phone. However, the lines were drawn 
to be coming out of a person™s mouth, with a wider mouth opening for louder speech (Figure 61). For pauses, most participants pointed out that pauses included count and length of time. While P1102 had suggested changing the size of the sign to indicate pauses, this might be harder for individuals using the device to tell how many pauses. She did add, ﬁI like the icon because pause looks the same as stop.ﬂ P1106 added, ﬁThis would be a count- how many times you paused.ﬂ So, the stop sign was kept, but it was simplified to replacing the word ﬁpauseﬂ to the number of pauses made (Figure 62). The idea for the change in thumbs in the quality icon was introduced by P1105: ﬁYou are going to use thumb up/down or just thumb up? One image is not enough- this image indicates good voice. It™s good, but need to add more levels.ﬂ Therefore, the dynamic icon included varying the thumbs up and/or down (Figure 63). Finally, there was feedback on the strain icon. P1105 wanted the icon to better reflect what it was measuring, ﬁMaybe use a picture of a person who is talking, lips moving.ﬂ She also 41  added: ﬁShowing someone using a lot of effort in talking.ﬂ Therefore, strain was changed to having an open mouth with lines coming from the neck to indicate increasing strain (Figure 64).  Loudness Suggestions Overall, participants preferred the sparkline: a small line representing a lot of information (Tufte, 2001) which can be seen in the bottom display for loudness in Appendix D (Figure 47) and top display for loudness in Appendix E (Figure 65). This was displayed as a small multiple- multiple, with time periods presented in miniature so all can fit (Tufte, 1983, 2001). This display was preferred over the bar graph (Figure 45) and the clock (Figure 46). P1102 commented that the sparkline is the best ﬁif the objective is to track and adjust,ﬂ and P1104 remarked, ﬁThis makes more sense to me [than another loudness display].ﬂ P1106 liked the use of the color red in the displays because ﬁRed usually gets people's attention that something is wrong.ﬂ  However, in the next iteration, the small multiple display was expanded to allow easier view of daily detail (Appendix E, Figure 66). This change was based on comments such as ﬁI like the 3rd one the way it is, maybe a little biggerﬂ from P1101. Despite liking the sparkline, participants felt that this display held a lot of information, so it might also work as a display for individual day rather than a small multiple. Therefore, this idea was added as a second display in the next iteration. As P1105 put it, ﬁEven if you have one 
day's voice tracked in one figure, that's a lot of information.ﬂ  The clock and stacked bar graph were not liked overall, so they were omitted from the next iteration. For example, P1102 found the clock the most difficult to understand, and P1103 
commented, ﬁI don™t know what™s happening here.ﬂ P1104 said that the clock ﬁdisplay is nice, 
but information is difficult to comprehend.ﬂ P1103 did not like the bar graph: ﬁIt takes way too 42  much work to understand what's happening,ﬂ and P1104 added, ﬁI really don™t understand the red or the gray [in the bar graph].ﬂ Additionally, a ﬁpop-upﬂ feature was included to give users a definition of measures (presented on a sticky note, not included in figures). This was based on feedback from P1102: 
ﬁBut I would like to be able to go back and check definitions/hit the icon for definition.ﬂ   Pause Suggestions Participants preferred the small multiple line graph (Appendix D, Figure 48), although some liked the clock as well (Appendix D, Figure 49). Both were preferred over the stacked bar graph (Appendix D, Figure 50). Therefore, a display was created that incorporated the two (Appendix E, Figure 68), and a second display was created without the clocks (Appendix E, Figure 69). P1104 liked the small multiple (Appendix D, Figure 48) and commented, ﬁShould be easy to track over time because we work in comparisons.ﬂ P1101 commented that the clocks could be ﬁHelpful to see overall how much break time you give students.ﬂ P1105 felt that both the clock and the small multiple would be helpful, with the clock as a real-time measure and the small multiple as a summary measure across multiple days. P1104 was one of the participants who did not like the stacked bar graphs because: ﬁHaving 2 sets of bars is difficult.ﬂ  Another display (Appendix E, Figure 67) was created that involved tick marks for each pause, with each tick of varying length to represent pause duration. This was based on a comment from P1106 that she liked the small multiple but wanted hash marks to indicate when pauses occurred.  43  Quality Suggestions There were six displays tested for quality because there were initially three designs each for quality and clarity. Out of the quality/clarity displays, participants preferred the smiley faces (Appendix D, Figure 51; Appendix E, Figure 70). P1102 commented that the smileys were ﬁvery self-explanatory,ﬂ and compared with the other quality displays, P1103 stated: ﬁI liked the smileys the best.ﬂ P1104 added that the smiley display was best because ﬁpeople like to spend 
less time reading.ﬂ  Based on feedback in favor of the clarity line graph (Appendix D, Figure 54) and the quality line graph (Appendix D, Figure 53), another display was created (Appendix E, Figure 71). P1105 felt that the smileys and the line graph (Appendix D, Figure 54) were the two best displays for quality because they give different information and the line graph giving ﬁmore information than the smiles.ﬂ On the other hand, P1106 found the smileys ﬁhokey,ﬂ and preferred 
the line graph for quality. The researcher added color (red) to determine whether this enhanced the display based on the early comment from P1106 about loudness. P1101 suggested displaying combining smileys and numbers on axes, which was incorporated into the design (Appendix E, Figure 71). Finally, a new version of the small multiple quality display (Appendix D, Figure 52) was created with greater time detail (Appendix E, Figure 72). This design was kept based on a comments from P1106, ﬁOne thing I like about this one is that it shows your change each day.ﬂ  Strain Suggestions P1101 liked the line graphs for strain (Appendix D, Figure 58), but suggested the researcher ﬁadd numbers to label endpointsﬂ (Appendix E, Figure 74). P1104 suggested varying 44  the color along the lines. Therefore, the researcher tried to differentiation by varying the lines™ color on the gray scale (Appendix E, Figure 73).   Multi-Measure Suggestions Finally, the researcher simplified the multi-measure display to include the remaining four measures (Appendix E, Figure 75). While some liked the multi-measure display, P1103 commented, ﬁI would rather look at one [measure] at a time on a daily basis.ﬂ P1105 commented that the multi-measure display was ﬁtoo big for a mobile phone.ﬂ Despite mixed reviews for a multi-measure display, it was kept in future iterations because opinions were split.  Iteration 2  As a result of feedback on the first iteration of displays, incremental changes were made to all feedback displays.  Icon Suggestions  P1207 was not a fan of the dynamic icons, stating, ﬁDon't change background icons, just thumbs up/thumbs down.ﬂ P1209 was also disinterested in dynamic icons: ﬁIcons are cute, but 
looking for overall, I won't want to work so hard. I'd rather have the graph and be able to zoom 
in.ﬂ P1208 liked the strain icon, ﬁI think lifting weight is good. Lightning from neck is cute. Man changing colors would be good: hotter colors for more strain.ﬂ Comments like this from P1208 indicated that dynamic icons might be worth a further look, so they were maintained in the 
second iteration (Appendix F, Figures 76-79).   45  Loudness Suggestions Based on feedback, the small multiple was kept but not the elongated one (Appendix F, Figure 81). As P1209 stated, the elongated multiple was ﬁIntimidating if looking for general pattern,ﬂ and P1207 said it ﬁSeems scary busyﬂ. P1208 agreed: ﬁI like short format better than long.ﬂ P1209 liked the idea of the pop-out, but wanted it to be better labelled than simply clicking on one of the images on the display: ﬁI wouldn't think to click on the axis to see these 
explanations.ﬂ Therefore, a small question mark icon was added. While the elongated multiple was not preferred, P1209 remarked, ﬁI would want to be able to zoom in like this.ﬂ In other words, the participant wanted to be able to zoom into a single 
elongated day to be able to see additional detail (Appendix F, Figure 80).  Pause Suggestions Participants liked the idea of the tick mark display as a pop-out (Appendix E, Figure 67), so it was retained (Appendix F, Figure 83). P1207 commented, ﬁYou don't want this to be the main screen because too busy,ﬂ but that it was good as a pop-out. While P1207 and P1209 were not very interested in the clocks, P1208 was excited about them: ﬁIt™s a clock! I like this a lot. I can see making a game out of the clocks [extending pause 
time].ﬂ Therefore, the clocks were retained (Appendix F, Figure 84). P1209 was not interested in the line graph, and felt that ﬁseeing it as a number would work.ﬂ This update was made to the 
clock display.  The line graph was retained (Appendix F, Figure 82), although simplified, because P1209 commented, ﬁThe line makes more sense to me,ﬂ which is a sentiment shared by prior 
participants as well.  46  Quality Suggestions Out of the quality/clarity displays, participants P1207 and P1208 preferred the smiley faces (Appendix F, Figure 85). P1209 commented, ﬁSmileys are too simplistic.ﬂ P1209 voted to keep the line graph. While not liking the smileys in their own display, P1209 remarked, ﬁI like incorporating emojies & numbers [on the axis].ﬂ Therefore, the single line graph was kept but 
reduced to a single day to act like a pop-up (Appendix F, Figure 87), and used in a small multiple display as well (Appendix F, Figure 86).  Strain Suggestions P1208 and P1209 did not like the arrows at the endpoints. P1208 liked the numbers of the endpoints better, and P1207 remarked, ﬁRedundant numbers on endpoints can be useful to non-scientists.ﬂ Therefore, the endpoint numbers were retained, and not the arrows (Appendix F, Figure 88).  The second design (Appendix I, Figure 89) was based on a suggestion from P1207, ﬁParallel lines. One line is for morning, one is for evening.ﬂ This comment led the researcher to recall a comment from an earlier participant for labelling the lines, P1101, ﬁUse morning/night as 
markers on axis.ﬂ  Multi-Measure Suggestions No changes were made to the multi-measure display (Appendix F, Figure 90). Again, there were mixed comments about retaining it. P1207 preferred the idea of an iconic dashboard, not a matrix (Appendix F, Figure 91). P1209 stated, ﬁAll good information but they compete with each other.ﬂ P1208 liked the idea, but not the display. She suggested, ﬁLift weights for 47  strain, smile/frown for quality, hand up for pause, throw out loudness. I could see all the measures in one thing.ﬂ This led to the multi-measure man (Appendix F, Figure 92).  Iteration 3  While there were more changes to individual measures, two big changes occurred in iteration 3. First, the dynamic icons were simplified to static icons (Appendix G, Figure 93). Second, the researcher changed how measures were presented. P1310 introduced the idea of a similar layered structure for each measure. This incorporated multiple displays for each measure, ordering them from the most ﬁsimpleﬂ to the more complex.  
 Icon Suggestions  Overall, there was little support for the dynamic icons. P1310 stated, ﬁI feel the icon should be consistent.ﬂ P1311 liked the idea of dynamic icons, but P1312 was not in favor of the 
dynamic icons. Therefore, icons were changed back to static images, as two of three participants 
were not interested in them.  P1311 had an additional idea for the pause icon. To make it more iconic, it should have ﬁNo words on icon.ﬂ Suggested changing it to ﬁHands across mouth or something.ﬂ Therefore, the researcher tried a similar idea- drawing a zipper across the mouth to indicate no talking. This was added to the other pause display so participants could choose which was better.   Loudness Suggestions Participants were supportive of the two designs (Appendix G, Figures 94- 95). One additional design was added by the researcher to assess whether participants might want to see 48  loudness go to zero during pauses because this hadn™t been tested in prior iterations (Appendix G, Figure 96).  Pause Suggestions Participants liked the idea of the tick mark display as a pop-out (Appendix F, Figure 83), so it was retained (Appendix G, Figure 97).  P1310 liked the idea of simplification. He didn™t like the clocks or the lines. He suggested, ﬁMaybe just put numbers for pauses.ﬂ P1311 stated, ﬁBar graphs could be easier to look at.ﬂ Therefore, the next iteration was a bar graph to indicate amount of time paused and numbers to represent the pause count (Appendix G, Figure 98).  Quality Suggestions Overall, participants liked these displays, so they were all retained (Appendix G, Figures 99-101).   Strain Suggestions The line graph was retained because this display was well liked by P1312, who preferred it and was especially complimentary on the double labeling of the axis (Appendix G, Figure 102). The double line graph was not well liked. P1311 suggested changing it to a bar graph (Appendix G; Figure 104) and she described it as, ﬁHave bar overlay- yellow for day and blue for night.ﬂ 49  The third display was based on P1310™s idea of making the similar layered structure across measures (Appendix G, Figure 103).   Multi-Measure Suggestions The matrix was kept, but simplified based on feedback from P1310 (Appendix G, Figure 105). Two of three participants liked the idea of it, but P1311 preferred something simpler, like icons.  Iteration 4  In this iteration, the researcher introduced the idea of the layered display structure to participants, and they helped to further shape it.   Icon Suggestions  The biggest icon change was the decision between the two pause icons. P1415 commented that the ﬁzipper on the teeth I don't like.ﬂ P1414 expressed a preference for the stop sign. Therefore, the zippered mouth was not included in the next iteration, only the stop sign for pauses (Appendix H, Figure 106).   Loudness Suggestions Participants were supportive of two designs from the prior iteration (Appendix H, Figures 108-109), but did not like the zero baseline display.  To match with the layered structure, participants suggested adding an ﬁat a glanceﬂ version (Appendix H, Figure 107). P1413 suggested that it read, ﬁLoudness- exceeded __dB for 50  __minutes.ﬂ However, this was changed by the researcher to include the phrase ﬁdanger zone,ﬂ which was a phrase used by multiple prior participants to describe what they felt the red color 
indicated.  Pause Suggestions Participants liked the idea of the tick mark display as a pop-out (Appendix G, Figure 97), so it was retained (Appendix H, Figure 111).  P1413 remarked that the, ﬁAverage number helps me more than when it happened.ﬂ This suggestion was used to create the simplified ﬁat a glanceﬂ display for pauses (Appendix H, Figure 110). P1414 commented that the, ﬁNumber on bars seems a little confusingﬂ when looking at the pause displays. Therefore, a small multiple of bar graphs for each day was created without numbers to fit the layered structure (Appendix H, Figure 112). A zoomed in version of the bar graph was also created to match with the other zoomed in versions, such as loudness (Appendix H, Figure 113).  Quality Suggestions Overall, participants liked these displays, so they were all retained (Appendix H, Figures 114-116) with one small change. P1413 was bothered by the fact that increasing strain was bad and increasing quality was good. Therefore, this participant suggested flipping the axes for one 
of the measures so that they would look visually similar.   51  Strain Suggestions  P1413 suggested putting both of the axis markers on the same axis. This was incorporated into Appendix H (Figures 118-119). The bottom figure is a simplified version Appendix G (Figure 102), without the numbers on the end, based on a suggestion from P1413 that the display was ﬁtoo busy.ﬂ This same simplification was used to remove the lines from the small multiple display by removing the lines. However, in this case, numbers were added based on a comment from P1415 that numbers would be helpful.  P1415 wanted a simplified view for the ﬁat a glance,ﬂ suggesting ﬁBaseline in the middle, but for each event, indicate whether above or below the baseline.ﬂ Therefore, the small multiple 
was simplified to arrows indicating whether the individual was above or below the baseline 
(Appendix H, Figure 117).  Multi-Measure Suggestions The matrix was kept, but edited to reflect the changes in the other measures™ displays (Appendix H, Figure 120).  Iteration 5 (Final) Results  Based on input from one repeat participant (P1208), the researcher finalized the displays, with some changes. These final images can be seen in Appendix I. Loudness and quality displays stayed the same (Figures 122 & 124), other than flipping the axes back for quality (this was confusing to P1208). P1208 was disappointed that the idea of multi-measure guy did not impress other participants. The researcher showed her the design, and this was used as a basis for 
completing the strain guy in the final displays. The zoomed in display for pauses was rejected 52  and the other displays were kept (Figure 123). P1208 looked at both current and prior displays for strain before finalizing the design (Figure 125). The smiley faces were replaced by variations of the strain icon to better differentiate between quality (smiley faces) and strain (weight lifter).  Summary of Results Four measures were chosen by the participants (loudness, pauses, quality, strain). In addition, participants recommended a layered structure of displays, rather than a single display, for each of four measures. While the researcher kept the multi-measure display during all iterations, about half of the participants expressed doubt on the usefulness or matrix design. Therefore, because of the high level of disinterest in the multi-measure display, it was not used as part of the feedback for Part 2.   Part 2 Methods Using the feedback displays developed in Part 1, the researcher tested the feedback display prototypes in Part 2. Potential target users completed vocal loading tasks (VLTs), tasks 
designed to induce vocal fatigue, to determine whether objective feedback influenced later voice 
production. Part 2 consisted of two phases: 1) laboratory testing (reading aloud for 15 minutes at an elevated vocal intensity) and 2) field testing (classroom lectures).  Participants  Fourteen participants enrolled in Part 2.1, but only 13 participants met the inclusion criteria of the study (6M, 7F; M=22.6 years, SD=9.15 years, Range=18-48 years). See Table 5 for demographic information on the participants. Only one participant, P2114, reported prior 53  speech therapy that was limited to a three-month time period in elementary school for an articulation disorder. P2101 failed the hearing screen at 4000 Hz in the right ear (able to hear at 25 dB), but this threshold was within an acceptable range for his age (Brant & Fozard, 1990), and so he was included in the study. On the other hand, P2108 failed the hearing screen at one frequency in one ear, and because individuals of her age should meet the 20 dB criteria, the 
researcher chose to exclude her from the study.  
 Participant ID Age Gender Occupation/Major P2101 48 M Actor, radio volunteer P2102 18 M Undergraduate student, medicine P2103 34 F Graduate student, former elementary educator P2104 18 F Undergraduate student, neuroscience P2105 18 F Undergraduate student, engineering P2106 19 M Undergraduate student, business P2107 18 M Undergraduate student, criminal justice P2108 18 F --- P2109 18 M Undergraduate student, engineering P2110 19 M Undergraduate student, packaging P2111 25 F Graduate student, media & information P2112 20 F Undergraduate student, wants to be professor P2113 21 F Undergraduate student, going to med school P2114 18 F Undergraduate student, biology; musical theater minor Table 5: Part 2, Phase 1 Demographics.  An additional five participants enrolled in Part 2, Phase 2 (Part 2.2), and all participants met all the inclusion criteria (2M, 3F; M=42.6 years, SD=12.40 years, Range=33-62 years). See Table 6 for demographic information on the participants.   54  Participant ID Age Gender Occupation P2201 34 F Course Instructor P2202 36 M Course Instructor P2203 62 F Course Instructor P2204 48 F Course Instructor P2205 33 M Course Instructor Table 6: Part 2, Phase 2 Demographics.  Procedures  An overview of Part 2 can be seen in Figure 4. Prior to participation, all participants first completed an informed consent form approved by Michigan State University™s Human Research Protection Program (HRPP). Participants completed three interview sessions (initial, midpoint, and final) and eight VLT sessions. The first three recording sessions (baseline recordings) did not include feedback, allowing the researcher to determine each participant™s average baseline. After the midpoint interview, each participant completed five additional recording sessions (feedback recordings), receiving feedback at the beginning of each one.  55   Figure 4: Outline of Part 2.    56  Interview Sessions During each interview session, the participant completed the following: 1. Stage of change questionnaire: The URICA-VOICE (Teixeira et al., 2013) has been developed for use with the voice disorders population. Each participant™s readiness to change was calculated from this assessment. Readiness to change is assessed by adding the average scores from the C, A, and M stages and subtracting the score from the PC stage (DiClemente et al., 2004; Teixeira et al., 2013).  2. Self-efficacy for voice change questionnaire: Self-efficacy was assessed using a modified version of a general self-efficacy scale (Lee, Hwang, Hawkins, & Pingree, 2008). The only modification was to the instructions: the word ﬁvoiceﬂ was substituted for the word ﬁhealth.ﬂ  3. Vocal Fatigue Index: Vocal fatigue was assessed along three factors using the VFI (Nanjundeswaran et al., 2015). 4. Semi-structured interviews: As in the semi-structured interviews for Part 1, the researcher has a series of printed questions, but asked follow-up questions when relevant to gain a deeper understanding of a participant™s answers. The entire interview was recorded using two digital recorders to ensure no data loss (Roland R-05, Lake Stevens, WA, USA; TASCAM DR-40, Montebello, CA, USA). Some questions differed between the three interviews to better understand the participant™s current and anticipated future voice use 
demands (initial; Appendix B), responses to voice monitoring without feedback 
(midpoint; Appendix J), and responses to feedback (final; Appendix J).  a. Unique to the initial interview: Participants completed the general intake form (Appendix A), providing basic information about demographics and life factors 57  that can influence voice such as smoking history and caffeine intake. In addition, the researcher gave a brief introduction to the study. The researcher described the tasks that participants would complete and demonstrated how the equipment 
worked. b. Unique to the midpoint interview: The researcher provided a brief verbal explanation of each measure and a basic introduction to each feedback display. For example, for the voice quality displays, the researcher first provided 
background on pitch strength as a measure of pitch saliency, and then briefly 
stated that prior research indicated that this measure is a good predictor of listener perception of voice quality. The researcher then went on to introduce the 
participant to each type of quality display, and encouraged the participant to try to 
walk her through the displays to assess understanding. Finally, the researcher 
answered any questions the participant had about the feedback displays.  Recording Sessions During the first recording session, the participant was fitted with an accelerometer (VoxLog collar, Sonvox, Umeå, Sweden), which was attached to a Roland handheld recorder (R-05, Lake Stevens, WA, USA). The accelerometer was attached to the front of the neck using 
double-sided tape. The exact placement of the accelerometer varied participant to participant, with placement dependent on where a strong signal could be detected by the accelerometer. In 
general, the accelerometer was placed below the thyroid notch (Adam™s apple). The VoxLog 
collar consists of both an accelerometer and a microphone, with the accelerometer only recording 
audio data up to 3 kHz. Recordings were dual channel (one channel each for accelerometer data 58  and microphone data). The participant wore the same dosimeter (accelerometer plus recorder) in all subsequent recording sessions. For the feedback recordings, the researcher presented the displays with data from up to five prior recording sessions at a time on printed pages. Participants were allowed to ask 
clarification questions about the measures and displays (e.g., ﬁWhat is quality again?ﬂ, ﬁAre 
increasing numbers better or worse?ﬂ), but they were not allowed to ask for the researcher™s interpretation of the data trends. The researcher encouraged participants to ﬁthink aloudﬂ while 
looking at the feedback to better understand how participants (potential users) interpreted the 
feedback with limited outside guidance.  Recording Sessions- Part 2.1  VLTs were all completed in the same single-walled, sound-treated booth while wearing the dosimeter. The configuration of the booth can be seen in Figure 5. The VLT consisted of 15 minutes of reading aloud from a novel presented in electronic form on a tablet (Charlotte™s Web by E. B. White) while seated. This duration was chosen by the researcher because is less than the 35 minute safety limit suggested for a continuous reading task (Titze et al., 2003). In addition, the researcher instructed participants to ﬁRead aloud as though you are reading to a classroom 
and want to be heard by all the children,ﬂ which are the same instructions given in a prior study to elicit louder speech (Hunter et al., 2015). Loud speech is one type of task that is common to induce vocal fatigue in participants (Solomon, 2008). This instruction allowed the participants in Part 2.1 to approximate the vocal effort of the course instructors, although for a shorter time. 
Participants were instructed to read at their own pace. The researcher informed participants they had the option to end any VLT early if their voice became very tired or if they needed to leave 59  the booth before the end of the timed task (such as to use the restroom). As a precaution, if a participant™s initial vocal fatigue was 7 or higher (none were), the researcher would cancel the 
session for that day and reschedule for another day. Participants were told to cancel if sick. The researcher instructed participants to try to maintain a similar vocal loudness throughout the task. The target dB level was established during the short tasks prior to the VLT. 
Participants counted from 1-5 at a comfortable and then a loud level. If the level became too low (no peaks occurring above the average comfortable counting level), a lamp would come on in the room, indicating that the participant should increase their volume.   Figure 5: Sound-treated booth configuration for Part 2.1. The two squares indicate the location of chairs for the participant during the session and the researcher during the presentation of the feedback.  60  Recording Sessions- Part 2.2 Part 2.2 had the same structure as Phase 1, but the vocal loading task consisted of recordings during classroom lectures. During the VLT, no additional instructions to maintain a particular vocal loudness were given. The recording policy was as follows: if the instructor held class and was comfortable with the recording that day, the recording was made. If an instructor 
was sick and did not want to record that day, their wishes were respected. If an instructor 
informed the researcher that there was an exam in class on a particular day and there would be 
limited instructor talking, the recording did not occur that day. If the instructor did not hold class on a particular day, the recording did not occur that day.  Due to the real-world nature of the Part 2.2 recordings, additional precautions were taken. During the first interview session, the researcher made participants aware that not only their voices would be recorded, but the voices of their students might also be recorded (on the 
microphone recordings). The researcher instructed participants to explain this to students, giving students the option to let the instructor know if recordings made them uncomfortable. Fortunately, no student concerns were reported to the researcher. As an extra precaution, participants were taught how to pause the recorder, and the researcher informed participants that they could turn off the recorder at any time if they felt there were any privacy issues, and to let the researcher know if anything sensitive was recorded so that it could be deleted.  All instructors taught on Mondays and Wednesdays (which was not planned by the researcher). Due to the real-world nature of the study, the course length and amount of lecture material varied (Table 7). Exact course length varied week to week even for the same class meeting, so the lengths reported are the assigned lengths in the course directory.  61  Participant Monday Class Length (hrs) Wednesday Class Length (hrs) Notes P2201 1.5 3.0 2 different classes P2202 3.0 3.0 2 different classes: M is more student presentations, *W is more lecture based P2203 1.5 1.5 Same class P2204 1.0 1.0 Same class P2205 3.0 3.0 2 different classes: toward end of semester, more voice use in W class Table 7: Basic course structure, by instructor.  Feedback  The feedback presented to participants resulted from Part 1 of the study. Based on the findings from Part 1, the measures for feedback were (with names used for feedback in parentheses): intensity (loudness), pitch strength (quality), first moment specific loudness 
(strain), and pauses (pause frequency and pause duration). However, because the researcher 
instructed participants to artificially increase their vocal intensity in Part 2.1 (instructions to ﬁread as though reading to a classroomﬂ), feedback on loudness was not provided. Therefore, 
participants in Part 2.1 received feedback on three measures (quality, strain, pauses), and 
participants in Part 2.2 received feedback on four measures (quality, strain, pauses, loudness).  For the recordings with feedback, participants received feedback information prior to VLTs. An example of the feedback, with explanations of each display, can be seen in Appendix K (Part 2.1) and Appendix L (Part 2.2). Feedback was presented on paper, one display at a time, but participants could request to go back to look at prior displays or look at more than one 
display at a time if they felt that a comparison was helpful. The researcher randomized the order of feedback presentation across participants to ensure that there were no order-related effects.     62  Part 2 Analysis  Aim 1 In Part 2, the researcher sought to determine what further design requirements could be incorporated into future iterations of the feedback displays. Therefore, midpoint and final interview responses related to occupational voice user needs, feedback design, and feedback usefulness were transcribed, coded, and analyzed. Coding was based in grounded theory (Corbin 
& Strauss, 2008). After each interview was transcribed, the researcher used open coding to extract the meaningful data from the interview. Open coding relies on the data to generate the 
codes, rather than assigning pre-conceived codes to the data. For example, if a participant remarked, ﬁI like the strain and quality measuresﬂ, two codes may be assigned: ﬁlikes strain 
measureﬂ and ﬁlikes quality measure.ﬂ  After all coding was completed, the researcher used affinity diagramming to identify emerging themes from the data (Beyer & Holtzblatt, 1998). Affinity diagramming implements a bottom-up approach to data analysis, allowing themes (code groupings) to emerge based on the data presented. The most common themes, representing codes from multiple potential users, can 
then be used to inform future design iterations.  Aim 2 Aim 2: To identify changes in voice behavior management after receiving feedback. The researcher used open coding for midpoint and final interview results to provide further insight on intuitive behavior changes and changes facilitated by the feedback. In addition to identifying themes from the interviews, it was hypothesized that active engagement in changing voice 63  production will manifest as improvements in: RTC (increase), S-E (increase), and vocal fatigue (decrease for dimensions 1 and 2, increase for dimension 3).  To address this hypothesis, the researcher completed a mixed analysis of variance (ANOVA). The researcher chose this analysis because it compares multiple measures while controlling for the other variables. This analysis was using repeated measures, but due to the addition of a between-subjects factor (gender), the correct term for the analysis is a mixed ANOVA. In this analysis, there were two within-subjects factors: time (initial, midpoint, and final interviews) and measure (RTC, S-E, and VFI scores). The researcher entered each of the three factors of the VFI separately, leading to a total of five measures in the analysis. This is in 
agreement with (Nanjundeswaran et al., 2015), where it is recommended that each VFI factor be reported separately. In addition, gender was the between-subjects factor to determine whether gender influenced participant responses. Statistical analysis was completed using IBM SPSS software (IBM SPSS Statistics for Windows, Version 23.0. IBM Corp.; Armonk, NY, USA).  Part 2 Results and Discussion  Aim 1 Interview Results and Discussion  A total of 1064 unique codes were generated from the interview data, with some repetitions of codes when appropriate. Of those codes, 430 were classified into themes related to Aim 1. The three most common themes are reported, comprising 323 unique codes (Table 8). They are: positive comments on current feedback displays (66 codes), occupational voice user 
needs (152 codes), and recommended feedback display improvements (101 codes). These three themes encompass the most important commentary and recommendations of the participants for 64  the feedback displays. Detailed descriptions and examples of quotes fitting these themes are included below.  Theme Unique Codes Positive Comments on Current Feedback Displays (66 codes) 1. Displays are user-friendly 25 2. Measures are helpful and should stay included in feedback 41 Occupational Voice User needs (152 codes) 1. Clearer definitions of measures are needed 66 2. Strategies for improving the voice based on feedback are needed 64 3. The system should be adaptable for a range of user needs  24 Recommended Feedback Display Improvements (101 codes) 1. Users should be able to include notes and labels in data 16 2. Displays should show relative trends across days 24 3. Additional feature suggestions 61 Table 8: Emergent themes related to Aim 1.  Theme 1: Positive Comments on Current Feedback Displays  It is promising to note that some features of the current displays are felt to be user-friendly and helpful. While participants provided many helpful suggestions for changes in future 
iterations of the feedback displays, some elements have been identified as positive features that should be preserved in future iterations. 
 Sub-Theme 1: Displays are user-friendly  Multiple participants expressed that the feedback was simple and easy to read in both Part 2.1 and Part 2.2. For example, P2106 stated, ﬁI thought it was really easy to read, very simple,ﬂ and he added, ﬁThe symbols do a pretty good job representing each quality [measure].ﬂ P2101 
reported there was ﬁnothing wrong with how it [the feedback] was presented.ﬂ P2205 highlighted strain as a measure with good displays: ﬁIt shows the changes intuitively.ﬂ Overall, P2205 65  remarked, ﬁAlready, it's kind of a practical system.ﬂ In summary, while participants did identify weaknesses in the current feedback displays, participants felt that one strength of the current displays is the straightforward way that they are presented, and this presentation manner should be maintained in future iterations.  Sub-Theme 2: Measures are helpful and should stay included in feedback  All four measures were identified as being helpful by at least one individual. Of the four measures, strain was mentioned the most often (10 instances), then quality (4 instances), pauses 
(3 instances), and loudness (2 instances). Some participants felt that there was not just one most 
important measure. P2105 commented that strain and quality were the most helpful, while P2111 
felt that strain and pauses were the most helpful. P2104 remarked, ﬁAll [measures were] about the same in helpfulness.ﬂ Some student participants in Part 2.1 provided insight into how they felt occupational voice users would interpret the measures. For example, P2114 commented, ﬁI definitely think that strain is something that's very important to look at, especially if you're a professor talking all dayﬂ. This corresponded with a comment from a participant in Part 2.2, P2204: ﬁIt's nice to know how much strain is involved in the voice.ﬂ While strain was identified as the most helpful measure, participant commentary suggests that all measures provide some 
level of usefulness and may be appropriate to maintain in future iterations.  Theme 2: Occupational Voice User Needs  The core feature of user-centered design is involving the end user in all stages of the design process to the greatest extent possible. While occupational voice users and students who are future occupational voice users were included in Part 1 of the study, they did not actually use 66  the feedback they helped to create. The additional occupational voice user needs uncovered by the participants in Part 2 included: clearer definitions of measures need to be provided, objective feedback needs to be accompanied by strategies to improve voice use, and individual user needs 
demand greater flexibility in the system.  Sub-Theme 1: Clearer definitions of measures are needed  While some participants felt that the displays were easy to use, many participants struggled with understanding the measures featured in the displays. A simple explanation of each 
measure was provided in the midpoint interview when the feedback displays were introduced, 
and participants were able to ask for clarification on measures in feedback recording sessions, 
but this was not perceived to be enough support for understanding the objective measures. The 
measures most frequently identified as needing more detailed explanations were quality (16 instances) and strain (15 instances). While to a lesser degree, pauses (5 instances) and loudness (2 instances) were also identified by participants as needing more detailed explanations. As stated by P2107, ﬁIf I was going to get an app like that on my phone, I wouldn't fully understand [the measures] unless there was a description of it.ﬂ  This lack of an understanding of the measures led to confusion, as explained by P2112,ﬂ But I had no idea why my quality varied and why my pauses varied.ﬂ P2111 reflected, ﬁI understand that I want to improve my quality even though I have no idea what is the higher 
quality.ﬂ Some participants, such as P2201, even remarked that not having clear definitions led to confusion between measures when she stated, ﬁMore clear distinction would be helpful.ﬂ P2103 felt that not just the measures themselves needed better explanations, but also, ﬁSpecific 
strategies or helping to make links across the three pieces of feedbackﬂ would benefit users. This 67  confusion and need for more explicit definitions of measures and relations between measures was summed up by P2203, ﬁIf I was a specialist, I could look at what it is [feedback measures] and I could do something about it, but as a novice, I'm not invested in this in the same way.ﬂ With many of the participants expressing the need for a better understanding of the measures, 
this suggests that a simple explanation of each measure is not enough and future design iterations 
need to come up with a more explicit option for definitions, such as optional viewing when needed as suggested by P2107, ﬁsomething you can click on, or if you click on the thumb and 
you can see your quality and a description up at the top and individual days underneath.ﬂ  Sub-Theme 2: Strategies for improving the voice based on feedback are needed  While physical activity trackers, in their simplest form, provide a daily step count for individuals to track their activity over time, basic reporting of measures is not sufficient for 
PVM. P2102 nicely described his interpretation of the difference between a physical activity tracker and voice monitor, ﬁ[For a voice monitor] Give interpretation and recommendation, the 
most important part. If people buy this, it's what they'll be looking for. It moves past what a physical activity tracker is.ﬂ This desire for providing strategies for improving the voice as part 
of feedback was echoed by most participants. P2103 stated, ﬁTo me, it was informative feedback, 
but not very educative. The feedback didn't help me to know what to do.ﬂ She later added, ﬁIn 
education, we not only give feedback on performance but on how to improve.ﬂ P2202 felt the 
same, ﬁThe feedback is interesting, but it is coaching of things you can do and try that I think 
would be helpful also.ﬂ P2204 also highlighted the importance of including strategies by describing an alternative scenario, ﬁOtherwise, every individual will make his or her own 
decisions and you will have absolutely zero control over what they do.ﬂ Based on this 68  information from participants, it is clear that feedback from voice monitoring needs two components: 1) the measures themselves and 2) suggestions for how to improve voice use based on the measures. The exact nature of these suggestions, such as the depth of explanation needed, 
should be explored in future studies.  Sub-Theme 3: The system should be adaptable for a range of user needs  When looking at occupational voice users from a vocal health perspective, the focus is on reducing risk factors for developing voice disorders. As mentioned previously, these include 
decreasing loudness and decreasing the time spent talking (increasing pauses). However, just as 
voice therapy is tailored to individual client needs, participants suggested that PVM also needs 
the flexibility to adapt to specific user needs. Again, this differs from a simple version of an activity tracker, where all users are assigned the same basic goal.   P2203 discussed this idea in her final interview: ﬁI think it also speaks to the function of the person talking. If I'm talking on the radio, it's a different kind of thing. If I'm talking to a classroom where I have an audience and I have response and I'm trying to convey certain things, then my big picture of how my voice is being used is a different thing. If you're talking because you're doing audio books, that's a different thing too. Different in how you use your voice, and 
different in what's important.ﬂ  P2103 discussed the PVM system in relation to the specific demands placed on teachers: ﬁLosing voice is a constant problem for teachers. Losing your voice is especially a problem for 
new teachers. There's a real need for it [PVM].ﬂ Despite the need, P2103 went on to discuss the difficulty of getting teachers to adopt such a device with the demands they face on a regular basis 69  as part of their occupation: ﬁThis would be one more new thing that teachers have to think about buying into.ﬂ  Not only were varying occupational voice user needs identified in the abstract, but study participants suggested varying goals within the study. P2101, P2104, P2111, and P2113 all identified themselves as being ﬁquiet talkers.ﬂ However, these individuals reported the VLT 
(reading at an elevated vocal intensity) helped them to speak louder in their daily lives. P2101 
stated, ﬁI speak louder now.ﬂ P2111 integrated increased vocal intensity and pause time in her teaching, ﬁSo I trade off- I pause more but become louder for more people to hear me.ﬂ On the other hand, P2201, P2110, and P2114 reported being ﬁloud talkersﬂ trying to speak quieter as a result of the study. P2101 reflected, ﬁI try to be more careful with my voice.ﬂ P2110 said, ﬁIf I 
don't have to talk that loud, then I usually don'tﬂ in the midpoint interview (before feedback), but 
reported no change in vocal habits outside the study in the final interview. This discrepancy suggests that either some changes as a result of voice monitoring may be more short-term, or maybe that this change was integrated into his vocal behavior and was no longer seen as a 
change from the study.  These comments from participants provide an important consideration for future development of PVM: it is not enough to simply suggest voice conservation to reduce the risk of disorders. Individual goals need to play some role in the feedback. Further investigation is needed to better understand the breadth of these different needs, and to determine whether any 
common ground can be identified.  70  Theme 3: Recommended Feedback Display Improvements  Participants felt that there were a number of ways to improve the feedback displays to increase the functionality of PVM. These included adding notes and data labels, focusing on relative data trends rather than absolute values, altering symbols and baseline values, and adding 
real-time alerts to the user.  Sub-Theme 1: Users should be able to include notes and labels in data One issue commented on by multiple participants during the study and addressed again in the final interviews was the difficulty with delayed recall of vocal behavior days after it occurred. The inability to analyze the voice recordings online was a limitation of the current study, but based on participant comments, difficulty with delayed recall would be present even 
with real-time analyses when users try to compare voice behavior across days. P2205 stated, ﬁThe voice tone is related to the context- I want more cues for context.ﬂ  P2202 suggested, ﬁIt would be really cool to go back through and label time periods– Part of the reason for the study is not just providing information but helping to make changes. I think the chunking would actually help with that.ﬂ Or, he suggested a more automated way of 
doing this: ﬁI'd love to line it up with my syllabus because that would really generally remind me 
of what I was talking about that day.ﬂ For non-teachers, P2112 proposed, ﬁThey could have a little calendar and they can go back and look at ‚oh, this accounted for it™.ﬂ In addition to 
including context about what was going on in the feedback, participants also wanted a notes 
function. For example, P2102 felt that users would like a way for ﬁAdding your own 
interpretations.ﬂ P2202 also wanted a ﬁ‚Notes to self™ field for each day.ﬂ While participants 71  wanted both feedback and tips for how to improve voice production from a monitoring system, they also want to add context to better understand their voices. 
 Sub-Theme 2: Displays should show relative trends across days  The desire for relative trends is also partially related to difficulty with delayed recall of voice use. As stated by P2202, ﬁBecause of the time delay, it was easier to think in large time chunks.ﬂ It is also related to a personal preference, as expressed by P2204, ﬁMy preference is to 
see the small clearly and the big picture.ﬂ P2205 echoed this sentiment, ﬁRelative trends or 
changes were helpful for me.ﬂ P2111 took this idea a step further: ﬁFor the different weeks, probably my voice will change a little bit. Sometimes I'm really tired, sometimes I'm really 
energetic. So doesn't mean anything- or probably an average value of one month or one semester, that probably make more sense.ﬂ  For the current study, the feedback was scaled to show a short amount of time (15 minutes to 3 hours) across days in a detailed form. However, the amount of detail in displays 
comparing the voice over full days needs to be investigated further. Based on the opinions of some study participants, a high level of detail may not be as meaningful to users as general 
trends across days.  Sub-Theme 3: Additional feature suggestions  Participants made many suggestions for future PVM development in addition to those described above. Some additional suggestions are included to demonstrate the breadth of these 
recommendations, and the need for further research before PVM is ready for widespread adoption. 72   Some participants suggested alterations to the symbols used in feedback. Specifically, suggestions were made for the changing symbols along the y-axis for both quality and strain. P2201 was concerned that the baseline face is ﬁ–sometimes perceived as a frowny face for some reason,ﬂ and suggested including a frowning face as a contrast to the baseline face to help 
remind users. P2205 was concerned about the strain guy stating, ﬁSometimes it looks not that 
bad because it's working harder.ﬂ This participant was referring to the fact that the symbol for more strain looks like it™s working harder at lifting weights, which can be good for one™s health. 
P2201 felt that simplification of the strain image could help, stating: ﬁthe face with a little bit of effort, and the face with the linesﬂ rather than picturing the weights. P2109 had the opposite reaction: ﬁThe stick person might be simpler- Maybe just a barbell.ﬂ  P2202 was concerned about the changing baseline for loudness (baseline was based on the average dB level by day). This was different than strain and quality, which had the same 
baseline each time. He felt it ﬁwould be nice if they all had same baseline so you could compare across days.ﬂ P2112 suggested adding a baseline value for pauses, ﬁIt'd be nice to have a baseline for the pauses, since there's the baseline for the other two.ﬂ  Participants also want alerts built in with varying functionality. P2205 suggested a ﬁnudgeﬂ feature: ﬁI was thinking 5 minutes before the class, if the system can have access to my calendar, ‚your class is coming. In your last class you did this, and this, and this,™ in my Apple watch. And then ‚why don't you–™ﬂ Based on voice use, P2106 wants the PVM to ﬁGive you 
alerts to say you're at high risk for voice problems right now.ﬂ P2111 wants more specific real-time suggestions, such as: ﬁI would suggest you have a rest, or you should have a pauseﬂ when 
the PVM detects that the speaker has been talking too long. P2205 offered that loudness ﬁcould 
be volume as a trigger or a cue for the adjustment.ﬂ 73  Aim 1 Interview Results Summary  The interview results were very informative for considerations in the future development of PVM. The main findings included some positive reviews of current features, including the simple display organization and helpfulness of the current measures. Participants also suggested many ways to improve PVM to appeal to a wider audience by allowing flexibility for varying 
user goals and providing greater support through clearer measure definitions and suggestions for 
how to improve one™s voice production based on the current feedback. Finally, participants suggested adding a level of customization in the displays so that users could add notes and labels 
to better understand the context of the data, focusing on relative trends in the data, changes to improve interpretation of the displays, and alerts when behavior suggests an increased risk for 
voice disorders.   Aim 2 Interview Results and Discussion A total of 1064 unique codes were generated from the interview data, with some repetitions of codes when appropriate. Of those codes, 421 were classified into themes related to 
Aim 2. The three most common themes are reported, comprising 343 unique codes (Table 9). These three themes encompass the overall picture of change for these participants: changes in 
action, thinking, and lack of change. Detailed descriptions and examples of quotes fitting these 
themes are included below. 
 Theme 1: Reported Behavioral Changes  Participants reported a variety of behavioral changes in the interviews. These included changes to try to help the voice, changes stemming from knowledge of being monitored, and 74  some specific changes reported due to the nature of the VLT in Part 2.1. These are described in detail below. 
 Theme Unique Codes Reported Behavioral Changes (156 codes) 3. Active changes in vocal behavior due to increased awareness and feedback 79 4. Changes in vocal behavior due to monitoring 17 5. Task specific voice changes in Part 2.1 60 Increased Awareness of Voice (138 codes) 4. Interpretations of feedback 106 5. Learning about own vocal fatigue and risk of voice problems 32 No Observed Changes (49 codes) 4. No conscious behavioral change 35 5. No change observed in feedback measures 14 Table 9: Emergent themes related to Aim 2.  Sub-Theme 1: Active changes in vocal behavior due to increased awareness and feedback  Participants reported a range of behavioral changes resulting from participation in the study. The most common change was increasing pauses (24 unique codes). P2111 reported 
increasing her pausing behavior in her teaching outside of the study, ﬁWhen I find it's (lecturing) boring, I probably speak more fast. So I try to commit to thinking ‚I need to slow down, I need to 
have pause.™ﬂ P2202 reported, ﬁThe place that I noticed the most change was how much [I talked].ﬂ He continued by revealing a strategy he used to increase pausing (voice rest), ﬁWhen I 
was starting to notice things, I was trying to get the students to talk more.ﬂ  Changing vocal loudness with the second most common change (22 unique codes), but the direction of change varied between individuals. This was an unexpected finding given that 
the purpose of PVM was to reduce the risk of voice disorders, and one way of doing that is reducing loudness. P2114, who reported being a ﬁloud talker,ﬂ followed the expected direction: 75  ﬁThere are periods where I'll talk quieter if I'm like ‚Oh, my voice isn't feeling the best today.™ﬂ P2110, another ﬁloud talker,ﬂ also reported, ﬁI can be quieter [than before the study].ﬂ On the 
other hand, P2113, a ﬁquiet talker,ﬂ stated, ﬁOutside of the study I might have started talking a 
little bit louder.ﬂ  
 Other changes were reported, such as P2111, a ﬁquiet talker,ﬂ who stated that she now balances her vocal loudness and pauses in her teaching: ﬁSo I trade off- I pause more but become louder for more people to hear me. I think that will make more clear to the students.ﬂ P2205 
reported that during the baseline recordings he ﬁtried a little bit to find a voice that made my 
throat comfortable,ﬂ including changing the pitch of his voice.  Sub-Theme 2: Changes in vocal behavior due to monitoring  Not all voice changes were due to participants trying to change their voice to help voice production. Some of these changes were made due to the monitoring, many of which became reduced with continued monitoring. These changes were reported by participants in Part 2.1 and Part 2.2. However, these changes probably affected how participants thought about their voices in the study, and may overemphasize changes that were seen. For example, P2201 reported at the midpoint interview that she ﬁtried to make sure you had enough information, so I tried to talk 
more.ﬂ P2109 and P2112 reported focusing on enunciation, as evidenced by P2112™s comment: 
ﬁI think I focused more on how I was enunciating- I focused on saying every word.ﬂ P2114 stated, ﬁI notice that I eventually got louder over time [within the study].ﬂ  76  Sub-Theme 3: Task specific voice changes in Part 2.1  Some behavior changes that were noted by participants were specific to the reading task in Part 2.1. Again, these may inflate some of the participant-reported findings in the study, especially if they made these changes sound more generalized in the interview. Some of the more common reports included talking louder than usual (10 instances) and talking more than usual (7 instances). For example, P2104 stated, ﬁAnd I know I have to talk louder [in the VLT]ﬂ and ﬁI 
talk more than I usually do [in the VLT].ﬂ Other changes were reported, such as P2114 stating, ﬁI 
did notice that while reading Charlotte's Web I had specific voices for each of the characters. Sometimes one of them [the voices] would bother me so I would change it.ﬂ  Theme 2: Increased Awareness of Voice  Participants reflected on their change in mindset when it comes to voice. Participants shared their interpretations of the feedback, as well as increased awareness of vocal fatigue and risk of voice problems. These sub-themes are further explained below.  Sub-Theme 1: Interpretations of feedback  Because there was little direction given to participants in how to interpret the feedback, insights into the thought process of these individuals is critical. This information can be used in future iterations to shape the presentation of data, and may also be of use to other voice care professionals when designing treatment programs. 
  First, 16 unique codes described disagreements between what participants saw in the feedback and their expectations. For example, P2106 felt he had a good grasp on the relationship 
between measures until ﬁI had the day that completely threw my theory off, but that might have 77  been a fluke day.ﬂ P2110 discussed the unpredictable nature of the quality measure: ﬁSome of its random, it feels like sometimes. Like when the quality is jumping up and down.ﬂ P2113 felt 
there was a disconnect between her perception of vocal fatigue and strain: ﬁI feel for the fatigue 
one [lowest on the scale], I have no idea what I'm going to get [for strain].ﬂ  While there was some disagreement with the feedback, some comments suggested a level of agreement or a connection between the participant and the feedback. P2202 commented, ﬁThis 
was more useful [pause bar graphs] because it's over time especially because you can see the 
structure of my class.ﬂ P2205 saw the feedback as a way to interact with his voice on a different 
level: ﬁAn avatar is reacting to my performance, so making this guy calm was the objective. It is 
showing myself, it is showing my throat. I need to take care of myself.ﬂ  Another interesting finding was the level of trust in the feedback. While the feedback did not conform to some participants™ ideas of how it should work, some participants indicated a 
high level of trust in the feedback. P2109, who noticed that his strain often decreased from the beginning of a session to the end while his fatigue increased commented, ﬁI paid less attention to how my voice felt. I'm not talking for that long, and it will be fine, according to the data that I've seen.ﬂ This was counter to what was anticipated in the study, with the goal being to increase voice awareness rather than relying on feedback to tell you what is happening. However, the 
relationship described by this individual, of decreasing strain over time, was seen by multiple 
participants and may be due to a warm-up effect (Vilkman, Lauri, Alku, Sala, & Sihvo, 1999) rather than simply showing that strain decreases over time with use. Possibly repeating this study 
with a longer VLT may lead to different results.  78  Sub-Theme 2: Learning about own vocal fatigue and risk of voice problems  One common area that participants expressed increased awareness was in vocal fatigue. P2202 was especially excited about noticing gradations in fatigue, which he had not noticed previously: ﬁParticipating in the study versus not, I was noticing those indications [of fatigue]. I 
was noticing a lot of differences rather than all of a sudden noticing that my voice was really, 
really tired. Now I can notice more gradations in between which has been really cool.ﬂ P2106 
also discussed his increased awareness of vocal fatigue: ﬁI became more aware of when I needed 
to give my voice a rest or when it was getting more effortful to produce.ﬂ P2201 took this a step 
further, expressing that before the study, ﬁI never even thought there was such a thing as a voice problem,ﬂ but that now she is much more careful of how she uses her voice. 
 Theme 3: No Observed Changes  Sub-Theme 1: No conscious behavioral change  While many participants reported changes in their vocal behavior, either task specific or generalized, other participants reported a lack of behavioral changes. For example, P2204 stated, 
ﬁI need to communicate, and I started where I ended in the sense I talked in the same fashion.ﬂ 
P2203 also did not feel that she changed her voice as part of the study, but rather, she felt that: 
ﬁfrom my point of view, I'm just participating.ﬂ Other participants changed some things, but not 
others. For example, P2201 remarked on her loudness: ﬁI don't think much changed in terms of 
loudness of my voice.ﬂ   79  Sub-Theme 2: No change observed in feedback measures  Some participants commented that they saw little to no change in the feedback measures over the course of the study. For example, P2107 stated, ﬁI didn't think the vocal strain was as helpful because I didn't feel like it changed all that much. But then again, neither did the other 
two.ﬂ P2112 attributed this lack of perceived change as possibly stemming from the scaling of 
the feedback, remarking, ﬁIt'd be nice if our baseline was a little more tight so that we could still see change.ﬂ However, another reason why there may have been little perceived change stems 
from the design of the experiment. All participants reported no history of or current voice 
disorders requiring medical treatment. Therefore, all their voices were relatively normal, and this lack of change may stem in part from not having voice problems. It may be informative to try a 
similar task with individuals with voice disorders to see if greater changes are observed.  Aim 2 Interview Results Summary  Overall, participants reported a wide spectrum of changes in the study. These changes ranged from a lack of change to increased awareness (changes in thinking) to actual behavior changes. Within each of these categories, the reports varied. For example, some individuals reported an overall lack of change in behavior, whereas others reported a lack of change for 
specific measures and changes in others. Increased awareness ranged from increased awareness 
of different aspects of the voice through interpretation of the feedback to increased awareness of the presence of vocal fatigue and risk of voice disorders. Finally, behavior changes were 
reported. Some changes were generalized to a broader context while others were limited to the 
VLT.  80  Interview Results Limitations  There were some limitations to Part 2 that may have influenced the interview results. First, participants in Part 2.1 were predominantly students who had limited prior experience as occupational voice users. Therefore, some of their comments were based on what they think would relate to occupational voice users. In addition, the controlled reading task may not have 
offered enough similarity to a real-world environment to facilitate carry over of voice use changes, and their insights may have changed with a more real-world VLT.  Some of the changes reported by participants in Part 2.1 were related to the task itself rather than actual behavior change, such as P2113™s reflection that the way she talked in the VLT 
was ﬁDefinitely louder than I normally talk.ﬂ When participants clarified these statements as being specific to the demands of the task, they were coded this way, but not all participants 
provided this level of detail. Therefore, it is possible that the level of perceived change in Part 
2.1 may be artificially inflated. It is also possible that the length and structure of the VLT in Part 2.1 may be too short to elicit the expected changes. It is possible that the task was too short for some participants to experience fatigue (reported by P2101), and for others, it may lead to a warm-up effect rather than a fatiguing effect. P2110 commented that it would be interesting to compare vocal 
performance between this length of task and a longer VLT. P2107 was interested in seeing what 
effect different reading material (with more emotional content) may have on the voice.  Part 2.2 was the field test, and involved recording in participant™s natural environment. Therefore, it is more likely that the changes reported are from participating in the study rather 
than an artifact of the task itself. However, the scope of the recordings is still limited. Other 
vocally demanding tasks, such as teaching other courses and attending meetings, were not 81  recorded. Therefore, participants were not given a more holistic picture of their day, but rather, a snapshot.   Aim 2 Statistical Analysis Out of all the questionnaires, only one missing data point was identified (for the final interview S-E questionnaire for P2204). However, because the rest of the questionnaire was answered, this data was still used with the understanding that this individual™s score was potentially lower (e.g., they could have scored the question as ﬁ0ﬂ) due to the missing data. Because all questionnaire scores were used, a total of 13 scores were analyzed (one per participant) for each measure at each interview in Part 2.1, and 5 scores were analyzed (one per 
participant) for each measure at each interview in Part 2.2.  It was hypothesized that active engagement in behavior change would manifest as improvements in: RTC (increase), S-E (increase), and vocal fatigue. The improvements in vocal fatigue were defined as changes in one or more of the following of the VFI (consistent with 
Nanjundeswaran et al., 2015): a decrease in Factor 1, a decrease in Factor 2, and/or an increase in Factor 3. For changes in individual results in Part 2.1 across the three interviews, see Appendix M, Appendix N, and Appendix O for RTC, S-E, and VFI scores respectively. For changes in individual results in Part 2.2 across the three interviews, see Appendix P, Appendix 
Q, and Appendix R for RTC, S-E, and VFI scores respectively.  82  Part 2.1  The assumption for sphericity was met for all variables but ﬁmeasure.ﬂ Therefore, Greenhouse-Geisser corrections were used for all Part 2.1 questionnaire statistics for consistency. The results of the mixed ANOVA are reported below.   Main Effects There was a significant main effect of time, F(1.947,21.417) = 6.299, p = .0072 = .364. The mean score across questionnaires was significantly lower in the initial interview (M= 7.798, SD= 4.495) than either the midpoint interview (M= 9.081, SD= 3.237) or the final interview (M= 8.902, SD= 4.335), p = .026 and .031, respectively. No significant difference was found between the midpoint and final interviews. This finding is in the hypothesized direction for three of the five measures (RTC, S-E, and VFI Factor 3). However, it is of interest to note that the trend is for these measures to increase prior to the feedback, suggesting an effect of monitoring. Trends are explored in more detail below. Corresponding individual results can be seen in Appendices M-O (Figures 140-142). For RTC, this trend of increasing and then decreasing was seen in 9 of the 13 participants (69.2%). This finding provides support for Aim 2: behavior change measures will show 
improvement. The trend suggests that most participants increased their readiness to change, likely due to actively engaging in the behavior change process. Even more interesting, all 9 participants had a higher RTC at the midpoint interview (being monitored with no feedback) that declined (although still above initial) in the final interview. Future work should further explore the reason for this great increase in the monitoring stage, but decrease after feedback.  83  For S-E, the same trend was seen with 6 participants (46.1%). In addition. 8 participants (61.5%) showed at least an increase from the initial to the final interview. These findings suggest that over half of participants experienced an increase in self-efficacy over the course of the study, which is again in support of Aim 2. Again, future work should further explore the reason for the greater increase in S-E during monitoring alone (versus monitoring with feedback). For VFI scores, the findings were mixed. On VFI Factor 1, 11 participants (84.6%) followed the trend seen for RTC and S-E. This is counter to the hypothesis in Aim 2 because increasing scores on Factor 1 indicates greater vocal fatigue. However, one possible explanation for this increase may be that because most of these participants were students with limited prior 
experience as an occupational voice user, engagement in the VLT may have increased their awareness of vocal fatigue, leading to increased scores. In this case, the decrease in the final 
interview suggests at least some reduction in perceived fatigue due to engagement in PVM. VFI Factors 2 and 3 showed much weaker adherence to the overall trend, with 4 participants (30.8%) and 3 participants (23.1%), respectively showing the trend. There was also a significant main effect of measure, F(2.159,23.754) = 32.402, p < .001, 2 = .747. Almost all measure pairings were statistically significantly different at or below p= .043 with a few exceptions. The exceptions were: RTC (M= 4.621, SD= 2.153) and VFI Factor 2 (M= 4.513, SD= 2.999) at p= 1.000; S-E (M= 9.744, SD= 3.470) and VFI Factor 1 (M= 14.487, SD= 4.541) at p= 0.71; and S-E and VFI Factor 3 (M= 9.641, SD= 2.311) at p= 1.000. This main effect was less meaningful because the scaling of the questionnaires and the underlying constructs they measured were different, and these differences probably contributed to the significant differences. However, the findings are consistent with the three factor model of vocal fatigue because all three factors were statistically significantly different. 84  Finally, gender had a non-significant difference, F(1,11) = .045, p 2 =.004.   Interactions There was a non-significant interaction of time and measure, F(4.024,44.267) = 2.444, p 2 =.182. Despite the non-significance of this measure with the Greenhouse-Geisser correction, this nearly significant result warranted further investigation to see the emerging trends. (If a Huynh-Feldt correction had been used instead, this result would have been significant at p= .024.)  Individual comparisons can be seen in Tables 10-12 for the initial, midpoint, and final interviews, respectively. In these tables, only the statistically significant differences are shown. Some interesting findings emerge when looking at these results. The reported means are 
difference scores- the difference of the row variable minus the column variable (e.g., RTC minus S-E). These differences show, for example, that S-E scores were higher than RTC scores (on average) in the initial interview. Again, while these differences were significant, these findings 
are of less interest to the overall purpose of the study.  Initial Interview S-E VFI Factor 1 VFI Factor 2 VFI Factor 3 RTC -4.94 (3.78) .006 -8.13 (4.96) .001  -5.13 (3.23) .001 S-E     VFI Factor 1   8.18 (4.59) <.001  VFI Factor 2    -5.18 (2.97) .001 Table 10: Comparison of Part 2.1 scores for the initial interview. Reported are the mean difference in measures across participants (standard deviation) and p-value.  85  Midpoint Interview S-E VFI Factor 1 VFI Factor 2 VFI Factor 3 RTC -4.69 (3.33) .004 -10.91 (3.06) <.001  -5.16 (3.21)  .001 S-E  -6.23 (5.53)  .019   VFI Factor 1   10.69 (2.58) <.001 5.75 (3.72)  .002 VFI Factor 2    -4.94 (3.66)  .005 Table 11: Comparison of Part 2.1 scores for the midpoint interview. Reported are the mean difference in measures across participants (standard deviation) and p-value.  Final Interview S-E VFI Factor 1 VFI Factor 2 VFI Factor 3 RTC -5.56 (3.16)  .001 -10.66 (3.48) <.001  -4.54 (4.10)  .021 S-E  -5.10 (4.47)  .017 6.08 (5.52)  .022  VFI Factor 1   11.18 (3.26)  <.001 6.12 (4.29)  .003 VFI Factor 2    -5.06 (3.92)  .007 Table 12: Comparison of Part 2.1 scores for the final interview. Reported are the mean difference 
in measures across participants (standard deviation) and p-value.  The other potential interactions were not statistically significant. There was not a significant interaction of measure and gender, F(2.159,23.754) = .642, p 2 =.055. In addition, there was a non-significant three way interaction (time, measure, gender), F(4.024, 44.267) = .753, p 2 =.064. Finally, the interaction of time and gender was non-significant, F(1.947,21.417) = 3.303, p = .0572 =.231. However, as with the interaction of time and measure, this nearly significant result warranted further investigation.   No statistically significant differences were found for women™s scores between time points. However, some statistically significant differences were found for men™s scores. Men™s overall scores (across all measures) were lower in the initial interview (M= 7.121, SD=2.588) than in the midpoint (M= 9.293, SD= 1.864) and final interviews (M= 9.079, SD= 2.495), with 86  p= .011 and .010, respectively. Despite being non-significant, the women showed the same trend as the men, with lower scores in the initial interview (M= 8.475, SD=2.396) than the midpoint (M= 8.870, SD= 1.725) and final interviews (M= 8.724, SD= 2.310).   Qualitative Analysis  Trends by gender over time for RTC can be seen in Figure 6. This figure shows the average increase in RTC for both genders between the initial and midpoint interviews. From the midpoint to the final interview, men continue to show an increase (although smaller than initial to midpoint), and women show a decrease (although still higher than at the initial interview). For 
both genders, this indicates an effect of monitoring alone. However, the difference in genders between the midpoint and final interviews warrants further investigation. Perhaps the feedback is 
more supportive in increasing RTC for men, and ways to improve its support for women should be explored.   Figure 6: Average Readiness to Change scores over time by gender for Part 2.1.  87  Trends by gender over time for S-E can be seen in Figure 7. This figure shows an interesting difference between genders. Scores for men increase across the three interviews, while scores for women decrease across the three interviews. Like RTC, this suggests that 
potentially the current feedback is more supportive in increasing S-E for men, and ways to improve its support for women should be explored.   Figure 7: Average Self-Efficacy scores over time by gender for Part 2.1.  Trends by gender over time for VFI can be seen in Figures 8-10. For Factors 1 and 2, an increase in score indicates an increase in perceived vocal fatigue. Factor 1 is characterized by a feeling that the voice is ﬁtired,ﬂ which may lead to a reduction in further voice use, and Factor 2 
is characterized by physical discomfort, such as a sore throat (Nanjundeswaran et al., 2015). Therefore, the average initial increase in both genders on both Factor 1 and Factor 2 may be consistent with increased awareness increasing the perception of vocal fatigue. The average decrease for men on both factors, and the average decrease for women on Factor 2 indicates that 88  feedback may have helped reduce some of this perceived fatigue. The continued average increase for women on Factor 1 does not follow this trend.    Figure 8: Average Vocal Fatigue Index, Factor 1 scores over time by gender for Part 2.1.   Figure 9: Average Vocal Fatigue Index, Factor 2 scores over time by gender for Part 2.1. 
 89  On the other hand, an increase in score indicates a decrease in perceived vocal fatigue for Factor 3. Factor 3 is characterized alleviation of fatigue symptoms with vocal rest (Nanjundeswaran et al., 2015). Therefore, men showed an average decrease in perceived vocal fatigue from the initial to the midpoint interview, and an increase at the final interview (although still less than at the initial interview). This overall increase in score indicates a reduction in vocal fatigue. On the other hand, women showed no change (on average) between the initial and 
midpoint interviews, and an increase in vocal fatigue at the final interview. While not specifically explored in this study, the increasing vocal fatigue experienced by women on Factors 1 and 3 may be due to the VLT used in Part 2.1. Some participants (such as 
P2104) reported continuing to increase their vocal intensity over time in the study due to 
increased comfort with the task, and this type of behavior may have led to increased perception 
of fatigue within the study, which artificially influenced the questionnaire results. Therefore, 
further work in this area should attempt to tease these effects apart.    Figure 10: Average Vocal Fatigue Index, Factor 3 scores over time by gender for Part 2.1.  
 90  Part 2.2  The assumption for sphericity was met for all variables, so no corrections were needed. The results of the mixed ANOVA are reported below.   Main Effects There was a significant main effect of measure, F(4,12) = 55.464, p < .0012 = .949. Almost all measure pairings were statistically significantly different at or below p= .043 with a 
few exceptions. Statistically significant differences were found for the following comparisons: 
RTC (M= 6.514, SD= 2.023) and VFI Factor 1 (M= 22.467, SD= 5.330); RTC and VFI Factor 3 
(M= 8.667, SD= 1.799); S-E (M= 10.133, SD= 3.226) and VFI Factor 1, VFI Factors 1 and 2 (M= 6.333, SD= 2.992), and VFI Factors 2 and 3. Again, this main effect was less meaningful because the scaling and underlying constructs measured by the questionnaires was different, and these differences probably contributed to the significant differences. The findings are consistent with the model of three factors of vocal fatigue because all three factors were statistically significantly different. There was a non-significant effect of time, F(2,6) = 1.374, p = .3232 = .314. However, due to the significance found in Part 2.1, further exploration for identifying trends was 
undertaken. The mean score across questionnaires was higher in the initial interview (M= 10.319, SD= 3.067) than the final interview (M= 9.830, SD= 4.388), but both were lower than the midpoint interview (M= 11.229, SD= 4.554). Relative trends are explored in more detail below. Corresponding individual results can be seen in Appendices P-R (Figures 143-145). For RTC, this trend of increasing and then decreasing was seen in 4 of the 5 participants (80%). However, only 2 of the 5 participants maintained higher RTC in the final interview 91  compared with the initial interview (2 participants™ scores were lower in the final interview). This difference in trend from Part 2.1 should be further explored in future studies. While there 
were only 5 participants in Part 2.2, these individuals are occupational voice users, and are 
therefore closer to the target user population. For S-E, less of a trend was noted. Only 2 individuals (40%) showed this trend, and both of them ended with lower S-E in the final interview than the initial interview. One individual showed a steady increase over the three interviews, one showed a decrease, and one had a decrease at the midpoint interview and a return to baseline in the final interview. These inconsistent suggest that more work with occupational voice users in the real-world environment are needed to understand the relationship between PVM and S-E.  For VFI Factor 1, a total of 3 participants (60%) followed the trend of an increase for midpoint followed by decrease in the final interview. Two of these individuals continued to have 
higher scores in the final interview than the initial, indicating more vocal fatigue. This is consistent with the findings in Part 2.1. Taking the findings from Parts 2.1 and 2.2 together, it is 
possible that voice monitoring alone draws greater attention to vocal fatigue and therefore participants report greater fatigue. With feedback, they are able to reduce that fatigue by at least 
some margin, and future work should explore how to further reduce perceived fatigue. For VFI Factor 2, a total of 3 participants (60%) followed a similar trend of an increase followed by a decrease in fatigue. However, these results showed greater promise. In total, 4 of 
the 5 participants (80%) demonstrated a decrease in Factor 2 from the initial to the final 
interview, indicating reduced vocal fatigue. This is a positive finding that with more participants can hopefully achieve statistical significance. For VFI Factor 3, a total of 2 participants (40%) 92  followed a similar trend. However, only one participant (20%) showed an increase in Factor 3 from initial to final interview (indicating less fatigue). No significant difference was found for gender, F(1,3) = 2.242, p = .2312 =.428. This finding indicates that differences found were not due to differences between genders.   Interactions There was a non-significant interaction of time and measure, F(8,24) = 1.513, p = .205, 2 =.335. Unlike Part 2.1, this result was not approaching statistical significance.   Unlike Part 2.1, there was a significant interaction of measure and gender, F(4,12) = 3.610, p = .0372 =.546. Post hoc pairwise comparisons revealed the same trends for men and women, with men showing greater variation between measures. Women showed an average difference of -14.632 (SD= 2.296) between RTC and VFI Factor 1, p= .003. In addition, women showed an average difference of 14.111 (SD= 4.980) between VFI Factors 1 and 2, p= .034. Finally, women showed an average difference of 10.667 (SD= 1.269) between VFI Factors 2 and 3, p= .001.  
 On the other hand, men showed an average difference of -24.052 (SD= 2.296) between RTC and VFI Factor 1, p= .001. In addition, men showed an average difference of 19.167 (SD= 4.980) between VFI Factors 1 and 2, p= .025. Finally, men showed an average difference of 
18.500 (SD= 1.269) between VFI Factors 2 and 3, p< .001.  These findings were significant, even with a small number of participants, but the same interaction was non-significant in Part 2.1. Therefore, further investigation with a larger number of participants is needed to determine the nature of the interaction between gender and measure. 93  In addition, the interaction of time and gender was non-significant, F(2,6) = 2.142, p = 2 =.417. Finally, there was a non-significant three way interaction (time, measure, gender), F(8,24) = 1.238, p 2 =.292.  Qualitative Analysis  Trends by gender over time for RTC can be seen in Figure 11. This figure shows the average increase in RTC for both genders between the initial and midpoint interviews. From the midpoint to the final interview, women continue to show an increase (although smaller than initial to midpoint), and men show a decrease (although still higher than at the initial interview). This trend is the opposite of Part 2.1, where women showed the decrease in RTC from the 
midpoint to the final interview. This finding suggests that men may need additional feedback 
support to continue to increase RTC. This difference in trends across the two phases of Part 2 suggests further exploration to uncover the true relationship is needed.   Figure 11: Average Readiness to Change scores over time by gender for Part 2.2.  94   Trends by gender over time for S-E can be seen in Figure 12. As with RTC, the findings with S-E are inconsistent with the findings from Part 2.1. Both genders show an average increase from the initial to the midpoint interview, and then an average decrease in the final interview. For both genders, the decrease in the final interview is below the average baseline value, indicating decreased S-E (opposite of the hypothesized change). This finding suggests that further exploration of the value of S-E as an outcome measure for PVM should be explored. Perhaps a more tailored self-efficacy scale for voice disorder prevention (rather than an adapted general S-E scale) would lead to different results.   Figure 12: Average Self-Efficacy scores over time by gender for Part 2.2.  Trends by gender over time for VFI can be seen in Figures 13-15. For Factors 1 and 2, an increase in score indicates an increase in perceived vocal fatigue. Factor 1 is characterized by a feeling that the voice is ﬁtired,ﬂ which may lead to a reduction in further voice use, and Factor 2 is characterized by physical discomfort, such as a sore throat (Nanjundeswaran et al., 2015). Women had an average increase in both factors from the initial to midpoint interview, and men 95  had an average increase for Factor 1 (Factor 2 remained constant). As suggested in Part 2.1, the average increases in Factor 1 and Factor 2 may be consistent with increased awareness increasing the perception of vocal fatigue. As in Part 2.1, men showed average decreases on both 
factors from the midpoint to the final interview, indicating that feedback may have helped reduce some of this perceived fatigue. The continued average increase for women on both factors does not follow this trend and should be further explored in future studies.  On the other hand, an increase in score indicates a decrease in perceived vocal fatigue for Factor 3. Factor 3 is characterized alleviation of fatigue symptoms with vocal rest 
(Nanjundeswaran et al., 2015). Both genders demonstrated an average decrease from the initial to the final interview, suggesting an increase in this type of vocal fatigue. However, the gender 
difference at the midpoint interview shows potentially different responses to voice monitoring 
alone: women experienced an increase in perceived fatigue (decrease in score) and men 
experienced a decrease in fatigue (increase in score). However, both genders showed an overall increase in perceived Factor 3 fatigue, which again may be attributed to increased awareness.   Figure 13: Average Vocal Fatigue Index, Factor 1 scores over time by gender for Part 2.2. 96    Figure 14: Average Vocal Fatigue Index, Factor 2 scores over time by gender for Part 2.1.    Figure 15: Average Vocal Fatigue Index, Factor 3 scores over time by gender for Part 2.1.     97  Summary  Extracting Design Requirements for Conveying Feedback  At the end of Part 1, vocal intensity (loudness), pauses, pitch strength (quality), and first moment specific loudness (strain) were identified as measures of interest by occupational voice users participating in the study. Rather than a single display for each measure, participants 
expressed that they preferred a layered structure which included a basic comparison between all days, a more detailed comparison between days, and finally a zoomed in version of each day. 
Feedback from participants in Part 2 suggested that strain might be the most helpful measure, 
with all other measures demonstrating some level of helpfulness. Participants liked that the displays were user-friendly and easy to read, but wanted clearer definitions of measures and suggestions on how to improve their voices based on the feedback measures. Interview results also indicated that occupational voice users are not a homogenous group, and feedback should be 
flexible in addressing a range of goals. Participants were more interested in relative trends rather than absolute values, especially because they did not have context to aid with delayed recall when comparing feedback across days. Additional features were suggested that should be explored in future iterations. 
 Identifying Changes in Voice Behavior Management after Receiving Feedback  The results from Part 2 indicate that PVM can be used to elicit behavior change in occupational voice users and future occupational voice users. Some of these changes were active, including trying to pause more when speaking and increasing/decreasing vocal intensity. Some 
of these changes manifested as changes in thinking. Many participants reported increased 98  awareness of the need to preserve the voice, especially from fatigue. Participants also shared insights into how they interpreted the feedback to better understand their own voices. Finally, some participants reported limited behavior change and/or little observed change in their feedback. Future research should attempt to determine what factors lead to greater vocal 
behavior change in some individuals than others, in an effort to encourage more occupational voice users to actively engage in PVM to reduce the risk of developing future voice disorders.  In addition to reported changes by participants during the semi-structured interviews, the questionnaire results suggested some behavior change as well. For Part 2.1, there was a main effect of time showing a general increase in scores on measures from the initial to the midpoint 
interview, but no statistically significant change in scores from the midpoint to the final 
interview. While not statistically significant in Part 2.2, a similar trend was found.  This is an interesting finding that possibly suggests greater behavior change from monitoring alone versus adding feedback. However, further examination suggests that some change does occur from the midpoint to the final interview that should not be discounted. For 
example, RTC increased from the initial to the final session for 9 of 13 participants (69.2%). Even though the increase was greater from initial to midpoint, there was still an overall increase in RTC following feedback. As another example, from Part 2.2, VFI Factor 2 scores increased at 
midpoint for 3 participants (60%), but when comparing initial versus final scores, 80% 
experienced a decrease. This decrease indicates less Factor 2 vocal fatigue (physical discomfort).  Both Part 2.1 and Part 2.2 showed a statistically significant main effect of measure. While this result is interesting, it does little to contribute to the overall purpose of the study. It 
was anticipated that these measures would measure different aspects of behavior change, and they all use different scales. 99  Finally, a significant interaction of measure and gender was found for Part 2.2. This finding indicated that men demonstrate greater differences in scores on the different measures, which may need to be taken into account in future studies. Finally, the three-way interaction between time, measure, and gender was explored qualitatively as no statistically significant results were found. The trends suggest some measure-specific differences between the genders over time, some of which varied between Parts 2.1 and 
2.2. This suggests that further exploration into this three-way interaction may be necessary to understand how to maximize the feedback to best support a wide variety of potential PVM users.  Conclusions Four measures (loudness, strain, quality pauses) were chosen as feedback, presented on a series of three displays each, with each display displaying a different aspect of the measure. 
Based on feedback from participants, strain was the most helpful, but the other measures were helpful as well. Participants reported increased voice awareness and behavior change as part of 
the study. However, they felt that better measure descriptions, suggestions on how to improve voice use, and the ability to add context to recordings would enhance the user experience. Additionally, there was a common trend of increases in RTC, S-E, and VFI scores after recordings (both midpoint and final scores were generally higher than initial scores). While this is a positive direction for RTC and S-E scores, the increased VFI scores indicate greater perceived vocal fatigue. This may be attributed to increased awareness, but needs to be explored in greater detail in future studies.  100  CHAPTER 5: Quantification of Voice Changes After Feedback  Study Overview The goal of this study was to determine whether Preventative Voice Monitoring (PVM) would impact vocal behavior in occupational voice users and future occupational voice users. Chapter 4 discussed the design of the feedback displays (Aim 1) and insight on the sensitivity of 
behavior change questionnaires to internal changes experienced by users of PVM (Aim 2). 
Chapter 5 discusses observed changes in voice production (Aim 3) as a result of PVM.  Part 2 Methods Using the feedback displays developed in Part 1, the researcher tested the feedback display prototypes in Part 2. Potential target users completed vocal loading tasks (VLTs), tasks 
designed to induce vocal fatigue, to determine whether objective feedback influenced later voice production. Part 2 consisted of two phases: 1) laboratory testing (reading aloud for 15 minutes at 
an elevated vocal intensity) and 2) field testing (classroom lectures).  Participants  Fourteen participants enrolled in Part 2.1, but only 13 participants met the inclusion criteria of the study (6M, 7F; M=22.6 years, SD=9.15 years, Range=18-48 years). P2108 was excluded from the study because she did not meet the inclusion criteria. All remaining 
participants were either current occupational voice users or future occupational voice users.  An additional five participants enrolled in Part 2, Phase 2 (Part 2.2), and all participants met all the inclusion criteria (2M, 3F; M=42.6 years, SD=12.40 years, Range=33-62 years). All 101  participants were currently instructors at Michigan State University who taught courses at least two days per week. More detailed summary information is provided in Chapter 4: Part 2 Methods: Participants.   Procedures  Part 2 consisted of data collection over 11 individual sessions for each participant with no session lasting more than one hour. These sessions consisted of three interview sessions and 
eight recording sessions. Only the recording sessions will be discussed in Chapter 5. All recording sessions featured a vocal loading task (VLT). The VLT consisted of reading aloud 
from ﬁCharlotte™s Webﬂ for 15 minutes in Part 2.1, and lecturing to a classroom in Part 2.2. For 
Part 2.1, participants were instructed to ﬁRead aloud as though you are reading to a classroom 
and want to be heard by all the children,ﬂ which are the same instructions given in a prior study 
to elicit louder speech (Hunter et al., 2015). Loud speech is one type of task that is common to induce vocal fatigue in participants (Solomon, 2008). This instruction allowed the participants in Part 2.1 to approximate the vocal effort of the course instructors, although for a shorter time. 
Further descriptions of the VLTs can be found in Chapter 4: Part 2 Methods: Recording Sessions.  Recording Sessions  All recording sessions followed the same general pattern. Before and after each VLT, the participant completed a series of short tasks to allow the researcher to assess changes pre- to post-VLT. The short tasks were completed in a randomized order across sessions. In one task, participants produced three sustained /-5 seconds in duration at a 102  comfortable pitch and loudness. The average reading from a sound pressure level (SPL) meter (at a distance of one meter) for the third /for the dosimeter. Participants also read the Rainbow Passage (Fairbanks, 1960) aloud at a comfortable loudness, and rated vocal fatigue on a scale from 1 to 10 (1 = not at all; 10 = the most extreme). For Part 2.1, the short tasks were performed in the same sound-treated booth used for the VLTs. For Part 2.2, the short tasks before and after the VLT were performed in a location of the instructor™s choosing (e.g., laboratory, office, or classroom). 
 All participants (Parts 2.1 and 2.2) received feedback based on measurements from the accelerometer signal. Because VLTs in Part 2.2 occurred in the classroom, where ambient noise would affect measures from the microphone recording, the researcher used the accelerometer 
signal for feedback analysis and presentation. This is consistent with prior dosimetry studies (Hunter & Titze, 2010). However, previous studies validated the use of pitch, pitch strength, and first moment specific loudness only for acoustic (audio) signals (e.g., Shrivastav & Camacho, 2010; Shrivastav et al., 2012). The focus of this study was to assess the feasibility of providing feedback on voice use to people performing a VLT, and therefore these measures were taken from the accelerometer for the feedback, but correlations of these measures from the two types of 
signals (accelerometer, audio) were conducted to determine the generalization of these measures.  The data for the feedback was analyzed using a combination of Goldwave (Goldwave Inc., St. John™s, NL, Canada) and MATLAB (The MathWorks, Natick, MA, USA) processes. For the steps used in these analyses, see Appendix S (Part 2.1) and Appendix T (Part 2.2). Analysis was different for pitch strength and pauses between Part 2.1 and 2.2 due to constraints 
using MATLAB scripts in Part 2.2 with audio files greater than one hour in length. See below for 
basic descriptions of these analyses. 103   First Moment Specific Loudness  The researcher ran this analysis in a consistent manner across both Part 2.1 and 2.2. In summary, the researcher extracted the sustained vowels before and after the VLT. Next, the researcher input the middle 500ms of each vowel into a MATLAB script to estimate spectral moments using custom MATLAB scripts (Kopf et al., 2013).   Pitch Strength: Part 2.1  Pitch strength analysis was conducted using Auditory-SWIPE' (Camacho, 2007) for all participants. Because this is a time-intensive analysis in MATLAB, the researcher completed this analysis for the entire VLT for participants in Part 2.1, but only for the initial and final 15-minute lecture segments for Part 2.2. Participants saw the average pitch strength reported minute-by-minute so they could see the variability and change over time, if any.   Pitch Strength: Part 2.2  Due to the time-intensive nature of running Auditory-SWIPE™, time constraints did not allow pitch strength to be calculated for full class periods (due to data collection being twice a week). Rather, participants were given feedback on the first and last 15 minute segments from each class period so they could see how/if pitch strength varied from the beginning to the end.  Pauses: Part 2.1  Pauses were calculated using modifications of the same MATLAB script between Part 2.1 and 2.2. For Part 2.1, pauses were calculated based on the pitch strength and pitch output. If 104  the pitch of the output segment was <80 Hz, it was determined to be a segment of silence. The researcher chose this cutoff value because many of the error values given were due to noise at 77 Hz. Because the output of pitch strength is in 0.01 second segments, 100+ consecutive silence 
segments were considered to be pauses of 1 second or greater and were reported in the feedback. 
The researcher chose this cutoff value based on the existing literature. In turn taking, gaps longer 
than a second are rare (Heldner & Edlund, 2010; Wilson & Wilson, 2005). In addition, silences less than a second are considered to be at the word/phrase boundary level (Titze et al., 2007), whereas silences of a second or more are considered to be at a sentence level. This is also above 
the threshold for long pauses in both healthy controls (502ms) and speakers with ataxic 
dysarthria (767ms) (Rosen et al., 2010). Therefore, these pauses of longer duration (one second or greater) might be associated with effort to pause on the part of the speaker.   Pauses: Part 2.2  For Part 2.2, because pitch strength analyses could not be conducted on the entire classroom recording due to time constraints, pauses were based on dB analyses. Those segments that were <80 Hz and below a given dB threshold (varied across participants) were considered to 
be segments of silence. Because the output of the dB analysis was in 0.1 second segments, 10+ consecutive silence segments were considered to be pauses of 1 second or greater and were 
reported in the feedback.  dB Level: Part 2.2  The average dB level was shown for every three seconds of recording for ease of displaying results. For the feedback displays, the average dB value replaced periods of silence. 105   Analysis  Aim 3 Aim 3: To quantify changes in the voice after receiving feedback.  For statistical analysis, a multi-measure MATLAB script, custom designed in this laboratory for a prior study (Schloneger & Hunter, 2016; Schloneger, 2014), was used to analyze phonation time, dB level, fundamental frequency, pitch, and pitch strength from both the accelerometer and audio recordings. First moment specific loudness was evaluated using the 
same MATLAB script from the feedback analysis. Analyses were completed separately for Part 2.1 and Part 2.2. 
 In addition to addressing the three Aim 3 hypotheses, correlational comparisons of pitch, pitch strength, and first moment specific loudness from microphone (audio) and accelerometer 
signals were conducted. These comparisons assessed the appropriateness of using these measures from an accelerometer signal.  Hypothesis 1 Hypothesis 1 states: It was hypothesized that occupational voice users would improve voice production in response to feedback, and these improvements would manifest as decreases 
in one or more of the following: vocal intensity, voicing time, and/or F0.  To address Hypothesis 1, the researcher completed a mixed ANOVA. There were two within-subjects factors: time (baseline, feedback) and measure (vocal intensity, voicing time, fundamental frequency). These measures were taken from the accelerometer signal. In addition, 106  gender was the between-subjects factor. Because the focus was on whether there was a change between baseline and feedback recordings, the researcher compared the 15-minute VLT average of the three baseline sessions with the 15-minute VLT average of the five feedback sessions.   Hypothesis 2 Hypothesis 2 states: It was hypothesized that increasing vocal fatigue would result in the following changes in objective voice quality measures: increasing strain (increasing first moment specific loudness) and/or increasing breathiness (decreasing pitch strength).   To address Hypothesis 2, the researcher completed a stepwise linear regression, with an entry value of 0.05 and removal value of 0.10. Pitch strength and first moment specific loudness were the independent variables, and participants™ self-rated vocal fatigue was the dependent variable. The researcher took the measures from the audio signal. Pitch strength and first moment specific loudness were averaged across the three sustained // vowels for each time point (e.g., before the VLT in the first baseline recording session). These average values and their corresponding vocal fatigue scores were entered into the analysis as separate data points. 
Therefore, each participant had a maximum of 16 data points (eight before VLTs, eight after 
VLTs), with some missing data due to loss of recordings.  Hypothesis 3 Hypothesis 3 states: Changes in breathiness and vocal strain pre- to post-vocal loading task would be greater for baseline tasks than tasks with feedback.  To address Hypothesis 3, the researcher completed a mixed ANOVA. In this analysis, there were two within-subjects factors: time (baseline, feedback) and measure (pitch strength, 107  first moment specific loudness). The researcher took the measures from the audio signal. In addition, gender was the between-subjects factor. Because the focus was on whether there was a change between baseline and feedback, the researcher first averaged each measure (pitch strength, first moment specific loudness) from the three // vowels before each VLT, and did the same for the three // vowels after each VLT. Then, the researcher took the difference (before minus after) of the average values for a given measure in a given session, resulting in one 
average difference score for pitch strength and one for first moment specific loudness per session. Then, the researcher compared the average difference score of the three baseline 
sessions with the average difference score of the five feedback sessions.   Results and Discussion  Correlation Results  One of the important things explored in this study was whether one can reliably measure pitch (as measured using Auditory-SWIPE™), pitch strength, and first moment specific loudness from a reduced bandwidth (accelerometer) signal.  
 Comparison of pitch from accelerometer and audio signals  The correlation of pitch from the two channels of the recording, averaged across all tasks (vowels, rainbow passage, reading) and all speakers, was found to be 0.901, suggesting that this 
measure can be used for accelerometer signals Œ at least for perceptually normal adult voices.  108  Comparison of fundamental frequency and pitch  The correlation of fundamental frequency and pitch, averaged across all tasks (vowels, rainbow passage, reading) and all speakers, was found to be 0.961 for the accelerometer signal and 0.898 for the audio signal. These high correlations suggest that both measures are measuring the same phenomenon. However, this correlation should be treated with caution as this does not 
hold true for voice signals with reduced voice quality (Shrivastav, Eddins, & Kopf, 2014).  Comparison of pitch strength from accelerometer and audio signals  The correlation of pitch strength from the two channels of the recording, averaged across all tasks (vowels, rainbow passage, reading) and speakers, was found to be 0.683, suggesting that 
this measure is not appropriate for use with accelerometer signals. For two of the three initial 
participants, P2101 and P2102, the correlations were at or above 0.79, indicating a stronger similarity between the two analyses, but this did not to hold when more participants™ data was included in the analysis. It is widely recognized that pitch strength can vary with spectral shape (e.g., Fastl & Zwicker, 2007); thus, a difference in pitch strength values when it is estimated from the lower bandwidth accelerometer signal vs. a higher bandwidth microphone signal is not surprising.   Comparison of first moment specific loudness from accelerometer and audio signals  The correlation of first moment specific loudness from the two channels of the recording, averaged across all tasks (vowels, rainbow passage, reading) and speakers, was found to be 0.399, suggesting that this measure is not appropriate for use with accelerometer signals. Again, correlations were higher with P2101 (0.54) and P2102 (0.41), but not high enough to justify the 109  use of this measure to estimate strain from accelerometer signals in future studies. Since spectral moments are directly related to the spectral bandwidth, these findings are of no surprise.   Aim 3 Results and Discussion  Data from the recording sessions was used in the analyses below. While most participants did not experience any issues with the recordings, some issues led to loss of data for other participants. Data loss in voice monitoring studies is not an uncommon problem (Hunter & Titze, 2010; Schloneger & Hunter, 2016).  In Part 2.1, P2106 and P2109 experienced a technical issue that caused loss of part of a recording from one session. P2111 experienced infrequent difficulties with the recording 
equipment, ultimately resulting in nearly complete data loss for one session and partial data loss 
for another.  In Part 2.2, P2201 was sick during the last two recording days but still chose to record on those days. This data was omitted from Hypothesis 1 and 3 analyses to avoid her reduced voice quality from negating any positive changes that might be captured in the other three feedback 
days. However, data from these days was included in Hypothesis 2, correlations between voice 
measures and vocal fatigue ratings, because these recordings were of suboptimal voice and provided addition variation in analysis. On the final feedback recording day, there was an issue 
with recorder set-up that led to P2202™s data being completely lost. However, this participant was willing to come back for a 9th recording session (approved by the MSU HRPP), and this data was used in place of his day 8 data. Issues on days 6 and 8 of recording led to data loss for 
P2203. P2204 had the most data loss of any participant- 4 of the 8 recording days™ data (2 110  baseline recordings, 2 feedback recordings) was unusable or only partially usable. Finally, due to a scheduling issue with P2205™s courses, only seven recordings were made. Even though there were missing data points for individual sessions, this did not affect the overall analysis for Hypothesis 1 and Hypothesis 3. Missing data was treated as missing at 
random, and if a data point from a session was missing, the average for a recording type included all non-missing data. For example, for P2204, ﬁbaseline sessionsﬂ consisted of one data point (one complete, two missing). While this was not an ideal way to do the analysis, it allowed for 
comparisons across participants (although some caution should be used in interpretation).  Hypothesis 1  The assumption for sphericity was met for all variables for Part 2.1 and 2.2 analyses. Therefore, no corrections were needed. The results of the mixed ANOVA are reported below. 
Due to the high correlations between F0 and pitch for accelerometer signals, pitch was used in place of F0 for this analysis.  Part 2.1 Main Effects  Participant specific results can be seen in Appendix U, V and W, for phonation time, vocal intensity, and pitch, respectively (Figures 146-148). The results of the mixed ANOVA found a significant main effect of measure, F(2,22) = 815.148, p 2 = 0.987. Pairwise comparisons demonstrated that all three measures were statistically significantly different at p < .001. Phonation time (M= 78.97, SD= 3.641) was significantly higher than dB level (M= 60.845, SD= 5.033) and both measures were significantly higher than pitch in semitones (M= 41.952, SD= 5.510). However, as with the comparison of the questionnaire scores, this comparison adds 111  little to the overall question of the study. These measures would be expected to have different means because they are quantifying distinct attributes of the signal. There was no significant main effect of time, F(1,11) = 3.508, p = .0882 = .242. In addition, there was a non-significant effect of gender, F(1,11) = .761, p = .4022 =.065.  Part 2.1 Interactions There was a significant interaction of measure and gender, F(2,22) = 34.677, p 2 = .759. Post hoc testing with univariate ANOVAs (all results for each measure collapsed across time) found that pitch was significantly lower for men (M= 36.598, SD= 2.162) than women (M= 46.618, SD= 2.159), F(1,24) = 139.059, p < .0012 =.853, and even with alpha correction for multiple comparisons (0.05/3 = 0.017), this finding is still significant. This finding is consistent with prior literature that women speak at a higher pitch (fundamental frequency) than 
men (Titze, 2000).  Statistical analysis also found that dB level was significantly higher for men (M= 62.952, SD= 4.749) than women (M= 59.093, SD= 4.748), F(1,24) = 4.265, p 2 =.151, but with alpha correction for multiple comparisons (0.05/3 = 0.017), this finding was non-significant. However, this finding of increased dB level is somewhat different from the literature, where a 
prior dosimetry study found that female teachers had a higher average dB level than male teachers (Titze & Hunter, 2015). However, the tasks were different, with the prior study in an actual workplace environment whereas this task involved artificially increasing dB level. Finally, statistical analysis found that phonation time was not significantly higher for men (M= 80.462, SD= 3.471) than women (M= 77.658, SD= 3.472), although this finding approached significance, F(1,24) = 4.215, p = .0512 =.149. The phonation times found in this 112  study were elevated from those reported in the literature, where average speaking levels are less than 30% and even choral singing is not greater than 40% (e.g., (Schloneger, 2014; Szabo Portela, Hammarberg, & Södersten, 2013). Further investigation into this discrepancy is warranted. There were no significant interactions of time and gender, F(1,11) = .067, p 2 = .006, or time and measure, F(2,22) = 1.200, p = .3202 =.098. Finally, there was a non-significant three way interaction (time, measure, gender), F(2,22) = 1.196, p 2 =.098.  Qualitative Analyses  In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders.  Phonation time  Average phonation time (in percent total time) can be seen by session (Figure 16) and by recording type (Figure 17). Again, these values are elevated from prior literature, and absolute values should be interpreted with caution. However, the trends (Figure 16) demonstrate an initial increasing in phonation time across the three baseline sessions and a lesser degree of change during the feedback recordings. While the hypothesized direction for phonation time was a decrease with feedback, Figure 17 demonstrates a small decrease for men, but an increase for women. Future work should look into what cues may be given through feedback to encourage reduction of phonation time. 113    Figure 16: Average phonation time for each session by gender for Part 2.1.     Figure 17: Average phonation time by recording type for Part 2.1. Results are reported separately by gender.    114  Vocal Intensity  Average vocal intensity can be seen by session (Figure 18) and by recording type (Figure 19). For baseline sessions, (Figure 18) both genders showed similar average intensity for the first and third sessions, but a deviation from that average in the second session (with an increase for men and a decrease for women). While the hypothesized direction for vocal intensity was a decrease with feedback, Figure 19 demonstrates an increase for both men and women. Prior results discussed in Chapter 4: Part 2 Results and Discussion highlighted the fact that some participants reported being ﬁquiet talkersﬂ, and so this increase in dB level was seen as a positive trend for these speakers. This again highlights the need for further investigation of occupational 
voice user goals, which should be accounted for when assessing outcome measures.   Figure 18: Average vocal intensity for each session by gender for Part 2.1.  115   Figure 19: Average vocal intensity by recording type for Part 2.1. Results are reported separately by gender.  Pitch  Average pitch (in semitones) can be seen by session (Figure 20) and by recording type (Figure 21). The trends (Figure 20) demonstrate an initial decrease in pitch across the three baseline sessions for men, with women showing a more stable pitch across sessions. While the hypothesized direction for pitch was a decrease with feedback, Figure 21 demonstrates a small decrease for men (from a mean of 35.48 to a mean of 35.33), but a small increase for women (from a mean of 45.80 to a mean of 45.96). One interesting point was that this was one outcome measure that participants did not receive feedback on. While an overall decrease in pitch (fundamental frequency) can reduce one™s risk of voice disorders, there is evidence that pitch changes with vocal fatigue, but there is controversy about the direction of pitch change (Solomon, 2008). Therefore, future work should explore whether pitch is appropriate feedback to give for PVM, and if so, how it should be presented to occupational voice users.  116   Figure 20: Average pitch for each session by gender for Part 2.1.    Figure 21: Average pitch by recording type for Part 2.1. Results are reported separately by gender.  Part 2.2 Main Effects   Participant specific results can be seen in Appendix X, Y and Z, for phonation time, vocal intensity, and pitch, respectively (Figures 149-151). The results of the mixed ANOVA found a significant main effect of measure, F(2,6) = 20.116, p = .0022 = .870. The significant 117  difference was found between dB level (M= 65.000, SD= 4.385) and pitch (M= 41.000, SD= 4.847). Phonation time (M= 63.421, SD= 7.823) was not significantly different from either other 
measure. Again, this finding, though significant, did not provide additional insight because these differences were expected between measures. There was a non-significant main effect of time, F(1,3) = .123, p = .7492 = .039, gender was also non-significant, F(1,3) = 2.199, p = .2352 =.423.   Part 2.2 Interactions No statistically significant interactions were found. Measure and gender, significant in Part 2.1, was not significant, F(2,6) = .769, p = .5042 = .204. In addition, there were no significant interactions of time and gender, F(1,3) = .093, p 2 = .030, or time and measure, F(2,6) = .527, p 2 =.149. Finally, there was a non-significant three way interaction (time, measure, gender), F(2,6) = 2.767, p 2 =.480.  Qualitative Analysis  In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders. Please note that standard error bars are not present for session 8 for the men. This is due to only 
one male instructor being recorded for the eighth session, so no error bars were generated.  118  Phonation time  Average phonation time (in percent total time) can be seen by session (Figure 22) and by recording type (Figure 23). Again, these values are elevated from prior literature, and absolute values should be interpreted with caution. Compared with Part 2.1, the trends (Figure 22) demonstrate greater variation session by session, with greater variation among men than women. Greater variation was expected, given the difference in nature between the VLTs in Part 2.1 and Part 2.2. While the hypothesized direction for phonation time was a decrease with feedback, 
Figure 23 demonstrates a small decrease for men, but an increase for women, the same as in Part 
2.1. However, the results from Part 2.2 should be interpreted with greater caution. While a small segment (10 minutes) of the lecture was chosen where the speaker was mostly talking, these 
results may be influenced by segment selection, and factors outside the experimenter™s control, such as the lecture content and disruptions during the lecture.   Figure 22: Average phonation time for each session by gender for Part 2.2.  119   Figure 23: Average phonation time by recording type for Part 2.2. Results are reported separately by gender.  Vocal Intensity  Average vocal intensity can be seen by session (Figure 24) and by recording type (Figure 25). Overall, both genders experienced a decrease in vocal intensity over the course of the study (Figure 25), but there was less of a trend for the men across the eight sessions (Figure 24).    Figure 24: Average vocal intensity for each session by gender for Part 2.2. 120  This trend is in the hypothesized direction for reducing the risk of voice disorders. This measure should be monitored with a larger participant pool to see if the trend becomes statistically 
significant.    Figure 25: Average vocal intensity by recording type for Part 2.2. Results are reported separately 
by gender.  Pitch  Average pitch (in semitones) can be seen by session (Figure 26) and by recording type (Figure 27). In contrast with Part 2.1, the trends (Figure 26) show greater variability across session for women than men. While the hypothesized direction for pitch was a decrease with feedback, Figure 27 demonstrates similar trends to Part 2.1: a small decrease for men (from a mean of 36.53 to a mean of 36.26), but an increase for women (from a mean of 45.02 to a mean of 46.00). Therefore, future work should explore this measure with a greater number of participants to see if the trends become statistically significant. In addition, future work should explore whether pitch is appropriate feedback to give for PVM, and if so, how it should be 
presented to occupational voice users. 121   Figure 26: Average pitch for each session by gender for Part 2.2.   
  Figure 27: Average pitch by recording type for Part 2.2. Results are reported separately by gender.    122  Hypothesis 2   Part 2.1 Results  Stepwise linear regression with all participants did not yield a result (no variables were entered into the equation). However, because P2101 never reported a change in fatigue (rating was always 1), this participant was removed from the analysis, and it was run again with 12 participants™ data. The result of Pearson correlations indicate that pitch strength and fatigue are 
correlated at r= 0.166, while first moment specific loudness is correlated with fatigue at r= 0.161. The results of the regression indicated that a single predictor, pitch strength, accounted for 2.8% of the variance (R2 = 0.028, F(1,185) = 5.261, p = .023). It was found that decreasing pitch -4.182, p = .023).  Qualitative Analysis In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders.   The summary results can be seen in Figures 28-31. Figures 28 and 29 show separate linear regressions for pitch strength and first moment specific loudness, respectively, including 
P2101. Figures 30 and 31 show separate linear regressions for pitch strength and first moment, respectively, excluding P2101. It is of interest to note that there is a trend in the relationship of 
pitch strength and fatigue that is consistent with the hypothesis (increasing fatigue will be related with decreasing pitch strength). However, the trend for first moment specific loudness was not in 123  the hypothesized direction, because increasing first moment specific loudness is correlated with increasing vocal strain, but as fatigue increased, this measure decreased. In all cases, there is still a great amount of unexplained variance in scores. This is not unlike prior findings for rating scales in voice quality work. Kreiman, Gerratt, Kempster, Erman, & Berke, (1993) described the high amount of variance in voice quality ratings, especially for those that were not at either end of the rating scale. There are a number of identified factors that 
play a role in this variation, including random errors, such as attention to task, and criterion 
(systematic) errors, such as how one uses the scale (Shrivastav, Sapienza, & Nandur, 2005). Shrivastav et al. (2005) found that these errors can be corrected by standardizing the ratings and averaging multiple ratings. To better understand the relationships between pitch strength and 
fatigue and first moment specific loudness and fatigue, an experimental design that allows for 
multiple ratings of the voice at the same moment, and later averaging and standardization in the analysis might be better able to determine the true relationships between these measures.   Figure 28: Linear regression of vocal fatigue rating and pitch strength including P2101. 
  124   Figure 29: Linear regression of vocal fatigue rating and first moment specific loudness including P2101.  
  Figure 30: Linear regression of vocal fatigue rating and pitch strength excluding P2101.  125   Figure 31: Linear regression of vocal fatigue rating and first moment specific loudness excluding P2101.  Part 2.2 Results  Stepwise linear regression with all participants did not yield a result (no variables were entered into the equation). However, the result of Pearson correlations indicate that pitch strength and fatigue are correlated at r= 0.019, while first moment specific loudness is correlated with fatigue at r= 0.188. The lack of entry into the regression may be due to a smaller number of data points than in Part 2.1 (where pitch strength was entered into the regression at r= 0.166), and therefore these measures should be re-examined with a greater number of data points in future research.  Qualitative Analysis In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number 
of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders. 126  Please note that standard error bars are not present for session 8 for the men. This is due to only one male instructor being recorded for the eighth session, so no error bars were generated.  The summary results can be seen in Figures 32 and 33. Figures 32 and 33 show separate linear regressions for pitch strength and first moment specific loudness, respectively. It is of interest to note that the trend for the relationship of pitch strength and fatigue is the opposite of that in Part 2.1, and is therefore in the opposite direction of the hypothesis. First moment specific loudness, on the other hand, shows almost no trend and is relatively consistent across all vocal 
fatigue ratings. Again, there is still a great amount of unexplained variance in scores. This is consistent with prior literature for subjective ratings. An experimental design with repeated ratings that can 
later be standardized in analysis would likely give a more consistent and clear picture of the 
actual relationships of pitch strength and first moment specific loudness with vocal fatigue.   Figure 32: Linear regression of vocal fatigue rating and pitch strength.  127   Figure 33: Linear regression of vocal fatigue rating and first moment specific loudness.   Hypothesis 3   Part 2.1 Main Effects  Participant specific results can be seen in Appendices AA and AB (Figures 152 & 153), respectively. No significant main effects were found. The results of the mixed ANOVA found no significant main effect of time, F(1,11) = .415, p = .532, 2 = .036, and no significant main effect of measure, F(1,11) = .759, p = .4022 = .065. There was no between subjects effect of gender, F(1,11) = .093, p 2 =.008.  Part 2.1 Interactions No significant interactions were found. There was a non-significant interaction of time and measure, F(1,11) = .298, p 2 =.026. There was also no significant interaction of measure and gender, F(1,11) = .141, p 2 =.013, or time and gender, F(1,11) = .328, p = 128  2 =.029. Finally, there was no significant three-way interaction (time, measure, gender), F(1,11) = .590, p 2 =.051.  Qualitative Analysis In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders.   Pitch Strength  The summary results can be seen in Figures 34 and 35. Average pitch strength can be seen by session (Figure 34) and by recording type (Figure 35). It was hypothesized that change scores (difference between the average pitch strength values from sustained /decrease as a result of the feedback. This trend was seen for men (Figure 35) by the absolute value in change 0.02 to 0.01. The actual change was in the opposite direction between recording types: in baseline recordings, men had an increase in pitch strength post-VLT. This was not anticipated, but may be related to a warm-up effect (Vilkman et al., 1999). However, during feedback recordings, the hypothesized direction was seen- a decrease in pitch strength from pre- to post-VLT.   On the other hand, women displayed the opposite trend of the men. First, the absolute value of their change became greater from baseline to feedback, 0.02 to 0.03. In addition, during 
the feedback recordings, they showed the expected trend of a decrease in pitch strength 
following the VLT. During the feedback recordings, the opposite occurred. It is possible that 129  women were able to use this task as more of a warm-up in these later sessions, however, further study should be done.    Figure 34: Average change scores (pre Œ post) for pitch strength by session in Part 2.1. Positive values indicate greater pitch strength before the vocal loading task.   Figure 35: Average change scores (pre Œ post) for pitch strength by recording type in Part 2.1. Positive values indicate greater pitch strength before the vocal loading task.  130   Overall, these findings suggest some small change as a result of the feedback, but the direction of the results is mixed between the genders. Further study with a greater number of participants may be able to determine whether this effect holds for a more generalizable population.  First Moment Specific Loudness  The summary results can be seen in Figures 36 and 37. Average first moment specific loudness can be seen by session (Figure 36) and by recording type (Figure 37). It was hypothesized that change scores (difference between the average first moment specific loudness 
values from sustained /The trends were the opposite of pitch strength. This trend was seen for women (Figure 36) by the absolute value in change 0.54 to 0.16. The actual change was in the opposite direction between recording types: in baseline recordings, women had a decrease in first moment specific loudness post-VLT (indicating less strain post-VLT). However, during feedback recordings, the opposite direction was seen- an increase in first moment specific loudness from pre- to post-VLT indicated by the negative value.  This finding is inconsistent with the pitch strength findings for women, if both measures are indicators of vocal fatigue. At baseline, they showed decreasing pitch strength and decreasing first moment specific loudness from pre- to post-VLT, potentially indicating increasing and decreasing vocal fatigue, respectively.   On the other hand, men displayed the opposite trend. First, the absolute value of their change became greater from baseline to feedback, 0.02 to 0.34. While this was not in the 
hypothesized direction of decreasing the change pre- to post-VLT, this finding suggests a greater 131  decrease in first moment specific loudness (strain) after feedback. This would be consistent with a positive change as a result of the feedback.    Figure 36: Average change scores (pre Œ post) for first moment specific loudness by session in Part 2.1. Positive values indicate greater first moment specific loudness before the vocal loading task.  
  Figure 37: Average change scores (pre Œ post) for first moment specific loudness by recording 
type in Part 2.1. Positive values indicate greater first moment specific loudness before the vocal loading task. 132   Overall, these findings suggest some small changes as a result of the feedback, but the direction of the results is mixed between the genders. However, due to the small, non-statistically significant trends, this hypothesis needs to be substantiated with additional testing with a larger number of participants to determine if one or both measures are indicators of vocal fatigue (possibly associated with different fatigue factors).  Part 2.2 Main Effects  Participant specific results can be seen in Appendices AC and AD (Figures 154 & 155), respectively. As in Part 2.1, there were no statistically significant main effects. The results of the mixed ANOVA found no significant main effect of time, F(1,3) = .014, p = .2 = .005, and no significant main effect of measure, F(1,3) = 5.392, p 2 = .643. There was no between subjects effect of gender, F(1,3) = 1.201, p 2 =.286.  Part 2.2 Interactions As in Part 2.1, there were no statistically significant interactions. The results of the mixed ANOVA found no , no significant interaction of measure and gender, F(1,3) = .397, p = .57117, and no interaction of time and gender, F(1,3) = .813, p = .434213. Finally, there was no three-way interaction, F(1,3) = 1.161, p = .36079.  Qualitative Analysis In addition to the statistical analysis, qualitative analysis was completed to see where trends may be occurring that warrant further investigation in future studies with a larger number 133  of participants. The following figures show the average results by session and by recording time (baseline, feedback) by gender with standard error bars to indicate the variability within genders. Please note that standard error bars are not present for session 8 for the men. This is due to only one male instructor being recorded for the eight session, so no error bars were generated.  Pitch Strength  The summary results can be seen in Figures 38 and 39. Figures 38 and 39 show separate linear regressions for pitch strength and first moment specific loudness, respectively. Unlike Part 
2.1 where opposite trends were found for men and women, the same basic trend emerges when 
looking at Figure 39. For both genders, there is a decrease in change pre- to post-VLT for both men and women. This trend, although not statistically significant, follows the hypothesized 
direction.    Figure 38: Average change scores (pre Œ post) for pitch strength by session in Part 2.2. Positive values indicate greater pitch strength before the vocal loading task.  
 134  In addition, this decrease indicates less of a decrease in pitch strength over the course of the VLT, leading to better voice quality (less fatigue) post-VLT. This trend is also seen in the session-to-session comparison in Figure 29. For men, the decrease appears more gradual. For women, there is a steep decline between feedback sessions 2 and 3.   Figure 39: Average change scores (pre Œ post) for pitch strength by recording type in Part 2.2. Positive values indicate greater pitch strength before the vocal loading task.  First Moment Specific Loudness  The summary results can be seen in Figures 40 and 41. Figures 40 and 41 show separate linear regressions for pitch strength and first moment specific loudness, respectively. Unlike 
pitch strength in Part 2.2 where both genders showed the same trend, opposite trends were found 
for men and women. Women followed the hypothesized trend of a decrease in change in first 
moment specific loudness (-0.17 to 0.03), but men showed the opposite trend (-0.26 to -0.37). While a negative value indicates an increase in first moment specific loudness (strain) post-VLT, the switch to an average positive value for women in the feedback recordings indicates a 
reduction in strain post-VLT. 135   Figure 40: Average change scores (pre Œ post) for first moment specific loudness by session in Part 2.2. Positive values indicate greater first moment specific loudness before the vocal loading task.   Figure 41: Average change scores (pre Œ post) for first moment specific loudness by recording 
type in Part 2.2. Positive values indicate greater first moment specific loudness before the vocal loading task.  The pitch strength and first moment specific loudness findings for women in Part 2.2 followed the hypothesized direction, suggesting that both of these may be possible outcome measures for female occupational voice users. However, with a small group of three individuals, 136  this trend must be interpreted with caution. These findings do suggest that further exploration of these measures as outcome measures may be appropriate for female occupational voice users.  Summary  Correlations between accelerometer and audio recordings for the auditory measures (pitch, pitch strength, first moment specific loudness) were conducted because these measures have previously been validated only for use with audio recordings (Kopf et al., 2013; Shrivastav & Camacho, 2010; Shrivastav et al., 2012). A high correlation (0.901) indicated that pitch estimates from Auditory-SWIPE™ are appropriate to use with both accelerometer and audio recordings. This measure also correlated with fundamental frequency estimates at 0.961. However, these recordings were of individuals with normal, Type 1 voices, and the predicted correlations for Type 2 and Type 3 voices would be reduced (Shrivastav et al., 2014). The other two measures are not appropriate to analyze from accelerometer recordings due to low 
correlations.  For Hypothesis 1, Part 2.1 found a statistically significant difference in pitch between men and women, with men having a lower pitch. These findings are consistent with the previous 
literature. Both Part 2.1 and Part 2.2 found a statistically significant difference of measure, which was also anticipated due to the measures addressing different constructs and having different scales. While no other statistically significant differences were found, similar trends were seen 
across the two phases of the study for phonation time and pitch by gender. Men in both phases 
showed a decrease in phonation time and pitch with feedback, both in the hypothesized direction. 
On the other hand, women in both phases showed increases in phonation time and pitch with 
feedback. This finding suggests that there may be a gender-specific response to the feedback that 137  should be further examined with a larger pool of participants. In addition, feedback may need to be tailored differently for women to encourage the same risk-reducing changes found in men. Finally, participants in Part 2.1 showed an overall increase in dB level in the feedback recordings, whereas participants in Part 2.2 showed an overall decrease (the hypothesized direction). This discrepancy may be due to the difference in VLTs, with Part 2.1 participants instructed to speak at a louder than normal volume.  For Hypothesis 2, there was a significant, although small, finding that pitch strength is a predictor of perceive vocal fatigue. While most participants reported some increase in fatigue, one participant (P2101) reported no fatigue at any point in the study, and participant-rated fatigue often increased by only 1 or 2 points on a 10-point scale after the VLT. However, looking at the data, one can see the high variability of associations between ratings and pitch strength. This is not exclusive to ratings of vocal fatigue, and has been observed, for example, in clinician 
ratings of voice quality (Kreiman et al., 1993). Possible ways to reduce this variability, and therefore get a more robust rating to compare with objective measures, would be to standardize and average multiple ratings (Shrivastav et al., 2005), which would need to be done systematically in a future study.   Finally, for Hypothesis 3, there was no statistically significant change in pitch strength or first moment specific loudness over time with the VLTs. However, when looking at the trends, 
women in Part 2.2 (current occupational voice users) showed the expected trends for both pitch 
strength and first moment specific loudness, with men also showing the expected trend for pitch strength only.  In Part 2.1, men did show a reduction in pitch strength change scores in the feedback recordings, but they showed an increase in pitch strength (increase in quality post-VLT) in the 138  baseline recordings and the opposite in the feedback recordings. While women had an increase in the amount of change in the feedback recordings, this change demonstrated an increase in quality 
post-VLT with feedback whereas these individuals had a decrease in quality post-VLT in the baseline recordings. The exact opposite findings were discovered for first moment specific 
loudness. Based on Part 2.1 findings, the direction and magnitude of change did not move 
together in the hypothesized direction.   Conclusions Overall, the statistical findings partially support the current hypotheses. Correlational analyses support the use of pitch from accelerometer recordings, but do not support calculations 
of pitch strength and first moment specific loudness. In addition, pitch strength was found to be a predictor of vocal fatigue in Part 2.1.  Qualitative analysis of the data provided a richer context for exploring the trends. These trends suggested that men decreased phonation time and pitch in response to feedback, both of 
which reduce the risk of developing voice disorders. Findings from Part 2.2 participants 
indicated that, regardless of gender, individuals reduced vocal intensity with feedback, again 
reducing voice disorder risk. In addition, Part 2.2 participants demonstrated a decrease in pitch strength change scores pre- to post-VLT, indicating less change (possibly less fatigue) as a result of feedback. In addition, all of these average change scores were positive (pitch strength 
decreasing over the VLT), and this reduction in change indicates less of a decline in voice 
quality with feedback. Further analysis with a larger set of participants should be completed to determine if any of these findings may be statistically significant.   139  CHAPTER 6: Study Discussion and Conclusions  Summary of Findings The goal of the current study was to determine whether Preventative Voice Monitoring (PVM), using a dosimeter with task-based feedback, impacts vocal behavior in occupational voice users and future occupational voice users. Three study aims were defined to design the feedback (Aim 1) and to assess its ability to influence both behavior change measures (Aim 2) 
and voice production (Aim 3). To address the aims, the study was divided into two parts: the 
creation of the feedback (Part 1) and testing of the feedback (Part 2). Summarized findings are included below. Aim 1: At the end of Part 1, a layered display structure for four measures (loudness, pauses, quality, strain) was developed with three displays with varying views of the data. The first display was an ﬁat a glanceﬂ version, the second display allowed comparison across multiple days, and the third display allowed a more detailed view of an individual day™s data. Participants in Part 2 suggested further improvements to future iterations of the feedback, including better defined measures, suggested behavior changes based on feedback results, and the ability to provide further context for later reflection (e.g., links to calendar and ability to add personal notes on the voice).  Aim 2: A statistically significant increase in questionnaire scores was found after PVM (both at baseline and with feedback). These increases were positive for RTC and S-E, with an increase in these scores indicating positive behavior change. For the VFI, only an increase in Factor 3 demonstrates an improvement. However, increases in Factors 1 and 2 may indicate 
increased awareness of fatigue, which may later lead to increased desire for behavior change. 140  Aim 3: There were some statistically significant findings related to the three hypotheses. For Hypothesis 1 (phonation time, pitch, and/or vocal intensity will decrease with PVM), there was a main effect of measure in Parts 2.1 and 2.2, indicating that the measures were quantifying three distinct attributes. There was an interaction of gender and measure in Part 2.1, with post hoc testing revealing that men had a higher average pitch than women (consistent with prior 
literature). For Hypothesis 2 (pitch strength and first moment specific loudness can predict vocal fatigue), stepwise linear regression found a small but statistically significant result that pitch 
strength was able to account for 2.8% of the variance in vocal fatigue self-ratings. Finally, for Hypothesis 3 (the change in pitch strength and first moment specific loudness from before to after the VLT will decrease with feedback), there were no statistically significant results.  Despite few statistically significant results for Aim 3, there were a number of trends identified that warrant future exploration. For Hypothesis 1, average phonation time increased for women in both parts of the study and men in Part 2.1 with feedback, but decreased for men in Part 2.2 (hypothesized direction). For average vocal intensity, both genders in Part 2.1 showed an average increase with feedback, while both genders in Part 2.2 showed an average decrease (hypothesized direction). All women showed an average increase in pitch with feedback, and men showed a decrease (hypothesized direction). For Hypothesis 2, participants in Part 2.1 showed an average decrease in both pitch strength (hypothesized direction) and first moment specific loudness with increasing vocal 
fatigue. On the other hand, participants in Part 2.2 showed an increase in pitch strength and virtually no change in first moment specific loudness with increasing vocal fatigue. For Hypothesis 3, Part 2.1 pitch strength trends demonstrated an interaction of gender and time. While only men decreased the change in pitch strength with feedback (hypothesized 141  direction), men changed from having higher pitch strength post-VLT to pre-VLT, and women had the opposite trend. In Part 2.2, both genders saw an average decrease in change in pitch strength with feedback (hypothesized direction), and all change scores were positive (indicating that pre-VLT scores were higher than post-VLT scores; hypothesized direction).  Women in both phases of Part 2 showed reduced pre-VLT to post-VLT change in first moment specific loudness with feedback (hypothesized direction), while men showed increased 
change. However, opposite trends for the direction of change were seen between genders in both phases. In Part 2.1, women saw a decrease in first moment specific loudness at baseline and an 
increase with feedback (hypothesized direction). Men saw the opposite trend from the women. In Part 2.2, women saw an increase in first moment specific loudness at baseline, but a decrease 
with feedback. Again, men showed the opposite trend from the women.  Study Limitations   While some statistically significant results were found, it is important to interpret the significance with caution due to the small number of participants in each group, especially the 
group in Part 2.2. However, despite these small numbers, some large effect sizes were found, which strengthens the likelihood of these findings remaining consistent in future studies. Future work in this area should include larger groups in order to increase the confidence in the statistical 
results.  In addition to statistical limitations, other limitations include the VLTs. In Part 2.1, the task was artificial (reading aloud to an imaginary classroom), and a number of behavioral 
changes reported were attributed to the task itself. Another important point is that the participants in Part 2.1 were predominantly students who had limited experience as occupational voice users, 142  and this was reflected in many of their conjectures of how feedback would be different for the target population.   In Part 2.2, the VLT was an actual job-related task (course instruction) performed by occupational voice users. However, this was also limited by factors that are inherent to doing in situ research. First, there was great variability between the courses taught by the instructors in length, content, and class size. This variability may have contributed to some of the non-
statistically significant findings, in addition to the small sample size. Furthermore, these individuals were monitored for only a short portion of their work day. Monitoring these individuals in multiple settings and multiple contexts is important for future assessment of the 
feedback system, to assess its flexibility and other user needs that may arise in different contexts.  One participant, P2109, reporting shifting his focus from observing internal cues about his voice (baseline recordings) to ignoring internal cues in favor of external cues (the feedback). 
While only one participant reported this change, it is interesting to consider in light of further PVM development. In voice therapy, clinicians not only focus on teaching patients to produce a better voice, but also teaching patients to focus on internal cueing of better voice production (Ramig & Verdolini, 1998). The idea behind PVM is to increase awareness of the voice, and provide a way for individuals to monitor the voice over time to try to optimize production before the need for treatment arises. Therefore, future studies need to explore the best combination of 
objective and subjective information for PVM to support self-reflection on voice production.  Another limitation of the current study is the organization of the study. Based on participant interview data, participants spent time figuring out how measures were related, what 
measures meant, and only some attempted to change their voice based on the feedback. In future 143  studies, a trial period with the feedback to see how it works may allow more future participants to focus on voice behavior change.  Feedback Issues  Sometimes recording or analysis issues led to incomplete feedback being presented to participants for a particular session. However, for most participants in Part 2.1, only one or two sessions had incomplete feedback (only one or two of the three measures presented). For P2111, this unfortunately happened on multiple occasions due to an undetermined issue with the recorder where the initial and final tasks were usually usable, but the VLT information from the 
accelerometer was unreliable. Because of the issues with many of her recordings, P2111 received limited feedback. For P2106 and 2109, part of one day™s recording was cut off at the end, but this did not happen for other participants recorded using the same equipment on the same day before 
and between these participants (the issue was not discovered until all the day™s recordings were 
completed). 
 In Part 2.2, two participants had recurrent recording issues. P2202 nearly always had a background noise of approximately 78 Hz present through the majority of the accelerometer 
recordings, which is believed to be attributed to interference from other electronic devices in the classroom itself. The initial and final tasks were usually clean (performed in the laboratory), but 
the classroom portion of the recording usually contained the noise. Therefore, this participant 
received feedback on quality from the initial and final Rainbow Passage readings rather than 
from the classroom recording (for post hoc analyses, we used an 80 Hz high pass filter to get rid 
of the extraneous noise). For P2204, the first two classroom recordings were lost due to a loose 144  battery issue that led to intermittent loss of recordings. One later classroom recording was lost due to an unknown recorder error.  Another somewhat consistent issue with recordings in Part 2.2 was the presence of long periods of silence during the lecture (due to student presentations, videos, guest speakers, etc.). If 
one of these periods occurred in the first or last 15 minutes of the course, pitch strength values 
were not reported in feedback during these silences. Occasionally, there would also be moments during the lecture where the pitch strength would drop by 0.2 or more for about a minute and 
then recover to the initial level for the remainder of the segment. Sometimes this was associated with increased silence duration, or the presence of temporary noise, but with no appreciable difference heard in the speaker™s voice quality perceptually. These data points were also omitted 
in the feedback.  Another potential feedback issue was the measures used in feedback. While these measures were identified as salient to occupational voice users in Part 1, participants in Part 2 reported some frustration with little to no change in these measures over the course of the study. This limited range may be due to a ceiling effect because these measures, especially the voice quality measures, are used in differentiating between normal and dysphonic voices. Participants 
in this study fell within the normal range of voices, so there was potentially little room for 
change on these measures. Therefore, future research should investigate the measures used in feedback, and look at measures that have been shown to differentiate between groups of non-dysphonic voices. Some possible measures to examine in future studies include measures from the long-term average spectrum, standard deviation of sounded (voiced) durations, and smoothed cepstral peak prominence (Warhurst et al., 2016; Warhurst, McCabe, Yiu, Heard, & Madill, 145  2013). These measures have been identified in studies to differentiate between commercial radio voices and both public radio and control voices in Australia.  Finally, biofeedback such as that provided in physical activity tracking instructs users to change behavior in a singular direction (e.g., increasing the number of steps taken). For PVM, the goal of the feedback is to help occupational voice users to find a balance in voice use. For 
example, users should decrease phonation time to give the voice more chance to rest, but still 
maintain a level of phonation time for conveying their message. Therefore, future research should evaluate whether there are better means of conveying this type of information to users.  Future Implications  During the study, participants provided a vast amount of helpful input. While there were few changes in objective voice measures across participants, input provided in the interviews will inform future design. Suggestions for future exploration include: better understanding of the 
needs of the array of potential users, the inclusion of recommendations based on feedback, including notes options to allow better comparison, and providing a social component to share 
data or potentially just advice. While students had mixed opinions on whether they would use the 
future app, current occupational voice users expressed interest in the app after appropriate modifications are made.  Conclusions  Overall, the results of this study support the further iteration and eventual design of a mobile application for occupational voice users to prevent voice disorders. The findings from the study indicate that users want an interactive, flexible system with well-defined measures and a 146  layered display structure that provides both objective feedback and suggestions to improve voice use. This insight from P2114 sums up the purpose of the study nicely: ﬁI really liked the study and I feel that the more people know about their voices, the less voice problems people will 
have.ﬂ This sentiment suggests that PVM could fill an important void for occupational voice users who may want to change their voices but do not feel that they have the knowledge or skills 
to make the changes on their own. Preventative Voice Monitoring could be used to empower 
occupational voice users and help them avoid future voice disorders.   147    
 
 
 
 
 
 
 
 
 APPENDICES  
 
 
 
 
 
 
 
 
   148  Appendix A: Intake Form (Modified VBALAB form) Gender:    M    F             Age: ______  a. Compared to a normal day, today I feel: a) Much less stress b) Less stress c) The same stress d) More stress e) Much more stress b. Compared to a normal day, today I feel: a) Much less fatigue b) Less fatigue c) The same fatigue d) More fatigue e) Much more fatigue  c. I would describe my PRIMARY workplace as: a) Very Quiet b) Quiet c) Neutral d) Noisy e) Very Noisy   e. Do I commonly experience symptoms of reflux (or heartburn)? yes no  Am I experiencing reflux symptoms today?   yes no  
f. Do I commonly experience symptoms of seasonal allergies? yes no  Am I experiencing allergy symptoms today?  yes no  
g. In the past year, I have smoked: never       occasionally  daily  h. On average, I consume ___ caffeinated beverages per day.       0     1    2    3   4+  i. Do you use voice amplification (e.g., microphone) in your job?  ___ Yes  ___ No  149  It is important that the ethnic and racial makeup of our research participant pool reflects that of the local community. Please indicate which of the following ethnic and racial categories you identify:  Ethnic Category  Racial Categories Hispanic or Latino   American Indian/Alaska Native  Not Hispanic or Latino   Asian  Prefer not to identify   Native Hawaiian or other Pacific Islander     Black or African American      White     Prefer not to identify  Table 13: Ethnic category.  Table 14: Racial category. 1. Are you a native speaker of American English?  ___ Yes  ___ No 2. Do you have, or have you ever had a voice disorder that required treatment (therapy, surgery)?  ___ Yes  ___ No 3. Do you have any hearing loss?  ___ Yes  ___ No 4. Do you have, or have you ever had a speech disorder that required treatment (therapy, surgery)?  ___ Yes  ___ No 5. Do you use voice amplification (e.g., microphone) in your job?  ___ Yes  ___ No   150  Appendix B: Initial Semi-Structured Interview (Parts 1 & 2)  1. Tell me about your job (if occupational voice user). a. Do you talk/sing a lot as part of your job? b. How noisy is your work environment? c. Tell me about your voice use during a typical work day. d. About how many hours do you talk per day in your job? 2. Tell me about your future career goals (if participant is a student). 3. Have you ever experienced any problems with your voice? a. Have you ever been sick and lost your voice? b. Do you ever feel like your voice gets tired at the end of the day? When? c. The last time that your voice became tired, what did you do? Was it helpful? d. Have you ever sought help for these problems?  i. Who did you turn to?  ii. What did you do? e. Do you have any current concerns about your voice? 4. Have you ever heard that certain professionals, such as teachers, are at risk for voice disorders? a. Where did you hear this? b. What did you think? c. Do you know of anyone in your profession or another profession who has had 
voice disorders? Who? What action did they take?  
  151  Appendix C: Part 1 Semi-Structured Interview Questions  After all measures of interest are explained, the researcher will ask the following: 1. How important do you think this measure is to understanding your voice on a scale from 
1-10, where 1 = not at all and 10 = extremely? Why? 2. How useful do you think this measure could be as daily feedback on a scale from 1-10, where 1 = not at all and 10 = extremely? Why? 3. How confident are you that you could change this measure based on daily feedback on a scale from 1-10, where 1 = not at all and 10 = extremely? Why?  For each measure, the following question will be asked: 1. If you were to receive feedback on this voice measure, what should it look like? You can tell me in words or I have a piece of paper you can draw on.  The participant will be asked to look at all three visual displays for a given measure first 
sequentially and then simultaneously and answer the following: Sequential Viewing (Mazza, 2006) 4. What kind of information can you gather from this feedback display? 5. Can you identify general patterns or tendencies in the feedback display? 6. Is the feedback display useful? Why? 7. Do you think the feedback should be presented differently? How? Simultaneous Viewing 1. Please tell me which of these displays would be most helpful for you. Why? 2. Please tell me which of these displays would be least helpful for you. Why? 3. Are there any parts from these visual displays that you would combine to create a better 
visual display? Which parts? Why? 4.  (Taking out the earlier description/drawing of feedback display) How do these compare to your initial thoughts on the feedback display? a. Which do you like better? b. Do you think they could be combined? c. Now that you™ve seen these prototypes, could you tell me what your ideal feedback 
display would be?  d. Is there anything else that might be useful that isn™t captured in this feedback display? 
(Mazza, 2006)   152  Appendix D: Initial Feedback Displays Distance  Figure 42: Initial displays- Distance small multiple line graph.    Figure 43: Initial displays- Distance bar graph.  
  Figure 44: Initial displays- Distance speedometer    153  Loudness  Figure 45: Initial displays- Loudness bar graph.    Figure 46: Initial displays- Loudness clock.  
  Figure 47: Initial displays- Loudness small multiple sparklines.     154  Pauses  Figure 48: Initial displays- Pauses small multiple line graph.    Figure 49: Initial displays- Pauses clock.  
  Figure 50: Initial displays- Pauses bar graph.     155  Quality  Figure 51: Initial displays- Quality small multiple smileys.    Figure 52: Initial displays- Quality small multiple line graphs.  
  Figure 53: Initial displays- Quality multi-day line graph.  
  156  Clarity  Figure 54: Initial displays- Clarity single-day line graph.    Figure 55: Initial displays- Clarity vertical line graph.  
  Figure 56: Initial displays- Clarity bar graph.   157  Strain  Figure 57: Initial displays- Strain small multiple line graph.    Figure 58: Initial displays- Strain multi-day line graph.    Figure 59: Initial displays- Strain bar graph.  
  158  Multi-Measure  Figure 60: Initial displays- Multi-measure matrix.    159  Appendix E: Iteration 1 Feedback Displays Icons  Figure 61: Iteration 1 displays- Dynamic loudness icons.    Figure 62: Iteration 1 displays - Dynamic pause icons.     160  Icons  Figure 63: Iteration 1 displays - Dynamic quality icons.    Figure 64: Iteration 1 displays - Dynamic strain icons.    161  Loudness  Figure 65: Iteration 1 displays Œ Loudness small multiple sparklines.    Figure 66: Iteration 1 displays Œ Loudness multiple sparklines.   162  Pauses  Figure 67: Iteration 1 displays Œ Individual pause time and length.    Figure 68: Iteration 1 displays Œ Pause clocks and line graph.  
  Figure 69: Iteration 1 displays Œ Pause line graph.   163  Quality  Figure 70: Iteration 1 displays Œ Quality small multiple smileys.    Figure 71: Iteration 1 displays Œ Quality multi-line graph.  
  Figure 72: Iteration 1 displays Œ Quality small multiple line graph.   164  Strain  Figure 73: Iteration 1 displays Œ Strain multi-shade line graph.    Figure 74: Iteration 1 displays Œ Strain labelled line graph.   165  Multi-Measure  Figure 75: Iteration 1 displays - Multi-measure matrix.   166  Appendix F: Iteration 2 Feedback Displays Icons  Figure 76: Iteration 2 displays- Dynamic loudness icons.    Figure 77: Iteration 2 displays - Dynamic pause icons.   167   Figure 78: Iteration 2 displays - Dynamic quality icons.    Figure 79: Iteration 2 displays - Dynamic strain icons.   168  Loudness  Figure 80: Iteration 2 displays Œ Loudness single-day sparkline.    Figure 81: Iteration 2 displays Œ Loudness small multiple sparklines.    169  Pauses  Figure 82: Iteration 2 displays Œ Pause line graph.    Figure 83: Iteration 2 displays - Individual pause time and length.  
  Figure 84: Iteration 2 displays Œ Pause clocks and counts. 170  Quality  Figure 85: Iteration 2 displays Œ Quality small multiple smileys.    Figure 86: Iteration 2 displays Œ Quality small multiple line graphs.  
  Figure 87: Iteration 2 displays Œ Quality single day line graph.  
  171  Strain  Figure 88: Iteration 2 displays - Strain labelled line graph.    Figure 89: Iteration 2 displays Œ Strain day and night line graph.   172  Multi-Measure  Figure 90: Iteration 2 displays Œ Multi-measure matrix.     Figure 91: Iteration 2 displays - Dynamic icon array.   173  Multi-Measure   Figure 92: Iteration 2 displays Œ Multi-measure man.   174  Appendix G: Iteration 3 Feedback Displays Icons  Figure 93: Iteration 3 displays ŒIcons for all four measures, with two pause options. 175  Loudness  Figure 94: Iteration 3 displays ŒLoudness single day sparkline.    Figure 95: Iteration 3 displays Œ Loudness small multiple sparklines.  
  Figure 96: Iteration 3 displays Œ Loudness single day sparkline with zeros.   176  Pauses   Figure 97: Iteration 3 displays Œ Individual pause time and length.     Figure 98: Iteration 3 displays Œ Pause bar graph with counts.   177  Quality   Figure 99: Iteration 3 displays Œ Quality small multiple smileys.    Figure 100: Iteration 3 displays Œ Quality small multiple line graphs.    Figure 101: Iteration 3 displays Œ Quality single day line graph.   178  Strain   Figure 102: Iteration 3 displays Œ Strain labelled line graph.    Figure 103: Iteration 3 displays Œ Strain small multiple line graph.   Figure 104: Iteration 3 displays Œ Strain day and night bar graph. 179    Multi-Measure  Figure 105: Iteration 3 displays Œ Multi-measure matrix.   180  Appendix H: Iteration 4 Feedback Displays:   Icons   Figure 106: Iteration 4 displays Œ Icons for the four measures.   181  Loudness  Figure 107: Iteration 4 displays Œ Loudness multi-day danger zone counts.    Figure 108: Iteration 4 displays Œ Loudness single day sparkline.    Figure 109: Iteration 4 displays Œ Loudness small multiple sparklines.   182  Pauses   Figure 110: Iteration 4 displays Œ Pause multi-day counts.    Figure 111: Iteration 4 displays Œ Individual pause time and length.  
  Figure 112: Iteration 4 displays Œ Pause small multiple bar graphs.   183  Pauses   Figure 113: Iteration 4 displays Œ Pause bar graph.      184  Quality   Figure 114: Iteration 4 displays Œ Quality small multiple smileys.    Figure 115: Iteration 4 displays Œ Quality single day line graph.  
  Figure 116: Iteration 4 displays Œ Quality small multiple line graphs.  
  185  Strain   Figure 117: Iteration 4 displays Œ Strain small multiple arrows.    Figure 118: Iteration 4 displays Œ Strain small multiple time points.   Figure 119: Iteration 4 displays Œ Strain labelled line graphs. 186  Multi-Measure   Figure 120: Iteration 4 displays Œ Multi-measure matrix.   187  Appendix I: Final Feedback Displays  Icons   Figure 121: Final displays Œ Icons for the four measures. 188  Loudness   Figure 122: Final displays Œ Layered structure for loudness.   189  Pauses   Figure 123: Final displays Œ Layered structure for pauses.   190   Quality   Figure 124: Final displays Œ Layered structure for quality.    191  Strain   Figure 125: Final displays Œ Layered structure for strain.   192  Appendix J: Part 2 Midpoint and Final Semi-Structured Interview Questions  Midpoint Semi-Structured Interview  Overall 1. What is your overall impression of the study so far? 2. What did you like most?  a. Why? 3. What did you like least?  a. Why?  Wearing the Device (questions based on Hunter, 2012)) 1. What did you think about wearing the collar and recorder? 2. Did you experience any discomfort with the equipment?  a. How did you resolve it? 3. Did you experience any difficulties with the equipment?  a. How did you resolve them? 4. Do you have any suggestions for how to improve the recording system?  Strategies 1. Did you change how you talked during the study?  a. How? 2. Did you change how much you talked during the study?  a. How? 3. Did you change how loud you talked during the study?  a. How?   193  How much did you change how you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ Not at All          Extremely  How much did you change how much you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ 
Not at All          Extremely  How much did you change how loud you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ 
Not at All          Extremely    194  Final Semi-Structured Interview  Overall 1. What was your overall impression of the study? 2. What did you like most?  a. Why? 3. What did you like least?  a. Why?  Strategies 1. Did you change how you talked during the second part of the study?  a. How? 2. Did you change how much you talked during the second part of the study?  a. How? 3. Did you change how loud you talked during the second part of the study?  a. How?  Feedback 1. What did you think about the feedback? 2. Do you feel like the feedback was helpful? 3. What did you think was most helpful?  a. Why?  b. How did it influence your talking? 4. What did you think was least helpful?  a. Why?  5. What additional feedback would be helpful?  a. Why? 6. What did you think about how the feedback was presented? 7. How would you improve the way the feedback is presented? I have a pen and paper 
available if you™d like to draw an example.  Sharing 1. Did you share your feedback information with anyone?  a. Who?  b. Why?  c. What were their thoughts? d. (If shared information with other participants) Did you find this helpful?  Recommendations 195  1. If a similar type of system were available for personal use, what should it include? a. Any other features you would add? 2. One future idea is adding a social component, where people can share their data with others. What are your thoughts? 3. How much do you think people would be willing to pay for this kind of system for 
personal use? Why? 4. If employers made these systems available for employee use, do you think employees 
would use them? Why?   196  How much did you change how you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ Not at All          Extremely  How much did you change how much you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ 
Not at All          Extremely  How much did you change how loud you talked? Indicate (make a vertical tick mark) on the scale below: ______________________________________________________________________________ 
Not at All          Extremely  
  197  Appendix K: Sample Feedback (Part 2.1)   Figure 126: Initial image seen by participants. Icons are displayed in a different random order for each participant. This participant™s order is: pauses, quality, and strain. 198   Figure 127: First pause display. This display shows the pause count (number of pauses equal to or greater than one second in length) over the course of the reading VLT.  199   Figure 128: Second pause display. This display shows the amount of time (% total time) spent in pauses equal to or greater than one second in length for each 3 minutes of reading.  200   Figure 129: Third pause display. Participants had one of these for each day. This display shows when these pauses of a second or greater occurred, and their individual durations.  201   Figure 130: First quality display. The smileys indicate the average value for quality for each day (across all 15 minutes of reading). Note that the smiley with a straight line for a mouth is equal to ﬁbaseline.ﬂ  202   Figure 131: Second quality display. The ﬁbaselineﬂ smiley is the average of the first minute of the baseline recordings (3 total).    Only the first minute was analyzed to ensure that there were no vocal fatigue effects. The range around the baseline was based on the range of pitch strength values seen in voice signal typing (Kopf, Shrivastav, Eddins, Skowronski, & Hunter, 2014), where the average value for Type 1 voices was 0.45 and the average value for Type 3 voices was 0.05. This range was 
chosen to ensure that all values for a given participant would fall within that range (e.g., even if fatigue leads to a lessening of voice quality, it should still fall within this range).  203   Figure 132: Third quality display. Amplified image of one day™s quality. Participants had one of these for each day. 204   Figure 133: First strain display. The stick figures indicate the average value for the /after the reading VLT (whichever value they are closest to). Note that the stick figure for days 3, 5, and 7 is the ﬁbaseline or betterﬂ stick figure. 205   Figure 134: Second strain display. Note that if the value went above the second stick figure, it was considered to be in the ﬁdanger zoneﬂ and was colored red.   ﬁBaselineﬂ was defined as the average value for the three values before the VLT for the three baseline recordings. The range around the baseline was based on the range of first moment specific loudness values seen for voices ranging from no perceived strain to high strain (Kopf et al., 2013), where the value for a voice with low strain (10.9) was approximately 8 points lower than the voice with the highest strain (19.5). This range was chosen to ensure that all values for a given participant would fall within that range (e.g., even if fatigue leads to an increase in strain, 
it should still fall within this range).  206   Figure 135: Third strain display. Amplified image of one day™s strain. Just like for quality, participants had one of these for each day.   207  Appendix L: Supplemental Feedback Displays (Part 2.2)   Figure 136: Example quality display. This display shows the difference between the quality display in Part 2.1, where instead of one 15-minute segment, two 15 minute segments are shown.    To keep the feedback consistent across participants, the first and last 15 minutes of the course were used (regardless of whether long periods of silence were present). Participants were asked to take that into consideration when looking at these graphs, which may explain some of 
the fluctuations present. The ﬁzoomed inﬂ version would be similar to the other third quality 
graph, where only one day is displayed. 208   Figure 137: First loudness display. The ﬁdanger zoneﬂ differs by day. It represents time spent greater than 2 standard deviations of the mean above the average dB level for that day.   209   Figure 138: Second loudness display. The ﬁdanger zoneﬂ is indicated by the red dashed lines.   For display purposes, each data point in these graphs represents the average dB level for a three second time window. Note that for display purposes, silences were not marked by zero values, but by the average value for that day. This insured that the feedback image looked more similar to the final feedback image from Part 1.  210   Figure 139: Third loudness display. Amplified image of one day™s loudness pattern. Just like for quality, participants had one of these for each day.    211  Appendix M: Part 2.1 Readiness to Change   Figure 140: The change in readiness to change for each participant in Part 2.1 from the initial to the midpoint to the final interview.   212  Appendix N: Part 2.1 Self-Efficacy   Figure 141: The change in self-efficacy for each participant in Part 2.1 from the initial to the midpoint to the final interview.   213  Appendix O: Part 2.1 Vocal Fatigue Index   Figure 142: The change in vocal fatigue for each participant in Part 2.1 from the initial to the midpoint to the final interview.     214  Appendix P: Part 2.2 Readiness to Change   Figure 143: The change in readiness to change for each participant in Part 2.2 from the initial to the midpoint to the final interview.    215  Appendix Q: Part 2.2 Self-Efficacy   Figure 144: The change in self-efficacy for each participant in Part 2.2 from the initial to the midpoint to the final interview.    216  Appendix R: Part 2.2 Vocal Fatigue Index   Figure 145: The change in vocal fatigue for each participant in Part 2.2 from the initial to the midpoint to the final interview.      217  Appendix S: Steps for Feedback Analysis (Part 2.1)  Step 1 Directions 1. Open GoldWave 2. Go to: File -> Batch Processing 3. Open the folder FullFiles 4. Highlight all files and drag into Batch Processing window 5. At the bottom of the Batch Processing window, under Presets, choose SplitAccData 6. In the Batch Processing window, click Begin 7. Once it finishes, click OK 8. The program has now put the correct channel from the recording into the SplitFiles folder 9. Open the SplitFiles folder and rename the recording, substitute ﬁAccﬂ for ﬁRolandﬂ so that it looks like this: P2101_D1_Acc 10. Copy and paste the file into the Step 2 folder 11. Move the audio files from FullFiles & SplitFiles into the Complete folder Step 2 Directions 1. Double click on ExtractdBF0 (MATLAB Code)- this will open MATLAB & MATLAB editor. 2. Right click on the wav file and choose rename. Copy the name of the file (but don™t 
change it). 3. In the MATLAB editor window, replace fname- line 5 (paste the name of the wav file but keep .wav at the end). 218  4. In the MATLAB editor window, replace namexls- line 6 (paste the name of the wav file but keep .xlsx at the end). 5. Save the file (click the disk icon in the upper left). 6. Click run (big green arrow). 7. Drag and drop the audio file into Goldwave. 8. Go up to Effect- Volume- Change Volume and increase the volume by 20 dB (change 0.00 in upper right corner to 20). Click OK. 9. Now locate the 3 ah vowels toward the beginning of the recording.  10. Find the start and stop time for the 3rd ah vowel. 11. Close Goldwave and don™t save changes. 12. Open the Excel file that you just created. 13. Find the start time for the 3rd ah vowel and note the corresponding SPL (sound pressure level). 14. In an empty cell next to it, type: = average (highlight SPL values for the duration of the 3rd ah vowel) 15. Now, in another empty cell, type: =(value from the Actual dB Excel sheet) Œ (highlight the cell where you did the averaging) 16. Go into the Step 1 Complete folder and copy both the full (Roland) and accelerometer (acc) recordings. Paste the copies into the Step 2 ChangeLevel folder. 17. Open Goldwave, drag and drop one of the files into the program. 219  18. Go up to Effect- Volume- Change Volume and increase the volume by the number of dB you got for 18. (Change 0.00 in upper right corner to that number). Click OK. If it was more than 20, add 20 and then add the rest by doing this again. 19. Save the file to: Y:\Graduate students\PhD students\LisaDissertation\Step2-AdjustdB\Complete Rename the file by adding ﬁ_calﬂ to the end (without quote marks). 20. Do the same for the other file. Step 3 Directions 1. Use the 1cc_inc files that are in this folder. 2. Open and segment a ﬁ_Roland_calﬂ wav file in Elan (see Elan Instructions- Word file). 3. Close MATLAB if it™s already open. 4. Double click CreateSegmentedWaves (MATLAB file) to open MATLAB. 5. Close the Editor window. 6. Type CreateSegmentedWaves (or copy paste it from here) into the MATLAB Command Window. 7. Follow the prompts- choose the ELAN file you just created, and choose the corresponding ﬁ_acc_calﬂ wav file to be segmented. 8. Once the wav files are created, rename all of them (take ﬁ.wavﬂ off the end of each- it™s redundant; change the one ending in ﬁxxxxxﬂ to the appropriate label). 9. Put all files that you used (wav & Elan) in the Complete folder under the correct participant and session. ELAN Instructions 1. Open ELAN 2. Go to ﬁFileﬂ  ﬁNew,ﬂ select the .wave file you want to work on 220  3. You can listen to the wave form by clicking on it and pressing ﬁplayﬂ a. After you™ve pressed ﬁplayﬂ once, you can play and pause by pressing ﬁspaceﬂ 4. You can slow down the speed of the playback by moving the ﬁrateﬂ slider in the controls tab by moving left or right 5. ﬁOptionsﬂﬁSegmentation modeﬂ At the top, choose ﬁTwo keystrokes per annotationﬂ- this way you won™t have to delete any. a. Click on waveform and press ﬁenterﬂ to the first segment b. Press ﬁplayﬂ and hit ﬁenterﬂ when you want to end the segment i. You will be pressing ﬁenterﬂ to both start a new segment and end a 
segment c. Press ﬁspaceﬂ when you want to make the waveform stop playing all together 6. ﬁOptionsﬂﬁTranscription Modeﬂ (Once you™ve segmented the entire passage) a. IT will bring each segment up (Tier) b. Select the tier name and click ﬁapplyﬂ c. It will show all segments in that tier d. Name them using the following: Note: sometimes I ask someone more than 3 ah vowels if the first ones are short. If there are more than 3, go with the last 3 in the set. ia1 (first a vowel before reading), ia2, ia3, irp (initial rainbow passage), red 
(reading), ea1 (first a vowel after reading), ea2, ea3, erp (end rainbow passage) 7. Once all the segments and tiers are transcribed, you go to ﬁFileﬂﬁExport AsﬂﬁTab-delimited text–ﬂ a. Name it using the participant # and session # (e.g., P2102_D8) b. That will give you a .txd file with ﬁbegin time,ﬂ ﬁend time,ﬂ ﬁdurationﬂ 8. Go to Save and save using the same name from a.  Step 4 Directions 1. Copy all the /- Vowels. 2. Rename all ah vowels. Get rid of the .wav at the end. (It™s on there 2x) 3. Double click the Midpoint500 MATLAB file to open it. 4. Click the green arrow to run it. 5. Once it is complete, go to the Vowels folder, rearrange the contents by Date Modified, highlight all the files ending in _mid, cut and paste allStimuli folder. 6. Double click on the sharpnessExperiment MATLAB file. 221  7. Scroll down to line 112 and change strainSharpnessMoments_... to strainSharpnessMoments_0121 (or whichever month and day you are running the analysis). 8. Save the script (upper left save icon). 9. Run the script (green arrow). 10. When finished, cut files from the Vowels folder and put in the Done folder. Also, cut 
files from allStimuli and put in the Done folder. Step 5: 1. Run Auditory-SWIPE' (which calculates pitch strength) on the ﬁredﬂ file from Step 3. 2. Run MATLAB script to automatically calculate pause frequency and length of pauses 
>1 second based on output from SWIPE. Output will be a P21xx_Dx_pauses.xlsx file. 3. Open the P21xx_Dx_pauses.xlsx file with Excel. 4. Open the corresponding P21xx_PauseLocale.xlsx file.  5. [Note: these files should be open side by side to make your work easier!] 6. In the P21xx_Dx_pauses.xlsx file, in cell E1, paste the following:  =COUNTIF(A:A,1). 7. [Note: this will tell you how many pauses occurred in the 15 minutes this person was reading] 8. In P21xx_Dx_pauses.xlsx, hold down the Ctrl button and push F (this will open the Find window). 9. In the text box, type the following: 1 10. Push the button Find Next. 222  11. On the line containing the 1 in Column A, there will be numbers in column B & C.  12. Copy both of these cells (Columns B & C), and paste them in the P21xx_PauseLocale.xlsx file. They will be pasted into columns A & B. If the number in column C is a 0, replace it with a 1 in the P21xx_PauseLocale.xlsx file. 13. Go back to the P21xx_Dx_pauses.xlsx file and double click on an empty cell next to the cells you just copied. 14. Click the Find Next button in the Find and Replace window. And continue for the rest 
of the lines containing 1 (the number in cell E1). 15. To double check your work, when you reach the end, check the number in E1 of P21xx_Dx_pauses.xlsx with I1 of P21xx_PauseLocale.xlsx. They should be the same number. 16. Once you have copied pasted all of the cells, you can close P21xx_Dx_pauses.xlsx. 17. Next, you will be adding the pauses together in time segments. First, you will add together all the pauses that occurred in minutes 0-2.999 (column A). To do this, type in cell F2: =SUM( and then highlight the corresponding pause lengths in column B. For example, if all the pauses between 0-2.999 minutes were in rows 2-5, your formula in F2 would look like this: =SUM(B2:B5). 18. Continue for each of the time segments. Step 6: 1. Create feedback displays using MATLAB scripts and Power Point template.    223  Appendix T: Steps for Feedback Analysis (Part 2.2)  Step 1 Directions 1. Open GoldWave 2. Go to: File -> Batch Processing 3. Open the folder FullFiles 4. Highlight all files and drag into Batch Processing window 5. At the bottom of the Batch Processing window, under Presets, choose SplitAccData 6. In the Batch Processing window, click Begin 7. Once it finishes, click OK 8. The program has now put the correct channel from the recording into the SplitFiles folder 9. Open the SplitFiles folder and rename the recording, substitute ﬁAccﬂ for ﬁRolandﬂ so that it looks like this: P2201_D1_Acc 10. Move the audio files from FullFiles & SplitFiles into the Complete folder 11. Open Goldwave, drag and drop one of the accelerometer files into the program. 12. Go up to Effect- Volume- Change Volume and increase the volume by 20 dB. 13. Save the file to: Y:\Graduate students\PhD students\LisaDissertation\Step2-AdjustdB\Complete a. Rename the file by adding ﬁ_incﬂ to the end (without quote marks). 14. Do the same for the other files. 15. If the entire session™s recording is in one file (lecture + initial and final tasks), open the 
file in Goldwave, find the start and end for the lecture, and save as 3 separate files (1cc 
for initial tasks, 2cc for lecture, 3cc for final tasks). 224  16. For the 2cc file, divide into 1 hour segments in Goldwave, indicated by 2c1, 2c2, etc, and save to the Step 2 folder. 17. For the 2cc file, save the first 15 minutes of the file to the First 15 folder (name using this 
convention: P220x_Dx_2_first). 18. For the 2cc file, save the last 15 minutes of the file to the Last 15 folder (name using this convention: P220x_Dx_2__last). Step 2 Directions 1. Note: this will be run for the 2cx files as well as the ia3 file created in Step 3 (for 
calibration purposes). 2. Double click on ExtractdBF0 (MATLAB Code) - this will open MATLAB & MATLAB editor. 3. Right click on the wav file and choose rename. Copy the name of the file (but don™t 
change it). 4. In the MATLAB editor window, replace fname- line 5 (paste the name of the wav file but keep .wav at the end). 5. In the MATLAB editor window, replace namexls- line 6 (paste the name of the wav file but keep .xlsx at the end). 6. Save the file (click the disk icon in the upper left). 7. Click run (big green arrow). 8. In the ia3 Excel sheet, note the corresponding SPL (sound pressure level). 9. In an empty cell next to it, type: a. = average (highlight SPL values for the duration of the 3rd ah vowel) 10. Now, in another empty cell, type: 225  a. =(value from the Actual dB Excel sheet) Œ (highlight the cell where you did the averaging) 11. Use this number to ﬁcalibrateﬂ the values in subsequent dB analysis, done in Excel.  12. Open the Excel sheet for the 2c1 file. Copy paste Columns D-Z from template for determining presence of voicing. 13. In column N, change the calibration value to the one from Step 2.11. Change SPL and F0 mins if needed (based on speaker and noise present in recording). 14. Copy columns D-Z and paste into the Excel sheets for 2c2, etc. 15. For dB level calculations, copy and combine values from Column V for all Excel files into a new worksheet. 16. Find average and standard deviation values. Determine cutoff for ﬁDanger Zoneﬂ (2 
standard deviations above mean). 17. Replace all ﬁzerosﬂ with the average value. 18. Run pause extraction MATLAB script for each of the 2cx files. 19. Use the ﬁpausesﬂ worksheet from each file. Sort by descending values in column 1 (so that all pauses are at the top. 20. Copy and combine values from Columns B and C for all Excel files into a new 
worksheet. They will be pasted into columns A & B. 21. Calculate count of Column A to determine the number of pauses >1 second. 22. Next, you will be adding the pauses together in time segments. Length of segments will 
vary depending on the length of the lecture. A total of 4-6 approximately equal segments will be displayed to the course instructor. First, you will add together all the pauses that occurred in a given segment. For example, if it is a 10 minute segment, you will add 226  together the values in Column B from time 0.0-9.999. To do this, type in cell F2: =SUM (and then highlight the corresponding pause lengths in column B. For example, if all the pauses between 0-9.999 minutes were in rows 2-50, your formula in F2 would look like this: =SUM(B2:B50). Step 3 Directions 1. Use the 1cc_inc and 3cc_inc files that are in this folder. 2. Close MATLAB if it™s already open. 3. Double click CreateSegmentedWaves (MATLAB file) to open MATLAB. 4. Close the Editor window. 5. Type CreateSegmentedWaves (or copy paste it from here) into the MATLAB Command Window. 6. Follow the prompts- choose the ELAN file you just created, and choose the corresponding ﬁ_acc_incﬂ wav file to be segmented. 7. Once the wav files are created, rename all of them (take ﬁ.wavﬂ off the end of each- it™s redundant; change the one ending in ﬁxxxxxﬂ to the appropriate label). 8. Put all files that you used (wav & Elan) in the Complete folder under the correct participant and session. ELAN Instructions 1. Open ELAN 2. Go to ﬁFileﬂ  ﬁNew,ﬂ select the .wave file you want to work on 3. You can listen to the wave form by clicking on it and pressing ﬁplayﬂ a. After you™ve pressed ﬁplayﬂ once, you can play and pause by pressing ﬁspaceﬂ 4. You can slow down the speed of the playback by moving the ﬁrateﬂ slider in the controls tab by moving left or right 5. ﬁOptionsﬂﬁSegmentation modeﬂ a. At the top, choose ﬁTwo keystrokes per annotationﬂ- this way you won™t have to delete any. b. Click on waveform and press ﬁenterﬂ to the first segment 227  c. Press ﬁplayﬂ and hit ﬁenterﬂ when you want to end the segment i. You will be pressing ﬁenterﬂ to both start a new segment and end a segment d. Press ﬁspaceﬂ when you want to make the waveform stop playing all together 6. ﬁOptionsﬂﬁTranscription Modeﬂ (Once you™ve segmented the entire passage) a. IT will bring each segment up (Tier) b. Select the tier name and click ﬁapplyﬂ c. It will show all segments in that tier d. Name them using the following: i. Note: sometimes I ask someone more than 3 ah vowels if the first ones are 
short. If there are more than 3, go with the last 3 in the set. ii. ia1 (first a vowel before reading), ia2, ia3, irp (initial rainbow passage), 
red (reading), ea1 (first a vowel after reading), ea2, ea3, erp (end rainbow passage) 7. Once all the segments and tiers are transcribed, you go to ﬁFileﬂﬁExport AsﬂﬁTab-delimited text–ﬂ a. Name it using the participant # and session # (e.g., P2102_D8) b. That will give you a .txd file with ﬁbegin time,ﬂ ﬁend time,ﬂ ﬁdurationﬂ 8. Go to Save and save using the same name from a.  Step 4 Directions 1. Copy all the /- Vowels. 2. Rename all ah vowels. Get rid of the .wav at the end. (It™s on there 2x) 3. Double click the Midpoint500 MATLAB file to open it. 4. Click the green arrow to run it. 5. Once it is complete, go to the Vowels folder, rearrange the contents by Date Modified, highlight all the files ending in _mid, cut and paste allStimuli folder. 6. Double click on the sharpnessExperiment MATLAB file. 7. Scroll down to line 112 and change strainSharpnessMoments_... to strainSharpnessMoments_0121 (or whichever month and day you are running the analysis). 8. Save the script (upper left save icon). 9. Run the script (green arrow). 228  10. When finished, cut files from the Vowels folder and put in the Done folder. Also, cut files from allStimuli and put in the Done folder. Step 5: 1. Run Auditory-SWIPE' (to calculate pitch strength) on the _first and __last files from Step 1. Step 6: 1. Create feedback displays using MATLAB scripts and Power Point template.  
  229  Appendix U: Part 2.1 Phonation Time   Figure 146: The change in average phonation time for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number.   230  Appendix V: Part 2.1 Vocal Intensity   Figure 147: The change in average vocal intensity for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number.   231  Appendix W: Part 2.1 Pitch   Figure 148: The change in average pitch for each participant in Part 2.1 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number.   232  Appendix X: Part 2.2 Phonation Time   Figure 149: The change in average phonation time for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the 
feedback recording number.   233  Appendix Y: Part 2.2 Vocal Intensity   Figure 150: The change in average vocal intensity for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback recording number. 
   234  Appendix Z: Part 2.2 Pitch   Figure 151: The change in average pitch for each participant in Part 2.2 across the eight recording sessions. B# indicates the baseline recording number, and F# indicates the feedback 
recording number.   235  Appendix AA: Average Pitch Strength (Part 2.1)   Figure 152: The average pitch strength for each participant in Part 2.1 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number.    236  Appendix AB: Average First Moment Specific Loudness (Part 2.1)   Figure 153: The average first moment specific loudness for each participant in Part 2.1 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number.   237  Appendix AC: Average Pitch Strength (Part 2.2)   Figure 154: The average pitch strength for each participant in Part 2.2 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates the feedback recording number.    238  Appendix AD: Average First Moment Specific Loudness (Part 2.2)   Figure 155: The average first moment specific loudness for each participant in Part 2.2 before and after each vocal loading task. B# indicates the baseline recording number, and F# indicates 
the feedback recording number.    239    
 
 
 
 
 
   
 BIBLIOGRAPHY    240  BIBLIOGRAPHY  Achey, M. A., He, M. Z., & Akst, L. M. (2016). Vocal Hygiene Habits and Vocal Handicap Among Conservatory Students of Classical Singing. Journal of Voice, 30(2), 192Œ197. http://doi.org/10.1016/j.jvoice.2015.02.003  Alfieri, L., Brooks, P. J., Aldrich, N. J., & Tenenbaum, H. R. (2011). Does discovery-based instruction enhance learning? Journal of Educational Psychology, 103(1), 1Œ18. http://doi.org/10.1037/a0021017  
Astolfi, A., Carullo, A., Pavese, L., & Puglisi, G. E. (2015). Duration of voicing and silence periods of continuous speech in different acoustic environments. The Journal of the Acoustical Society of America, 137(2), 565Œ579.  Astolfi, A., Carullo, A., Vallan, A., & Pavese, L. (2013). Influence of classroom acoustics on the vocal behavior of teachers. In Proceedings of Meetings on Acoustics (Vol. 19, p. 40123). Acoustical Society of America. Retrieved from http://scitation.aip.org/content/asa/journal/poma/19/1/10.1121/1.4800427  Bailey, G. ﬁSkip.ﬂ (1993). Iterative methodology and designer training in human-computer interface design. In Proceedings of the INTERACT ™93 and CHI ™93 Conference on Human Factors in Computing Systems (pp. 198Œ205). New York, NY, USA: ACM.  Baken, R. J. (1992). Electroglottography. Journal of Voice, 6(2), 98Œ110. http://doi.org/10.1016/S0892-1997(05)80123-7  
Bandura, A. (1977). Self-efficacy: toward a unifying theory of behavioral change. Psychological Review, 84(2), 191.  
Bergan, C. C., Titze, I. R., & Story, B. (2004). The perception of two vocal qualities in a synthesized vocal utterance: ring and pressed voice. Journal of Voice, 18(3), 305Œ317. http://doi.org/10.1016/j.jvoice.2003.09.004  
Bernstein, L., & Yuhas, C. M. (2005). Prototyping. In Trustworthy Systems through Quantitative Software Engineering (pp. 107Œ136). John Wiley & Sons, Inc. Retrieved from http://onlinelibrary.wiley.com.proxy1.cl.msu.edu/doi/10.1002/0471750336.ch4/summary  Beyer, H., & Holtzblatt, K. (1998). stomer-Centered Systems. San Francisco, Calif: Morgan Kaufmann. Retrieved from https://search.ebscohost.com/login.aspx?direct=true&db=e000xna&AN=472251&scope=
site  241  Bhuta, T., Patrick, L., & Garnett, J. D. (2004). Perceptual evaluation of voice quality and its correlation with acoustic measurements. Journal of Voice, 18(3), 299Œ304. http://doi.org/10.1016/j.jvoice.2003.12.004  Bovo, R., Galceran, M., Petruccelli, J., & Hatzopoulos, S. (2007). Vocal problems among teachers: Evaluation of a preventive voice program. Journal of Voice, 12(6), 705Œ722.  
Boyle, R. G., O™Connor, P. J., Pronk, N. P., & Tan, A. (1998). Stages of change for physical activity, diet, and smoking among HMO members with chronic conditions. American , 12(3), 170Œ175.  Brant, L. J., & Fozard, J. L. (1990). Age changes in pure-tone hearing thresholds in a longitudinal study of normal human aging. The Journal of the Acoustical Society of America, 88(2), 813Œ820. http://doi.org/10.1121/1.399731  Buley, L. (2013). The User Experience Team of One. Brooklyn, NY: Rosenfeld Media.  Camacho, A. (2007). music. University of Florida. Retrieved from http://www.kerwa.ucr.ac.cr:8080/handle/10669/536  
Cardinal, B. J. (1995). The stages of exercise scale and stages of exercise behavior in female adults. The Journal of Sports Medicine and Physical Fitness, 35(2), 87Œ92.  
Carroll, T., Nix, J., Hunter, E., Emerich, K., Titze, I., & Abaza, M. (2006). Objective measurement of vocal fatigue in classical singers: A vocal dosimetry pilot study. Otolaryngology - Head and Neck Surgery, 135(4), 595Œ602. http://doi.org/10.1016/j.otohns.2006.06.1268  
Carullo, A., Vallan, A., & Astolfi, A. (2013). Design Issues for a Portable Vocal Analyzer. IEEE Transactions on Instrumentation and Measurement, 62(5), 1084Œ1093. http://doi.org/10.1109/TIM.2012.2236724  
Chan, R. W. (1994). Does the voice improve with vocal hygiene education? A study of some instrumental voice measures in a group of kindergarten teachers. Journal of Voice, 8(3), 279Œ291.  
Child, D. R., & Johnson, T. S. (1991). Preventable and nonpreventable causes of voice disorders. In Seminars in Speech and Language (Vol. 12, pp. 1Œ13). \copyright 1991 by Thieme Medical Publishers, Inc. Retrieved from https://www.thieme-connect.com/products/ejournals/pdf/10.1055/s-2008-1064206.pdf  
Choe, E. K., Lee, N. B., Lee, B., Pratt, W., & Kientz, J. A. (2014). Understanding quantified-selfers™ practices in collecting and exploring personal data (pp. 1143Œ1152). ACM Press. http://doi.org/10.1145/2556288.2557372  242  Colton, R. H., & Casper, J. K. (2006). perspective for diagnosis and treatment. Baltimore, MD: Lippincott Williams & Wilkins.  Consolvo, S., McDonald, D. W., Toscos, T., Chen, M. Y., Froehlich, J., Harrison, B., – others. (2008). Activity sensing in the wild: a field trial of ubifit garden. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 1797Œ1806). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=1357335  Corbin, J. M., & Strauss, A. L. (2008). Procedures for Developing Grounded Theory (3rd ed.). Los Angeles, CA: Sage Publications, Inc.  
Deliyski, D. D., & Hillman, R. E. (2010). State of the Art Laryngeal Imaging: Research and Clinical Implications. Current Opinion in Otolaryngology & Head and Neck Surgery, 18(3), 147Œ152. http://doi.org/10.1097/MOO.0b013e3283395dd4  
DiClemente, C. C., Prochaska, J. O., Fairhurst, S. K., Velicer, W. F., Velasquez, M. M., & Rossi, J. S. (1991). The process of smoking cessation: An analysis of precontemplation, contemplation, and preparation stages of change. Journal of Consulting and Clinical 
Psychology, 59(2), 295Œ304. http://doi.org/10.1037/0022-006X.59.2.295  DiClemente, C. C., Schlundt, D., & Gemmell, L. (2004). Readiness and stages of change in addiction treatment. The American Journal on Addictions / American Academy of Psychiatrists in Alcoholism and Addictions, 13(2), 103Œ119. http://doi.org/10.1080/10550490490435777  Dijkstra, A., De Vries, H., & Bakker, M. (1996). Pros and cons of quitting, self-efficacy, and the stages of change in smoking cessation. Journal of Consulting and Clinical Psychology, 64(4), 758Œ763.  Duffy, O. M., & Hazlett, D. E. (2004). The impact of preventive voice care programs for training teachers: A longitudinal study. Journal of Voice, 18(1), 63Œ70. http://doi.org/10.1016/S0892-1997(03)00088-2  
Eddins, D. A., & Shrivastav, R. (2013). Psychometric properties associated with perceived vocal roughness using a matching task. The Journal of the Acoustical Society of America, 134(4), EL294-300. http://doi.org/10.1121/1.4819183  Epstein, D. A., Borning, A., & Fogarty, J. (2013). Fine-grained sharing of sensed physical activity: a value sensitive approach (p. 489). ACM Press. http://doi.org/10.1145/2493432.2493433  Fairbanks, G. (1960). Voice and Articulation Drillbook (2nd ed.). New York, NY, USA: Harper.  
Fastl, H., & Zwicker, E. (2007). Psychoacoustics. Springer.  243  Fogg, B. J. (2003). . San Francisco, CA: Morgan Kaufmann Publishers.  Folkins, J. W., Brackenbury, T., Krause, M., & Haviland, A. (2015). Enhancing the Therapy Experience Using Principles of Video Game Design. American Journal of Speech-Language Pathology, 1. http://doi.org/10.1044/2015_AJSLP-14-0059  
Fritz, T., Huang, E. M., Murphy, G. C., & Zimmermann, T. (2014). Persuasive Technology in the Real World: A Study of Long-term Use of Activity Sensing Devices for Fitness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 487Œ496). New York, NY, USA: ACM. http://doi.org/10.1145/2556288.2557383  
Froeschels, E. (1952). Chewing method as therapy: A discussion with some philosophical conclusions. Archives of OtolaryngologyŒHead & Neck Surgery, 56(4), 427Œ434.  
Gartner-Schmidt, J. L., Roth, D. F., Zullo, T. G., & Rosen, C. A. (2013). Quantifying Component Parts of Indirect and Direct Voice Therapy Related to Different Voice Disorders. Journal of Voice, 27(2), 210Œ216. http://doi.org/10.1016/j.jvoice.2012.11.007  Gee, J. P. (2008). Learning and Literacy. New York, NY, USA: Peter Lang.  Ghassemi, M., Van Stan, J. H., Mehta, D. D., Zañartu, M., Cheyne, H. A., Hillman, R. E., & Guttag, J. V. (2014). Learning to detect vocal hyperfunction from ambulatory neck-surface acceleration features: Initial results for vocal fold nodules. IEEE Transactions on Bio-Medical Engineering, 61(6), 1668Œ1675. http://doi.org/10.1109/TBME.2013.2297372  Goldman, N., & Narayanaswamy, K. (1992). Software Evolution Through Iterative Prototyping. In Proceedings of the 14th International Conference on Software Engineering (pp. 158Œ
172). New York, NY, USA: ACM. http://doi.org/10.1145/143062.143109  Griffin, M. (2004). Minimum health and safety requirements for workers exposed to hand-transmitted vibration and whole-body vibration in the European Union; a review. Occupational and Environmental Medicine, 61(5), 387Œ397. http://doi.org/10.1136/oem.2002.006304  
Grillo, E. U., & Verdolini, K. (2008). Evidence for Distinguishing Pressed, Normal, Resonant, and Breathy Voice Qualities by Laryngeal Resistance and Vocal Efficiency in Vocally Trained Subjects. Journal of Voice, 22(5), 546Œ552. http://doi.org/10.1016/j.jvoice.2006.12.008  
Hansen, A. L. (1997). Reflections on I/Design: User interface design at a startup. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (pp. 487Œ493). New York, NY, USA: ACM.  244  Harrison, D., Berthouze, N., Marshall, P., & Bird, J. (2014). Tracking physical activity: problems related to running longitudinal studies with commercial devices (pp. 699Œ702). ACM Press. http://doi.org/10.1145/2638728.2641320  Heldner, M., & Edlund, J. (2010). Pauses, gaps and overlaps in conversations. Journal of Phonetics, 38(4), 555Œ568. http://doi.org/10.1016/j.wocn.2010.08.002  
Heman-Ackah, Y. D., Michael, D. D., Baroody, M. M., Ostrowski, R., Hillenbrand, J., Heuer, R. J., – Sataloff, R. T. (2003). Cepstral Peak Prominence: A More Reliable Measure of Dysphonia. , 112(4), 324Œ333. http://doi.org/10.1177/000348940311200406  Hillman, R. E., Heaton, J. T., Masaki, A., Zeitels, S. M., & Cheyne, H. A. (2006). Ambulatory monitoring of disordered voices. , 115(11), 795Œ801.  
Hillman, R. E., Holmberg, E. B., Perkell, J. S., Walsh, M., & Vaughan, C. (1989). Objective assessment of vocal hyperfunctionan experimental framework and initial results. Journal , 32(2), 373Œ392.  Hirano, M. (1981). Psycho-acoustic evaluation of voice: GRBAS scale for evaluating the hoarse voice. In Clinical Examination of Voice (pp. 81Œ84). London, England: Springer London.  Holmes, N. (1984). Designer™s Guide to Creating Charts and Diagrams. Broadway, NY: Watson-Guptill Publications.  
Hu FB, Sigal RJ, Rich-Edwards JW, & et al. (1999). Walking compared with vigorous physical activity and risk of type 2 diabetes in women: A prospective study. JAMA, 282(15), 1433Œ1439. http://doi.org/10.1001/jama.282.15.1433  Hunter, E. J. (2012). Teacher response to ambulatory monitoring of voice. Logopedics Phoniatrics Vocology, 37(3), 133Œ135. http://doi.org/10.3109/14015439.2012.664657  
Hunter, E. J., Bottalico, P., Graetzer, S., Leishman, T. W., Berardi, M. L., Eyring, N. G., – Whiting, J. K. (2015). Teachers and Teaching: Speech Production Accommodations Due to Changes in the Acoustic Environment. Energy Procedia, 78, 3102Œ3107. http://doi.org/10.1016/j.egypro.2015.11.764  Hunter, E. J., & Titze, I. R. (2009). Quantifying vocal fatigue recovery: Dynamic vocal recovery trajectories after a vocal loading exercise. Laryngology, 118(6), 449.  
Hunter, E. J., & Titze, I. R. (2010). Variations in intensity, fundamental frequency, and voicing for teachers in occupational versus nonoccupational settings.  and Hearing Research, 53(4), 862Œ875.  245  Intille, S. S. (2004). Ubiquitous computing technology for just-in-time motivation of behavior change. Stud Health Technol Inform, 107(Pt 2), 1434Œ7.  Jiang, J., & Stern, J. (2004). Receiver operating characteristic analysis of aerodynamic parameters obtained by airflow interruption: A preliminary report. Rhinology & Laryngology, 113(12), 961Œ966.  Keefe, F. J., Lefebvre, J. C., Kerns, R. D., Rosenberg, R., Beaupre, P., Prochaska, J., – Caldwell, D. S. (2000). Understanding the adoption of arthritis self-management: stages of change profiles among arthritis patients. Pain, 87(3), 303Œ313.  
Kempster, G. B., Gerratt, B. R., Abbott, K. V., Barkmeier-Kraemer, J., & Hillman, R. E. (2009). Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. American Journal of Speech-Language Pathology, 18(2), 124Œ132.  
Kim, C.-J., Hwang, A.-R., & Yoo, J.-S. (2004). The impact of a stage-matched intervention to promote exercise behavior in participants with type 2 diabetes. International Journal of Nursing Studies, 41(8), 833Œ841. http://doi.org/10.1016/j.ijnurstu.2004.03.009  Kopf, L. M., Shrivastav, R., & Eddins, D. A. (2013, August). Isolating the effects of strain on voice quality perception. Poster presented at the PEVOC, Prague, Czech Republic.  Kopf, L. M., Shrivastav, R., Eddins, D. A., Skowronski, M. D., & Hunter, E. J. (2014). A Comparison of Voice Signal Typing and Pitch Strength. Presented at the American Speech-Language Hearing Association (ASHA) Convention, Orlando, FL.  Kreiman, J., Gerratt, B. R., Kempster, G. B., Erman, A., & Berke, G. S. (1993). Perceptual evaluation of voice quality: review, tutorial, and a framework for future research. Journal , 36(1), 21.  Lee, S. Y., Hwang, H., Hawkins, R., & Pingree, S. (2008). Interplay of Negative Emotion and Health Self-Efficacy on the Use of Health Information and Its Outcomes. Communication Research, 35(3), 358Œ381. http://doi.org/10.1177/0093650208315962  
Lim, B. Y., Shick, A., Harrison, C., & Hudson, S. E. (2011). Pediluma: motivating physical activity through contextual information and social influence. In Proceedings of the fifth  (pp. 173Œ180). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=1935736  Lucas Jr., H. C. (1971). A user-oriented approach to systems design. In Proceedings of the 1971 26th annual conference (pp. 325Œ338). New York, NY, USA: ACM.  
Ma, E. P.-M., & Yiu, E. M.-L. (2005). Suitability of acoustic perturbation measures in analysing periodic and nearly periodic voice signals. Folia Phoniatrica et Logopaedica, 57(1), 38Œ47.  246  Manson, J. E., Hu, F. B., Rich-Edwards, J. W., Colditz, G. A., Stampfer, M. J., Willett, W. C., – Hennekens, C. H. (1999). A Prospective Study of Walking as Compared with Vigorous Exercise in the Prevention of Coronary Heart Disease in Women. New England Journal of Medicine, 341(9), 650Œ658. http://doi.org/10.1056/NEJM199908263410904  Marshall, S. J., & Biddle, S. J. (2001). The transtheoretical model of behavior change: a meta-analysis of applications to physical activity and exercise. Annals of Behavioral Medicine, 23(4), 229Œ246.  
Maryn, Y., De Bodt, M., & Roy, N. (2010). The Acoustic Voice Quality Index: Toward improved treatment outcomes assessment in voice disorders. Journal of Communication Disorders, 43(3), 161Œ174. http://doi.org/10.1016/j.jcomdis.2009.12.004  Mazza, R. (2006). Evaluating Information Visualization Applications with Focus Groups: The CourseVis Experience. In Proceedings of the 2006 AVI Workshop on BEyond Time and ods for Information Visualization (pp. 1Œ6). New York, NY, USA: ACM. http://doi.org/10.1145/1168149.1168155  McCabe, D. J., & Titze, I. R. (2002). Chant Therapy For Treating Vocal Fatigue Among Public School TeachersA Preliminary Study. American Journal of Speech-Language Pathology, 11(4), 356Œ369.  
McConnaughy, E. A., Prochaska, J. O., & Velicer, W. F. (1983). Stages of change in psychotherapy: Measurement and sample profiles. Practice, 20(3), 368Œ375. http://doi.org/http://dx.doi.org.proxy2.cl.msu.edu/10.1037/h0090198  McCrory, E. (2001). Voice therapy outcomes in vocal fold nodules: a retrospective audit. International Journal of Language & Communication Disorders, 36, 19Œ24.  McNaney, R., Lindsay, S., Ladha, K., Ladha, C., Schofield, G., Ploetz, T., – others. (2011). Cueing swallowing in Parkinson™s disease. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (pp. 619Œ622). ACM. Retrieved from http://dl.acm.org/citation.cfm?id=1979030  McNaney, R., Othman, M., Richardson, D., Dunphy, P., Amaral, T., Miller, N., – Vines, J. (2016). Speeching: Mobile Crowdsourced Speech Assessment to Support Self-Monitoring and Management for People with Parkinson™s (pp. 4464Œ4476). ACM Press. http://doi.org/10.1145/2858036.2858321  McNaney, R., Poliakov, I., Vines, J., Balaam, M., Zhang, P., & Olivier, P. (2015). LApp: A Speech Loudness Application for People with Parkinson™s on Google Glass (pp. 497Œ500). ACM Press. http://doi.org/10.1145/2702123.2702292  247  McNaney, R., Vines, J., Roggen, D., Balaam, M., Zhang, P., Poliakov, I., & Olivier, P. (2014). Exploring the acceptability of google glass as an everyday assistive device for people with parkinson™s (pp. 2551Œ2554). ACM Press. http://doi.org/10.1145/2556288.2557092  Misono, S., Banks, K., Gaillard, P., Goding, G. S., & Yueh, B. (2015). The clinical utility of vocal dosimetry for assessing voice rest: Vocal Dosimetry for Voice Rest. The Laryngoscope, 125(1), 171Œ176. http://doi.org/10.1002/lary.24887  Moore, B. C. J. (1983). Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. The Journal of the Acoustical Society of America, 74(3), 750. http://doi.org/10.1121/1.389861  Moore, B. C. J., Glasberg, B. R., & Baer, T. (1997). A Model for the Prediction of Thresholds, Loudness, and Partial Loudness. Journal of the Audio Engineering Society, 45(4), 224Œ240.  
Nanjundeswaran, C., Jacobson, B. H., Gartner-Schmidt, J., & Verdolini Abbott, K. (2015). Vocal Fatigue Index (VFI): Development and Validation. Journal of Voice, 29(3). http://doi.org/10.1016/j.jvoice.2014.09.012  
Nanjundeswaran, C., Li, N. Y. K., Chan, K. M. K., Wong, R. K. S., Yiu, E. M.-L., & Verdolini-Abbott, K. (2012). Preliminary Data on Prevention and Treatment of Voice Problems in Student Teachers. Journal of Voice, 26(6), 816.e1-816.e12. http://doi.org/10.1016/j.jvoice.2012.04.008  Ohlsson, A.-C., Andersson, E. M., Södersten, M., Simberg, S., Claesson, S., & Barregård, L. (2015). Voice Disorders in Teacher StudentsŠA Prospective Study and a Randomized Controlled Trial. Journal of Voice. http://doi.org/10.1016/j.jvoice.2015.09.004  Oinas-Kukkonen, H., & Harjumaa, M. (2008). A systematic framework for designing and evaluating persuasive systems. In Persuasive technology (pp. 164Œ176). Springer. Retrieved from http://link.springer.com/chapter/10.1007/978-3-540-68504-3_15  
Pereira, L. P. de P., Masson, M. L. V., & Carvalho, F. M. (2015). Vocal warm-up and breathing training for teachers: randomized clinical trial. Revista de Saúde Pública, 49, 0Œ0. http://doi.org/10.1590/S0034-8910.2015049005716  Pinkowski, B. (1993). LPC spectral moments for clustering acoustic transients. IEEE Transactions on Speech and Audio Processing, 1(3), 362Œ368.  
Portone, C., Johns III, M. M., & Hapner, E. R. (2008). A Review of Patient Adherence to the Recommendation for Voice Therapy. Journal of Voice, 22(2), 192Œ196. http://doi.org/10.1016/j.jvoice.2006.09.009  248  Prochaska, J. O., & DiClemente, C. C. (1983). Stages and processes of self-change of smoking: toward an integrative model of change. Journal of Consulting and Clinical Psychology, 51(3), 390Œ395.  Prochaska, J. O., & DiClemente, C. C. (1984). Self change processes, self efficacy and decisional balance across five stages of smoking cessation. Progress in Clinical and Biological Research, 156, 131Œ140.  Prochaska, J. O., Norcross, J. C., Fowler, J. L., Follick, M. J., & Abrams, D. B. (1992). Attendance and outcome in a work site weight control program: Processes and stages of change as process and predictor variables. Addictive Behaviors, 17(1), 35Œ45. http://doi.org/10.1016/0306-4603(92)90051-V  Ramig, L. O., & Verdolini, K. (1998). Treatment efficacy: Voice disorders. Language and Hearing Research, 41(1), S101ŒS106.  
Richter, B., Nusseck, M., Spahn, C., & Echternach, M. (2016). Effectiveness of a Voice Training Program for Student Teachers on Vocal Health. Journal of Voice, 30(4), 452Œ459. http://doi.org/10.1016/j.jvoice.2015.05.005  
Rosen, K., Murdoch, B., Folker, J., Vogel, A., Cahill, L., Delatycki, M., & Corben, L. (2010). Automatic method of pause measurement for normal and dysarthric speech. Clinical Linguistics & Phonetics, 24(2), 141Œ154. http://doi.org/10.3109/02699200903440983  Rossi-Barbosa, L. A., Gama, A. C. C., & Caldeira, A. P. (2015). Association between readiness for behavior change and complaints of vocal problems in teachers. CoDAS, 27(2), 170Œ177. http://doi.org/10.1590/2317-1782/20152013088  Roy, N., Barkmeier-Kraemer, J., Eadie, T., Sivasankar, M. P., Mehta, D., Paul, D., & Hillman, R. (2013). Evidence-Based Clinical Voice Assessment: A Systematic Review. American Journal of Speech-Language Pathology, 22(2), 212. http://doi.org/10.1044/1058-0360(2012/12-0014)  
Roy, N., Merrill, R. M., Gray, S. D., & Smith, E. M. (2005). Voice Disorders in the General Population: Prevalence, Risk Factors, and Occupational Impact: The Laryngoscope, 115(11), 1988Œ1995. http://doi.org/10.1097/01.mlg.0000179174.32345.41  
Schloneger, M. J. (2014). Assessments of Voice Use, Voice Quality, and Perceived Singing Voice Function Among College/University Singing Students Ages 18-24 Through Simultaneous Ambulatory Monitoring With Accelerometer and Acoustic Transducers. Retrieved from https://kuscholarworks.ku.edu/handle/1808/18407  Schloneger, M. J., & Hunter, E. J. (2016). Assessments of Voice Use and Voice Quality Among College/University Singing Students Ages 18Œ24 Through Ambulatory Monitoring With a Full Accelerometer Signal. Journal of Voice. http://doi.org/10.1016/j.jvoice.2015.12.018 249   Schneider-Stickler, B., Knell, C., Aichstill, B., & Jocher, W. (2012). Biofeedback on Voice Use in Call Center Agents in Order to Prevent Occupational Voice Disorders. Journal of Voice, 26(1), 51Œ62. http://doi.org/10.1016/j.jvoice.2010.10.001  
Shrivastav, R. (2003). The use of an auditory model in predicting perceptual ratings of breathy voice quality. Journal of Voice, 17(4), 502Œ512. http://doi.org/10.1067/S0892-1997(03)00077-8  
Shrivastav, R., & Camacho, A. (2010). A Computational Model to Predict Changes in Breathiness Resulting From Variations in Aspiration Noise Level. Journal of Voice, 24(4), 395Œ405. http://doi.org/10.1016/j.jvoice.2008.12.001  
Shrivastav, R., Eddins, D. A., & Anand, S. (2012). Pitch strength of normal and dysphonic voices. The Journal of the Acoustical Society of America, 131, 2261.  
Shrivastav, R., Eddins, D. A., & Kopf, L. M. (2014, October). What is a better descriptor of dysphonic voices- Fundamental frequency or pitch? Presented at the Fall Voice, San Antonio, Tx.  Shrivastav, R., & Sapienza, C. M. (2003). Objective measures of breathy voice quality obtained using an auditory model. The Journal of the Acoustical Society of America, 114(4), 2217. http://doi.org/10.1121/1.1605414  Shrivastav, R., Sapienza, C. M., & Nandur, V. (2005). Application of psychometric theory to the measurement of voice quality using rating scales. Hearing Research, 48(2), 323.  Slegers, K., Duysburgh, P., & Jacobs, A. (2010). Research methods for involving hearing impaired children in IT innovation. In Proceedings of the 6th Nordic Conference on Human- (pp. 781Œ784). New York, NY, USA: ACM.  Solomon, N. (2008). Vocal fatigue and its relation to vocal hyperfunction. International Journal of Speech-Language Pathology, 10(4), 254Œ266.  
Speyer, R. (2008). Effects of Voice Therapy: A Systematic Review. Journal of Voice, 22(5), 565Œ580. http://doi.org/10.1016/j.jvoice.2006.10.005  
Stemple, J. C. (2000).  (2nd ed.). Clifton Park, NY: Singular Thomson Learning.  
Sundberg, J., & Gauffin, J. (1978). Waveform and spectrum of the glottal voice source. Music and Hearing Quarterly Progress and Status Report, 19(2Œ3), 35Œ50.  250  −vec, J. G., Popolo, P. S., & Titze, I. R. (2003). Measurement of vocal doses in speech: experimental procedure and signal processing. Logopedics Phoniatrics Vocology, 28(4), 181Œ192. http://doi.org/10.1080/14015430310018892  Szabo Portela, A., Hammarberg, B., & Södersten, M. (2013). Speaking Fundamental Frequency and Phonation Time during Work and Leisure Time in Vocally Healthy Preschool Teachers Measured with a Voice Accumulator. Folia Phoniatrica et Logopaedica, 65(2), 84Œ90. http://doi.org/10.1159/000354673  Teixeira, L. C., Rodrigues, A. L. V., Silva, A. F. G. da, Azevedo, R., Gama, A. C. C., & Behlau, M. (2013). The use of the URICA-VOICE questionnaire to identify the stages of adherence to voice treatment. CoDAS, 25(1), 8Œ15.  
Titze, I. R. (2000). Principles of Voice Production (2nd ed.). Iowa City, IA: National Center for Voice and Speech.  
Titze, I. R. (2012).  (NIH PROJECT NUMBER: R01DC004224) (pp. 1Œ19). Salt Lake City, UT.  
Titze, I. R. (2015). On flow phonation and airflow management. Journal of Singing, 72(1), 57Œ58.  Titze, I. R., & Hunter, E. J. (2015). Comparison of Vocal Vibration-Dose Measures for Potential-Damage Risk Criteria. Journal of Speech Language and Hearing Research, 58(5), 1425. http://doi.org/10.1044/2015_JSLHR-S-13-0128  Titze, I. R., Hunter, E. J., & vec, J. G. (2007). Voicing and silence periods in daily and weekly vocalizations of teachers. The Journal of the Acoustical Society of America, 121(1), 469. http://doi.org/10.1121/1.2390676  Titze, I. R., Lemke, J., & Montequin, D. (1997). Populations in the U.S. workforce who rely on voice as a primary tool of trade: a preliminary report. Journal of Voice, 11(3), 254Œ259. http://doi.org/10.1016/S0892-1997(97)80002-1  
Titze, I. R., −vec, J. G., & Popolo, P. S. (2003). Vocal dose measures: Quantifying accumulated vibration exposure in vocal fold tissues. , 46(4), 919.  Tufte, E. R. (1983). The Visual Display of Quantitative Information. Cheshire, Connecticut: Graphics Press.  
Tufte, E. R. (2001). The Visual Display of Quantitative Information (2nd ed.). Cheshire, Connecticut: Graphics Press LLC.  
van Leer, E. (2010). The role of social-cognitive factors in voice therapy adherence and outcomes (Ph.D.). The University of Wisconsin - Madison, United States -- Wisconsin. 251  Retrieved from http://search.proquest.com.proxy2.cl.msu.edu/docview/861307914/abstract?accountid=12
598  van Leer, E., & Connor, N. P. (2010). Patient Perceptions of Voice Therapy Adherence. Journal of Voice, 24(4), 458Œ469. http://doi.org/10.1016/j.jvoice.2008.12.009  
van Leer, E., & Connor, N. P. (2012). Use of Portable Digital Media Players Increases Patient Motivation and Practice in Voice Therapy. Journal of Voice, 26(4), 447Œ453. http://doi.org/10.1016/j.jvoice.2011.05.006  
van Leer, E., Hapner, E. R., & Connor, N. P. (2008). Transtheoretical model of health behavior change applied to voice therapy. Journal of Voice, 22(6), 688Œ698. http://doi.org/10.1016/j.jvoice.2007.01.011  
van Leer, E., Pfister, R. C., & Zhou, X. (2016). An iOS-based Cepstral Peak Prominence Application: Feasibility for Patient Practice of Resonant Voice. Journal of Voice. http://doi.org/10.1016/j.jvoice.2015.11.022  
Van Stan, J. H., Mehta, D. D., & Hillman, R. E. (2015). The Effect of Voice Ambulatory Biofeedback on the Daily Performance and Retention of a Modified Vocal Motor Behavior in Participants With Normal Voices. Journal of Speech Language and Hearing 
Research, 58(3), 713. http://doi.org/10.1044/2015_JSLHR-S-14-0159  Vilkman, E., Lauri, E.-R., Alku, P., Sala, E., & Sihvo, M. (1999). Effects of prolonged oral reading on F 0, SPL, subglottal pressure and amplitude characteristics of glottal flow waveforms. Journal of Voice, 13(2), 303Œ312.  Villar, A. C. N. W. B., Korn, G. P., & Azevedo, R. R. (2016). Perceptual-auditory and Acoustic Analysis of Air Traffic Controllers™ Voices Pre- and Postshift. Journal of Voice. http://doi.org/10.1016/j.jvoice.2015.10.021  Warhurst, S., Madill, C., McCabe, P., Ternström, S., Yiu, E., & Heard, R. (2016). Perceptual and Acoustic Analyses of Good Voice Quality in Male Radio Performers. Journal of Voice. http://doi.org/10.1016/j.jvoice.2016.05.016  
Warhurst, S., McCabe, P., Yiu, E., Heard, R., & Madill, C. (2013). Acoustic Characteristics of Male Commercial and Public Radio Broadcast Voices. Journal of Voice, 27(5), 655.e1-655.e7. http://doi.org/10.1016/j.jvoice.2013.04.012  
Werth, K., Voigt, D., Döllinger, M., Eysholdt, U., & Lohscheller, J. (2010). Clinical value of acoustic voice measures: a retrospective study. European Archives of Oto-Rhino-Laryngology, 267(8), 1261Œ1271. http://doi.org/10.1007/s00405-010-1214-2  252  Wilcox, N. S., Prochaska, J. O., Velicer, W. F., & DiClemente, C. C. (1985). Subject characteristics as predictors of self-change in smoking. Addictive Behaviors, 10(4), 407Œ412. http://doi.org/10.1016/0306-4603(85)90037-1  Wilson, M., & Wilson, T. P. (2005). An oscillator model of the timing of turn-taking. Psychonomic Bulletin & Review, 12(6), 957Œ968.