Glottic insufficiency caused by vocal fold atrophy with or without sulcus: systematic review of outcome measurements

In this review we identified the OMIs most used to evaluate treatment effect in patients with non-paralytic glottic insufficiency caused by vocal fold atrophy with and without sulcus. A total of 50 different OMIs were identified with 19 of these accounting for 80% of total reported OMIs. Of these 19 OMIs, five showed a significant change after treatment in more than half of the studies where they were used.

Interestingly, of the top ten most used parameters most were acoustic, aerodymamic or stroboscopic. Only one patients’ self-evaluation parameter was included in the top ten which was the VHI-30 ranked as 9th most used while self-evaluation is one of the most clinically relevant tools for measuring treatment outcome in daily practice. Additionally, several studies have proposed that it is the one most reliable tool for evaluating treatment response in this patient population [2, 45, 46]. This review shows it has a much higher percentage of significance than the acoustic or endoscopic parameters.

It is well known that assessing voice outcome after treatment is complex and that multidimensional evaluation is necessary. With the large body of OMIs available, choosing representative and reliable parameters is challenging and much evidence points to the fact that disease specific core outcome sets of OMIs are needed [10]. Before formulating such a COS a basic overview of parameters used in literature is required. It is important to emphasize that the parameters found to be the most frequently used for patients with non-paralytic glottic insufficiency caused by vocal fold atrophy with or without sulcus in this review may not necessarily be the most appropriate for this cohort. To properly assess the usefulness of an OMI, not only frequency of use, but also clinical relevance, applicability and psychometric validity are important factors to consider [10].

However, as a starting point, it is valuable to have insight into the choices that are currently being made by clinicians. The findings of this review can guide further initiatives on the route to a COS by indicating which parameters should be prioritized going forwards. The top OMIs revealed by this review as well as the factors for determining the ultimate relevance of an OMI are discussed below.

Subjective OMIs

The VHI-30 was the most frequently used subjective OMI (n = 10, 9th rank) and had a very high percentage of significance at 90%. The VHI-10 was the second most used (n = 7, 12th rank) with a lower percentage of significance of 57%. Around 75% of the studies in this review (26 out of 34) used a form of subjective rating. Subjective evaluation is one of the most clinically relevant tools in the communication with patients. The VHI is a robust questionnaire with translations and validations in many languages and it has the most sufficient psychometric construct based on COSMIN taxonomy [47, 48]. One may argue that VHI is designed for dysphonic patients in general and not specifically for patients with glottic insufficiency and that a questionnaire especially designed for glottic insufficiency may be preferred above VHI. More focused questionnaires for a future COS could be the glottal function index (GFI)[49], vocal fatigue index (VFI)[50] or vocal fatigue handicap questionnaire (VFHQ)[51] with the GFI having the advantage above the other disease specific questionnaires being the only OMI with moderate positive rating on psychometric ratings [48].

However, it is also important to consider that instead of incorporating ever more detailed disease specific PROMs (Patient-reported Outcome Measurement), there is also a countercurrent in literature supporting the development and use of generic PROMs focusing on general health aspects such as physical, mental and social health including quality of sleep or ability to work. An initiative to develop and measure generic PROMs is PROMIS (Patient-reported Outcomes Measurement Information System) which is an innovative, intelligent system for measuring generic PROMs to be used for different health problems and diseases (www.healthmeasures.net). A generic, non-disease specific health survey may be also of interest as a quality of life measurement instrument which can be used for cost-utility analysis by measuring quality-adjusted life years (QALYs) such as the EQ-5d (EuroQol 5D) [52].

Perceptual OMIs

The GRBAS was the most frequently used perceptual OMI (n = 18, 5th rank), with a percentage of significance of 72%. GRBAS is a widely used perception scale. The G, general grade, has a satisfactory inter- and intra-rater reliability and is therefore suitable as a single OMI. In the latest ELS proposal the use of complete GRBAS scale is preferred [4]. A main disadvantage of using perceptual OMIs in patients with non-paralytic glottic insufficiency is that structural defects, such as sulcus, are not always addressed in treatments, such as medialization procedures, where the primary goal of treatment is to improve endurance and not perceptual quality of the voice [17, 22].

Acoustic OMIs

Interestingly, our review showed that studies relied heavily on acoustic OMIs such as fundamental frequency (F0) (n = 20, 2nd rank), shimmer (n = 19, 4th rank), jitter (n = 18, 6th rank) and noise to harmonic ratio (NHR) (n = 11, 8th rank), even though none of these acoustic parameters achieved a percentage of significance above 50%. Their high frequency of use is likely due to them being automatically provided by most voice programs, but their clinical usefulness may be less defined. They are less intuitive in communications with patients, in our and other’s experience, and have been shown not to correspond to more clinically relevant parameters [2, 17, 22, 45].

Nevertheless, acoustic OMIs could potentially aid in detecting differences in the regularity of phonation that may be missed with more broad spanning parameters such as perceptual evaluation. The challenge would be to find the appropriate ones for this specific patient population from the large number of parameters available. Despite its low ranking and lack of significance in our review, one example could be the soft phonation index (SPI) (n = 4, 19th rank, 0% percentage of significance) which reflects the approximation of vocal folds [53]. It’s possible usefulness has been shown in unilateral nodules, but, to our knowledge, has not been clarified in atrophy and/or sulcus [54]. Inconsistency in normal values and increased SPI for pressed phonation have been seen [53, 54]. This may hamper the interpretation of SPI in atrophy and sulcus.

Aerodynamic OMIs

Of the top 19 OMIs used, three were aerodynamic; maximum phonation time (MPT) (n = 19, 3th rank), mean flow rate (MFR) (n = 10, 10th rank), and dynamic range (DR) (n = 5, 15th rank). MPT is a well-known voice parameter; it is simple, reliably obtainable, but with the disadvantage that normative data will differ in sub-populations depending on gender or age [55, 56]. MPT has been found to be the most used and most significant OMI for UFVP (90% percentage of significance) [5]. Our results indicated a less prominent role in our patient group (68% percentage of significance), possibly due to the difference in underlying pathology, including the degree of glottal gap that needs correction. Aerodynamic OMIs that require a pneumotachograph are less easy to obtain, f.e. MFR or phonation quotient ((PQ)(vital capacity/MPT)) as alternative. MFR may be of value for glottic insufficiency with mobile vocal folds, as it is for immobile vocal fold in UVFP, stated by Desuter et al., with relatively high ranking and percentage of significance (86% percentage of significance) [5].

Another measurement of interest is the phonation threshold pressure (Pth). It reflects the minimum subglottic pressure needed to reach phonation onset and sustain phonation [57]. It may be more appropriate to capture the subtle changes in subglottic pressure when comparing pre- and posttreatment effect. It has found only limited use up till now, although a preliminary study in 2021 showed that measuring Pth in UVFP is feasible [58]. Attributing factors for this may be variations in procedural methodology for task elicitation as well as environmental and participant inconsistencies that might affect phonation threshold pressure values [59].

Endoscopic OMIs

Mucosal wave was the most used OMI (n = 21, 1th rank) followed by glottic closure (n = 18, 7th rank), although both had a relative low percentage of significance (47% and 28% respectively). It is therefore debatable if endoscopic parameters are the most suitable OMIs for this patient population due to the inherent inter-observer bias associated with this form of assessment and the combined pathology of atrophy and sulcus leading to further difficulties in assessing exams [4, 60].

However, as endoscopy is broadly used in this patient group, more systematic and detailed videolaryngostroboscopic assessment protocols should be investigated, f.e. as described in VALI (Voice-Vibratory Assessment with Laryngeal Imaging)[61]. Frame-by-frame- analysis (FBFA) could also be useful [62]. Another possibility would be to use disease specific laryngoscopic assessments. For vocal fold atrophy, the reliability of laryngoscopic features have been investigated with satisfying results and recently a validated classification of presbylarynx based on laryngoscopic findings has been published [63, 64].

As stated in the introduction, to properly assess the usefulness of an OMI, before it can be included in a COS, quality assessment has to be performed. In doing so, not only frequency of use, but also clinical relevance, applicability and psychometric validity are important factors to consider [10].

To address the issue of the relevance we calculated the “percentage of significance” for the most frequently used OMIs, defined by Desuter et al. as the percentage of number of studies with a significant change in a specific OMI, divided by total number of studies using this OMI [5]. We found the VHI-30 to be the only OMI with a percentage of significance higher than 80% and the VHI-10, GRBAS, MPT and the APQ to be the only parameters of 50% or more. Interestingly, Desuter et al. found percentages of significance higher than 80% for MPT (90%), mean airflow (86%) and the G of the GRBAS (85%) in his review on unilateral vocal fold paralysis. We hypothesize that this difference may reflect the pathophysiological difference between glottic insufficiency with mobile vocal folds and UVFP, supporting the notion that the relevance of OMIs may differ from disease to disease.

Studies tend to report mainly on the statistical significance of a change in an OMI, which does not necessarily correspond to a difference that is clinically relevant. But for patients and health professionals clinically relevant changes in outcome are of great importance.

Until now, the clinical relevance of a certain outcome has often been consensus based [31]. However, values for clinically relevant changes have been suggested for some of these OMIs. Van Gogh et al. defined what constitutes a clinically relevant change for the VHI-30 based on a selected Dutch population with dysphonia after treatment for early glottic cancer or benign voice disorders and a normal population [65]. More recently Young et al. formulated the MCID (minimal clinically important difference) for VHI-10 in patients with vocal fold paralysis. The authors highlight that not only the numerical change within a parameter that represents a minimal clinically relevant change is important, but also that this value may be disease specific [66]. Therefore, some OMIs may not be as valuable for a specific disease as traditionally assumed.

Applicability, whether a test can be performed or not, depends on logistic, technical and financial possibilities and limitations. For acoustic, aerodynamic, but also endoscopic OMIs this can be a limiting factor. For acoustic measurements special voice program software is needed to record and store a phonetogram, and to extract, calculate and store various voice parameters. These programs are commercially available, f.e. MDVP (multidimensional voice program software, computerized speech laboratory (KayPENTAX, Montvale, NJ)) and have their own set of parameters. For aerodynamic parameters as MFR a pneumotachograph is needed (phonatory aerodynamic system (PAS), KayPENTAX, Montvale, NJ).

The last important factor is psychometric validity. Psychometric validity has been only investigated for subjective OMIs [48, 67]. In the study of Francis et al. 32 PROMs were reviewed on development and validation and showed gross psychometric weaknesses as lack of patient involvement, lack of robust construct validity and lack of interpretability and scaling [67]. Speyer et al. reported on psychometric properties of 15 PROMs and concluded that many psychometric data were missing or indeterminate, VHI seeming to be the most promising questionnaire [48].

This study has some several weaknesses. First of all, no formal Risk of Bias (RoB) was performed. We found this of limited added value, because most studies, 32 out of 34 were cohort studies of which 25 retrospective, with a comparable risk of bias. Of the 2 clinical trials, there was only one double blind RCT, which has a low RoB. Secondly, no formal meta-analysis was performed. As statically significance does not always correspondent with clinically relevancy we chose “percentage of significance” to capture relevancy, although this may not be the most thorough way of doing this. Lastly, we would like to emphasize that the most frequently used OMIs, collected in this review, do not defacto represent the most appropriate OMIs for this patient group, and that besides frequency of use, also clinical relevance, applicability, and psychometric validity are important factors to consider.

留言 (0)

沒有登入
gif