Freiburg vision test (FrACT): optimal number of trials?

Figure 3 shows all our results in test–retest scatter plots, faceted by run length. An acuity range from 1.22 to -0.59 LogMAR was covered, corresponding to a factor of ≈60 change in visual angle. Without artificial blur and 48-trial runs only, the participants’ acuity ranged from 0. 45 to -0.37 LogMAR.

Fig. 3figure 3

Outcome for all participants (26), eye conditions (3, OU, OD, OS) and run lengths, arranged in test–retest scatter plots segregated by run length (number of trials). Note the inverted axes: good acuity rightward / upward. The bottom right insets indicate the respective limits of agreement

Data points at or close to the 45-degree identity line indicate low variability (low LoA). Inspection of the figure immediately shows that all data points far away from this line correspond to low trial numbers, as we would expect.

Obviously, short runs often show outliers (points far away from the 45° identity line).

Limits of agreement (LoA) depend on the number of trials.

When there is a marked bias (mean test–retest difference) between retest and test, the Limits of Agreement have a differing lower and upper value. Here, this bias was only 0.008 LogMAR, with the 95%-confidence interval ranging from 0.0 to 0.017 LogMAR. A t-test for difference from zero gave a p-value of 0.058. That bias corresponds to two-tenths of a line (with the acuity from the second run being slightly better), and might well be due to chance. Given the low bias, I here simply report the mean LoA Eq. 1:

Figure 4 depicts the test–retest limits of agreement (LoA, blue solid line) versus the number of trials. The error bars indicate the bootstrapped 95% confidence intervals. The figure shows that LoA decreases with a growing number of trials (p = 0.015); first steeply from 6 to 18 trials (± 0.47 to ± 0.17 LogMAR), then more slowly to ± 0.12 LogMAR. The shape of the LoA vs. number-of-trials characteristic exhibits a pronounced change in slope (“kink”) at 18 trials (Fig. 4). The test–retest differences from which LoA is calculated was found to be non-normally distributed, there was an excess of outliers. A re-calculation with an “empirical LoA” (non-parametric, based on quantiles, see repository for code) found only slight deviations: the empirical LoA is typically lower by 0.01 LogMAR than the Bland–Altman LoA. This is within symbol size of Fig. 4 and has no effect on interpretation.

Fig. 4figure 4

LoAs (test–retest Limits of Agreement) vs. the number of trials. Blue disks with connecting lines and error bars): Standard LoA values with bootstrapped 95% confidence intervals. Blue circles: “Empirical” non-parametric LoAs. The red crosses represent a simple square-root-of-n model as explained in the text

A reviewer enquired about theoretical grounds for the run-length findings. So a simple model of the LoA behavior was set up assuming that LoA is mainly governed by the square-root-of-n law, scaling with \(1/\sqrt\). This model initially provided a poor fit. Further thought revealed that the first 3 trials are “easy” and nearly always correct, as per default FrACT follows the sequence [LogMAR = 1.0, 0.7, 0.4, 0.1…] until an error occurs (cf. Figure 1). Thus the initial steps just train the participant but do not provide much threshold information; depending on the acuity state the first informative error will occur when a step nears the acuity threshold. So it seemed appropriate to subtract a constant trialOffset from the number of trials, and experimentation suggested a trialOffset of 3, so \(_=nTrials-trialOffset\). LoA would then scale with run length as follows:

$$LoA \sim 1/\sqrt_}$$

(2)

Applying Eq. (2) and additionally normalizing to coincide at \(nTrial=6\) yielded the modeled values indicated by red crosses in Fig. 4. Considering that there are only two free parameters in this simplistic model, namely the trialOffset and the overall scaling to let the first pair of points coincide, there is a surprisingly close fit, bringing out all the more the “kink” at 18 trials.

LoA decreases steeply until 18 trials, showing a “kink” (slope change) there, then declines more gradually. Differences between parametric and non-parametric LoAs are of no consequence.

Limits of agreement (LoA) do not depend on the acuity range.

Figure 5 depicts the limits of agreement (LoA) for run lengths of 18 and 48 trials for a number of visual acuity ranges. Acuity (average of test–retest) was binned and grouped in 0.3 LogMAR intervals, and the LoA per group computed. Error bars represent the bootstrapped 95% confidence intervals for LoA. An ANOVA, with factors trialLength and acuityGroup, revealed a significant effect of trialLength (p < 0.01), but not of acuityGroup (p = 0.13) [LoA was normally distributed]. LoA therefore does not seems to depend on visual acuity across the full 1.6-LogMAR-range from 1.13 to –0.47 LogMAR (these values differ from above because here they are derived from the averaged test–retest values).

Fig. 5figure 5

LoA (test–retest Limits of Agreement) versus grouped visual acuity for run lengths of 18 (reddish) and 48 (greenish) trials. Error bars indicate the bootstrapped 95% confidence intervals for LoA

LoA is lower for 48 than for 18 trials in all acuity ranges while largely independent of acuity within the error margins.

留言 (0)

沒有登入
gif