The primary objective of this study was to evaluate the validity and level of agreement between the new DormoTech Vlab device and gold-standard Nox A1 PSG. The study was conducted between June 2023 and August 2023 and was a comparative, observational clinical trial. Participants were recruited during attendance at the Shamir Medical Center Sleep Laboratory (Be’er Yaakov, Israel) or Millenium Sleep Clinic (Be’er Sheva, Israel) where PSG is routinely performed using the Nox A1 system (Nox Medical, Reykjavik, Iceland, K192469). All participants had been referred for PSG by their physician and were not selected in advance by the investigators. All participants gave informed consent prior to their inclusion in the study. Inclusion criteria included participant willingness and ability to comply with protocol requirements, giving informed consent, and an age restriction of 22 years or older. Participants were excluded if they were not willing to sign informed consent, had implanted electronic devices or used exterior electronic devices during the procedure, had known allergies to device materials, had skin irritation or open wounds at the device placement site, or were pregnant. All procedures were approved by the IEC committee on ethics of Shamir Medical Center on April 24, 2023, study number 0067-23-ASF.
Participants consenting to the study were required to wear the DormoTech Vlab device in parallel with the Nox system and complete questionnaires both pre- and post-study. Attending laboratory technicians applied both the Nox and Vlab devices on the patients. The device was worn throughout the night and data from both devices were collected for analysis. Participant sample size was estimated following EP09-A3 and FDA guidance. A total of 47 participants were recruited across two sites. Following descriptive analyses, 2 participants were excluded due to technical issues (i.e. device not being fully charged) during the study and 2 participants redacted their consent during the sleep study. Of the remaining participants, 2 patients were defined in advance by the researchers as pre-test cases and were therefore not part of the clinical data analysis.
Recording process and data extractionConventional PSG recordings were performed using the Nox A1 device and Noxturnal Software System (version V.6.3). Vlab recordings were performed using the Vlab device. Recordings from both devices were saved on the laboratory’s local server and was accessible only by the laboratory technician. Separate EDF files from both devices were exported for scoring in the Noxturnal Software System. Signals recorded from the Vlab device were not modified or filtered before being exported to EDF formatting for accurate analysis using Noxturnal software.
Channels recorded by the Vlab device included electroencephalogram (EEG), electrocardiogram (ECG), electrooculogram (EOG), electromyogram (EMG), airflow, respiratory sound/snore, oxygen saturation, heart rate, respiratory effort, and body position. The gold-standard device, Nox A1, included channels recording EEG, EOG, and EMG, pressure, respiratory effort, airflow, respiratory sound/snore, position, and ambient light.
Sleep study scoringSleep staging scoring for both recordings (Vlab and Nox) was performed using Noxturnal Software System (version V.6.3), allowing for manual scoring. In this study, all data was manually scored using the American Academy of Sleep Medicine guidelines, and respiratory events were scored according to AASM scoring rules. An apnea was scored if airflow was absent for 10s and a hypopnea if airflow dropped by ≥ 30% of pre-event baseline in association an arousal or an oxygen desaturation of 3% or 4%, according to recording location. Results from these two rules were compared and found to be similar, so were combined in population analyses.
Two trained PSG scorers, one for each site, performed the scoring. The same scorer scored both Vlab and Nox recordings for each participant at their site. Sleep scoring was performed blind to participant identification, as EDF files had no identifying information other than the participant’s study ID. Data from the two simultaneously recorded tests were not scored sequentially or on the same scoring session.
Collected measurementsDescriptive statistics, including age, BMI, and sex were calculated to characterize all participants. Categorical variables (e.g., active smokers, sex) were presented as counts or frequency distributions. Continuous measurements were summarized as mean (± standard deviation).
The primary endpoint measurement was the apnea-hypopnea index (AHI) and severity level, calculated as the average frequency of apnea and hypopnea events, measured per hour of sleep. AHI severity was then categorized based on accepted severity levels (normal AHI < 5, mild 5–14, moderate 15–29, severe > 30).
The secondary endpoint measurements included other PSG parameters including total sleep time (TST, in minutes), total recording time (TRT, in minutes), sleep efficiency (%, TST/TRT × 100), sleep stages (Wake, N1, N2, N3, REM, in % of total recording time), sleep latency (in minutes), wake after sleep onset (in minutes), REM latency (in minutes), ODI (oxygen desaturation index), total snore (%), and body position (supine position, left, right, up, as % of TST). In patients who did not have REM sleep, REM latency was set to 0.
Device usability was assessed using pre- and post-study questionnaires based on a 5-point Likert scale. Participants were instructed to evaluate only the Vlab device, independent of the Nox A1 device. If a participant felt they could not differentiate between the devices when answering the post-night test questionnaire, this observation would be noted in our results. See Supplementary B for pre- and post-study questionnaires.
The number of Adverse Events (AE) related to device use was measured as a readout of device safety. While PSG is regarded as relatively safe, systematic evaluation of adverse events have suggested that AEs may occur including those cardiac in nature (mostly involving acute chest pain), falls, neurologic, pulmonary, or psychiatric in nature [13, 14].
Acceptance criteriaAgreement between continuous parameters from the two different devices was analysed primarily using the Bland-Altman method. For primary endpoint analysis, the study protocol defined a priori < 15% deviation between AHI calculated from the different devices as being functionally equivalent in continuous AHI values. This functional equivalence threshold was chosen as diagnostic severity thresholds are relatively broad (normal AHI < 5, mild 5–14, moderate 15–29, severe > 30) and therefore a < 15% deviation in AHI values would not be likely to result in a classification difference. Moreover, previous clinical studies [15] have defined minimal clinically important differences in AHI > 10–15 events per hour, which would not be reached with our stringent threshold of < 15%.
In addition, we performed statistical t-tests and a priori defined confirmation of the null hypothesis (i.e. p > 0.05 in comparative tests) as an indication of equivalence between the two devices. For correlation analyses, we defined a priori a strong positive correlation of Pearson’s correlation coefficient as > 0.80. Device safety was assessed based on the frequency of adverse events (AEs). The acceptance criteria were decided by the physician according to the severity of the event.
Device usability was assessed via questionnaires (see Supplementary B) using a Likert type 5-point scale. We defined a priori that a score of 3 (neutral) or higher should be achieved for at least 70% of the questions in both questionnaires.
Statistical analysis techniquesQualitative measuresFor qualitative measurements, the overall agreement between the DormoTech Vlab and the Nox device was calculated using binary determinations. Specifically, this involved comparing the categorization of sleep apnea severity (normal, mild, moderate, or severe) based on AHI between the two devices for each individual patient. The percentage of cases where both devices agreed on the severity was calculated.
Quantitative measuresAll analyses adhered to FDA Guidance E6 GCP, E9, and ISO 14155 [16, 17].
Advanced statistical techniques, including Student’s t-test, Bland-Altman plots, Pearson’s correlation, and Passing-Bablok linear regression models were employed to compare the performance of the DormoTech Vlab and Nox A1 PSG devices. To provide a holistic view of the accuracy and validity of the measurements, the Root Mean Square Error (RMSE) was computed. A low RMSE indicates high agreement between the two devices.
Bland-Altman plots, also known as mean-difference plots, were used to analyse the agreement between the two measurements of the same continuous parameter. Specifically, for each metric we calculated the participant-specific difference and average value between the two recording methods. Values close to zero on the Bland-Altman plot suggest good agreement between the parameters. The limits of agreement (LoA) between the DormoTech Vlab and Nox A1 devices was defined as 2 standard deviations from the mean of the difference in score between the two systems. The upper and lower limits of agreement therefore represent the range where 95% of the measurement differences lie, with a narrower range indicating better method agreement. For example, the theoretical AHI range is 0-150 events per hour (theoretical maximum at 2.5 events per minute as each event must be at least 10s and occur during sleep). Recently, extremely severe cases of AHI were reported at 130–140 [18, 19]. As the calculated 95% LOA for AHI is 7 events per hour, and the possible range is 0-150 events per hour, then the 95% LOA is only 4.7% of the clinical theoretical range, indicating high agreement between the two devices.
The Cohen’s kappa coefficient was calculated to assess the agreement between the two raters, with values close to 1 indicating excellent agreement.
Passing-Bablok regression was calculated for measurements from the Vlab and Nox systems to assess the agreement between the two recording devices. The resulting intercept and slope are crucial parameters, where an intercept close to zero coupled with a slope nearing one signifies very high agreement and little bias between the two devices.
The Pearson’s correlation coefficient quantifies the linear relationship between the two devices, where a value of one indicates perfect relationship between the two signals.
留言 (0)