Speech-mediated manipulation of da Vinci surgical system for continuous surgical flow

2.1 Setup2.1.1 dVRK

The dVRK, donated by Intuitive Surgical Inc. in 2014, is the main platform for research on ergonomic improvements of the dVSS [1]. It comprises four 8-axis motor control units, a stereo viewer, two MTMs, two PSMs, and a foot pedal tray, as demonstrated in Fig. 1. The dVRK is configured using an open-source robot operating system framework and libraries developed by Johns Hopkins University [27, 28]. The methods for manipulating the dVRK are as follows: The operation of the MTMs, which are the master controllers for the PSMs, is remotely interlocked with the execution of each of the PSMs. In detail, the movements of the MTMs are directly transferred to the PSMs, and the actions of the surgical instruments installed on the PSMs, such as gripping motion, can be controlled by switching the status of engagement or disengagement of finger clutches at the MTMs in real-time. The scale factor that determines the moving ratio of the PSM according to the MTM is preset by the dVRK software to simulate the operation environment [29]. Therefore, the user can control the PSMs with MTMs remotely viewing the stereo viewer in real-time.

Fig. 1

dVRK: a 8-axis motor control units, b stereo viewer, c MTMs, d foot pedal tray, e PSMs, f sea spikes pod

2.1.2 EECS

Because the ECM is not included as a basic component of the dVRK, an enhanced endoscope control system (EECS) has been developed to obtain stereoscopic images from the test bed and imitate the motion of the ECM as illustrated in Fig. 2. The EECS, which has a double-parallelogram structure to enable fulcrum point motions, can provide various operations including roll, pitch, insertion, and rotation movements, using six motors that are controlled using the Dynamixel Wizard software [9]. In detail, J1 (Motors 1 and 2) is for roll, J2 (Motors 3 and 4) is for pitch, J3 (Motor 5) is for insertion, and J4 (Motor 6) is for rotation movements. The technical specifications of the EECS are represented in Table 1. Further, the moving ratio of the EECS is preset to provide analogous movements of the ECM. Two camera modules with a maximum resolution of 1,920$\:\:\times\:\:$1,080 pixels are installed to capture the images in real-time. To facilitate 3D visualization to the stereo viewer of the dVRK, a rectification process based on Unity and Visual Studio 2019 is implemented. The users can see the stereoscopic images from the EECS through the stereo viewer.

Fig. 2

Table 1 Technical specifications of EECS2.1.3 SRCI

The SRCI was designed by the Microsoft Azure model to incorporate speech recognition technology, as illustrated in Fig. 3 [30]. When a user’s speech commands are input to the Linux PC, this speech command converts audio data into text data through the speech SDK in Azure Cloud. In the present study, rule-based commands, including “Endo up,” “Endo down,” “Endo left,” “Endo right,” “Endo in,” “Endo out,” and “Endo stop,” have been implemented. The prefix “Endo”, inspired by the word “endoscope”, is set to minimize the false positive rate induced by ordinary conversations. “Endo in” and “Endo out” are the commands for zooming in and out, respectively. If the user’s speech commands are interpreted to one of the predefined text-type commands by the sequence matching formula, then the corresponding function is executed, and the EECS continuously moves in a specific direction [31]. Because the EECS was designed to mimic the operational mechanism of ECM, therefore, all commands in the present study can control all movements of the EECS. From the perspective of patient safety, the EECS has policies that operate at the preset moving ratio in the range of the maximum speed and stop the movement automatically if it moves over its workspace limitation.

$$D_o=(2*k_)/(|S_|+|S_|)$$

Fig. 3

2.2 Manipulation methods2.2.1 Traditional method

To control the surgical instruments installed on the respective PSMs with MTMs, a user manipulates the MTMs based on their operation intentions while continuously pushing a “coag” button at the right side of the foot pedal tray. If the working range of the instruments on the patient side is limited during the operation, then the user should manipulate the MTMs while continuously pushing a “clutch” button at the left side on the foot pedal tray. Because the movements of the MTMs are not interconnected to the instruments while pushing a “clutch” button, a user can readjust the position of the MTMs to prepare to control the instruments again. Likewise, the EECS can be controlled by manipulating the MTMs with pedals.

2.2.2 Proposed method

The manipulation methods of the surgical instruments with the MTMs and pedals are the same as the traditional methods. To control the EECS, a user speaks the rule-based commands to the microphone installed on the stereo viewer. Then, the EECS continuously moves based on the user’s intended direction. In general, a user can speak the “Endo stop” command to stop the control of the EECS, and it can be applied to abnormal situations in which the SRCI does not recognize or misinterpret the commands. Technically, the diagonal movements of the EECS are possible, however, those are restricted to minimize the tendency difference from person to person in the present study. Considering the safety issue, the EECS was designed thoroughly to be controlled in the inner range of the workspace simulated in real robotic surgery.

2.3 Participants

Following the approval of the institutional review board (IRB) of Seoul National University Hospital (IRB No: H-2107-167-1236), a total of 38 Korean participants (age: 30.16 ± 4.09) were randomly recruited based on previous studies [22, 23]. To investigate whether the difference in task performance or usability between the traditional and proposed methods varies with the level of expertise, both the surgeon group and novice group that represents the extreme end of inexperienced surgeons were recruited. The surgeon group comprised 15 fellows (age: 34.27 ± 2.68) and 5 residents (age: 30.40 ± 2.05) from gastrointestinal, coloanal, endocrine, ophthalmologic, and orthopedic surgery fields. The novice group included 18 novices (age: 26.67 ± 0.97), who were not experts in the medical field. In the surgeon group, only male participants were recruited, therefore, the sex composition of all participants was 32 males (age: 30.91 ± 4.07) and 6 females (age: 26.17 ± 1.03). Before the usability evaluations, instructions and cautions to operate the dVRK, based on the traditional and proposed methods, were provided enough to all the participants for safe manipulation.

2.4 Procedure of usability evaluation

To investigate the replaceability from MTM to SRCI based on ISO 9241-11 [32], the usability of traditional and proposed methods has been evaluated in analogous environments using the dVSS. In detail, the participants perform tasks using the dVRK and respond to various questionnaires based on their experience of using both these methods. The tasks are designed based on previous studies, and globally reliable questionnaires focused on usability are selected [32,33,34,35,36,37]. The user evaluation conducted in the present study is characterized by a within-subject design, in which all the participants conduct identical tasks using the traditional and proposed methods. To minimize issues related to learnability, initially, half of the participants performed the evaluation using the traditional method, while the remaining participants performed the tasks using the proposed method.

2.5 Task performance2.5.1 LTT

In the line tracking task (LTT), which has been utilized in a variety of types in previous studies, the participants move the EECS and surgical instruments together by following the arrows in the sequence of 0, 1, 2, 3, and 0, as illustrated in Fig. 4a [11, 38, 39]. The shape of the route to be tracked is a square, and each line to be tracked is of length 0.1 m, thus, a total distance of 0.4 m is covered, which is the validated space in the abdomen model [38]. At each point with a designated number, the participants perform the gripping motion once with the surgical instruments. All the participants perform the LTT twice, and the total time required to move the total distance is measured and compared for both the traditional and proposed methods. The mean time and standard deviation (SD) are determined to compare and analyze the results obtained from both groups for the two methods, as in the previous studies [21,22,23].

2.5.2 SSPT

As an application, the sea spikes pods task (SSPT) is designed based on the sea spike pods that are utilized by medical interns to exercise the dVSS and can be modified for diverse purposes [23, 40, 41]. In the present study, the participants control the EECS and surgical instruments to change the visual field and grip the designated spikes on the respective sea spike pods, as shown in Fig. 4b. The detailed steps are as follows: The participants grip the identical colored spikes with right and left instruments one at the same time, and then move the visual field with the EECS to the opposite sea spike pod. At the opposite sea spike pod, the participants perform the same actions and then return to the original sea spike pod. All the participants perform the SSPT twice, and the corresponding completion times are measured, referring to [21,22,23].

Fig. 4

Usability evaluation: a LTT, b SSPT

2.6 Questionnaires2.6.1 ASQ

The after-scenario questionnaire (ASQ), which indicates 0.96 global reliability (Cronbach alpha score) ranging from 0 (completely unreliable) to 1 (perfectly reliable), is a survey used immediately following scenario completion in scenario-based usability studies [33,34,35]. As the user evaluation in the present study includes LTTs and SSPTs, which have protocol scenarios, the ASQ survey is implemented to examine the satisfaction of the participants. The ASQ has three questions, including comments, and uses a seven-point Likert scale ranging from + 1 (strongly agree) to + 7 (strongly disagree), as presented in Table 2 [34].

Table 2 ASQ (global reliability: 0.96)2.6.2 SUS

The system usability scale (SUS) survey usually reflects a user’s subjective rating of usability rapidly and easily [36, 37]. The SUS has 10 questions and uses a five-point Likert scale ranging from + 1 (strongly disagree) to + 5 (strongly agree), as presented in Table 3. The questions are categorized as positive and negative attributes, and comprehensive scores can be calculated using the given formula to be considered as adjective ratings: best imaginable, excellent, good, OK, poor, and worst imaginable [36, 37].

Table 3 SUS (global reliability: 0.92)2.6.3 NASA TLX

The NASA task load index (TLX) is globally used to investigate the subjective workload numerically in experimental tasks, as presented in Fig. 5 and Table 4 [42]. It has six indicators: mental demand, physical demand, temporal demand, performance, effort, and frustration. To calculate the scores of the NASA TLX, weights for the six indicators are computed. Then, scores for each indicator are assigned based on a 21-point Likert scale ranging from 0 (very low, perfect) to + 100 (very high, failure) [42]. The overall score is obtained by multiplying the weights with the scores based on the Likert scale. In this study, a quantitative analysis is conducted using simplified raw NASA TLX scores, thereby simplifying the computation of weights [43].

Fig. 5

Table 4 Description of NASA TLX2.7 Statistical analysis

To compare the values computed for the traditional and proposed methods, the independent T-test, Mann–Whitney U-test, and Welch’s T-test are applied at a 95% confidence level according to the results of the Shapiro–Wilk test for normality and Levene’s test for equality of variance. In detail, if the p value derived from the Shapiro–Wilk test is over 0.05, the independent T-test is used; otherwise, the Mann–Whitney U-test is performed as a nonparametric test. In the independent T-test, if the p value obtained from Levene’s test is under 0.05, Welch’s T-test is applied (in the Results section, the p values obtained from the Mann–Whitney U-test and Welch’s T-test are indicated by single (†) and double (††) dagger symbols, respectively; there is no superscript in the case of the independent T-test). All the statistical analyses are carried out using the statistical package for social sciences (SPSS).

View original article

BIOMEDICAL ENGINEERING LETTERS

Like

分享书签

0 0 0 0 0 0 0

More from this channel

Speech-mediated manipulation of da Vinci surgical system for continuous surgical flow

留言 (0)