Patient Perspectives on Artificial Intelligence in Healthcare Decision Making: A Multi-Center Comparative Study

The development and implementation of ML require large, accurate datasets to train predictive algorithms. Data is culled from real-world patient samples in electronic medical records and used as the foundation of algorithm training. Gathering this data demands the implicit or explicit consent of patients, the ethics of which have not been fully defined. Presumably a patient enthusiastic about the prospects of AI and ML would be more likely than not to give explicit consent for such data gathering and many factors may contribute to willingness. Therefore, the primary goal of this study was to characterize an orthopaedic surgery population’s perspective on the use of AI/ML in their care. A patient’s age, gender, education, experience with AI, and preconceived perceptions of the impact of AI/ML on healthcare quality and cost affect the patient’s comfort with AI/ML in their healthcare team. On average patients appear comfortable with AI joining the care team.

The present study demonstrates that patients appear comfortable with AI, illustrated by the 6.4 average comfort score across the entire study population. This suggests that even among patients that are most comfortable with AI, there are still reservations, which is consistent with previously published findings [7].

This study suggests that age impacts a patient’s comfort level with AI/ML, with younger patients being less comfortable. This is an interesting finding and contradicts previously published results [7]. York et. Al published data suggesting that younger individuals had a higher comfort score compared to older patients. Their results are based on a 2 question survey from a single urban center, and there is no indication of controlled logistic analysis or consideration of a weighted sensitivity analysis. Although young adults have grown up in a society where technological and computational integration into the world has been rapid and widely accepted. It is possible that this life-long relationship with computers has left younger patients skeptical of AI/ML, or that youth is associated with a less advanced understanding of AI and ML. Further investigation is warranted on this topic.

The patient’s gender was also a significant variable. Patients who identified as male reported significantly higher comfort across the population compared to those who identified as female and those who identify as other or did not respond. This is similar to previously published results demonstrating the repeatability of this finding [7].

Patients with a higher level of education were more likely to approve of AI in the care team, which is also similar to prior findings [7]. There can be multiple reasons for this. Educated patients may have a deeper knowledge of technology, or work with technology daily. The patient’s level of understanding of the terms “artificial intelligence” or “machine learning” was an important factor, as familiarity with AI increased overall comfort with AI. Increased familiarity with AI also positively impacted patient’s perception of AI in orthopaedic care, their belief that AI would decrease healthcare costs, and decreased the number of individuals who would refuse care if AI were to increase healthcare cost. This data demonstrates that future patient education can significantly impact patient perception of AI and ML in healthcare in a positive way.

Survey format impacted patient comfort level in bivariate and controlled multivariable analyses. Our data demonstrate that individuals who completed the survey in the clinic on a tablet had a lower average comfort score by 2.3 units compared to patients who completed the survey via REDCap and those who completed a paper survey in the office setting. Previous studies have identified that email/mail surveys allow patients to participate from home, which is often a more comfortable environment compared to a stressful office encounter [11]. However, there is little understanding of how the mode of completion effects survey results, reliability, and validity [12]. Our study is the first, to our knowledge, to provide evidence that environment and modality of survey completion may have an impact on how patients respond to survey questions [12]. These results should be considered in design, implementation, and interpretation of future surveys. REDCap which was delivered by email may have been more convenient, or the fact that the patient was able to obtain it and complete it by email and web browser may have been a selection bias for patients who are baseline more comfortable with technology [11]. For example, an increased frequency of computer use was associated with more AI comfort in our data. This information is valuable as it demonstrates that it may be preferable to request data usage for future data-gathering efforts when the patient is not in the clinical site, experiencing external stressors. Further research may elucidate these unknowns.

The primary analysis demonstrates that patient ethnicity, religion, and clinical setting (urban vs rural) do not have an identifiable impact on the patient’s comfort level. Our sample of nearly 400 patients is large, however, it is possible that a larger sample size could identify a relationship not observed here. Understanding differences in opinion and perspectives among racial, religious, and regional groups is a critical aspect of providing equitable AI-driven healthcare. The implications of distrust among a specific group are important for providing appropriate healthcare in the ML era; if a specific population were to refuse data gathering for the build of a predictive algorithm, the outputs of that algorithm may not apply to that group, thus creating an unintended health disparity.

Patients appear to distrust fully autonomous machines, which is evident by a few patients reporting willingness to undergo robotic surgery without the immediate supervision of a human physician. This point should be clarified by research to guide the future of robot-assisted joint arthroplasty [13, 14] and other automated surgeries.

As expected, patients who thought AI would have positive effects on their care were more likely to have a higher comfort level compared to those who thought that AI would have a negative impact on their care. Similarly, perceived impact on the cost of healthcare had a significant impact on average comfort level with 28.8% of respondents reporting that they would refuse the integration of AI if it increased the cost of care. Similarly, patients with higher comfort levels also felt comfortable with a physician (or by proxy a healthcare entity) participating in the sale of health data to third parties in an effort to improve the quality of AI. The majority of patients, however, did not feel comfortable with this. This data should be considered by physicians, hospital systems, and technology companies when designing future data-gathering efforts.

One of the first studies to consider the patient opinion on AI was an interview with a small cohort of patients undergoing radiologic imaging [15]. Those patients expressed a desire for proof of concept, reliability, efficiency, accountability and education on the general process before they would be comfortable accepting AI into their radiology care [15]. York et al. found patients strongly preference the opinion of their physician over that of an AI machine [7]. A subsequent study surveyed patient opinion regarding AI implementation in radiology, concluding that patients tended to have moderate distrust in machines assuming diagnostic tasks of a radiologist [16]. Our results support these previous studies and also identified that patients harbor some distrust of AI, but that patient education may improve patient comfort with AI in healthcare. Patient comfort is important because anxiety, catastrophic thinking, and somatization have been linked to negative perioperative events and poor post-operative outcomes with elective orthopaedic surgeries [17, 18]. Thus, not only will a positive patient attitude lead to favorable surgical outcomes in cases where intelligent machines are involved, but it will also impact the patient opinion on the subject of AI, which could subsequently increase patient consent for participation in critical data-gathering efforts.

The clinical implications of comfort level with AI reach far beyond the obvious patient emotional response to technology. The nature of the “learning” dataset is mission-critical for ML; the machine must be trained on data representative of the patients to which it is applied. For this reason, it is highly relevant that age, gender, and education level impact patient comfort with the technology, for if these patients do not provide data to the training sets then the algorithm will be biased against them. In this study, females, younger patients, and less-educated patients were not comfortable with the idea of the technology being utilized in their care. If, because of their discomfort, these patients’ data were excluded from the training dataset, the resultant algorithm could not be appropriately applied to them in the future should their opinion change. This would ultimately lead to an unconscious bias within the algorithm, making it poorly generalizable and potentially prone to generating health disparities based upon age, gender, and education. The algorithm can only make predictions based upon what it has seen, therefore the “training” population should be representative of the “treatment” population. Practically speaking, this should generate an impetus for improving comfort with AI among these groups via directed education or other modalities.

Although the present study was designed to investigate the patient perspective on AI and a subset of AI known as ML in healthcare, the results of our study may be beneficial to the development of other forms of AI. ML algorithms such as neural networks (NN) are the primary modality driving the development of precision medicine, an approach to tailor disease treatment to individuals based on patient genetics, environment, and lifestyles [4, 19]. A more complex form of ML called deep learning (DL) utilizes NN’s to make sense of its input in a layered fashion and practically can be used to identify abnormal findings within a radiology study, for example a potentially malignant lesion in the brain [5, 19]. Another distinct type of AI is Natural Language Processing (NLP), which is already widely integrated with modern society in forms as common as spell check, Amazon’s Alexa, Google search, and web-based assistants [19]. NLP can be used in healthcare to converse with patients and triage patient requests, among other tasks [19, 20]. Although these tools are already in use and undergoing continuous refinement patient opinion has not been adequately quired. One major area of AI that has not been breached in healthcare is autonomous physical robots (APR); although surgical robots are widely used, they still require human direction [19, 20]. Barriers to autonomous machines are not necessarily technological, but as emphasized by our results: societal [19]. Patients in our study are uncomfortable with the idea of autonomous robots compared to assistive devices, which is consistent with prior findings suggesting the limitation to this next step of APR is societal in nature. The future of AI is a topic of great debate, notably the ramifications of the “singularity”, or the moment a machine reaches the capability of human-level thinking also referred to as artificial general intelligence (AGI). Although the concept of AGI and if it will ever happen is a contentious topic [21], the fact remains that multiple facets of AI will continue to become integrated with healthcare and understanding how to improve its integration and the magnitude of integration with which patients are comfortable is paramount.

This study has several limitations. Data derived from patient surveys may be confounded by complex influences and are difficult to validate; however, several methods can be used to mitigate these uncertainties such as utilizing multiple Likert scales and easily comprehensible language in the survey [7, 22]. Although several Likert scales were utilized, the Likert scales used in this study have not yet been validated. Furthermore, although prior studies have identified an average comfort score from multiple Likert scales, this may not be a valid method and should be considered as a limitation for this study [7]. In addition, the language used in framing the questions may have provoked implicit biases among the patients. For example, replacing the term “sell” with “share” in the question, “Is it acceptable for a doctor to sell health data to a third party for the purpose of building intelligent computers for use in healthcare?” could potentially change a patient’s response to the question.

Intrinsic bias may be present in the data based on the medium through which this survey was distributed. The COVID-19 pandemic interrupted in-person survey administration, limiting the in-clinic population to only 11 responses via paper and pencil and 21 responses via tablet. Thus, comparisons between survey format should be interpreted with caution and repeated with a larger population. We were also limited by survey response rate; of the 3789 patients that were emailed the survey, only 365 (9.6%) patients responded. This low response rate was accounted for in the weighted analysis and directly compared to the non-weighted analysis. Although no significant difference was observed between rural and urban populations in both non-weighted and weighted analyses, the rural site experienced a significantly lower response rate compared to the urban center. Prior studies have also observed lower response rates from rural populations compared to urban populations [23, 24]. It is well documented that less than 20% of patients will respond to a survey that is emailed or mailed to them [11, 25,26,27]. Response rates can be increased by providing the survey in the clinic with the physician [11]; however, this was not possible due to the COVID-19 pandemic-associated office closure during the collection period. Current research suggests the most effective way to increase response rate is to request survey response several times with more than one mode of request (i.e. via email and then phone call) [12]. In this study, multiple requests were delivered to patients who had not completed the surveys, with some improvement. It is well-established in medicine that implicit biases exist among care practitioners towards individuals of different race and religion, and that these biases do impact health outcomes [28,29,30]. Involving a diverse patient population in the development of AI/ML applications is of paramount importance, as a lack of diversity could build implicit bias into the algorithm itself and in effect limit the applicability of AI/ML to the general population. Algorithms may be an effective tool only for those populations whose data was used to build the algorithm. Therefore, the study could be improved by increasing the demographic diversity of survey respondents. Notably, the sensitivity analysis demonstrated no major significant differences when applying weighted statistics to these results, which may suggest generalizability. The data collection could be improved by utilizing an internally validated and standardized patient questionnaire like the model described by Ongena et al [16].

留言 (0)

沒有登入
gif