Foundation Models, Generative AI, and Large Language Models: Essentials for Nursing

Nurses can utilize generative artificial intelligence (genAI) including large language models (LLMs) to swiftly obtain and understand medical information, guidelines, and research, thereby ensuring that care is based on evidence.1 These models can assist nurses in completing administrative tasks like documentation, scheduling, and patient records management more efficiently.2 Large language models can provide nurses with personalized recommendations and support clinical decision-making by analyzing vast amounts of data. They can serve as educational tools, offering customized learning experiences and immediate answers to clinical questions, crucial for continuous professional development. These models can improve communication by translating medical jargon into layperson's terms for better understanding and facilitating multilingual interactions, improving patient care and satisfaction.3 GenAI-based solutions can support mental health, offering therapeutic communication and counseling techniques that nurses can apply. By leveraging these models, nurses may enhance the quality and efficiency of care, reduce the risk of errors, and improve the patient experience. However, there are concerns about the tools' readiness for use within the healthcare domain and their acceptance by the current workforce.3–5 Therefore, the goal of this article is to provide nurses with the essentials they need to better understand the currently available foundation models and artificial intelligence (AI) tools, enabling them to evaluate the need for such tools and assess how they can impact current clinical practice. Although there is existing literature on the basics and essentials of AI to nurses,6 this article will mainly focus on basics and essentials for foundation models and genAI.

Artificial intelligence is the field of study and development of computer systems that can perform tasks that usually need human intelligence, such as seeing, listening, reasoning, and/or translating languages.7 Building upon this foundation, in late 2021, the Stanford Institute for Human-Centered Artificial Intelligence introduced “foundation models,” large AI models trained on extensive data,8 serving as a base for more specific AI tools. These models are based on deep-learning architectures, primarily the “Transformer,”9 and trained through self-supervision. Table 1 includes the descriptions and the references of key concepts and terms commonly used in this domain (ordered alphabetically). GenAI is a type of AI systems capable of generating content such as text, images, or other media in response to prompts. GenAI models are mostly based on foundation models that learn the patterns and structure from their large input training data and then generate new data that have similar characteristics. Large language models are a type of genAI that learns the patterns and relationships in language data and use this knowledge to generate human-like text. Large language models can be used for a wide range of applications in clinical settings, including natural language processing, text summarization, and language translation.11

Table 1 - Key Concepts of Generative AI Concept Definition Attentions10 A technique used within deep-learning modeling that is meant to mimic cognitive attention. The effect enhances some parts of the input data while diminishing other parts. Autoregressive Type of statistical model that predicts future values based on past values Deep learning A subset of machine learning algorithms that uses multilayered neural networks, called deep neural networks. Those algorithms are the core behind the majority of advanced AI models. F1 score A machine learning evaluation metric that measures a model's accuracy. It combines the precision and recall scores of a model. Factual accuracy State of being precise, exact, or accurate in relation to facts Fine-tuning The projection of the pretrained model to new data allows the model parameters (weights) to be slightly updated to better fit a specific task. Foundation model Commonly a neural network–based model trained on mountains of raw data that can be adapted to accomplish a broad range of tasks. GenAI A type of artificial intelligence system capable of generating content such as text, images, or other media in response to prompts. GenAI models are mostly based on foundation models that learn the patterns and structure of their large input training data and then generate new data that have similar characteristics. Graphic processing unit A hardware initially designed to accelerate computer graphics and image processing. However, it also speeds up the computational time for complex machine learning and deep-learning algorithms that enable the training of large models in a short time. LLM A foundation model trained on massive text data that can recognize, summarize, translate, predict, and generate text and other content based on knowledge gained from massive datasets Neural networks A category of machine learning algorithms that use processes that mimic the way biological neurons work together to identify phenomena, weigh options, and arrive at conclusions, in order to derive decisions manner similar to the human brain Parameters Weights learned and built up the machine learning model Precision The fraction of relevant instances among all retrieved instances Pretraining Pretraining a neural network refers to first training a foundation model on a more generic task and large data and then using the parameters or model from this training to train another model on a different task or dataset Prompt The input you give to an LLM, and it is the best way to influence its output Prompt engineering Creating and adapting prompts (input) to instruct AI models to generate specific text output Recall (sensitivity) The fraction of relevant instances that were retrieved Self-supervised learning Machine learning process where the model trains itself to learn one part of the input from another part of the input Sequence modeling Machine learning models that input or output sequences of data Transformer Type of deep-learning models designed to process sequential input data, such as natural language, with applications toward tasks such as translation and text summarization
BACKGROUND

The advancement in training foundation models such as LLMs has been fueled by the advances in computer hardware, especially graphic processing units, parallel and distributed computing, and the advances in machine learning and natural language processing, as well as the availability of cloud computing resources to handle such large computational loads. Early LLMs include Generative Pretraining Transformer10 (GPT) by OpenAI12 and Bidirectional Encoder Representations from Transformers (BERT)13 and Text to Text Transfer Transformer (T5)14 by Google. GPT-1,10 as an example, was trained on text from over 7000 books and used a 12-layer transformer network to predict the next word in a sentence. Later versions, like GPT-3,15 expanded to 96 layers with 175 billion parameters, trained on over 570 GB of diverse text data. These pretrained LLMs were then evaluated on tasks like question answering, translation, text classification, and even common sense reasoning.

Generative AI Applications

GenAI and LLMs have gained widespread attention in recent years, as their potential applications are vast. In the field of language translation, these models can be used to improve the accuracy and efficiency of machine translation, making it easier for people to communicate across different languages. In content generation, LLMs can be used to automatically generate news articles, product descriptions, and other types of text content. Similarly, image-generating AI models such as text-to-image model named DALL-E16 can generate images and schematic drawings based on provided text descriptions. GenAI and LLMs can also be used to develop chatbots and virtual assistants, such as ChatGPT, which can understand and respond to natural language input, allowing for more natural and intuitive interactions between humans and machines. Such chatbots can be used to provide customer service, answer questions, and even assist with tasks such as scheduling appointments or making reservations.

Application of LLMs to Healthcare

The potential applications of LLMs are considerable in healthcare. Large language models are capable of processing and analyzing vast amounts of natural language data, including clinical, and can be used to perform a wide range of tasks in healthcare settings.17 Here are some examples of how LLMs can be used in healthcare:

Diagnosis and treatment recommendations: LLMs can be used to analyze a patient's medical history, symptoms, and other relevant data to make personalized diagnoses and treatment recommendations. By processing vast amounts of data, LLMs can also help healthcare providers identify patterns and trends in patient care, which can inform better treatment decisions. Patient education: LLMs can generate human-like text and provide patients with clear and concise information about their medical conditions, treatment options, and health outcomes. This can help patients make more informed decisions about their care and improve their overall health literacy. Electronic health records (EHRs): LLMs can be used to analyze and summarize large amounts of EHR data, making it easier for healthcare providers to identify patterns and trends in patient care. This can help improve the quality of care and patient outcomes. Clinical decision support: LLMs can provide healthcare providers with real-time decision support, assisting them in making more accurate and informed decisions based on a patient's medical history, symptoms, and other relevant data. Medical research: LLMs can be used to analyze and summarize large amounts of medical literature, making it easier for researchers to identify relevant studies and extract information for systematic reviews. This can help improve our understanding of various medical conditions and inform the development of new treatments.

Although LLMs have great potential in healthcare, it is important to consider the ethical implications of their use. This includes ensuring patient privacy and data security are protected and that healthcare providers and nurses are using LLMs in a responsible and ethical manner. One of the main concerns is the potential for bias and misinformation. Large language models are trained on existing data, which may contain biases or inaccuracies.18,19 This can lead to the perpetuation of harmful stereotypes or the spread of false information. Additionally, the sheer size and complexity of these models can make them difficult to understand and audit, raising concerns about accountability and transparency. Despite these challenges, LLMs have the potential to transform the way we interact with technology and each other. As research in this field continues, it will be important to balance the potential benefits with the risks and challenges and ensure that these models are developed and used in an ethical and responsible manner (Figure 1).

F1FIGURE 1:

Illustrative diagram showing the inner working of an LLM that can take patient data from the last 12 hours to generate a nursing handoff summary. The figure shows the patient data are used as the input to the encoder part of the transformer and the output that can be generated from the decoder part. Some LLM like GPT does not have the encoder part.

BASIC TECHNICAL ASPECTS NURSES NEED TO KNOW

As described above, foundation models are large AI models that are mostly transformer-based and trained on large data using self-supervised methods that can be used to improve the performance of a wide range of downstream tasks.8 This section provides some baseline technical information that can help nurses better understand the inner workings of the model.

Transformer

The main idea behind the transformer architecture is to use a deep-learning mechanism called “attentions” to build encoder and decoder layers.20 The encoder will learn the representation of the word that encodes what this word means in the context of the surrounding words, and the decoder layer will be used to generate words based on the given input, which is commonly a sequence of preceding words, which are known as prompts. For a language-translation task, the input can be an encoded representation of a similar word in another language, whereas for a text completion task, the input can be the generated representation of the preceding sentences up to specific length predefined by the LLMs.

There are three main types of pretrained LLMs, based on how they utilize different parts of the transformer architecture. First are encoder-only models, such as BERT,13 which focus on learning word representations based on the sentence's context. The output from encoder models is mostly referred to as contextualized embeddings that can be used as the first layer or as the input for any downstream task-specific model to achieve state-of-the-art performance after fine-tuning. Second are decoder-only models such as GPT, which is also known as autoregressive models, and these models are mostly generative and focus on creating new words based on input text, which is commonly referred to as a prompt. This prompt can also include examples to enable in-context learning by stimulating the model to focus on the task demonstrated by the examples. The third type is encoder-decoder models such as T5,14 which can be used more for translation-like tasks. The recent waves of LLMs that attract the public's attention are mainly decoder-only ones.

Self-supervised Tasks

In machine learning, where the AI model learns patterns from the training data, supervision mostly reflects how we provide the correct answer (true label) for the training data. For example, for in-hospital mortality prediction tasks, the input can be the patient clinical information on admission. The AI model will learn the pattern from the patient clinical information used in the training set to estimate the probability of the patient to expire within the admission. However, in order to evaluate the model and further improve its performance, patient samples with true labels indicating if they died or not during the admission must be provided. In supervised machine learning methods such as classification tasks mostly used in predictions, the model will compare its predicted probabilities against the provided true labels and use the results of this comparison to adjust the learned pattern to further improve the model accuracy. However, in unsupervised methods, true labels are not provided to the model, as the model will try to learn the inherent pattern only from the given input data elements.

Self-supervised learning is a kind of unsupervised learning. In self-supervised tasks, the model trains itself to learn one part of the input from other parts of the input. So, for example, in BERT model pretraining, the model learns to predict the value of randomly masked words within the sentences given as input from the other neighboring words. On the other hand, GPT models learn to predict the next word given the first few words of the sentences. So, in other words, the labels will be automatically generated from the input. On the one hand, such tasks are easy to construct, and often, only minimal preprocessing efforts are needed for the data preparation. On the other hand, such tasks are difficult enough that the model needs to “understand” the input before it can give a correct prediction. As a result, self-supervised learning can be straightforwardly scaled up, and intricate patterns in natural languages can be learned.

Fine-Tuning

The main advantage of foundation models is that, once pretrained, they can improve the performance of a wide range of downstream tasks. For example, the latest version of GPT (GPT-4) can be used to generate a tweet, write an SQL query, or describe an image, which are different examples of downstream tasks. The concept of fine-tuning means that pretrained representations learned by the foundation model get further trained on new data for a specific downstream task. The main advantage of fine-tuning foundation models rather than training new AI models from scratch is that it will mostly require smaller labeled data, which can be expensive and time-consuming to acquire.

Prompt Engineering

In the context of LLMs, a prompt is an instruction or input you provide to the LLM to guide its response. The process of designing and tuning these prompts for specific tasks, with the goal of improving the performance of LLMs, is called prompt engineering. Carefully designed prompts can significantly improve the performance of LLMs as they can provide instructions and contextual information to guide the LLM's attention toward the most relevant information for a given task, leading to more accurate and reliable outputs. Chain-of-thought is a prompting technique that guides LLMs through a series of intermediate reasoning steps, which can be particularly useful in complex tasks such as drafting a patient care plan or medical diagnosis. For instance, a chain-of-thought prompt could guide the LLM through the steps of diagnosing a patient based on symptoms, medical history, and test results.21

Best practices for LLM prompt engineering involve creating prompts that are clear, specific, and aligned with the task at hand. For example, in a nursing education scenario, a prompt could be designed to guide the LLM in generating a detailed explanation of a medical procedure, with the prompt specifying the procedure, the level of detail required, and the target audience, whether they are nursing students or patients. There are many prompt formulas, frameworks, and cheat-sheets available,22–27 among which we find CRISPE23,28 (capacity, role, insight, statement, personality, and experiment) and CREATE25,29 (context, result, explanation, audience, tone, and edit) are the most comprehensive. To improve their prompt engineering skills, nurses will need a basic understanding of how those LLMs work, their capabilities and limitations, and continuous practice.30

Zero/Few Shot Learning

Cutting-edge foundation models often require only a handful of samples during either prompting or fine-tuning, a process known as few-shot learning.15 Additionally, some of these models perform exceptionally well without any fine-tuning, a phenomenon referred to as zero-shot.

Generative AI Evaluation Methods

GenAI models including LLMs are commonly assessed on their ability to generate precise, coherent, and contextually appropriate responses.31,32 This evaluation is often done through two methods. The first is an automated process, similar to a multiple-choice examination, where the AI's responses are compared with a set of predetermined correct answers.31 The second method involves human evaluators where experts assess the quality and relevance of the AI's output in real-world scenarios. Continuous evaluation and monitoring of genAI models are crucial to identify areas for further training and development.31,32

In the realm of genAI models, several evaluation methods have been proposed and utilized.31,33,34 Evaluation methods utilize essential qualitative and quantitative metrics like coherence, relevance, fluency, diversity, factual accuracy, bias and fairness, user satisfaction, task completion, and efficiency, besides traditional machine learning metrics such as precision, recall (sensitivity), accuracy, and F1 score. Whereas benchmarking against trusted standards and checking grammatical validity are basic methods to evaluate the LLMs' ability to generate precise and coherent content, other evaluation methods are used to assess if the generated text or responses are contextually appropriate responses. These include the following:

Single-turn question answering without retrieval: In this setup, users pose individual questions or prompts, and a genAI model is employed to generate responses instantially. This is like answering a patient's single question, for example, “What is hypertension?” and getting an answer such as, “Hypertension is a condition characterized by consistently high blood pressure,” without the need to refer to the patient's medical records. The key metric used here is accuracy, which measures how often the AI's responses match the correct answers. Multiturn or single-turn chat with retrieval: In this context, users engage in conversational interactions. The genAI model, equipped with retrieval mechanisms, not only generates responses but also has the capability to access and incorporate information from external sources. For example, a patient asks, “Why am I feeling dizzy?” referring back to the patient's records; the model sees the patient is on blood pressure medication and responds, “Your dizziness could be a side effect of your blood pressure medication.” The key evaluation metric in this case is relevance, which assesses how pertinent and directly related the genAI's responses are to the given questions. Task-based evaluation: This method evaluates the AI model in place as it would be used in industry. For example, a nurse uses a genAI tool to analyze a patient's laboratory results. The metric used here is task completion, which measures whether the AI model was able to successfully complete the task. Turing-style test: This test measures how well a human can distinguish the model from another human. This is similar to a nurse using an AI tool to communicate with patients. The goal is to see if the AI can provide responses that are indistinguishable from a human's. The key metric is the pass rate, which measures how often a human evaluator cannot distinguish the AI's responses from a human's responses. Truthfulness: This method checks how true the model's outputs are and whether it fabricates or reproduces real-world biases. This is similar to a nurse using an AI tool to provide medical information. The metric used here is factual accuracy, which checks if the generated medical information is true and not fabricated. Human-AI collaboration: This approach measures genAI performance through human-AI collaboration. For example, using a genAI tool to assist in diagnosing a patient. The AI tool can provide suggestions based on the patient's symptoms, but the clinician makes the final decision. The key metric is user satisfaction, which measures users' satisfaction with the AI's responses. ADOPTION OF genAI IN CLINICAL SETTINGS

Excitement about genAI is everywhere.35 Top executives and CEOs are speaking about it in the press. Alongside their enthusiasm, however, these experts are discussing the need for regulations about how AI should be used.36 Nurses have a clear understanding of the current workflow but have yet to identify how AI technology may affect their work. Workflows, implementation, and project management principles must not be ignored out of excitement for technology and innovation. Frameworks, principles, and methods to support innovative tool implementation and adoption are needed. Quick decisions to adopt new technologies without considering the current state of people and processes can lead to patient harm and business losses.37 Domain experts are needed to assist in problem identification, assessment, planning, and AI implementation.

Although the specifics may differ depending on the scenario, the involvement of an interdisciplinary team, including nurses, other medical professionals, data scientists, ethicists, and policymakers, is generally essential in most clinical practice use cases.38 In the following sections, we will highlight major questions to ask when considering the implementation and adoption of genAI-based solutions.

Needs Assessment

The implementation of genAI solutions requires significant technological infrastructure and education. Seibert and colleagues39 argue that nurses need practical business use cases for AI applications in nursing care. Use cases could provide insight into how to balance technology with evidence-based recommendations to improve decision-making. These use cases could include evaluating care requirements, choosing appropriate and effective interventions, and observing and assessing health statuses and results. Such use cases would be beneficial in training staff to account for genAI systems recommendations being error-prone and not applicable to individual patients. Nurses must be able to interpret and understand the logic behind these suggestions to make informed decisions and avoid actions that might lead to inappropriate care.3 What's more, nurses may find it challenging to adapt to new systems. Although genAI can automate specific tasks, its implementation may initially increase workloads for nurses. Continuous education will be needed to interpret genAI outputs and manage additional monitoring procedures, which may add to staff responsibilities.2

Business cases or problems will help determine each project's objectives, costs, and benefits. After the business case has been clearly stated, project feasibility must be considered.37 Project feasibility guides organizational leaders in decision-making by identifying the project's benefits, risks, barriers, constraints, and other issues that could affect its overall success. Frameworks such as the Cross-Industry Standard Process for the development of Machine Learning applications with Quality assurance methodology (CRISP-ML (Q))40 can be used to guide organizations in understanding the business case and identify risks for genAI solutions.

This process will guide the team through the essential steps of assessment and evaluation throughout the new technology's implementation. The CRISP-ML (Q)40 framework consists of six distinct phases, starting from defining the project's scope to the ongoing maintenance of the deployed machine learning system. Initially, the business and data assessment phases are undertaken concurrently, as both significantly influence the project's practicality. The subsequent stages are data preparation, modeling, evaluation, and implementation. Particular attention is given to the implementation phase, as a model operating in dynamic real-time settings necessitates continuous evaluation to monitor its performance. Throughout each stage of the process, this approach introduces a quality assurance methodology designed to tackle challenges and mitigate the risks typically encountered in a machine learning development environment.

During the feasibility analysis, critical factors such as problem identification, technical and legal issues, the business environment, and strategic planning must be considered carefully.41 Such an analysis can later be used to support leaders in determining whether to proceed with their proposed genAI implementation plan. A feasibility analysis typically involves understanding the ideas, perceptions, assumptions, and facts associated with the organization and the problem.

Nurse informaticists play a crucial role in the assessment, evaluation, and risks of the business case to support the implementation of general AI by disseminating their expertise and skills among other staff members, especially when there is a lack of practical knowledge within the team. Importantly, the partnership between clinical nurse leaders and nurse informaticists is crucial for the success of genAI projects.42 Nurse informaticists can assist the team in understanding the answers to the questions listed in Table 2.4,41,43 Additionally, the nurse informaticists can educate other team members by providing explanations, sharing their knowledge, offering guidance, and facilitating discussions. This collaborative approach helps to enhance the understanding and efficiency of the entire team.

Table 2 - Questions for Nurse Leaders to Ask When Assessing the Need for genAI Solution Implementation Why should the organization undertake genAI tools rather than other standard methods? Is this the right solution for the problem? Can this be solved by another IT strategy, solution, or improved workflow? What justifies undertaking this project, and was it forecasted in the strategic plan? What questions do we need to answer to communicate to business leaders about the problem, and why do we need staff support and funding for this project? How will this project contribute to the organization's goals, objectives, innovation, success, and digital transformation? What is the return on investment? What is the benefit to the consumer?
IMPLEMENTATION AND EVALUATION

Nurses often struggle to stay up-to-date with swift advancements in digital technologies such as AI and their effects on society.3,35 This evolving landscape demands that nurse informaticists be involved in the adoption and education of new health technologies. Their key responsibility in this area is to uphold the commitment to educate and protect as organizational teams create, adopt, and apply new technologies.44–46 Nurse informaticists can support genAI implementations in various ways. For example, they may analyze and map healthcare workflows to identify inefficiencies and areas where organizations can apply genAI to improve processes.2 The process might involve automating routine tasks, streamlining data entry procedures, or improving communication channels within the healthcare team. Alternatively, they might develop systems that can collect, store, and analyze data more efficiently. Such processes could involve integrating disparate health information systems to enable seamless data exchanges and applying data standards to ensure that genAI solutions can interpret and act on data accurately.4

Nurse informaticists also play a critical role in educating nursing staff and other healthcare professionals about genAI technologies, including how to use them effectively and understand their impact on patient care and workflow.47 This education may help alleviate fears and resistance to new technologies, fostering a culture of innovation and continuous improvement. Furthermore, nurse informaticists can guide the ethical use of genAI in healthcare, ensuring that automated systems are used to respect patient privacy and consent and ensure equity in care.42 They can also help healthcare organizations navigate the complex regulatory landscape associated with digital health technologies. They can participate in research initiatives that explore the effectiveness of genAI in healthcare, contributing to evidence-based practices and innovation in the field. By leveraging their expertise in nursing and information technology, nurse informaticists can act as crucial intermediaries in successfully implementing and supporting genAI in healthcare.2

The Role of Nurse Informaticists

Nurse informaticists are instrumental to the collaboration necessary between clinical and technical teams to identify precise problem definitions and user requirements, analyze project feasibility, and assess organizational readiness, including analyses of workflows to include people, processes, and technology.42 As a part of the healthcare team, they can assist in determining the current state of data completeness and accuracy, as well as how those data apply to the context of use cases, staff, and patient preferences.

During the needs assessment and feasibility analysis phases of implementing a genAI-based solution, nurse informaticists can help nurse leaders to understand the importance of and then answer identification questions.4,41,43 For example, if the organization is planning to implement an LLM-based program to automatically generate nursing summary note drafts, the questions posed by nurse informaticists might look like the following:

What is the LLM-based tool intended to do, and what is the output format expected to look like? How will the adoption of an LLM-based tool to draft nurses' hand-off notes help to achieve the organization's strategic goals? What are the potential benefits of using such tools and how can they improve patient outcomes? What are the potential risks and limitations of using such tools and how can these be mitigated? What kind of clinical input is needed to assess, implement, and further validate the tool during the different stages of the project? Do we have the data required for the LLM's fine-tuning and evaluation? If yes, is it in a consumable format, or is it feasible to transform the data for LLM use? Do we need clinical inputs to transform the data required for the LLM-based tool? Will the adoption of the LLM-based tool to draft nurses' hand-off notes conform to or disrupt the existing clinical workflow? From a resource perspective, what will be the cost of utilizing such a tool? How can we validate the performance of the tool? Is there an approach to ethical (responsible) LLM? How can we ensure that LLMs are being used ethically and responsibly and that patient privacy and data security are protected? How can we ensure that healthcare providers and nurses are adequately trained to use LLMs in healthcare and that they understand the limitations and potential biases of this technology? How can we evaluate the effectiveness of the LLM-based tool, and how can we continuously monitor and improve its performance?

Successfully implementing an AI project requires following a well-defined model or process. Skipping any step in the process can lead to unintended and potentially negative outcomes. AI, ML models, and workflows are common themes in the early phases of such processes. These phases require domain experts and knowledge workers to collaborate with technical experts to identify current and future states and success criteria. Organizational leaders may be tempted to skip these steps because they assume their staff understand the problem and are familiar with current workflows and data. However, this may be far from the truth and can be a reason why AI projects fail. A 2020 survey by Gartner identified the complexity of integrating AI within existing infrastructures as a significant barrier to the application AI into practice.48 Nurses equipped with the knowledge from the questions from Tables 2 and 3 will facilitate the application of AI by defining the business and data assessment phase of the CRISP-ML (Q)40

Table 3 - Questions to Ask While Designing the Implementation of GenAI Based Tool Within a Clinical Workflow # Questions Description 1 What is the genAI-based tool intended to do? GenAI is a type of artificial intelligence capable of creating content. This could be in the form of text, images, music, or even videos. An LLM, or language model, is an AI model designed to understand, generate, or complete text sequences. It is a generative model tailored explicitly for language-related tasks. 2 Describe the goal for the genAI tool and how it aligns with the organization's strategic objectives. The intended purposes of the LLM should be consistent with the broader long-term goals and plans of the organization. 3 What are the potential benefits of using genAI in healthcare, and how can they improve patient outcomes? GenAI can offer a range of benefits to enhance nursing roles, improve patient care, and streamline hospital operations. Potential benefits include training and simulation, personalized patient education, optimized scheduling, wound assessment and monitoring, data synthesis for care plans, and streamlining documentation. While genAI holds significant promise for enhancing the nursing profession, AI should be viewed as a supportive tool rather than a replacement. 4 Why should the organization undertake AI? The justifications include a return on investment (ROI) for the organization to adopt AI. AI initiatives can offer organizations a competitive edge, streamline operations, enhance customer experiences, and open up new avenues for innovation. 5 Is AI the right solution for the problem? The nature of the problem must be assessed in terms of people, processes, and technology. The assessment will provide insights into solutions relevant to that problem. AI might not be the best solution if the problem depends on situations requiring complex decision-making, innovative concept development, or a lack of data to train. 6 Can the problem be solved by another IT strategy, solution, or improved workflow rather than AI? Clinical problems can be addressed through alternative IT strategies, solutions, or improved workflows rather than relying solely on AI. Often, a combination of approaches, tailored to the specific needs and context of a clinical problem, can provide practical solutions without necessarily employing AI. Depending on the nature of the problem, various non-AI technological approaches can be beneficial, such as telemedicine solutions, clinical decision support systems, health information exchanges, pharmacy automation, patient engagement platforms, data analytics, quality assurance protocols and training, and education platforms. 7 What justifies the undertaking of an AI project, and was it forecasted in the strategic plan? Justifying an AI project requires a comprehensive assessment of its potential benefits, alignment with organizational goals, and an expected ROI. Forecasting AI in the strategic plan can influence how the project is resourced and affect its overall success. 8 How does AI benefit nurses? AI can support nurses by reducing administrative burdens, enhancing patient care, assisting in decision-making, and providing training and skill development tools. By efficiently handling repetitive manual tasks and providing valuable insights, AI can free nurses to focus on patient-centered care and their own professional development. 9 How does AI benefit patients or clients? AI offers several advantages to individuals receiving medical care or treatment. Examples include:
• Analyzing health data to recommend personalized treatment plans
• Using radiology images to detect disease early
• Monitoring medication intake to reduce errors
• Continuously monitoring a patient's vitals or condition through wearable devices
• Alerting healthcare professionals automatically
• Utilizing chatbots to answer questions and schedule appointments
• Assisting with personalized rehabilitation plans and progress tracking 10 What clinical inputs are needed to perform the data transformation required for genAI implementation? Professionals with clinical expertise should get involved and provide guidance. Expert involvement from clinicians is vital to determine which data are relevant and ensure that they are interpreted correctly. These professionals include domain experts such as nurses from various specializations. 11 How can the genAI tool be integrated appropriately within the current clinical workflow?
What processes must we consider if we need to adjust our workflow for efficiency? Any AI-based tool has the potential to impact the way that clinical tasks and procedures are carried out. The integration of such tools should be smooth and not cause any disruption or necessitate changes to how things are done.
Even if the changes required in the workflow to improve overall efficiency are minor, stakeholders must be involved in the redesign decisions for the workflow and further evaluation. 12 How has the LLM's performance been validated? Models are trained on vast amounts of text data. The more diverse these training data are, the more robust the resulting model will be. 13 How can we ensure that LLMs are being used ethically and responsibly and that patient privacy and data security are protected? There are defined processes and procedures that protect patients' personal and private information. These data are safeguarded to ensure they are not accessed, tampered with, or shared without proper authorization. Ethical issues touch on privacy, accountability, bias, and other concerns. Addressing these ethical concerns requires a combination of technological innovations, regulatory measures, industry standards, and public awareness. 14 What are the potential risks and limitations of using LLMs in healthcare, and how can these be mitigated? The possible risks include bias and fairness, data privacy, and misinformation and misuse. These risks must be identified to develop mitigation strategies to decrease them. 15 Do data exist for the required genAI or LLM's fine-tuning and evaluation?
If yes, how do we transform those data for genAI use? The data for genAI are available and of good quality.
It is possible to convert the available data to a format suitable for consumption by genAI. For example, the data may be saved to a textual format for LLM, such as GPT-4, so the data may be consumed and outputs generated in the desired format. 16 Draft the budget for the implementation of the genAI tool, along with a long-term cost analysis. Overall costs may include infrastructure, hardware, software, personnel, training, maintenance, and resources necessary to implement the LLM effectively. 17 What is the ROI for AI? ROI is a measure used to evaluate profitability, including the indirect benefits of AI that might accrue over time. Quantitative ROI refers to the tangible, numerical, or measurable returns on an investment. Qualitative ROI refers to the benefits that might not be immediately measurable in dollars but still provide value to an organization, such as customer satisfaction.

Domain experts, such as nurse informaticists, can assist data scientists and engineers when developing LLM prompts. Such collaboration can help to achieve more accurate, relevant, coherent, and diverse outputs from tools such as ChatGPT. These domain experts provide relevancy, tone, and context to crucial questions to target specific, structured answers. Their expert involvement can save healthcare organizations time and money by reducing the number of staff hours required to identify the most accurate prompt to achieve precise outputs. What's more, optimized outputs may even be used to improve patient outcomes.

Identifying a prompt requires recognizing and framing the problem or clinical question.40 To understand the clinical question, one must have domain knowledge in healthcare and working knowledge to assess the organization's workflows, people, processes, and technology. Domain experts cannot be replaced in this process. Suppose an output is identified as a relevant clinical problem. The domain experts would then work to answer the identified problem by communicating with the healthcare team and negotiating throughout the organization to further identify the problem, develop a business case, examine the problem's feasibility, conduct assessments, implement the solution, and oversee the adoption and maintenance of the project.

Table 3 describes the types of questions that nurse informaticists can help to answer when designing an LLM-based tool to integrate within an existing clinical workflow. Most of these questions draw from Bohr and Memarzadeh's proposed framework for AI-based application adoption.49

CONCLUSION

The aim of this article was to provide nurses and nurse informaticists with an overview of the current state and potential applications of foundation models,

留言 (0)

沒有登入
gif