Development of a Common Data Model for a Multisite and Multiyear Study of Virtual Visit Implementation: A Case Study

Consistency and efficiency in multisite data analysis can be facilitated by development and implementation of a common data model (CDM). In a CDM framework, each study site maintains core datasets in uniform and consistent formats with common measurement specifications. A reporting or analysis program may be written and tested at one site and distributed for execution at all sites which have implemented a CDM.

A Virtual Data Warehouse (VDW)—designed and implemented by member organizations of the Health Care Systems Research Network (HCSRN)—is an example of a CDM.1 The HCSRN VDW datasets have a uniform design for all member organizations, are populated with data extracted from each organization’s legacy electronic health record (EHR) systems, and remain on each organization’s secured computer network. VDW datasets are specific to a content area (eg, patient demographics, outpatient and inpatient encounters ancillary services, insurance benefits) and have a consistent configuration within each content area as to measurement specifications and code structure. Each site assesses its local data assets for populating VDW datasets, and writes and tests code to conform to VDW guidelines. The uniformity—variable labels and code structures—allow for a SAS (SAS Institute) program written at one HCSRN member organization to be distributed and executed at any organization maintaining the VDW specifications. Thus, the HCSRN VDW supports rapid and valid ascertainment of multiple site sample sizes and comparisons of cross-site patient demographic characteristics and healthcare utilization rates.

At the time we proposed our “Virtual Visits Study,” the existing HCSRN VDW did not include sufficient detail for us to distinguish virtual visits from in-person visits nor, among virtual visits, the specific mode (eg, synchronous online chats, telephone visits, video visits). We used HCSRN VDW principles to design and implement a CDM for support of our multisite research study to investigate several study questions related to the implementation of virtual visits in ambulatory care:

Trends in the uptake of virtual visits, overall and by department. Patient characteristics associated with choice of virtual versus in-person visits (n). Differences between virtual and in-person visits in patterns of: physician orders for ancillary services associated with ambulatory care for specific conditions (eg, neck or back pain), and patient fulfillment of those orders.

In this paper, we describe the CDM that we developed to support our investigation of these research questions. We present information on CDM design decisions, validation of the integrity of the CDM content, and basic descriptive statistics on the key measures included in our CDM that may be serve as a future model for broadly applicable data standardization efforts.

STUDY SETTING

Three Kaiser Permanente (KP) regions participated in this study: KP Colorado (KPCO), KP Georgia (KPGA), and KP Mid-Atlantic States (KPMAS). Under a ceding arrangement, the institutional review boards of KPCO, KPGA, and KPMAS collectively reviewed and approved the study protocol.

KPCO provides comprehensive medical services to ∼635,000 members per year in the Denver/Boulder metropolitan areas. Beginning in 2016, KPCO implemented synchronous chats through a contract with a third-party software entity (CirrusMD). Synchronous chats allow patients to obtain “fast answers” from any available board-certified provider for questions primarily related to minor acute conditions (eg, upper respiratory symptoms). Patients initiate synchronous chats through the EHR. Video visits were implemented in 2017.

KPGA provides comprehensive medical services to ∼320,000 members per year in the Atlanta metropolitan area. Video visits were initiated in 2016 primarily in internal medicine and psychiatry.

KPMAS provides comprehensive medical services to ∼760,000 members per year in the DC, Maryland, and northern Virginia areas. Video visits were introduced in adult primary care in late 2014, and in behavioral health in mid-2016. A “House Calls” program of virtual care for a limited number of minor acute conditions (eg, rhinitis) was initiated in 2018.

DATA SOURCES HealthConnect

KP has a national EHR system (HealthConnect) based on the Epic Systems (Epic Systems) EHR platform. Epic is a widely used EHR in both the US and in Europe and has been adapted to the EHR requirements of a broad range of healthcare systems and organizations.2,3 HealthConnect incorporates a broad range of Epic clinical modules (ambulatory, inpatient) and functions as a data integrator in which data from ancillary services systems (prescribed medications, diagnostic radiology examinations, laboratory tests) and claims for payment of medical services delivered by contract providers—are consolidated into EHR databases.

Most KP regions implemented HealthConnect over a period of several years beginning around 2004–2006. HealthConnect is an adaptable EHR platform and is constantly updated as a KP region implements new clinic workflows and discontinues others. Many, but not all, EHR processes and functionality are standardized across KP regions. Regional differences in organization of healthcare delivery result in differences in data acquisition, processing, and curating.4,5 This can result in some subtle, but important, differences in identification and retrieval of healthcare events—such as virtual visits.

Existing Virtual Data Warehouses

In addition to the HCSRN VDW, KP regions collaborate in the KP Center for Effectiveness and Safety Research (CESR) VDW. The CESR VDW is intended to supplement the HCSRN VDW with datasets that are more detailed in areas selected for investigation, such as insurance benefit details and clinical evaluation results (eg, bone densitometry t-scores).

STUDY ORGANIZATIONAL STRUCTURE

Our study included representation from the 3 KP regions and was organized into 3 study teams. The Investigator Team directed project scientific activities and included several members with substantial involvement in VDW development and multisite studies. The Programmer Team included programmer/analysts with knowledge of the HCSRN VDW and HealthConnect, the KP EHR system. The Project Manager Team coordinated team and Advisory Committee meetings, identified KP virtual visit content experts, and documented CDM development.

METHODS Scoping Review of Virtual Visit Modes for Common Data Model Development

Our initial step was a series of discussions with KP virtual visit content experts (both physicians and nonphysicians) who had experience in design and implementation of virtual visits at each of the 3 KP regions. We also included KP data administrators and programmer/analysts who had been identified by these content experts as having gained knowledge of EHR sources of data to identify virtual visits and their characteristics (eg, whether the visit mode was telephone or video based, completed or not). In the process of conducting this scoping review, we identified KP Regional and National Program Office staff who were also exploring HealthConnect datasets to identify virtual visits for operations management and quality improvement initiatives.

Our primary goal in these discussions was to gain a thorough understanding of EHR data sources to ensure completeness and accuracy of virtual visit utilization. A secondary goal was to understand how to identify fully executed versus “incomplete” virtual visits, which could affect accuracy of study measures, such as ancillary services orders by providers.

Scoping Review of Electronic Health Record Data Sources for Common Data Model Development

Our second step in CDM development was a scoping review of potential EHR data sources as well as opportunities and challenges in use of EHR data for specifying study measures. The first scoping review discovered the range of visit modes that had been implemented at the KP regions; however, that review raised concerns as to how EHR data might accurately distinguish visit modes [eg, virtual vs. in-person and type of virtual mode (telephone visit or video visit)]. In addition, experience among study Investigators and Programmers suggested a number of extant VDW assets might be adapted for accurate measurement of proposed patient-level, provider-level, and system-level measures (eg, factors affecting choice of a virtual versus in-person visit or factors affecting patient likelihood of fulfilling a provider order for a prescription medication).

Our goals in this second scoping review were to identify: (1) programing and analytic expertise in KP quality improvement and operations management department who were exploring use of KP HealthConnect data for identifying virtual visits; and (2) validated data assets and measures in the HCSRN VDW and CESR VDW to incorporate into our study CDM—rather than developing those measures from scratch. The Investigator, Programmer, and Project Manager Teams met on a regular basis to review the availability of existing data assets. Those assets included algorithms as well as extant datasets that might be linked, or linkable, to the CDM at the patient, provider, or clinic levels. Examples included adaptation of an HCSRN VDW algorithm to compute the Charlson Comorbidity Index6,7 and availability of an “area deprivation index”8,9 for linking patient residence to area-level socioeconomic status (SES).

Retrospective Chart Review of Common Data Model Accuracy

We conducted 2 rounds of chart reviews to confirm the CDM. Round 1 focused on validating visit mode (ie, chat, phone, video). Round 2 focused on confirming “primary” visit diagnosis, ancillary service order associated with the visit, and service fulfillment. Chart reviews spanned the study period.

For Round 1, visits for review were sampled from strata defined by visit year, visit mode, and department. For Round 2, visits for review were sampled separately by selected clinical conditions of interest [neck or back pain (NBP), urinary tract infection (UTI), major depression (DEP)] stratified by the same criteria. Chart reviews were independently conducted at each KP region by trained research assistants. Due to budget and time constraints, the sample size of visits for review in each Round was limited to <450 total across the 3 KP regions; and, only one reviewer was assigned for abstraction of information from each visit.

RESULTS

The initial virtual visit scoping review resulted in a set of tables for each of the 3 KP regions that outlined design and implementation of virtual visits relevant to our study’s measurement requirements (multisite comparison summarized in Supplement Table S1, Supplemental Digital Content 1, https://links.lww.com/MLR/C605; KPMAS details as an example summarized in Supplement Table S2, Supplemental Digital Content 1, https://links.lww.com/MLR/C605). These tables provided guidance to the Programmer Team to develop a data collection strategy from relevant HealthConnect and HCSRN VDW databases, data elements, and code structures (Supplement Table S3, Supplemental Digital Content 1, https://links.lww.com/MLR/C605).

The next scoping review of the HCSRN and CESR VDWs at the beginning of our study revealed that while VDW databases might be excellent sources of covariate measures that we could incorporate into our CDM (eg, patient age and sex, comorbidities, residential area SES), they would be an inadequate source of visit mode measures as outlined in our study research plan. We determined that virtual visit measurement specifications, including identification of virtual visit mode subtypes (synchronous chats, telephone visits, and video visits) and whether a visit was completed or not, would be best measured validly if derived directly from HealthConnect databases. This effort requires substantial, unanticipated, redirection of study resources toward original programming of study measures; however, retrospective medical record review affirmed the validity of our effort.

With guidance from the Investigator Team, the Programmer Team constructed the Virtual Visit CDM Encounters Dataset from HealthConnect datasets, variables, and codes (Fig. 1) in a manner that was consistent across the 3 KP regions, but also recognized site-specific coding differences that were needed to properly identify virtual visits.

VV_PAT_DB: This dataset has one record per KP member included in the study. Inclusion criteria are: (1) at least 1 month of benefit eligibility in the period from January 2016 through June 2021; and (2) at least 19 years of age as of the first month of benefit eligibility in this 4-year period. Additional datasets were implemented for comorbidity measures, insurance product and cost-sharing, and distance to clinic. These datasets were primarily sourced through the HCSRN and CESR VDWs. VV_ENC_DB: This dataset has one record per encounter per included KP member. An encounter record consists of information related to the encounter appointment and its fulfillment (the actual visit). This dataset is primarily sourced through HealthConnect tables. VV_ENC_ATTRIB_DB: This dataset has one record per included visit that identifies important attributes of each encounter’s appointment and visit components, such as who initiated the appointment, when the appointment was made, if the appointment was fulfilled, which provider attended the visit, and primary diagnosis associated with the visit. This dataset is primarily sourced through HealthConnect tables. VV_ENC_NBP_DB, etc.: These datasets represent additional visit details, including ancillary services orders and fulfillments, for clinical conditions selected for focused study on variation in provider ordering behavior, and patient fulfillment behavior, for related ancillary services: NBP, UTI, and DEP.10–13 F1FIGURE 1:

Virtual visits study common data model: component dataset relationships. KPCO indicates Kaiser Permanente Colorado; KPGA, Kaiser Permanente Georgia; KPMAS, Kaiser Permanente Mid-Atlantic States.

Further details of each of the CDM datasets are presented in the Supplement (Tables S4–S9, Supplemental Digital Content 1, https://links.lww.com/MLR/C605). Key decisions that informed contents of specific CDM datasets are summarized in the Supplement (Table S10, Supplemental Digital Content 1, https://links.lww.com/MLR/C605).

Tables 13 profile the population and ambulatory visit data in the CDM, both overall and separately for each of the 3 KP regions in our study. Compilation of these tables in an expeditious manner was achieved through distributed SAS code applied to the standardized multisite data available in the study’s CDM. Overall, our CDM contains information on 7,476,604 person-years (Table 1) with 2,966,112 virtual visits (Table 2) and 10,004,195 in-person visits (Table 3) during the study period.

TABLE 1 - Count of Person-years in the Common Data Model for 2017—June 2021 n (%) CDM variable or derived measure Overall KPCO KPGA KPMAS Total 7,476,604 2,420,947 1,493,214 3,562,443 Age group  19–34 2,274,766 (30.4) 688,036 (28.4) 460,429 (30.8) 1,126,301 (31.6)  35–44 1,990,752 (26.6) 626,392 (25.9) 427,871 (28.7) 936,489 (26.3)  45–64 1,980,332 (26.5) 618,286 (25.5) 427,641 (28.6) 934,405 (26.2)  65–89 1,230,754 (16.5) 488,233 (20.2) 177,273 (11.9) 565,248 (15.9) Sex  Female 4,025,102 (53.8) 1,286,040 (53.1) 825,302 (55.3) 1,913,760 (53.7)  Male 3,450,749 (46.2) 1,134,329 (46.9) 667,744 (44.7) 1,648,676 (46.3)  Other 0 (0) 0 (0) 0 (0) 0 (0)  Unknown 753 (0) 578 (0) 168 (0) 7 (0) Race/ethnicity  White 2,914,050 (39.0) 1,505,938 (62.2) 462,427 (31.0) 945,685 (26.5)  Black 1,936,581 (25.9) 108,726 (4.5) 621,307 (41.6) 1,206,548 (33.9)  Asian 579,087 (7.7) 88,582 (3.7) 83,139 (5.6) 407,366 (11.4)  HI/PI 9673 (0.1) 6578 (0.3) 824 (0.1) 2271 (0.1)  American Indian 17,021 (0.2) 9645 (0.4) 2271 (0.2) 5105 (0.1)  Multiple 142,830 (1.9) 17,881 (0.7) 7412 (0.5) 117,537 (3.3)  Other 102,251 (1.4) 76,295 (3.2) 0 (0) 25,956 (0.7)  Unknown 901,624 (12.1) 227,635 (9.4) 253,850 (17) 420,139 (11.8)  Hispanic 873,487 (11.7) 379,667 (15.7) 61,984 (4.2) 431,836 (12.1) Charlson Comorbidity Index  0 3,605,059 (48.2) 1,198,959 (49.5) 679,406 (45.5) 1,726,694 (48.5)  1 726,485 (9.7) 256,767 (10.6) 142,192 (9.5) 327,526 (9.2)  ≥2 674,510 (9.0) 250,541 (10.3) 127,439 (8.5) 296,530 (8.3)  Missing 2,470,550 (33) 714,680 (29.5) 544,177 (36.4) 1,211,693 (34) ADI quartile (national percentile)  Lowest SES 529,513 (7.1) 48,701 (2.0) 235,133 (15.7) 245,679 (6.9)  Lower mid-SES 1,024,746 (13.7) 189,003 (7.8) 362,877 (24.3) 472,866 (13.3)  Upper mid-SES 2,257,250 (30.2) 865,603 (35.8) 499,061 (33.4) 892,586 (25.1)  Highest SES 3,527,893 (47.2) 1,295,302 (53.5) 384,797 (25.8) 1,847,794 (51.9)  Missing 137,202 (1.8) 22,338 (0.9) 11,346 (0.8) 103,518 (2.9) Distance to clinic  <5 miles 4,657,547 (62.3) 1,774,263 (73.3) 662,374 (44.4) 2,220,910 (62.3)  5–9 miles 1,702,456 (22.8) 401,136 (16.6) 496,410 (33.2) 804,910 (22.6)  ≥10 miles 1,083,011 (14.5) 235,699 (9.7) 329,590 (22.1) 517,722 (14.5)  Missing 33,590 (0.4) 9849 (0.4) 4840 (0.3) 18,901 (0.5) Enrollment in a high deductible health plan  Yes 3,130,455 (41.9) 1,396,179 (57.7) 846,494 (56.7) 887,782 (24.9)  No 4,268,737 (57.1) 1,020,236 (42.1) 635,927 (42.6) 2,612,574 (73.3)  Missing 77,412 (1) 4532 (0.2) 10,793 (0.7) 62,087 (1.7) Prior prescription mail order use  Yes 2,398,972 (32.1) 1,112,334 (45.9) 247,695 (16.6) 1,038,943 (29.2)  No 5,077,632 (67.9) 1,308,613 (54.1) 1,245,519 (83.4) 2,523,500 (70.8) Year  2017 1,478,037 (19.8) 515,044 (21.3) 293,893 (19.7) 669,100 (18.8)  2018 1,596,068 (21.3) 510,772 (21.1) 346,820 (23.2) 738,476 (20.7)  2019 1,515,240 (20.3) 490,278 (20.3) 296,363 (19.8) 728,599 (20.5)  2020 1,472,395 (19.7) 469,464 (19.4) 277,382 (18.6) 725,549 (20.4)  2021 1,414,864 (18.9) 435,389 (18.0) 278,756 (18.7) 700,719 (19.7)

A “person year” is defined as a person with any eligibility in the year and age 19 years and older as of 1/1/2017–2021.

This is a “population” denominator. No utilization in the year is required.

2021 counts are only for January–June 2021.

ADI indicates Area Deprivation Index; HI/PI, Hawaiian/Pacific Islander; KPCO, Kaiser Permanente Colorado; KPGA, Kaiser Permanente Georgia; KPMAS, Kaiser Permanente Mid-Atlantic States; SES, socioeconomic status.


TABLE 2 - Count of Virtual Visits in the Common Data Model for 2017—June 2021 n (%) CDM variable or derived measure Overall KPCO KPGA KPMAS Total 2,966,112 1,039,958 510,357 1,415,797 Age group  19–34 608,766 (20.5) 244,661 (23.5) 93,169 (18.3) 270,936 (19.1)  35–44 720,752 (24.3) 252,176 (24.2) 137,462 (26.9) 331,114 (23.4)  45–64 858,992 (29) 254,328 (24.5) 176,559 (34.6) 428,105 (30.2)  65–89 777,602 (26.2) 288,793 (27.8) 103,167 (20.2) 385,642 (27.2) Sex  Female 1,905,934 (64.3) 695,868 (66.9) 335,476 (65.7) 874,590 (61.8)  Male 1,060,031 (35.7) 343,968 (33.1) 174,856 (34.3) 541,207 (38.2)  Other 0 (0) 0 (0) 0 (0) 0 (0)  Unknown 147 (0) 122 (0) 25 (0) 0 (0) Race/ethnicity  White 1,261,192 (42.5) 706,542 (67.9) 164,066 (32.1) 390,584 (27.6)  Black 931,119 (31.4) 57,088 (5.5) 265,327 (52) 608,704 (43)  Asian 203,555 (6.9) 28,329 (2.7) 21,715 (4.3) 153,511 (10.8)  HI/PI 3584 (0.1) 2467 (0.2) 367 (0.1) 750 (0.1)  American Indian 7604 (0.3) 4519 (0.4) 949 (0.2) 2136 (0.2)  Multiple 65,276 (2.2) 8959 (0.9) 2696 (0.5) 53,621 (3.8)  Other 33,775 (1.1) 24,507 (2.4) 0 (0) 9268 (0.7)  Unknown 89,833 (3) 31,925 (3.1) 36,160 (7.1) 21,748 (1.5)  Hispanic 370,174 (12.5) 175,622 (16.9) 19,077 (3.7) 175,475 (12.4) Charlson Comorbidity Index  0 1,607,439 (54.2) 581,939 (56) 268,153 (52.5) 757,347 (53.5)  1 492,895 (16.6) 173,576 (16.7) 93,268 (18.3) 226,051 (16)  ≥2 604,286 (20.4) 200,834 (19.3) 99,923 (19.6) 303,529 (21.4)  Missing 261,492 (8.8) 83,609 (8.0) 49,013 (9.6) 128,870 (9.1) ADI quartile (national percentile)  Lowest SES 208,592 (7) 19,523 (1.9) 86,599 (17) 102,470 (7.2)  Lower mid-SES 406,984 (13.7) 71,116 (6.8) 132,901 (26) 202,967 (14.3)  Upper mid-SES 926,665 (31.2) 377,731 (36.3) 174,048 (34.1) 374,886 (26.5)  Highest SES 1,401,710 (47.3) 564,113 (54.2) 114,684 (22.5) 722,913 (51.1)  Missing 22,161 (0.7) 7475 (0.7) 2125 (0.4) 12,561 (0.9) Distance to clinic  <5 miles 1,835,722 (61.9) 747,947 (71.9) 214,758 (42.1) 873,017 (61.7)  5–9 miles 693,745 (23.4) 184,308 (17.7) 172,455 (33.8) 336,982 (23.8)  ≥10 miles 428,080 (14.4) 105,814 (10.2) 122,032 (23.9) 200,234 (14.1)  Missing 8565 (0.3) 1889 (0.2) 1112 (0.2) 5564 (0.4) Enrollment in a high deductible health plan  Yes 993,826 (33.5) 518,969 (49.9) 235,123 (46.1) 239,734 (16.9)  No 1,955,897 (65.9) 520,905 (50.1) 265,513 (52) 1,169,479 (82.6)  Missing 16,389 (0.6) 84 (0) 9721 (1.9) 6584 (0.5) Prior prescription mail order use  Yes 1,646,326 (55.5) 694,269 (66.8) 198,576 (38.9) 753,481 (53.2)  No 1,319,786 (44.5) 345,689 (33.2) 311,781 (61.1) 662,316 (46.8) Year  2017 265,678 (9.0) 141,175 (13.6) 22,853 (4.5) 101,650 (7.2)  2018 317,627 (10.7) 159,016 (15.3) 42,526 (8.3) 116,085 (8.2)  2019 361,773 (12.2) 161,842 (15.6) 47,213 (9.3) 152,718 (10.8)  2020 1,406,549 (47.4) 419,018 (40.3) 265,302 (52) 722,229 (51.0)  2021 614,485 (20.7) 158,907 (15.3) 132,463 (26.0) 323,115 (22.8) A virtual visit in a year for a person is counted only if the person is in the denominator for the year in Table 1.

Only completed visits are counted.

2021 Counts are only for January–June 2021.

ADI indicates Area Deprivation Index; HI/PI, Hawaiian/Pacific Islander; KPCO, Kaiser Permanente Colorado; KPGA, Kaiser Permanente Georgia; KPMAS, Kaiser Permanente Mid-Atlantic States; SES, socioeconomic status.


TABLE 3 - Count of In-person Visits in the Common Data Model for 2017—June 2021 n (%) CDM variable or derived measure Overall KPCO KPGA KPMAS Total 10,004,195 2,909,086 2,112,559 4,982,550 Age group

留言 (0)

沒有登入
gif