OpenSAFELY: Representativeness of Electronic Health Record platform OpenSAFELY-TPP data compared to the population of England.

Abstract

Abstract Background Since its inception in March 2020, data from the OpenSAFELY-TPP electronic health record platform has been used for more than 50 studies relating to the global COVID-19 emergency. OpenSAFELY-TPP data is derived from practices in England using SystmOne software, and has been used for the majority of these studies. We set out to investigate the representativeness of OpenSAFELY-TPP data by comparing it to national population estimates. Methods With the approval of NHS England, we describe the age, sex, Index of Multiple Deprivation and ethnicity of the OpenSAFELY-TPP population compared to national estimates from the Office for National Statistics. The five leading causes of death occurring between the 1st January 2020 and the 31st December 2020 were also compared to deaths registered in England during the same period. Results Despite regional variations, TPP is largely representative of the general population of England in terms of IMD (all within 1.1 percentage points), age, sex (within 0.1 percentage points), ethnicity and causes of death. The proportion of the five leading causes of death is broadly similar to those reported by ONS (all within 1 percentage point). Conclusions Data made available via OpenSAFELY-TPP is broadly representative of the English population. Summary Users of OpenSAFELY must consider the issues of representativeness, generalisability and external validity associated with using TPP data for health research. Although the coverage of TPP practices varies regionally across England, TPP registered patients are generally representative of the English population as a whole in terms of key demographic characteristics.

Competing Interest Statement

BG has received research funding from the Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK, the Health Foundation, the World Health Organisation, UKRI, Asthma UK, the British Lung Foundation, and the Longitudinal Health and Wellbeing strand of the National Core Studies programme; he also receives personal income from speaking and writing for lay audiences on the misuse of science.

Funding Statement

This work was jointly funded by UKRI [COV0076;MR/V015737/1] NIHR and Asthma UK-BLF and the Longitudinal Health and Wellbeing strand of the National Core Studies programme.The OpenSAFELY data science platform is funded by the Wellcome Trust. BG's work on better use of data in healthcare more broadly is currently funded in part by: the Wellcome Trust, NIHR Oxford Biomedical Research Centre, NIHR Applied Research Collaboration Oxford and Thames Valley, the Mohn-Westlake Foundation; all DataLab staff are supported by BG's grants on this work. LS reports grants from Wellcome, MRC, NIHR, UKRI, British Council, GSK, British Heart Foundation, and Diabetes UK outside this work. AS is employed by LSHTM on a fellowship sponsored by GSK. KB holds a Wellcome Senior Research Fellowship (220283/Z/20/Z). BMK is also employed by NHS England working on medicines policy and clinical lead for primary care medicines data.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

NHS England is the data controller; TPP is the data processor; and the researchers on OpenSAFELY are acting with the approval of NHS England. This implementation of OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant;(22,23) patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network (VPN) connection, restricted to a small group of researchers; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.(24) The OpenSAFELY research platform adheres to the obligations of the UK General Data Protection Regulation (GDPR) and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 (COPI) to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent.(25) Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform. GP practices, from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic, and have been informed of the OpenSAFELY analytics platform. This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the LSHTM Ethics Board (reference 21863).

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines and uploaded the relevant EQUATOR Network research reporting checklist(s) and other pertinent material as supplementary files, if applicable.

Yes

Data Availability

Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. The data in OpenSAFELY is drawn from General Practice data across England where TPP is the Data Processor. TPP developers (CB, JC, JP, FH, and SH) initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. DataLab developers and PIs (BG, LS, CEM, SB, AJW, KW, WJH, HJC, DE, PI, SD, GH, BBC, RMS, ID, KB, SE, EJW and CTR) holding contracts with NHS England have access to the OpenSAFELY pseudonymised data tables as needed to develop the OpenSAFELY tools. These tools in turn enable researchers with OpenSAFELY Data Access Agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code.

留言 (0)

沒有登入
gif