Discrepancies Between Preliminary and Final COVID-19 Mortality Data – the Case of Serbia

Since the start of the COVID-19 pandemic, countries have scrambled to set up data collection and dissemination pipelines to quickly produce and publish the data regarding the pandemic. These data include the number of COVID-related deaths. Countries started reporting the daily number of COVID-19 deaths. Online databases that aggregated these daily data emerged quickly.

With its aging population [1], healthcare system which requires a lot of investment [2], Serbia was initially considered vulnerable to the impact of the pandemic. However, from the start of the pandemic in March 2020, the daily reported mortality data kept pointing to a different conclusion. Serbia had few COVID-related deaths when compared to the number of COVID-19 cases. Serbia's case fatality rate (CFR) remained below 0.95% at the end of 2020, and as of September 2022, the CFR has decreased further to 0.72% [3]. Based on the available data, Serbia's COVID-19 outcomes have been comparatively better than those of its neighbouring countries. In Europe, Serbia was among the best-performing nations, with fewer COVID-19 deaths per confirmed COVID-19 case even compared to Norway. In contrast, countries that share similar demographics and socioeconomic profiles with Serbia, such as Bosnia and Herzegovina and North Macedonia, had case fatality rates (CFRs) that were more than three times higher [4]. However, after considering excess mortality data [5], [6], [7], it became evident that Serbia's COVID-19 outcomes were more comparable to those of neighbouring countries rather than those of Norway or Denmark [8].

This study compares the preliminary COVID-19 mortality data from Serbia and the official vital statistics in order to determine the presence or absence of any discrepancies. We conducted a literature review of studies that use preliminary data on COVID-19 mortality in Serbia to determine the extent to which potentially suspect data is being employed in research. We discuss the implications of these findings, and provide recommendations for researchers encountering anomalous COVID-19 mortality data from Serbia in public datasets. We then provide general recommendations for handling potentially anomalous COVID-19 mortality data from other countries. This study aims to provide an analysis of the potential discrepancies between final official mortality data and preliminary daily reported data on COVID-19 mortality in Serbia, with the goal of contributing to the scientific community's understanding of the situation. We highlight the presence of problematic data that continues to be published in major databases and used widely in academic publications. In addition, we conduct a brief comparison between the preliminary and final data from other countries to determine the extent to which Serbia stands out and whether such discrepancies exist in other countries.

We conducted a comparison between the aggregated preliminary, daily reported COVID-19 mortality data from Serbia and the final vital statistics data, which is considered more reliable, with the aim of evaluating potential differences between these two datasets. The comparison was possible only for the first two years of the pandemic – 2020 and 2021 - due to the vital statistics publishing schedule. We conducted a month-by-month comparison of the two datasets from March 2020 to the end of 2021. The comparison between the two different pipelines for gathering mortality data is illustrated in Figure 1.

The need for a faster distribution of pandemic-related data led to the development of a new centralized COVID-19 surveillance system, using a Unique and Centralized Software solution named ‘IS COVID-19’. This Serbian Ministry of Health’s database is run by the Institute of Public Health of Serbia – Dr Milan Jovanovic Batut and The Office of Information Technology and eGovernment [9]. The database is being used to report daily COVID-19 death count (among other COVID-19 statistics). These statistics are reported by government officials, distributed by the media, and aggregated into online COVID-19 databases. The first (daily reported) COVID-19 death in Serbia occurred on 20th March 2020.

Gathering of vital statistics in Serbia is regulated by the Law on Official Statistics and Official Statistics Programme for 2021-2025 (and its earlier edition for 2016-2020) [10]. As the exclusive provider of official government statistics, SORS is responsible for designing survey instruments and managing their implementation. Following rigorous quality control measures, including analysis and tabulation, the statistical data is released according to an established calendar and through designated publication channels [10]. The official mortality data for the preceding year is routinely released in July. As of the time of writing, the authors had access to the final mortality data for 2020 and 2021 [11]. The detailed mortality dataset is available in the offline mortality database maintained by the World Health Organization (WHO), which currently includes data up to 2020 [12]. The authors were able to obtain detailed mortality data for both 2020 and 2021 through a special request made to the SORS [11].

To assess the potential impact of the preliminary data from Serbia on research, the authors conducted a review of published research papers that had utilized this data. We used a two-pronged approach. The first approach involved a direct examination of the sources of the data, specifically the Serbian government portals covid19.rs and covid19.data.gov.rs. The second approach examined databases that aggregate the preliminary data from Serbia.

The search period for the present study extended from March 2020, when preliminary data for Serbia began to be reported, to September 2022, at the time of writing this article.

The present study identified three main databases that contained preliminary data on COVID-19 mortality for Serbia: the Johns Hopkins database [13], Our world in data [4], and Worldometer [14]. While there are other databases that aggregate daily reports from Serbia, we chose these three for their span, accessibility, widespread use.

To identify research that may have been affected by the questionable preliminary mortality data, the authors conducted a search using two databases: Web of Science (WoS) and Google Scholar (GS). Appendix A contains a detailed description of the search strategy, including search strings used in the analysis.

The authors conducted the GS search using the Publish or Perish software [15] and used Zotero [16] to perform DOI lookup and export bibliographic data.

The study included research studies that explicitly focused on Serbia and used the preliminary data. In addition, cross-country studies that incorporated the preliminary COVID-19 mortality data from Serbia in their analyses were also included.

A significant number of research studies that cited one of the databases were easily excluded, such as those related to specific countries or regions that were not relevant to Serbia. Numerous research papers related to the pandemic cited the data from the databases in their introductory sections to provide overall figures concerning the pandemic, including the number of deaths. Research studies that reported on worldwide deaths caused by COVID-19 but did not include Serbia in their analyses were excluded. Although the worldwide COVID-19 deaths number includes the underreported numbers from Serbia, which technically could make the reported number incorrect, we assessed that this factor alone did not have a significant impact on the research. Due to the population size of Serbia, the potential underreporting could not have resulted in a significant difference in worldwide numbers. Nonetheless, we included research studies that reported the number of COVID-19 deaths specifically in Serbia due to the substantial absolute difference in reported numbers.

All studies that employed the preliminary COVID-19 mortality data from Serbia in statistical analysis were included, regardless of their methodology (even though some analyses used methodologies that were more resistant to outliers than others).

To determine if Serbia was included in the analysis, we checked whether it was listed in the main text, tables, figures, and supplementary data (where available). Several research studies utilized data from the affected databases but did not explicitly mention the countries that were included in their analyses. We excluded those studies unless they analysed a considerable number of countries, in which case it was reasonable to assume that Serbia was included in the analysis. The complete list of studies is provided in the Appendix B Table B1 (research that includes mortality data from Serbia) and Table B2 (research that likely includes mortality data from Serbia).

We collected the number of times that each identified research study and paper was cited in both the WoS Core Collection and GS. The citation count was used as a criterion for selecting studies to address in the discussion section of this paper. The combined number of citations was calculated by summing the citation counts, without taking into account the number of distinct papers that referenced them.

留言 (0)

沒有登入
gif