Cholesterol and breast cancer risk: a cohort study using health insurance claims and health checkup databases

Data source and study participants

This study used the data of health insurance claims and health checkups collected by JMDC Inc., Tokyo, Japan. In Japan, all citizens are provided with a universal health insurance program, and each employer is required to provide its employees with insurance and regular health checkup opportunities. The JMDC Claims Database is an epidemiological claims database that has accumulated claims (inpatient, outpatient, and dispensing) and health checkup data from multiple health insurers since 2005. The cumulative population is approximately 14 million (as of February 2022), and the data enable the study of the prevalence and/or incidence rate of any disease in the general population, including healthy people, and can track hospital transfers or multiple facility visits. The JMDC Claims Database includes information on employee demographics, medical history, drug prescriptions, and hospital claims records based on International Classification of Diseases 10th Edition (ICD-10) coding.

We recruited 1,283,619 women who were insured between April 2008 and July 2019 and had at least one health checkup during that period. We excluded participants who had been insured for less than one year at the start of follow-up (n = 179,388) to ascertain cancer incidence and medication prescription status prior to the start of follow-up. We also excluded participants with no LDL-C, HDL-C, or triglyceride (TG) data (n = 29,499) and those with a breast cancer diagnosis prior to the start of follow-up (n = 4,523). To eliminate the influence of lipid-modifying agents, we excluded participants who had been prescribed lipid-modifying agents (WHO-ATC code: C10) at least once from 1 year before the start of follow-up to the end of follow-up (n = 114,050). The final analysis cohort consisted of 956,390 women.

Cholesterol levels and other covariates

Age, body mass index (BMI), blood pressure, and fasting laboratory values, including cholesterol and blood glucose levels, were collected at the time of health checkup by using a standardized protocol at each health care institution. Data on smoking status, alcohol consumption, and physical activity were collected at the time of the health checkup by using a self-administered questionnaire. Hormonal medication use was defined using the insurance claims of drug prescriptions.

On the basis of the cholesterol level at the time of the initial health checkup (the start of follow-up), the participants were classified into quintiles, and the group with the lowest cholesterol level was used as the reference. The following definitions were used for the other covariates [10]. Hypertension was defined as a systolic blood pressure of 140 mmHg or higher, diastolic blood pressure of 90 mmHg or higher, or use of antihypertensive drugs. Diabetes mellitus was defined as fasting blood glucose of 125 or higher or use of diabetes medication. Current smokers were participants who have had a total of 100 or more cigarettes or had smoked for 6 months or longer and also smoked in the last month. Physical activity was defined as ≥ 30 min of exercise for more than 2 times a week or more than 1 h of walking per day. Current hormone users were participants who were prescribed hormone drugs (WHO-ATC code: G03C estrogens, G03D progestogens, or G03F progestogens and estrogens in combination) at least once from one year before the start of follow-up to the start of follow-up. There were no data on hormonal contraceptives for systemic use (WHO-ATC code; G03A) in the JMDC Claims Database because they are not covered by insurance in Japan.

Case identification

Breast cancer was identified using an algorithm combining the diagnosis code for breast cancer (ICD-10 code: C500 to C506, C508, and C509), breast cancer–specific procedures, radiotherapy, and drugs. Details regarding breast cancer–specific procedures, radiotherapy, and drugs are reported elsewhere [11]. This algorithm has been shown to be a valid algorithm for identifying patients with newly diagnosed cancer from a Japanese claims database by comparison with the national cancer registry data [11]. In addition, other validation studies have been conducted in Japan to identify breast cancer by using a claims database, and the accuracy was reported to be high when the definition was combined disease codes and cancer treatment codes (surgery, chemotherapy, medication, and radiation procedure) [12, 13]. These studies also suggest that the Japanese claims database can accurately define the incidence of breast cancer.

The month of breast cancer incidence was defined as the month in which the ICD-10 codes for breast cancer, treatmen, and drugs were recorded in the claims in the same month during the insurance coverage period. The day of breast cancer incidence was defined as the 15th day of the month of breast cancer incidence.

Statistical analysis

The baseline characteristics of the participants are presented using means and standard deviations or medians and interquartile ranges for continuous variables and percentages for categorical variables.

The Cox proportional hazards regression model was used to calculate hazard ratios (HRs) and 95% confidence intervals (CIs) to describe the risk of breast cancer incidence associated with LDL-C, HDL-C, or TG. Person-years of follow-up for each participant were calculated from the date of the first health checkup and censored at the date of breast cancer incidence, withdrawal from insurance due to death or job change, or end of the study period (July 31, 2019) whichever occurred first. On the basis of the LDL-C, HDL-C, or TG level at the time of the initial health checkup (the start of follow-up), the participants were classified into quintiles, and the group with the lowest LDL, HDL, or TG level was used as the reference. P values for trends were calculated by assigning scores for each quintile category as an independent continuous variable in the model. Alternatively, they were divided by the clinical cutoff values for diagnosing hyperlipidemia (140 mg/ml for LDL-C, 40 mg/ml for HDL-C, and 150 mg/ml for TG), and the group with the lowest value was used as the reference. The analysis was adjusted for known breast cancer risk factors as confounders [14]. Model 1 was stratified by age group (< 50 and ≥ 50) and adjusted for age (continuous). Model 2 was stratified by age groups (< 50 and ≥ 50) and adjusted for age (continuous), BMI (< 18.5, 18.5–25, 25–30, or > 30 kg/m2), hypertension (yes or no), diabetes mellitus (yes or no), current smoker (yes, no, or missing), drinking status (daily, sometimes, rarely, or missing), physical inactivity (yes or no), and current hormone use (yes or no).

A stratified analysis by age group (< 50 and ≥ 50 years) was conducted using age 50 as a surrogate indicator of menopausal status to examine effect modification by menopausal status. P values for interaction were calculated by adding product terms for LDL-C, HDL-C, or TG and age groups (< 50 and ≥ 50) to the main models, with adjustments for the aforementioned confounders. We conducted a sensitivity analysis that excluded participants with a follow-up period of less than 5 years to prevent potential reverse causation and ensure a sufficient latent period.

All P values reported were two-sided, and the significance level was set at P < 0.05. Statistical analyses were performed using Stata (version 16.0; Stata Corporation, College Station, TX, USA).

留言 (0)

沒有登入
gif