Background
Linking data collected in surveys with other sources of data, such as administrative data, can maximize limited resources and reduce respondent burden. Similar to nonresponse bias, differential linkage refusal increases the possibility of ‘consent’ or ‘linkage’ bias (Hill, Atkinson, and Blakely 2002; Jenkins et al. 2006; Sala, Burton, and Knies 2012). As for nonresponse bias (Groves 2006), linkage bias can lead to inaccurate estimates of population characteristics (Sakshaug and Kreuter 2012; Sakshaug et al. 2012). Previous studies have examined health differences between those who consent to link and those who do not. Some studies reported poorer health among those who consented, other studies found no systematic health differences by consent status, and some studies found differences by consent status that varied among health endpoints (Bohensky et al. 2010; Carter et al. 2010; Cruise et al. 2015; Dunn 2004; Huang et al. 2007; Kho et al. 2009; Knies, Burton, and Sala 2012; Knies and Burton 2014; Young, Dobson, and Byles 2001).
The National Health Interview Survey (NHIS), conducted by the Centers for Disease Control and Prevention’s National Center for Health Statistics, is a principal source of information on the health of the civilian noninstitutionalized U.S. population. The cross-sectional NHIS data are routinely linked to administrative records from the Social Security Administration and the Centers for Medicare and Medicaid Services to maximize the scientific value of the survey without increasing respondents’ reporting burden (Golden et al. 2015). Respondents’ refusal to provide a Social Security Number (SSN) or Medicare Health Insurance Claim Number (HICNUM), used to facilitate linkage, may affect the composition of the linked sample. Our objective was to examine the role of health status on the propensity to refuse linkage in a large nationally representative U.S. sample. We estimated associations of health conditions, general health status, and the total number of health conditions with linkage refusal.
Methods
Data Source
Data came from the 2010–2013 NHIS. Details about the sampling frame and sample design have been published previously (Parsons et al. 2014). Interviews were conducted in respondents’ homes, with some telephone follow-ups. One person from each family answered questions about family members’ age, health insurance coverage, and general health status. One randomly selected “sample adult” aged 18 years and older from each family answered questions about himself or herself. In each year, the final response rate among sample adults was over 60 percent and data were obtained from approximately 30,000 sample adults. Details of the sample sizes and response rates are available in the survey documentation for each relevant year (e.g. NCHS 2013).
Linkage and Linkage Refusal
At the conclusion of the sample adult interview, respondents were asked: “To help us link your survey data with vital statistics and health-related records of other government agencies, we would like the last four digits of your Social Security Number. The National Center for Health Statistics uses this information for research purposes only. Providing this information is voluntary. Federal laws authorize us to ask for this information and require us to keep it strictly private. There will be no effect on your benefits if you do not provide this information. What are the last four digits of your Social Security Number?” Respondents were similarly asked for the last four digits of the HICNUM if they were Medicare eligible. Respondents who answered “no,” “refused,” “don’t know,” or “don’t have a SSN (or Medicare number),” were asked: “May we try to link your survey data without [your SSN/HICNUM]?”
The questions were the same whether administered in-person or by telephone. For the analysis, record-linkage consent was defined as permission to use the SSN or HICNUM to link, or permission to link without either an SSN or HICNUM. Record-linkage refusal was defined as refusal to provide either the SSN or HICNUM combined with refusal to allow linkage without the SSN or HICNUM. For the 2010–2013 NHIS, unweighted linkage refusal ranged from 10.0 percent in 2010 to 11.8 percent in 2013.
Health Characteristics
The selected health conditions examined were hypertension, heart disease, diabetes, cancer, chronic obstructive pulmonary disease (COPD), serious psychological distress (SPD), and stroke. Hypertension was defined as having been told by a doctor or other health professional of having hypertension on two or more different visits. Heart disease was based on responses to questions about having ever been told by a doctor or other health professional of having coronary heart disease, angina (angina pectoris), a heart attack (myocardial infarction) or any heart disease or condition. Diabetes (excluding borderline diabetes) and stroke were also based on “yes” or “no” responses to questions about having ever been told by a doctor or other health professional of having these conditions. Cancer was based on responses to having ever been told by a doctor or other health professional of having cancer or a malignancy excluding nonmelanoma skin cancer. COPD was based on positive responses to questions about having ever been told by a doctor or other health professional of having emphysema or having been told in the past 12 months of having had chronic bronchitis (American Thoracic Society 2004). SPD was measured using the K6, a series of six psychological distress questions asking how often a respondent experienced symptoms of psychological distress during the past 30 days (Kessler et al. 2003). Small numbers of respondents were missing data: hypertension (n=253); heart disease (n=54); diabetes (n=91); cancer (n=106); COPD (n=44); SPD (n=2,025); stroke (n=121) respondent-assessed health status (n=66); and total number of conditions (n=2,534). Records with missing data were not included in the health-outcome specific analyses but were retained in the overall study population.
Demographic Characteristics
Demographic characteristics included race and ethnicity, age, and poverty. Race and ethnicity were categorized as Hispanic, non-Hispanic White, non-Hispanic Black and all other races/ethnicities. Age groups included 18–44 years, 45–64 years, and 65 years and over. Annual family income was categorized into a poverty index ratio (PIR) of below 100 percent of the federal poverty level, 100 percent–199 percent of the federal poverty level, 200 percent–399 percent of the federal poverty level, and 400 percent or more of the federal poverty level. Imputation was used to assign a poverty level for records with missing income data (Schenker et al. 2009). Because Miller et al. found a possible association with imputation status and record-linkage refusal (Miller, Gindi, and Parker 2011), imputation status was included as an independent variable, with categorical responses: providing all income information, providing income by categories, and providing no income information.
Statistical Analysis
Point estimates, corresponding variances, and 95 percent confidence intervals were calculated using SAS callable SUDAAN software (SUDAAN 2008) to provide weighted estimates and account for the complex sample design. Overall associations for categorical variables were evaluated using the Rao Scott Chi Square statistic for weighted survey data.
We used multivariable logistic regression models that included one health condition or characteristic and controlled for age group, sex, race/ethnicity, PIR, and income imputation status. Trend tests identified linear relationships between linkage refusal, health status, and total number of health conditions.
Results
The study population included 129,253 adults with complete demographic data. Approximately 11 percent of respondents refused record linkage. Close to 50 percent were aged less than 45 years (sample mean age was 47 years) and almost 70 percent were non-Hispanic White (Table 1). The majority of adults were in the 200–399 percent PIR (30 percent) and ≥400 percent PIR (37 percent) groups. Nearly 39 percent of adults had at least one health condition.
In bivariate analysis, Hispanic adults were the most likely to refuse record-linkage (13 percent) compared to non-Hispanic White (10 percent) and non-Hispanic Black (10 percent) adults (Table 1). Adults in the youngest age group were less likely to refuse linkage than other age groups. Adults in the lowest income groups were less likely to refuse linkage than those in higher income groups (p for trend < 0.01). Adults who did not provide income information were more likely to refuse linkage (29 percent) than adults who reported their income (9 percent). Adults with any one health condition were less likely to refuse linkage than adults without the condition.
In separate multivariable logistic regression models, each of the health conditions remained inversely associated with linkage refusal after controlling for age, race/ethnicity, sex, PIR, and income imputation status (Table 2). In addition, the inverse association was stronger as number of conditions increased (p for trend <0.01; Table 2). Similarly, those with better general health status had higher rates of linkage refusal (p for trend <0.01). Sensitivity analyses that excluded the income imputation variable (an indicator of refusal to provide income information) and included the survey year (to allow for changes in record-linkage refusal over time) had similar results to the main findings.
Discussion
Our major finding was that adults with hypertension, heart disease, diabetes, cancer, COPD, SPD, or stroke were less likely to refuse linkage compared to adults without one of these health conditions. The finding was further supported by a dose-response association between linkage refusal and number of health conditions and health status. As the number of health conditions increased, adults were less likely to refuse record-linkage. Similarly, refusal to link decreased with poorer reported health status.
The results are consistent with other studies showing associations between health and linkage refusal (Dunn 2004; Knies, Burton, and Sala 2012; Knies and Burton 2014; Young, Dobson, and Byles 2001). Differences in linkage refusal by health characteristics may be explained by leverage-salience theory; faced with a survey request of interest, respondents cooperate at higher rates than those less interested (Groves, Singer, and Corning 2000).
Health differences between adults who consent and refuse linkage could bias study results. Weighting methods to decrease linkage bias, similar to survey nonresponse bias, have been proposed for analyses of NCHS linked data (Judson, Parker, and Larsen 2013). However, weighting methods may not fully account for potential biases. The best approach may depend on the study question. It is possible to obtain information for a large number of health factors to inform approaches for analyses using the linked data.
Our report on record-linkage refusal in a national sample found that respondents without selected health conditions were more likely to refuse linkage. Researchers should evaluate potential biases in their analyses due to linkage refusal to determine appropriate adjustment.