Loading [Contrib]/a11y/accessibility-menu.js
Skip to main content
Survey Practice
  • Menu
  • Articles
    • Articles
    • Editor Notes
    • In-Brief Notes
    • Interview the Expert
    • Recent Books, Papers, and Presentations
    • All
  • For Authors
  • Editorial Board
  • About
  • Issues
  • Blog
  • Subscribe
  • search

RSS Feed

Enter the URL below into your favorite RSS reader.

http://localhost:12504/feed
Articles
Vol. 14, Issue 1, 2021August 26, 2021 EDT

Impact of demographic survey questions on response rate and measurement: A randomized experiment

Jeanette Y. Ziegenfuss, Casey A. Easterday, Jennifer M. Dinh, Meghan M. JaKa, Thomas E. Kottke, Marna Canterbury,
response ratessurvey researchdemographic questionssurvey methodsquestionnaire design
https://doi.org/10.29115/SP-2021-0010
Photo by Sharon McCutcheon on Unsplash
Survey Practice
Ziegenfuss, Jeanette Y., Casey A. Easterday, Jennifer M. Dinh, Meghan M. JaKa, Thomas E. Kottke, and Marna Canterbury. 2021. “Impact of Demographic Survey Questions on Response Rate and Measurement: A Randomized Experiment.” Survey Practice 14 (1). https:/​/​doi.org/​10.29115/​SP-2021-0010.
Save article as...▾
Download all (2)
  • Supplemental Materials A
    Download
  • Supplemental Materials B
    Download

Sorry, something went wrong. Please try again.

If this problem reoccurs, please contact Scholastica Support

Error message:

undefined

View more stats

Abstract

Demographic survey questions are important to describe the population of survey responders, illuminate potential disparities, and ultimately advance equity. Little is known about their impact on survey response rate or measurement.

Methods

A total of 4,448 individuals were randomly assigned to one of three conditions in a mailed paper questionnaire where demographic questions were (1) not asked, (2) integrated at the end of the survey, or (3) included as standalone questions on a separate piece of paper. Response rates to the main survey and demographic questions, as well as item nonresponse and correlation of responses to administrative records, are compared.

Results

Overall, 33.4% of individuals who were mailed the survey responded. There were no substantive or statistical differences in survey response rate when demographic questions were not asked (34.2%), were integrated into the survey (33.1%), or were standalone (33.0%; p = 0.762). Sampled individuals responded to the demographic questions at a significantly higher rate when they were integrated into the main survey (32.7%) compared to when they were standalone (28.3%). Respondents, when asked about income, declined to answer at a significantly higher rate when demographics were integrated (16.5%) compared to standalone (10.5%). Discordance between administrative and self-reported race and ethnicity data ranged from 0.6% to 1.0% and were not statistically different across arms (p = 0.64 and p = 0.88, respectively).

Discussion

While these findings are limited to the context of the experiment, our results suggest that embedding demographic questions in a survey (as opposed to on a separate page) may result in more usable demographic data. Future work could explore the differential impact of post-survey missing data adjustments on estimates of demographic characteristics and correlation with other survey content. Overall, there was little measurement error in reporting of race/ethnicity in both conditions.

Conclusion

For collection of demographic data from the largest portion of individuals via a mailed survey without negative impact on response rate or measurement error, demographic questions are best integrated into surveys rather than included as standalone items on a separate piece of paper.

Surveys are an important tool for researchers across disciplines working to reduce disparities. When detailed frame data are not available to describe respondents, self-report is needed to understand respondent characteristics. However, some demographic questions are largely recognized as sensitive (Tourangeau and Yan 2007) and may be left unanswered due to confidentiality concerns or the perception that the question is threatening or difficult to answer (Lor et al. 2017). Furthermore, respondents may choose to skip the entire survey due to confidentiality concerns (Singer, Hippler, and Schwarz 1992).

Researchers have attempted to attenuate the sensitive nature of demographic questions using a variety of methods, for example, by stating the purpose of collecting demographic items, placing demographic questions at the end of the survey (Lor et al. 2017), or assuring respondent confidentiality (Singer, Hippler, and Schwarz 1992). However, the impact of whether and how demographic questions are asked on response rates and other data quality measures is largely unstudied. To our knowledge, there is no empirical evidence evaluating the impact of demographic question inclusion in a survey (Bradburn, Sudman, and Wansink 2004; Dillman 2007; Dillman, Smyth, and Christian 2009). As such, we tested several methods for reducing the sensitivity of demographic questions in a mailed survey.

Our research aims to answer the following three questions in the context of mailed paper survey administration: (1) What impact does the inclusion of demographic questions have on survey response rate? (2) Will separating the demographic questions from the rest of the survey on a standalone piece of paper that restates the optional nature of each question impact response rate? (3) Will labeling the demographic questions as above affect measurement properties including individual item nonresponse or discordance between administrative and self-reported data? Answers to these questions could help shed light on the effectiveness of different strategies survey researchers use to mitigate the sensitivity of demographic questions across disciplines and potentially across modes.

Methods

This randomized experiment was embedded in a larger survey-based evaluation project designed to measure opinions and beliefs about community mental illness stigma. The population of interest included adults residing in six Midwest communities. The study was conducted within a large, integrated health system whose patients and members were used as a convenient proxy for the underlying communities in which they reside. The sample size (n=4,448) was selected to achieve a 5% margin of error on key outcomes in each community. The sample was randomly selected from the population and then randomly assigned to one of three paper questionnaire conditions: no demographic questions, integrated demographic questions, and standalone demographic questions (see Table 1).

Table 1.Demographic characteristics from administrative data by Study Arm.
Study Arm
Number (percentage)
Demographic Characteristic No Demographics Integrated Demographics Optional Demographics
Total 1484 1482 1482
Age
18-24 163 (11.0) 131 (8.8) 149 (10.1)
25-34 229 (15.4) 225 (15.2) 221 (14.9)
35-44 254 (17.1) 288 (19.4) 282 (19.0)
45-54 287 (19.3) 303 (20.4) 280 (18.9)
55-64 361 (24.3) 334 (22.5) 362 (24.4)
65 and older 190 (12.8) 201 (13.6) 188 (12.7)
Gender
Female 818 (55.1) 818 (55.2) 786 (53.0)
Male 666 (44.9) 664 (44.8) 696 (47.0)
Insurance Product
Commercial 1329 (89.6) 1294 (87.3) 1304 (88.0)
Medicaid 85 (5.7) 116 (7.8) 96 (6.5)
Medicare 70 (4.7) 72 (4.9) 82 (5.5)
Ethnicity
Not Hispanic/Latino 1091 (73.5) 1075 (72.5) 1075 (72.5)
Hispanic/Latino 22 (1.5) 33 (2.2) 29 (2.0)
Unknown 371 (25.0) 374 (25.2) 378 (25.5)
Race
American Indian or Alaska Native* - - -
Asian 50 (3.4) 55 (3.7) 49 (3.3)
Black or African American 61 (4.1) 63 (4.3) 58 (3.9)
Native Hawaiian/other Pacific Islander* - - -
Other 10 (0.7) 8 (0.5) 8 (0.5)
White 1318 (88.8) 1299 (87.7) 1310 (88.4)
Unknown 42 (2.8) 43 (2.9) 49 (3.3)

No significant differences by demographic characteristics were observed.
*One or more values in the row contained 5 or fewer individuals so the entire row was suppressed to protect subject identity.

The four demographic questions, when included, were adapted from standard public health surveillance instruments and asked about annual household income, education, ethnicity, and race. In the integrated survey version, the questions were at the end of the survey with the following transition, “We have a few more questions about you. These questions help us understand your responses better. As a reminder, your information will be kept confidential and secure.” In the standalone demographic question version, the questions were printed on a sheet of yellow paper in contrast to the white paper on which the main survey was printed. The questions were introduced with the following text, “We have a few more questions about you. These questions are optional and help us understand your responses better. As a reminder, your information will be kept confidential and secure. If you choose to fill out this page, please return it with your survey in the envelope provided.” (See supplemental materials.) The rest of the survey comprised four pages and included questions about experience with individuals impacted by mental illness, awareness, and willingness to take action.

The survey was mailed May 2019 with a $2 bill and cover letter describing the survey and its overall optional nature. (See supplemental materials.) A unique study identification number linked the survey content to the standalone page when applicable. Mail survey non-responders transitioned to telephone follow-up after at least 21 days as part of the sequential mixed-mode survey design, but the demographic experiment was limited to the mail phase. Mail data collection closed September 2019.

The integrated health system from which the sample was drawn collects administrative data on ethnicity and race as reported by the member/patient and/or clinic staff. In the administrative data, ethnicity is recorded as Hispanic/Latino, not Hispanic/Latino, or unknown. Race is recorded as Native Hawaiian/other Pacific Islander, American Indian or Alaska Native, Asian, Black/African American, White, other, or unknown. If the administrative data indicated multiple races, the indicated race least common in the state according to US Census data current as of July 1, 2019, was assigned as the subject’s primary race. The same process was applied to self-reported data to create comparable measures between administrative and self-reported demographics.

The primary outcome variable was response rate. Since these surveys were embedded within a larger evaluation and any respondent feedback was valuable, a survey returned with at least one item completed was considered a response, regardless of whether that item was related to the primary outcomes or was a demographic item (importantly 99% of respondents completed more than half of the survey). Secondary outcomes include demographic questions completion (defined by completion of at least one question on the demographic question set) and alignment of self-reported demographics with administrative records. Endorsement of “Decline to answer,” available only for the income question, was considered non-response. Pearson’s chi-squared tests were used to assess survey completion rate differences, demographic item completion, and discordance between self-reported and administrative demographics. Surveys returned undeliverable due to invalid postal address were removed from analyses. Only individuals with known self-reported and administrative demographic characteristics were included in analyses of data source alignment. Statistical analyses were conducted in R version 3.6.1 (R Core Team 2019)

This project was embedded in a program evaluation designed for future program planning that was deemed exempt from institutional review board (IRB) oversight by the organization’s IRB.

Results

Demographic characteristics of the 4,448 sampled individuals as recorded in the administrative data were balanced across the three conditions (see Table 1). The sample of randomly selected members and patients ranged in age from 18–102 years with most individuals between 55 and 64 years old; primarily female with commercial insurance, and not Hispanic/Latino, and White. Of the 4,448 surveys fielded, one survey was returned undeliverable due to invalid address and was removed from subsequent analyses. Of the remaining 4,447 individuals, 1,487 responded to the survey by mail (33.4% response rate, minimum response rate 2) (American Association for Public Opinion Research 2016). While a survey returned with at least one item completed was considered a survey response, over 99% of respondents completed at least half of the survey items. Mail response rates to the main survey were 34.2% when no demographic questions were included, 33.1% when demographic questions were standalone, and 33.0% when demographic questions were integrated into the main survey. Differences were not statistically significant (Table 2; χ2 = 0.545, p = 0.762). Overall, responders were more likely to be female, older, have Medicaid and be White. This pattern held across arms except for gender. Only in the standalone arm were women significantly more likely to respond (results not shown).

Table 2.Survey response rates by Study Arm.
Study Arm
No Demographics Integrated Demographics Optional Demographics
Main Survey
Number fielded 1484 1481 1482
Number returned undeliverable 0 1 0
Number completed 507 489 491
Response rate (%) 34.2 33 33.1
Demographics Survey
Number fielded 0 1481 1482
Number completed 0 484 420
Response rate (%) NA 32.7* 28.3

*The response rate to the demographics survey was significantly higher in the Integrated Demographics group than the Optional Demographics group (χ2 =6.38, p = .012).

Overall, in the two arms with the demographic questions included (integrated and standalone), 30.5% of individuals responded to at least one demographic question. At least one demographic question was reported by a larger subset of the sample when the demographic questions were integrated (32.7%) compared to when they were standalone (28.3%; χ2= 6.38, p = 0.012; Table 2). Of survey respondents, 99.0% responded to at least one demographic question when integrated, but only 85.5% when standalone (χ2= 60.0, p < 0.001).

Item nonresponse rates to demographic questions ranged across demographic questions and arms from 0.41% to 16.5% (see Table 3). There was more item nonresponse to the income question among respondents of the integrated demographic questions than among respondents of the standalone demographic questions (16.5% compared to 10.5%; χ2= 6.46, p = 0.011). However, despite this higher item nonresponse rate, due to the higher overall survey response rate, the integrated demographics condition ultimately produced more data on respondent income than the standalone demographics condition. Item nonresponse did not vary significantly across survey conditions for the three other demographic questions.

Table 3.Item nonresponse rates by demographic item.
Study Arm
Number of Item Non-Responses / Number of Completed Demographic Surveys (Percentage)
Survey Item Integrated Demographics Optional Demographics
Income 80/484 (16.5)* 44/420 (10.5)
Education 2/484 (0.41) 4/420 (0.95)
Ethnicity 7/484 (1.4) 7/420 (1.7)
Race 9/484 (1.9) 4/420 (0.95)

*The non-response rate to the survey question about respondent household income is significantly higher in the Integrated Demographics group than in the Optional Demographics group (χ2 =6.46, p = .011).

Across the study population, race and ethnicity information was available for 75% and 96% of the sample, respectively (Table 1). For respondents whose race and ethnicity were known through administrative records and who also self-reported race and ethnicity on the survey, discordance ranged from 0.6% to 1.0%. We found no significant differences between discordance rates of the integrated and standalone study arms (for ethnicity, χ2 = 0.02, p = 0.88 and for race, χ2 = 0.22, p = 0.64; see Table 4).

Table 4.Discordance rates between administrative and self-reported data for ethnicity and race
Integrated Demographics Optional Demographics
Ethnicity 2/353 (0.6%) 3/310 (1%)
Race 6/462 (1%) 3/405 (0.7%)

No significant differences between discordance rates of the integrated and optional study arms were observed.

Discussion

We sought to answer three questions. (1) What impact does including demographic questions have on survey response rate? We demonstrated no impact. (2) Will separating the demographic questions from the rest of the survey on a standalone page impact response rate? We demonstrated no impact. And (3) Does separation of demographic questions in a standalone page impact measurement properties including individual item nonresponse or discordance between administrative and self-reported data? We found that whether the demographic questions are integrated or standalone can influence a respondent’s choice to answer certain demographic questions. When demographic questions were integrated, we saw more respondents answer at least one demographic question compared to when standalone. However, integrated demographics also led to more item nonresponse compared to when standalone. Specifically, six percentage points more respondents refused to provide information about their income when demographics were integrated into the survey compared to when standalone. Pragmatically, our results suggest that embedding demographic questions in a survey (as opposed to on a separate page) may result in more usable demographic data. Respondents more hesitant about answering demographic questions may complete the survey but not mail back the standalone demographic questions, but if those questions were embedded in the survey, they may mail back the survey with at least some responses to demographic items. Nonetheless, the higher rate of missing data in the integrated condition could result in different estimates of income if the choice not to respond is correlated with income. While data spread could be investigated empirically, we would not be able to disentangle the impact of selective item non-response, unit nonresponse and measurement properties with our current design. With more robust frame data, future work could explore this as well as the differential impact of post-survey missing data adjustments on estimates of demographic characteristics and correlation with other survey content that is beyond the scope of this work.

Importantly, there was no impact on measurement error; self-reported ethnicity and race corresponded highly with administrative data. At most, self-reported and administrative ethnicity and race data discorded by 1%. These data suggest that standalone demographic questions can impact measurement through differential item nonresponse, but not directly through measurement error.

This work shows that demographic questions can be included in a mailed survey with confidence when needed. Response rates were not negatively impacted, and we showed high concordance between self-report of race/ethnicity and administrative data. Question integration into the survey is optimal. While results may vary by survey mode, population, and topic, among other survey design factors, our results give confidence that the inclusion of these important questions in a mail survey does not bias our data.

Our findings have limitations. The experiment was conducted in a single institution that serves a relatively homogeneous population. Compared to the adult population in Minnesota, sampled individuals were younger and more likely to be female, White and insured. Compared to the United States, Minnesota is relatively White (U.S. Census Bureau 2019). It is possible that the existing relationships with the sponsor could engender recipients’ trust and thus impact willingness to return demographic information. The study was limited to four demographic constructs and a single question for each. We saw a different pattern with income nonresponse, suggesting, not surprisingly, that not all demographic constructs are equal. We did not test variations of questions for each construct and are not commenting on the appropriateness of these specific questions; our findings are limited to the specific constructs and questions used. Results may also vary with surveys that collect more demographic questions than the four that we tested.

The study was constrained to the first part of a sequential mixed-mode survey (mail with phone follow-up) due to the inability to replicate fully in phone administration. Results may differ with other modes, however, could have relevance to self–administered web-based questionnaires. The experiment was embedded in a survey about the stigma of mental illness. This topic’s salience could be correlated with a respondent’s willingness to volunteer demographic information. Similarly, for this experiment, we achieved an approximate 33% mail survey response rate. There may be a complex relationship between the impact of demographic questions along the continuum of response that could be evaluated. Finally, there was unavoidable confounding between survey length and presence of demographic questions. However, this is mitigated somewhat by not observing differences in response rates by demographic condition.

Our study also had important strengths. We utilized a randomized design embedded in a large study without unduly increasing respondent burden or the main study objectives. We had robust frame data to enable insight on measurement error across conditions and also achieved a relatively high response rate to assuage concern about differences along the response continuum.

Conclusion

Understanding the sociodemographic characteristics of survey respondents is important to give responses context. Moreover, understanding how beliefs, opinions and behaviors may differ by subpopulations is critical if underlying disparities are to be documented and ultimately addressed. Here we have shown that relying on self-reported demographics does not negatively impact survey performance, thus further warranting their inclusion. Future work should replicate this experiment in other modes and with other demographic questions as well as consider the potential differential impact of demographic question presentation for subpopulations. For example, in our data there was a signal for differential impact on response rate by gender. We have provided evidence to support inclusion of demographic questions in surveys, enabling the important work on documentation and analysis of disparities when administrative data is not available.


Funding

This study was funded by Lakeview Health Foundation and HealthPartners. While the authors are employed by these funding sources, the sponsors themselves had no role in study design; in collection, analysis, and interpretation of data; in report writing; or in the decision to submit for publication.

Declaration of interest statement

Declarations of interest: none

Submitted: March 23, 2021 EDT

Accepted: July 14, 2021 EDT

References

American Association for Public Opinion Research. 2016. Standard Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. Washington, DC: AAPOR.
Google Scholar
Bradburn, Norman M, Seymour Sudman, and Brian Wansink. 2004. Asking Questions: The Definitive Guide to Questionnaire Design--for Market Research, Political Polls, and Social and Health Questionnaires. San Francisco, CA: Jossey-Bass.
Google Scholar
Dillman, Don A. 2007. Mail and Internet Surveys: The Tailored Design Method . Hoboken, NJ: John & Wiley Sons Inc.
Google Scholar
Dillman, Don A, Jolene D Smyth, and Leah Melani Christian. 2009. Internet, Mail, and Mixed Mode Surveys: The Tailored Design Method. Hoboken, NJ: John Wiley & Sons.
Google Scholar
Lor, Maichou, Barbara J Bowers, Anna Krupp, and Nora Jacobson. 2017. “Tailored Explanation: A Strategy to Minimize Nonresponse in Demographic Items among Low-Income Racial and Ethnic Minorities.” Survey Practice 10 (3). https:/​/​doi.org/​10.29115/​SP-2017-0015.
Google Scholar
R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
Google Scholar
Singer, Eleanor, Hans-Juergen Hippler, and Norbert Schwarz. 1992. “Confidentiality Assurances in Surveys: Reassurance or Threat?” International Journal of Public Opinion Research 4 (3): 256–68. https:/​/​doi.org/​10.1093/​ijpor/​4.3.256.
Google Scholar
Tourangeau, Roger, and Ting Yan. 2007. “Sensitive Questions in Surveys.” Psychological Bulletin 133 (5): 859–63. https:/​/​doi.org/​10.1037/​0033-2909.133.5.859.
Google Scholar
U.S. Census Bureau. 2019. “Population Estimates, July 1, 2019 (V2019) — Minnesota.” QuickFacts. 2019. https:/​/​www.census.gov/​quickfacts/​MN.

This website uses cookies

We use cookies to enhance your experience and support COUNTER Metrics for transparent reporting of readership statistics. Cookie data is not sold to third parties or used for marketing purposes.

Powered by Scholastica, the modern academic journal management system