Should high response rates really be a primary objective?

Koen Beullens; Geert Loosveldt

doi:10.29115/SP-2012-0019

Many survey agencies tend to measure and express the quality of their respondent samples in terms of response rates. Moreover, AAPOR advises in its best practices to “maximize cooperation or response rates […]” ( www.aapor.org). Indeed, low nonresponse rates are believed to reduce the potential for nonresponse bias. High response rates are also easy to calculate, and because everyone is expected to appreciate their meaning and relevance, they are easy to report. Moreover, a response rate is a clear strategic objective to focus on during fieldwork activities. Nevertheless, the high response rate objective can also tempt agencies and interviewers to prioritize the high propensity cases by following the path of least resistance, potentially causing or preserving nonresponse bias rather than reducing it.

Method

We will investigate the response rate–nonresponse bias dilemma by running different simulations of a hypothetical fieldwork situation in face-to-face surveying. Suppose we start from a sample s consisting of n = 100 persons. Each of them has a particular response propensity $\rho_\text{i}$ (= individual probability to participate after one survey request). The sample mean of these propensities is 0.25 (= $\bar{\rho}$ ) and the variance is 0.02 (= $\text{S}^{2}_\rho$ ), indicating that different sample members may have different response probabilities. The survey budget allows exactly 300 contact attempts, which have to be divided among the 100 individuals. Similar to the formula that estimates the probability of throwing a six when rolling a die k times, we define the final response propensity as $\rho_{\text{i,final}} = 1-(1-\rho_{\text{i}})^{\text{k}_{\text{i}}}$ , where $\text{k}_{\text{i}}$ is the number of contact attempts allocated to unit i. For simplicity, we assume the $\rho_{\text{i}}$ ’s to be independent in time.

The crucial indices to evaluate the sample that is eventually obtained are the mean final response propensity (= response rate or $\bar{\rho}_{\text{final}}$ ) and the variance of the final propensities $(\text{S}^{2}_{\rho_{\text{final}}}$ ). The risk of bias can be expressed as $\text{S}_{\rho_{\text{final}}}/\bar{\rho}_{\text{final}}$ . This fraction determines the maximum standardized difference possible between the respondent set and the full sample for a particular parameter (e.g., Schouten et al. 2009).

Based on the expression of risk of bias, three strategies can be defined in order to distribute the 300 contact attempts:

Maximize the response rate $\bar{\rho}_{\text{final}}$ . The underlying idea is to limit the potential for nonresponse bias by pursuing a low nonresponse percentage.
Minimize the variance of the final propensities $(\text{S}^{2}_{\rho_{\text{final}}}$ ). Schouten et al. (2009) define representativeness as the equality of response propensities.
Minimize the risk of bias $\text{S}_{\rho_{\text{final}}}/\bar{\rho}_{\text{final}}$ .

These three strategies can be compared to a random allocation strategy. Here, every nonresponding unit has an equal probability of being revisited.

For each of these four strategies, we will assess the obtained survey sample with respect to the final response rate, the risk of bias and the effective sample size. This latter indicator reflects the statistical power of the sample because all responding units may need to be weighted in order to restore the representativeness of the sample. This usually leads to an inflation of estimation variance. The effective sample size is expressed as $\text{n}/(1+(\sum_{\text{s}}\rho_{\text{i}}(w_{\text{i}}-1)^{2})/(\text{n}\bar{\rho}))$ , where the weight score $\text{w} = \bar{\rho}/\rho$ . This expression is based on the variance inflation factor $(\text{n}/(1 + \text{S}^{2}_{\text{w}}/\bar{\text{w}}^{2}$ ) as proposed by Kish (1965), but slightly modified since not all sample units will eventually be in the respondent sample. Because all sample units have a probability to be in the respondent sample, they all have a probability to be weighted; whereas, the original variance inflation expression considers the weighting scores as fixed because the respondent sample is considered to be already realized.

How exactly the 300 attempts need to be distributed is optimized numerically using the OPTMODEL procedure in SAS: The algorithms search for different distributions of the attempts until (1) no higher response rate or (2) no lower propensity variance or (3) no lower risk of bias can possibly occur. For the fourth fieldwork objective, the algorithm searches for the most suitable probability to revisit a nonrespondent, equal among all sample units. All scenarios start from a vector of response propensities satisfying $\mu_{\rho}$ = 0.25 and $\text{S}^{2}_{\rho}$ = 0.02. As an additional constraint, all sample cases should be attempted at least once.

Results

Table 1 shows that the maximization of the response rate and the minimization of propensity variance are two incompatible fieldwork strategies. Response rate maximization implies the prioritization of the highest propensity cases (almost perfect positive correlation); whereas, pursuing representativeness implies the exact opposite (almost perfect negative correlation). Notice that the pursuit of representativeness and absence of bias are strategically very similar, as both strategies share the strict negative correlation between propensities and prioritization.

Table 1 Correlation between propensities and probabilities of being revisited (prioritization), according to different fieldwork strategies.

	Correlation between propensity and revisit probability
Maximization of $\bar{\rho}_{\text{final}}$	0.85
Minimization of $\text{S}^{2}_{\rho_{\text{final}}}$	-0.97
Minimization of $\text{S}_{\rho_{\text{final}}}/\bar{\rho}_{\text{final}}$	-0.97
Random (uninformed) fieldwork	Constant revisit probability

Response rate maximization can be termed “the path of least resistance” as it focuses almost exclusively on the least problematic cases. As such, one runs the risk of only generating more of the same type of respondents. Trying to reduce the nonresponse rate, therefore, offers no guarantee that the potential for nonresponse bias will decrease. Consequently, it may be even more interesting to also focus on the low propensity cases in order to obtain a more balanced respondent set, even if this means that the objective of a high response rate will not be achieved. Table 2 seems to confirm this idea.

Table 2 Quality indices of the obtained samples for four different simulated fieldwork strategies.

	Response rate	Risk of bias	Effective sample size
Maximization of $\bar{\rho}_{\text{final}}$	0.55	0.48	27.38
Minimization of $\text{S}^{2}_{\rho_{\text{final}}}$	0.44	0.09	44.17
Minimization of $\text{S}_{\mathrm{\rho}_{\text{final}}}/\bar{\rho}_{\text{final}}$	0.45	0.09	44.33
Random (uninformed) fieldwork	0.51	0.26	46.95

Although response rate maximization yields the highest response rate, the risk of bias and the effective sample size under this fieldwork objective is far below the quality indicators of the bias minimization strategy. Even the random allocation of renewed contact attempts leads to a less biased sample than response rate maximization.

Conclusion

Particularly in a climate of declining response rates, survey researchers have become increasingly concerned with the potential threat of bias. Although it is still a dominant fieldwork strategy, response rate maximization may not be the best alternative to combat nonresponse bias. First, it appears from the simulations that response rate maximization and bias minimization are strategically hard to unify. Second, response rate maximization seems to be surpassed by any other strategy with respect to bias and statistical power.

But why then do survey researchers still insist on maximizing response rates? A first reason is the belief that low nonresponse rates correlate with lower levels of bias and large sample sizes (implying lower standard errors). Indeed, as the proportion of nonresponse is relatively low, the potential for bias to occur is rather limited. However, this does not necessarily mean that response rates should be an end in themselves. In this respect, response rate maximization is probably a good example of goal displacement. If avoiding bias is the main objective and low nonresponse rates are believed to restrict the potential for nonresponse bias, the fieldwork objective may shift toward the maximization of the response rate, losing sight of the initial objective, i.e., bias reduction. A second reason why response rates are so dominant is the ease with which they can be pursued, calculated, and reported. Focussing on the risk of bias is much more difficult; it usually implies a set of auxiliary variables in order to measure and reduce the differences between respondents and nonrespondents.

Aiming at low nonresponse bias, how should fieldwork be conducted? Some researchers have recently tried to identify and prioritize low propensity cases (e.g., Luiten and Wetzels 2010; Peytchev et al. 2010). Usually, auxiliary information (e.g., age, gender, neighborhood characteristics, etc.) is used to approximate the response probability of individual sample members, allowing specifically targeted fieldwork efforts. These interventions include such strategies as incentivizing interviewers to recontact specific nonresponse profiles and alternating contact modes. Such innovative fieldwork operations are challenging since individual response propensities are hard to estimate based on a limited set of auxiliary variables. As these sets of variables leave much propensity variance obscured, one runs the risk of only making the respondent set representative with respect to the known auxiliary variables. Consequently, such a strategy may extend the path of least resistance beyond the levels of known response propensities, implying only a partial reduction of bias. Furthermore, as the managerial differences between rate maximization and bias minimization strategies seem to be so fundamental, many organizational fieldwork parameters may need to be drastically altered. These include, for example, the training, remuneration, and allocation of interviewers or the incentivizing of nonrespondents. Many of these organizational parameters, however, still seem to be deeply rooted in the tradition of response rate maximization.

Should high response rates really be a primary objective?

Abstract

Method

Results

Conclusion

References