Program evaluators and survey researchers often do not have the budget to conduct surveys using multiple modes. This limitation, choosing phone or web administration, may result in mode effects that could skew results, reduce generalizability, affect data quality, and lead to misleading conclusions.
Mode effects are the differences in responses to a survey or survey question attributable to the survey administration mode. Past research has found that survey mode can affect the demographics of survey respondents, specifically income and education (Datta, Walsh, and Terrell 2002). Further, a recent meta-analysis of survey characteristics and representativeness found that mixed-mode surveys and other-than-Web modes tend to be more representative (Cornesse and Bosnjak 2018).
Mode effects are composed of two components: differential non-observation and observation effects (de Leeuw 2018). Non-observation effects refer to who responds to different modes and observation effects refer to how respondents respond to questions presented via a given mode.
Non-observation effects may result in systematic underrepresentation when different types of respondents choose or are only able to complete the survey in a specific mode. As such, non-observation effects may result in a form of self-selection bias that can affect the sample frame if groups of individuals are unable or unwilling to provide one or more forms of contact information. For example, some members of the population of program participants may be unwilling to provide email contact information or telephone contact information. A survey performed using a single mode of contact in these cases may result in biased results if those willing to provide a form of contact information differ systematically from those who are not willing to provide that information.
In contrast, observation effects occur when the mode of administration affects the types of responses provided. For example, responses gathered orally may provide different information than would be collected in a written or online survey because the interviewer may provide clarifying information to the respondent or probe the respondent to clarify their response. In addition, some individuals may be more hesitant to share private information or to disclose information that they fear may result in being stigmatized when speaking with another person. For example, mode effects were demonstrated in a comparison of Web and telephone responses that found that Web respondents gave less differentiated answers to sets of questions with the same response options (Fricker et al. 2005). The same study also found a mode effect for survey response time. Specifically, Web respondents took longer to complete the survey than telephone respondents, although the authors noted the slower completion times mainly related to entering open-ended responses.
Survey administration mode can also affect patterns of providing non-substantive responses (i.e., refusing to answer or selecting “don’t know”) or item nonresponse (skipping questions) to certain questions. For example, one study found that item nonresponse differed significantly across modes with telephone respondents having significantly lower item nonresponse than Web and mail respondents (Lesser, Newton, and Yang 2012). Another study found that an interviewer led survey had lower item nonresponse than PC or Web-based survey designs (Lee et al. 2019). A more recent meta-analysis of item nonresponse in Web versus other modes found no statistically significant difference in the average item nonresponse rate in Web versus other survey modes (Čehovin, Bosnjak, and Lozar Manfreda 2022).
The study presented here aims to add to the existing literature on survey administration mode effects by exploring a real-world program evaluation issue, namely if the type of contact information that members of the population provide relates to non-substantive responses (choosing "don’t know or “prefer not to answer”).
We examine this issue by comparing responses given by respondents who provided email contact information to those who did not provide email contact information but who did provide a telephone number. In this analysis, we compare the portion of respondents in each group that provided non-substantive responses while controlling for demographic factors.
In August, November, and December 2020, we conducted an English-language survey of residential energy efficiency program participants in the Southern United States via phone and Web. Program participants were either sent an email to complete a Web-based survey or contacted via telephone to complete the survey by telephone. The mode of administration was based on the available contact information: participants with email addresses were contacted by email and those without were contacted by telephone. In both modes, the contact was provided an overview of the study and asked if they would be willing to provide their feedback.
The email survey was launched in two batches for quarter one and quarter two participants. We emailed an invitation and one reminder to 1,619 participants who had valid email addresses. This represented all participants with a valid email address. Quarter two email invitations were sent on August 3rd and 5th, with reminders on August 7th. Quarter three email invitations were sent November 9th, with reminders on November 12th.
When administering the survey by telephone, we called 912 of the 1,102 participants without email addresses in two waves; the first wave was between August 9 and September 1, and the second wave was between November 11 and December 1. A total of 190 participants were not invited to take the survey because of budget and time constraints. These contacts only had telephone numbers in the tracking data. The list had been randomized before calling and did not systematically exclude certain customers in any manner. We left one voicemail and made one follow-up call to potential phone respondents. The data did not contain information regarding respondent device type for email responses nor did it contain information regarding phone service type (cell or landline).
We received 337 web and 165 telephone responses, for a 21% web and 22% telephone response rate, respectively (see Table 1)
Phone and Web respondents received instruments designed to be as similar as possible. For all questions analyzed in this study, respondents were given “don’t know,” “prefer not to say,” or both response options. The most notable difference between phone and Web administration was that if a phone respondent answered a question before being prompted with response options, phone administration staff were not required to read all response options. The Web respondents received a prompt to answer the question if they left a question blank. (Questions were “soft-required.”)
The analysis for this study focused on 9 of the 13 demographic/home characteristics questions and up to 12 program indicator questions for non-substantive response. We excluded four home characteristics variables—home type (single-family detached, single-family attached, etc.); age of home; main space heating fuel; and main water heating fuel—as being unlikely to produce results that would be easily interpretable.
We operationalized the non-substantive response variable as a dichotomous variable: each respondent who provided one or more “don’t know” or “prefer not to answer” responses to a demographic or program indicator question was given a value of 1 for the non-substantive response variable, and each respondent who did not provide one or more of those responses was given a value of 0. In total, 12 program indicator questions included up to nine Likert scale questions and up to three yes/no questions. Table 2 summarizes additional information about each of these dependent variables, including number of response categories or levels, and the type of bivariate analysis performed to assess differences between phone and Web respondents.
We used logistic regression analyses to assess the relationship of item nonresponse to survey mode. Demographic factors were included in the regression analysis to explore possible correlations with item nonresponse and to understand how mode may affect item nonresponse, while controlling for various background characteristics.
In addition to the outcomes previously discussed, we investigated survey response time. The survey collection tool collected start and submission date time. However, there are limitations to interpreting times. For both email and phone responses, it appears responses were not submitted promptly at survey completion (i.e., the survey-taker or survey administration staff may have failed to hit submit immediately after they had completed the survey). To address this, we removed 90 respondents’ response times that were determined to be outliers (not falling within the interquartile range). A total of 63 Web (19%) and 27 phone (16%) response times were removed. After this preliminary step, we found a median survey response time of 17 minutes 35 seconds and an average response time of 24 minutes 16 seconds. The average and median telephone survey response times were about 7 and 3 minutes longer than the Web surveys, respectively. The questions used in this study were located in the latter half of the survey, with the demographic questions being the final section.
Table 3 and Table 4 show demographic and home and household characteristics, respectively, by survey mode. To control for multiple comparisons, we divided the conventional level of statistical significance (p < .05) by the number of comparisons made (19) across the nine demographic/home characteristic variables (variables listed in Table 3 and Table 4, excluding home type, home age, space heating fuel type, and water heating fuel type) which produced a criterion significant level of p < .0026. For nominal level variables, we counted each level except prefer not to answer, and for ordinal variables, we counted the entire item as one comparison.
Among the demographic variables, age, education, household size, home ownership, and two of the employment status variables had statistically significant relationships to survey mode.
The phone survey gathered more responses from participants aged 65 or older, while the Web mode had a larger portion identifying as 44 years old or younger (36% vs. 22%). The phone survey gathered a larger portion of responses from people ages 65 or older (45% vs. 26%).
Web survey respondents noted having completed more education than phone respondents: 58% of Web respondents said they had a four-year degree or higher compared to 41% of phone respondents.
A higher percentage of Web than phone survey respondents (47% vs. 30%) reported working at least 30 hours per week, while phone survey respondents were more likely than Web respondents to report being retired (48% vs. 28%). A larger portion of Web respondents identified as male (51% vs. 41%), but this difference did not reach the above criterion for statistical significance.
We did not find a statistically significant difference between Web and phone respondents’ reported household income, though a significant portion of respondents to both modes preferred not to share this information (see Table 4).
Regarding race, the portion of respondents that identified as white or Caucasian was larger in the Web survey; however, the magnitude of the difference was small. We did not find a statistically significant difference between the portion of respondents who identified as non-white.
Web respondents tended to own their homes at a higher rate than phone respondents, while phone respondents were less aware of their home size (see Table 4). Phone respondents reported living alone more than Web respondents.
We used regression analysis to examine whether response differences between phone and Web respondents reflect the demographic differences rather than mode effects. We regressed the non-substantive response variable on demographic characteristics and mode of administration using a series of hierarchical nested logistic regression models.
We included all demographic and home characteristics variables that had statistically significant bivariate relationships with survey mode in the regression analyses.
For these models, we dummy coded the nominal variables (i.e., education, sex, and employment). Age and household size were input into the model as ordinal variables.
In Model 1, we regressed the non-substantive response variable on the variables that had significant differences between Web and phone respondents (employment status, age, education, and household size). We found the overall model was statistically significant; however, only employment significantly predicted question non-response.
Model 2 regressed non-substantive response on demographic factors and a dummy variable for the administration mode. We found that the mode variable was statistically significant (p < .002) in this model and therefore remains a predictor of item nonresponse, even while controlling for demographic factors.
Table 5 summarizes the model statistics. Both models were statistically significant overall. Pseudo R2 values indicate the model with the administration mode had a better overall fit than the model with only demographic independent variables.
This article explored mode effects on respondent characteristics, nonresponse/self-selection, and item nonresponse. We found that survey mode related to non-substantive responding. We found higher non-substantive response rates for phone respondents, while controlling for demographic and background factors.
This study’s limitations include potential unmeasured differences between the two samples, sample size, and the formulation of non-substantive response (combining don’t know and refused responses). Though the survey gathered demographic information and we sought to control for these factors through logistic regression, reviewers may identify additional factors that may influence the likelihood of item nonresponse that we did not investigate (for example time of day, quality of phone connection). Similarly, a larger sample size could potentially increase the strength of the analysis and future analysis may be more focused on parsing out the difference between non-substantive responses, whether they are don’t know or prefer not to say.
Our findings contribute to a growing body of research on item nonresponse. Čehovin et al.'s meta-analysis found item nonresponse rates in web surveys were similar compared to other modes. Both Lesser and Lee et al. found that item nonresponse differed significantly across modes with phone respondents having significantly lower item nonresponse than web and mail respondents. Some research has suggested that interactions between survey mode, demographics, and questionnaire characteristics contribute to variations in item nonresponse (Messer, Edwards, and Dillman 2012). Our study found that phone respondents had higher item non-substantive response, even while controlling for respondent characteristics.
Strategies for reducing item nonresponse may take the form of forcing a response, prompting a response if someone attempts to skip an item, or offering “don’t know” or “no response” options. The latter may be done in combination with forced or prompted responses. None of these is ideal. Forcing responses can increase drop-out rates and offering “no response” options can increase item nonresponse (Kmetty and Stefkovics 2021). Also, offering “don’t know” responses in online surveys can result in a higher rate of this response than in a telephone survey that allows but does not explicitly offer it (Zeglovits and Schwarzer 2014). Our study’s findings and other past research illuminates an area in need of continued research—if and how survey administration mode may relate to non-substantive responses and the most appropriate methods for addressing this data quality issue.