Giving Respondents Voice? The Feasibility of Voice Input for Mobile Web Surveys

Melanie Revilla; Mick P. Couper; Carlos Ochoa

doi:10.29115/SP-2018-0007

Introduction

More and more respondents are completing Web surveys using mobile devices, mainly tablets and smartphones. Previous research has investigated the impact of the device used on the comparability and quality of the data obtained from open narrative questions. Several studies have found that nonresponse to open questions is higher on mobile devices (Lambert and Miller 2015; Mavletova 2013), completion times for such questions are longer (Mavletova 2013; Revilla and Ochoa 2016), and the length of open-ended responses is shorter (Lambert and Miller 2015; Lugtig and Toepoel 2016; Peterson et al. 2013; Struminskaya, Weyandt, and Bosnjak 2015; Wells, Bailey, and Link 2014). However, (Buskirk and Andrus 2014) found no significant differences in length by device, and (Antoun, Couper, and Conrad 2017) found smartphone respondents provided longer answers to open questions.

Based on the evidence suggesting potential problems with open questions on mobile devices, (Revilla and Ochoa 2016) suggested investigating the potential use of voice input functions to make it easier and quicker for respondents to provide their answers to open narrative questions on mobile devices. The goal of this research paper is to study (1) how often people use voice input on their mobile devices (for “everyday” life), (2) to what extent would they be willing to use voice input to answer open questions in Web surveys, and to (3) explore the factors associated with willingness to use voice input. This is a largely understudied area.

A search on data on the extent of general voice input use yielded only three studies. Parks Associates (2016) reported that 39% of smartphone owners use some sort of voice recognition software such as Siri or Google Now. A Google study (Callaham 2014) revealed that 59% of teens and 41% of adults reported using voice search more than once a day (see also Google 2014). (Bajarin 2016) reported that 65% of a consumer panel said that they had used Siri, Google’s “OK Google or voice search”, or Microsoft’s Cortana. None of these studies provide sufficient detail to evaluate these estimates. Voice input is also increasingly used in some instant messaging apps (e.g., WhatsApp) instead of typing, but again data are hard to find.

We are aware of no studies that have explored the issue of voice input in the survey context. However, (Cape 2015) tested the feasibility of video input, finding that about half of Survey Sampling International panelists were capable of doing a video for responding to an open question, and about half of those actually did so when asked. Schober and colleagues (Schober et al. 2015) examined voice input (relative to text input) in a survey experiment among iPhone users. They reported that voice users provided better quality data (rounded less, straight-lined less, and reported more information that was sensitive). However, respondents expressed stronger preference for text input than voice for answering mostly closed-ended questions.

Given the rise in voice input technology and the increased use of smartphones, the potential use of voice input for answering open questions on a mobile device seems an area ripe for exploration.

Method and Data

We used data from a survey implemented in Spain in September–October 2016 within the Netquest panel, an opt-in online panel (www.netquest.com). The target sample was restricted to panelists who had Internet access through both PC and smartphone. Cross quotas for age and gender were used to ensure that the distributions of these variables in the sample were similar to the ones observed in the full panel.

The survey contained a maximum of 69 questions. Answers were not mandatory. The full survey (in Spanish) can be found at http://ww2.netquest.com/respondent/glinn/mobile2016. A total of 1,476 respondents (i.e., 48.4% of those who started; 90.9% of those who answered the first main survey question) completed the survey and formed the focus of our analyses. We focused on two key questions:

VI1: How often do you use voice input on a smartphone or tablet? (“1-Never” to "5-Always)
VI2: If it would be possible in the surveys we send you to use the voice input option to answer to open questions, would you use it? (“Definitively yes”, “Probably yes”, “Probably no”, and “Definitively no”).

We looked at the answers to these two questions and at how these are related to each other and to other variables described in the following section. Given the nonprobability nature of the sample, all analyses used unweighted estimates.

Hypotheses

We expect people who already use voice input in their everyday life would be more interested in also using it in surveys. Furthermore, we expect that elderly people, less educated people, people using the Internet on a smartphone less frequently, and people who do not have Spanish as a mother tongue to have more difficulties in typing their answers to open questions on mobile devices, and thus to show more interest in the possibility of using the voice input option. For similar reasons, we expect respondents who reported that the current survey was difficult to answer and those in the highest quartile of estimated completion time (i.e., over 15 minutes) to show more interest in the possibility of using the voice input option. In addition, if respondents are answering from home and if they are alone while answering the survey, using voice input might be easier than at work or in the presence of others. We thus expect being alone while answering to have a positive impact on the stated interest in using voice input to answer open questions.

Results

Frequency of Use of Voice Input on Mobile Devices

First, we look at how often respondents use voice input on a smartphone or tablet. Table 1 shows the distribution of the answers to this question (VI1).

Table 1 Frequency of use of voice input.

How often do you use voice input	% (N=1,469)
1. Never	49.1
2	19.5
3	17.3
4	5.7
5. Daily	8.4

Around half of the sample report never using voice input. Fewer than one in ten respondents (8.4%) report daily use of voice input. Still, this suggests that voice input is already being used at least to some extent on mobile devices.

Stated Use of Voice Input to Answer Open Questions

Second, we consider the willingness to use voice input to answer open question in surveys (VI2). Table 2 shows the distribution of responses.

Table 2 Willingness to use voice input to answer open questions.

Would you use voice input for open questions	% (N=1,469)
Definitively yes	12.7
Probably yes	41.3
Probably no	33.9
Definitively no	12.0

Overall, the distribution is quite balanced, with a small shift to the positive side: A majority (54.1%) of respondents said they definitely or probably would use voice input to answer open questions if it was possible.

Relationships With Other Variables

To what extent is willingness to use voice input to answer open questions related to current use of voice input and to other variables described above? We first examine bivariate relationships using a series of chi-square tests and then use a multivariable regression.

Bivariate Analyses

Table 3 presents the row percentages and significance tests for willingness to use voice input. We collapsed the dependent variable (VI2) into two categories (combining definitely and probably yes to “would use” and definitely and probably no to “would not use”). Similarly, we collapsed current use of voice input (VI1) into never use (1) versus use (2-4).

Table 3 Percent willing to use voice input (VI) in surveys, by selected characteristics and chi-square tests.

Variables	Values	Would use VI (%)	n	chi-square	P-value
Use voice input	Never use	40.8	716	96.538	.000
	Use	66.4	747
Gender	Women	53.7	844	1.111	.292
	Men	56.6	541
Age	18-24	54.0	285	4.236	.375
	25-34	51.2	402
	35-44	53.1	426
	45-54	58.5	241
	54+	58.4	113
Education	Secondary or less	58.7	385	4.404	.036
	More than secondary	52.5	1,082
Internet use on Smartphone	Daily	54.8	1,345	4.315	.038
	Less often	45.1	122
Mother tongue Spanish	No	44.7	103	4.033	.045
	Yes	54.9	1361
Answer survey from home	No	52.8	415	0.808	.369
	Yes	55.4	1033
Alone while answering survey	No	56.3	513	1.105	.293
	Yes	53.5	939
Difficulty answering the survey	No	55.0	1396	8.069	.005
	Yes	37.3	67
Completion time	<=15 minutes	53.7	1107	2.590	.107
	>15 minutes	58.8	313

We find significant associations with willingness to use voice input for several variables. Specifically, those who currently use voice input on their smartphones are more willing to do so for open questions in surveys. Willingness is higher for those with lower education, frequent smartphone users, native Spanish speakers, and those who did not report difficulty answering the current survey. We expected that those who may benefit more from voice input (i.e., those who use the Internet on a smartphone less frequently, those who do not have Spanish as a mother tongue, and those who had more difficulty answering the survey) would show more interest in the possibility of using the voice input but found the opposite. This might be because these people do not see the voice input option as a way to help them answer more easily (as we expected) but instead as a new tool which may make it even more difficult for them to participate.

Regression Analyses

Finally, we move from bivariate to multivariate analyses and run a regression with VI2 as the dependenrt variable. We use the original 4-category variable but reverse-coded so that a high score (and thus positive coefficients) means greater willingness.

We use the same variables as in Table 3 as independent variables, except that we consider the original variables instead of the categorical variables created for Table 3 in the case of VI1 (1= “Never”, 5= “Daily”), education (1= “Less than once a month”, 6= “Daily”) and difficulty of answering the survey (1= “Very easy”, 4= “Very difficult”). Table 4 shows the coefficients and p-values for this linear regression analysis.

Table 4 Regression analysis with VI2 (1=“Definitely not”, 4=“Definitely yes”) as the dependent variable.

Independent variables	Coefficients	P-values
Use voice input (VI1)	.219	.000
Men	.040	.381
Age	.038	.047
Education	-.032	.191
Freq. Int. use on Smartphone	.093	.005
Mother tongue Spanish	.097	.264
Answer survey from home	.090	.071
Alone while answering survey	-.018	.698
Difficult to answer the survey	-.209	.000
Long estimated completion times	.090	.092
Constant	1.777	.000

R²=.143; Adjusted R²=.136; N=1,309

When controlling for the effect of other variables in a multivariate analysis, several of the relationships found in the bivariate analyses (i.e., the use of voice input in everyday life, the frequency of Internet use on a smartphone, and the perceived difficulty of answering the current survey) remain significant. On the other hand, two variables (education and Spanish as a mother tongue) are no longer significant, and age shows a marginally significant positive relationship with willingness to use voice input for answering open questions. We also ran a logistic regression (1=willing, 0=not willing) and generally found similar results. Education is statistically significant (p=.023) in the logistic model, and age is marginally insignificant (p=.055), but the signs of the coefficients for all predictors remain the same.

Discussion

Overall, we are interested in the possibility of using a voice input option to facilitate the answers of respondents in Web surveys to open narrative questions. We used data from a Web survey implemented in an online opt-in panel in Spain in 2016, in which respondents were asked about their current use of voice input in everyday life and about their willingness to use voice input in a survey to answer open questions. All respondents had Internet access on smartphones.

We found that around half of the sample report using voice input on a mobile device (smartphone or tablet) at least sometimes, and a little more than half (54.1%) report being willing to use this option to answer open questions if possible. We found three variables that have a robust relationship with willingness to use voice input for answering open questions in surveys: the use of voice input in everyday life (the greater the use, the greater the willingness); the frequency of Internet access through a smartphone (the greater the frequency, the greater the willingness); and the perceived difficulty of answering the current survey (the greater the difficulty, the lower the willingness).

An important limitation of this study is that we only considered what respondents said they would do and had no information about actual use of voice input in a survey if allowed (and/or encouraged) to do so. Even though previous research has shown that stated willingness is a useful measure in its own right, especially if the goal is to examine reasons for and covariates of (un)willingness (Couper and Singer 2013; Couper et al. 2008), we expect that actual compliance rates may be lower than those based on expressed willingness. Given that we surveyed members of an opt-in panel who had Internet access on their smartphones, these estimates of voice input use and willingness are likely to be high relative to the general population.

More research is needed to explore the likely use of voice input in surveys, but also to explore how the use of voice input could affect data participation, data quality, and respondent satisfaction with the survey. Further research is also needed to explore the potential technical and respondent-related barriers to the use of voice input and to replicate these findings in other settings. The next step is to test the actual use of voice input among smartphone users to answer open questions in a survey if given the option and encouraged to do so.