It’s Getting Late: Improving Completion Rates in a Hard-to-Reach Sample

Jennifer Cantrell; Morgane Bennett; Randall K. Thomas; Jessica Rath; Elizabeth Hair; Donna Vallone

doi:10.29115/SP-2018-0019

Introduction

High retention rates in longitudinal survey research are key to ensuring study success. Sample retention helps ensure robust power and variability to detect study effects. These issues are of particular concern when assessing outcome behaviors with relatively low incidence, such as tobacco use among youth. Strong retention also minimizes bias due to differential attrition. Such bias is common in studies that examine risk behaviors, as those with higher levels of risk are more likely to be lost to follow up (Morrison et al. 1997). Retaining study participants over time can be especially difficult with teen and young adult populations (Price et al. 2016). Obtaining parental consent for respondents younger than 18 and increasing mobility and life course changes during young adulthood pose significant challenges for long-term studies (Boys et al. 2003; Hanna, Scott, and Schmidt 2014; Seibold-Simpson and Morrison-Beedy 2010).

A variety of monetary incentives and communication strategies have been described or tested in retaining youth and young adult participants in longitudinal studies. Monetary payments are commonly used in studies with young people (Borzekowski et al. 2003; Capaldi and Patterson 1987; Gregory, Lohr, and Gilchrist 1992; Morrison et al. 1997; Seed, Juarez, and Alnatour 2009) and have been found to be effective at improving responses to mailed (Collins et al. 2000; Edwards et al. 2009) and electronic (Brueton et al. 2013; Millar and Dillman 2011) questionnaires. Offering the opportunity to win a prize, through a lottery or sweepstakes drawing, has also been tested to retain young study participants, but research on the effectiveness of this strategy has been mixed (Boys et al. 2003; Henderson et al. 2010; Stiglbauer, Gnambs, and Gamsjäger 2011). Contact with participants through mail, email, or phone is another common strategy used to increase response and retention among youth and young adults (Boys et al. 2003; Cotter et al. 2002; Davis et al. 2016; Faden et al. 2004; Hanna, Scott, and Schmidt 2014; Koo and Rohan 1996; Morrison et al. 1997; Pirie et al. 1989; Scott 2004; Seibold-Simpson and Morrison-Beedy 2010; Vincent et al. 2012). These communications can function to enhance the legitimacy of the study and to increase trust in study researchers (Cotter et al. 2002; Millar and Dillman 2011).

Many longitudinal studies use a combination of methods to successfully retain young participants (Boys et al. 2003; Capaldi and Patterson 1987; Gregory, Lohr, and Gilchrist 1992; Morrison et al. 1997; Seed, Juarez, and Alnatour 2009; Vincent et al. 2012). A systematic review of retention efforts in cohort studies identified several youth and young adult studies that were successful at maintaining adequate retention rates using combinations of multiple methods, including varying incentive amounts and communications by mail, phone, and email (Booker, Harding, and Benzeval 2011). The use of multiple methods may be valuable as the effectiveness of each method may vary by participant characteristics – while a personalized mailing may be sufficient motivation for one person to participate, the addition of a cash incentive may be necessary to recruit another person (Boys et al. 2003).

Despite the well-documented success of implementing a variety of approaches to improve participant response and retention, few studies with younger populations have used an experimental design to determine the relative efficacy of different strategies. In addition, few of these retention efforts have been tested in Web-based studies with young people, with some exceptions. Stiglbauer et al. (2011) found lottery incentives to improve retention rates in a longitudinal Web-based sample of adolescents, but only when trust in anonymity was low (Stiglbauer, Gnambs, and Gamsjäger 2011). Millar and Dillman (2011) found that email and Web contacts combined with monetary incentives improved response rates to a Web-based survey of young adults (Millar and Dillman 2011). The use of Web-based data collection has the advantages of lower cost and greater acceptability, particularly among young people who are familiar with the technology. However, some research suggests response rates to Web surveys may be lower than to mailed surveys. More evidence is needed on approaches to improve retention over time to Web-based surveys among youth and young adults.

The focus of this study was to examine the efficacy of strategies to enhance retention among a probability-based national online panel of youth and young adults aged 15-21 recruited primarily via address-based sampling (ABS) and followed over time. The sample, known as the Truth Longitudinal Cohort (TLC), was designed to evaluate the effectiveness of the truth® campaign–a nationally branded mass media campaign focused on youth and young adult tobacco use prevention. More specifically, this study examines the efficacy of varying incentives and communication strategies to improve survey completion rates at six-month follow-up for this three-year longitudinal study. At the time of the study, this sample was, to our knowledge, the first national ABS sample of youth and young adults surveyed solely online over time. Given the novelty of the panel and the somewhat mixed literature on incentives for online surveys and communication methods, we developed the following set of research questions:

Research Question 1: Is there an increase in survey response at six-month follow-up with higher monetary incentives?

Research Question 2: Is there a difference in survey response at six-month follow-up with different methods of communication contact?

Research Question 3: Does the impact on survey response of different communication methods depend on the level of monetary incentive offered at six-month follow-up?

Methods

Study Background and Sample

In 2014, a probability-based sample of youth and young adults was recruited for the TLC study via direct mail. Additional details regarding participant recruitment are available elsewhere (Cantrell et al. 2017). Briefly, potential participants were directed to an online screener where they provided the ages of all members of the household. If multiple 15-21-year-olds resided in a household, one was randomly selected. If the selected participant was under age 18, their parent or guardian provided consent. Both the baseline and subsequent surveys were conducted online every six months for three years. The panel was recruited primarily via ABS (n=10,257), with a subsample of the cohort recruited though random digital dial (RDD) (n=1,966). The original baseline sample for the custom ABS and RDD subsample totaled 12,223 respondents. Six months after recruitment, follow-up survey requests were delivered to the 12,223 baseline participants, which included (in chronological order): an initial email invite; three reminder emails; a postcard mailing; an additional email; a text message (for respondents who provided cellphone numbers); alternative contact reminders (i.e., additional email or other contact); and a $10 increased incentive, from the original $10 to $20. The response rate for this follow-up reached 63% (n=7,756) after approximately nine weeks and then plateaued. The total of 4,467 participants who did not respond to requests to complete the follow-up survey were included in a randomized experiment to identify potential strategies to help improve the response rate.

Study Design

The study was a 3 x 6 factorial design with three incentive conditions ($30, $40, $50) and six communication conditions (for phone scripts and email/postcard communications, see supplemental materials). The monetary incentive levels were devised based on the study budget and what had been previously offered (i.e., an additional $20). The communication conditions included the following varying strategies:

Email prompt
Email prompt with sweepstakes offer ($500 prize)
Interactive Voice Response (IVR) reminder call
Live phone prompt
Direct mail postcard
All of the above

The 4,467 nonrespondents were randomly assigned to one of 18 experimental conditions as shown in Table 1. The study fielded from March 20, 2015, until April 15, 2015.

Table 1 Number of participants assigned to each experimental condition.

	Experimental incentives
Experimental group	$30	$40	$50	Total
Email prompt	254	245	244	743
Email prompt with sweepstakes offer ($500 prize)	245	251	252	748
IVR reminder call	244	253	237	734
Live phone prompt	247	255	251	753
Direct mail postcard	248	243	246	737
All of the above	253	250	249	752
Total	1,491	1,497	1,479	4,467

Measures

The outcome variable was survey response status (no response vs. completed survey). The independent variables included a factor variable for each condition. We also collected demographic variables on gender; age group (15-17 vs. 18-21); race/ethnicity; region (Northeast, South, Midwest, West); and smoking status (never smoked, ever smoked but not in the past 30 days, and smoked in the past 30 days).

Data Analysis

Descriptive analyses indicated the percentage of respondents who completed the survey by experimental condition. Among the 4,467 respondents assigned to a treatment condition, 1% (n=64) responded but did not complete the survey. This group was eliminated for the purposes of the analyses for a final analytic sample of 4,403. All data analysis was conducted using Stata version 14.2 (StataCorp 2015).

We conducted chi-square tests to examine differences in basic demographic variables across conditions as a check to ensure random assignment was successful. With response status as the outcome, we ran logistic regression models predicting the odds of each condition within each factor (i.e., incentive or communication factor) compared with a reference condition. In follow-up analyses, we also examined each additional pairwise comparison condition within each factor. We adjusted all comparisons with the Bonferroni correction (Bland and Altman 1995). Next, we examined the interaction of the two main effects to determine whether the effect of the communication enhancement depended on the level of the monetary incentive.

Results

Table 2 shows unweighted demographics across the sample. A majority of the sample was male, aged 18-21, and white. A total of 10.3% were African American, and 11.1% were Hispanic. A larger proportion were from the South and Midwest compared with the Northeast and West, and 67% were never smokers. Results of the chi-square analysis found no significant differences in demographics across conditions.

Table 2 Unweighted study sample characteristics.

Gender
Male	52.9%
Female	47.1%
Age
15-17 years	34.8%
18-21 years	65.2%
Race/Ethnicity
White, Non-Hispanic	69.5%
Black, Non-Hispanic	10.3%
Other, Non-Hispanic	4.7%
Hispanic	11.1%
2+ races, Non-Hispanic	4.5%
Smoking status at baseline
Never smoker	66.9%
Ever, not past 30-day smoker	19.8%
Past 30-day smoker	13.3%
Region
Northeast	21.0%
South	36.2%
Midwest	26.8%
West	16.0%

Sample for retention experiment: N=4,403

Table 3 shows survey completion rates for the 4,403 eligible study participants overall and for each group within the incentive and communications conditions. Overall 15.9% (n=699) of the 4,403 participants assigned to the experiment completed the survey. Survey completion rates within each group by condition ranged from a low of 6.3% (e.g., enhanced communication by IVR only) to a high of 33.4% (all communications enhancements) across the six communications conditions and from 14.1% ($30 incentive) to 16.8% ($40 incentive) across the dollar incentive conditions.

Table 3 Proportions of survey completion by experimental condition.

	Incentive groups
Communications groups	$30	$40	$50	Total
	%	%	%	%
	(SE)	(SE)	(SE)	(SE)
Email prompt	8.7%	10.3%	13.3%	10.8%
Email prompt	(0.02)	(0.02)	(0.02)	(0.01)
Email prompt with sweepstakes offer	7.8%	12.5%	12.8%	11.1%
Email prompt with sweepstakes offer	(0.02)	(0.02)	(0.02)	(0.01)
IVR reminder call	5.7%	7.2%	6.0%	6.3%
IVR reminder call	(0.02)	(0.02)	(0.02)	(0.01)
Live phone prompt	10.6%	14.7%	11.0%	12.1%
Live phone prompt	(0.02)	(0.02)	(0.02)	(0.01)
Direct mail postcard	17.6%	21.8%	25.5%	21.7%
Direct mail postcard	(0.02)	(0.03)	(0.03)	(0.02)
All of the above	34.1%	34.4%	31.6%	33.4%
All of the above	(0.03)	(0.03)	(0.03)	(0.02)
Total	14.1%	16.8%	16.7%	15.9%
Total	(0.01)	(0.01)	(0.01)	(0.01)

Table 4 presents odds ratios from the logistic regression model comparing the incentive and communications conditions and provides confidence intervals unadjusted and adjusted with a Bonferroni correction. Estimates demonstrate that respondents who received the $40 and $50 incentives had 1.25 and 1.23 significantly higher odds, respectively, of completing the survey compared with those who received $30. However, after adjusting for multiple comparisons, we found no significant difference between any pairwise comparisons of the incentive conditions.

For the communication conditions, models indicated that compared with communicating via email alone, there were no significant differences in completion when adding a sweepstakes offer or replacing the email with a live phone prompt. Respondents who received an IVR call were approximately half as likely to complete the survey compared with those who received the email, while those who received a mailed postcard were over twice as likely to respond compared with email only. Those who received all five prompts were over four times as likely to complete the survey compared with those who received the email alone. These differences in IVR, the postcard and the multiple prompt groups compared with email alone were all statistically significant both before and after adjusting for multiple comparisons.

Follow-up analyses for each pairwise communication condition demonstrated similar patterns. After adjusting for multiple comparisons, respondents in the IVR condition were significantly less likely to respond compared with those in any other communication condition, respondents receiving all five prompts were significantly more likely to respond compared with those in any other condition, and respondents receiving the postcard alone were significantly more likely to respond than those in any other condition, with the exception of the multiple prompt condition, which was always the most responsive condition (data not shown).

A model examining the interaction of each factor found no significant interactive effects (data not shown).

Table 4 Results of logistic regression predicting survey completion.

	Odds ratio	Unadjusted 95% CI	Bonferroni adjusted 95% CI
Base Incentive
$30 (ref)	--	--	--
$40	1.25	1.01–1.54	0.97–1.61
$50	1.23	1.00–1.52	0.96–1.59
Experimental Group
Email (ref)	--	--	--
Sweepstakes and Email	1.03	0.74–1.43	0.63–1.68
IVR	0.56*	0.38–0.82	0.32–0.98
Phone Prompt	1.14	0.83–1.57	0.71–1.85
Postcard	2.29*	1.71–3.07	1.48–3.56
Sweepstakes, Email, IVR, Phone Prompt & Postcard	4.16*	3.14–5.50	2.73–6.31	,
*Indicates that the odds ratio remained significant at the p<.05 level after adjusting for multiple comparisons.

Conclusions

This study assessed the influence of various incentive and communication strategies on survey completion among a sample of youth and young adults. Results indicated that using multiple methods of communication to prompt survey response was more efficacious than any single method. However, when using single methods of communication to prompt response, a mailed postcard prompt was more efficacious than any other method tested, including email, sweepstakes, IVR phone contact, and live phone contact. IVR contacts were consistently the least efficacious method compared with all other contact methods. In addition, there were no substantive improvements in survey response with increases in monetary incentives. Furthermore, there were no interaction effects between the dollar incentive and communication conditions with respect to survey response.

These findings may be valuable for survey researchers working with youth and young adult populations in the online environment to help improve sample retention among this challenging group. Similar to some prior research among adults, monetary incentives were not effective for improving online survey response. In terms of communication methods, the high efficacy of multiple contacts may have been due to different methods working for different participants, as suggested by previous research (Boys et al. 2003), or to the fact that respondents were receiving more contacts than single method groups. Contact via sweepstakes offers did not stand out as an especially efficacious method, while IVR was a particularly poor method of contact and may have reduced survey legitimacy (Boys et al. 2003). The single postcard contact was the most efficacious single method and may have functioned to increase study legitimacy (Boys et al. 2003; Hanna, Scott, and Schmidt 2014; Millar and Dillman 2011). Although utilizing multiple methods of communication is likely to prompt the highest response, multiple contacts are more labor-intensive and costly than single contacts, even after eliminating the IVR method from the group given its low efficacy. Assuming tradeoffs between achieving higher response rates and limiting costs, researchers may consider utilizing a single communication contact to improve survey response. In this case, the postcard contact is the option with the highest efficacy and lowest cost. Furthermore, given no differences in monetary incentive levels, the lowest incentive combined with the postcard would be the most cost-effective option.

It is important to note that the study sample only included respondents who did not respond to the initial survey invitations, follow-up contacts, and an additional incentive up to $20. Therefore, the results of this study may be applicable primarily to resistant or later-stage respondents. We also did not examine nonresponse bias. Additional research is needed to evaluate attrition in longitudinal online panels and test strategies for reducing both nonresponse and differential nonresponse. Such approaches may include targeted incentives tailored to specific populations and individuals with a lower propensity to respond, targeted personalized communication contacts, and prepaid incentives. Research on the most effective monetary incentive levels for this age group and for earlier stage respondents would also be valuable, especially for studies with lower levels of survey compensation.

Given high rates of Internet use and comfort with digital communication (Perrin and Duggard 2015), Web-based data collection can be a valuable mode for surveying younger populations. Compared with older adults, youth and young adults are more likely to respond to an Internet survey than a mail survey (Kaplowitz, Hadlock, and Levine 2004; Shih and Fan 2008). Yet little research has examined how best to recruit and retain this population in Web-based research, particularly for probability-based samples. Retaining younger populations in online and offline longitudinal surveys presents challenges, and more research is needed to best understand how to leverage the strengths of the Internet, in addition to other methods, to maintain robust study samples over time.

Acknowledgments

The authors would like to thank Alexandria Smith and Larry Osborne for their contributions to the implementation of this study.