Introduction
The proportion of the U.S. population with Internet access has risen dramatically over the past several decades. In 1997, only 18% of households in the United States had Internet use at home. Now, nearly four out of five U.S. adults (78.5%) reported having a computer with Internet connection according to the 2013 American Community Survey (File and Ryan 2014). Not surprisingly, the growth in Internet access has generated widespread adoption of Internet surveys in market and public opinion research. Current estimates are that more than 50% of global research revenues are generated by online surveys (ESOMAR 2013). The vast majority of online population surveys are conducted as non-probability surveys of existing online panels or one time surveys of river sampling of Internet users from selected sites (Baker et al. 2010).
The American Association for Public Opinion Research (AAPOR) Report on Online Panels summarized research published between 2000 and 2009 that online panels were disproportionately comprised of whites, more active Internet users, and those with higher educational attainment (Baker et al. 2010). These findings, however, are based on the characteristics of respondents to particular online surveys compared to the general population. These differences may arise from coverage error associated with online access, biases in panel recruitment for individual panels, or self-selection of participants in particular surveys. Coverage error associated with online access is relatively small and declining. Nonresponse bias in panel surveys is easily calculated and corrected by comparison of respondent and nonrespondent characteristics from the panel profile. Consequently, the most serious errors are likely due to the differences between the total population and the population who participate in online panels.
Differences between panels have been recently discussed by Keeter et al. (2016), who highlight the variability across different panel vendors, by Yeager et al. (2011), who compare the accuracy of online panels by comparing to known benchmarks, and extended to variances by Iachan et al. (2016). However, since online panels frequently partner with other panels in order to generate samples that are larger, more diverse or more targeted than what is available within their panel, the population who participate in any online panel is more critical to evaluating coverage errors for this form of survey than errors in individual panels.
The Current Population Survey (CPS) provides annual estimates of households with Internet access, but not estimates of the population who participate in online panels. Indeed, we have found no published study that estimates the size and characteristics of the online panel universe. The only previous estimate comes from an AAPOR conference presentation that found 9% of a national landline random digit dial (RDD) sample in 2009 reported participating in any Internet survey panels (Boyle 2010).
The purpose of this paper is to investigate the current coverage of Internet panels among American adults. Rather than look at respondents to any one online panel or survey, we consider the population who participate in any online-based panel surveys.
Methods
In order to explore online panel participation in the United States, we added a set of questions about online panel participation to the demographics of a national dual frame RDD survey, conducted in English and Spanish from April to June 2015. The survey included 503 adults whose interview included the online panel questions. The computer-assisted telephone interviewing (CATI) telephone interview averaged 31.1 minutes for landlines and 33.1 minutes for cell phones for the general population sample. The response rate was 27.2% for landlines and 17.6% for cellphones using AAPOR RR 3. At the end of the interview, all respondents were asked, “Do you have Internet access at home, at work or on a smart phone?” Those who reported that they had Internet access from any of these sources were asked, “Are you a member of an online or internet panel for which you receive invitations to complete surveys?” Respondents who identified themselves as being a member of these panels were subsequently asked, “As a panelist, how often are you contacted to participate in Internet or online surveys?” and “As a panelist, how often do you participate in Internet or online surveys, either in response to invitations or by visiting the survey website?” with precoded response categories. The completed sample was weighted by race and Hispanic ethnicity, census region by housing tenure, age by educational attainment, sex by marital status, and sex by age group to population parameters.
Prevalence
This 2015 survey found that 71.1% of American adults reported that they had Internet access at home. (CPS estimates does not specify location of access.) Nearly two out of five (39.8%) reported Internet access at work. And, two thirds (66.7%) of adults reported Internet access on a smartphone. Collectively, 86.2% of the sample reported Internet access through one of these means (Table 1). This survey’s estimate of adult Internet access (86%) is nearly identical to the estimate of the Pew Research Center that 84% of all American adults used the Internet in 2015 (Perrin and Duggan 2015).
The survey found that 6.0% of respondents reported being a member of an online or Internet panel for which they received invitations to complete surveys. The survey estimate had an expected sampling variability of ±1.6 percentage points at the 95% confidence level. The adult population of the United States was estimated at 245,273,438 in July 2015. Consequently, it would appear that approximately 15 million Americans were members of Internet panels in 2015 (with a confidence interval between 11.1 million and 18.9 million). The top 10 online panels reported a total of 33 million adult members of their online panels in the United States (Table 2). Since individuals can be members of multiple panels, and some companies may include partner organizations in their panel counts, the number of unique panelists will be smaller than the combination of members of all panels. Consequently, the total unduplicated number from the top 10 panels (33 million) is not inconsistent with the estimated unique number of panelists (15 million) from the survey.
Respondents who said that they were a member of an online panel were asked how often they were contacted to participate in online surveys (Table 3a). Frequency of contact may reflect the number of Internet panels to which the respondent belonged (not asked) or the match between some of the respondents’ characteristics on their panel member profile and the nature of surveys being conducted. It could also be affected by their propensity to respond if those who respond more or less frequently are sampled at a different rate for individual surveys. However, what is notable is that four out of five respondents who considered themselves as online panelists (80%) report that they are contacted at least monthly to participate in online surveys.
Online panelists were then asked, “As a panelist, how often do you participate in Internet or online surveys, either in response to invitations or by visiting the survey website?” Collectively, 74.1% of panel members report that they participated in an online survey at least monthly (Table 3b). The reported frequency of survey invitations and survey participation appears to be a credible description of online panel behavior because most online panelists will be contacted for surveys on a regular basis and panelists who do not participate on a regular basis are routinely eliminated from panels according to the panel maintenance requirements.
Demographic Composition
We had hoped to use a national probability sample to compare the characteristics of online panelists to the general population, as well as estimate the size of that population. This would be an opportunity to estimate the frame bias of online panels, in general, compared to frame and nonresponse bias to individual online surveys. Unfortunately, the relatively low incidence of online panel members in the population, coupled with the modest size of the national telephone sample yields a very small sample of online panelists (n=35) as the basis of these estimates. The maximum expected sampling error in a true random population sample of this size would be about ±16.6 percentage points at the 95% confidence level.
Nonetheless, as the first national probability survey of online panelists, a comparison of the point estimates for the demographic characteristics of the panel subsample and the total sample may provide some preliminary direction. For example, it has been reported that respondents in Internet surveys are more likely to overrepresent women compared to their population proportion. The survey finds that women are more likely than men to be members of online panels. Among the online panelists, 36% are men, compared to 49% of the total weighted sample.
Internet panels have also been described as biased toward younger and against older adults. The findings from the 2015 survey are limited by the relatively small sample size of Internet panelists in the survey particularly when considering categorical variables. Nonetheless, the proportion of panelists under age 35 is slightly larger (40%) than all survey respondents (30%), while panelists aged 65 and older is slightly smaller (12.2%) than all survey respondents (17.5%).
Ignoring the small sample size, the racial and ethnic composition of the online panelists is quite similar to the total sample. The proportion of non-Hispanic whites is virtually the same among the online panelists (68%) compared to all survey respondents (66%). The proportion of Hispanics is slightly higher among panelists (18%) than all respondents (15%), and so is the proportion of non-Hispanic blacks (14% to 12%).
The differences between the panelists and the total sample in Table 4, while consistent with the literature, may be more limited than might have been expected. But, given the very small sample size, no conclusions should be drawn about the demographic representativeness of the population of online panelists from this survey. However, further research with larger probability samples may yield useful information about the representativeness of those panels, as a whole.
Limitations
The estimation of the size of the online panel participation based on probability survey may be biased by the method. On the one hand, participants in one survey may be more likely to participate in other surveys, including online panels. On the other hand, those who participate in compensated online panels may be less likely to participate in non-panel surveys, like this one. This potential bias could be evaluated with identical questions about frequency by type of survey participation in both probability and Web panel’s with common populations (e.g., national adult).
Although one or both of these biases may affect the estimate of the size of the online panel population, the survey estimate seems generally consistent with magnitude of the size of the population from the reported size of the major panels. Estimates of the characteristics of the online panel population from the survey are limited by the small subsample size. The confidence intervals about the estimates in this sample are too large to make any statistically generalizable comparisons to the population. However, readers may find them useful in evaluating the internal consistency of the survey results.
Discussion
Non-probability samples, like online panels, have a number of limitations. The most prominent of these limitations is the inability to use statistical designs for inference. But, while the reliability of estimates from non-probability samples cannot be calculated, those estimates may be as accurate as estimates from probability samples. The most notorious examples of errors in non-probability samples, such as the Literary Digest projections for the 1936 election, can be explained by biases in specific sample sources and selection procedures, rather than as inherent to non-probability samples.
While the government and academic sectors of the public opinion industry have steadfastly insisted on retaining probability methods for virtually all surveys, the commercial sector has adopted non-probability methods for most survey data collection. Speed and cost have undoubtedly been the main reasons for the popularity of non-probability methods, such as online panels, in commercial opinion and market research. However, these methods also appear to generate estimates that are at least “good enough” in the commercial sector to sustain their usefulness. Increasingly, in the government sector, we also see comparisons in which probability and non-probability samples yield surprisingly similar findings.
Whether we “like” or “don’t like” non-probability sampling methods, the most surprising thing is how little we know about the methods. We are not aware of any published estimates of the size of the non-probability panel population from which so many of the market research population estimates are made. We also know nothing of the characteristics or behavior of the general population of online panelists who generate these estimates. We really need to know much more about this population from which all online panel respondents are drawn if we are to explain why non-probability estimates are often accurate and when they are not.
The sample of online panelists was too small from this survey to conclude whether the total population of Internet panelists is at least broadly representative of the general population. However, such a finding in future, larger probability surveys would be useful step in understanding the results of individual non-probability samples. If the characteristics of the population of online panelists is generally representative of the total population, and the relationships among key variables in both populations is similar, we may have the beginning of a model for evaluating the likely accuracy and reliability of non-probability samples. However, we can only conclude from this study that we need more attention and information about this issue.