Paradata are data about the process of collecting survey data. They can include things like call record data, CAI keystroke files and interviewer observations. Such data can be put to a variety of uses that can improve surveys. In this article we describe the work of a network that has been set up in the UK to explore and promote the use of survey paradata and we outline some of the key challenges in making the most of paradata.
The creation of the UK survey paradata network has stimulated a number of related activities. Events have included two one-day seminars, an expert workshop and a researcher exchange. These follow an earlier special session on paradata at the European Survey Research Association conference. The first seminar, which took place at the London School of Economics in August 2009, addressed the question of why and how researchers should use paradata. The first invited speaker, Lars Lyberg, reviewed the history of survey paradata and discussed the use of paradata in quality control and quality improvement. François LaFlamme then described a range of ways in which Statistics Canada use paradata to improve data collection processes, while Frauke Kreuter discussed the use of paradata in responsive designs to reduce non-response bias. The final speaker was Mick Couper, who focussed on ways in which paradata can be used to reduce measurement error, with a particular focus on computer-assisted and Web surveys.
The six presentations in the second seminar addressed the application of paradata in social surveys. How interviewer personality traits can influence the likelihood of contacting sample households was the topic of the presentation from Jennifer Sinibaldi, while Annelies Blom showed how paradata can provide additional explanatory power for non-response adjustment weighting. Gabi Durrant used paradata from six major UK government surveys to predict the best times to make contact with households in face-to-face surveys and how this varies with household and interviewer characteristics. The presentation by Kristen Olson contrasted the use of paradata to predict dynamic response propensities during fieldwork with their use in post-fieldwork non-response adjustments. Sunghee Lee discussed responsive designs for RDD surveys. The final paper, by James Wagner but presented in his absence by Frauke Kreuter, discussed the limitations of response rates as indicators of survey quality and described some alternatives using paradata.
Following the two seminars, a workshop of 24 invited experts from five countries, most of whom had attended both seminars, was convened at the Office for National Statistics in London. The workshop aimed to identify key messages arising from the seminars, to outline a future research agenda, and to propose ways of progressing the agenda.
Some of the main points identified in discussion at the three events are summarised in the following sections of this article, organised under six headings. Considerable enthusiasm to use paradata effectively was revealed, but also some caution and skepticism. Feelings were expressed that methodologists need to provide more evidence of the importance of collecting certain types of paradata, the things that can and should be done to improve their quality, and how best to use them.
Use of Paradata in Fieldwork Monitoring
Several participants expressed the view that paradata should be incorporated into the mainstream of survey management. Barriers to this were identified as a lack of agreement about the best process variables to use, a lack of suitable analysis tools for survey managers and a need for tools for communicating the information effectively to fieldwork managers and interviewers.
Research to date has, at best, largely demonstrated only that paradata can make a difference. But there has been little or no comparison of alternative methods or alternative choices of variables. Thus, we can conclude that some methods may be helpful, but we do not know which methods are best.
Examples exist of the use of statistical process control charts in a survey management context, but this is not widespread. There is a need to develop simultaneously both the means to identify problems in the process and the means to ensure that appropriate action is taken. Communication between researchers, field managers, supervisors, and interviewers is a crucial component.
Use of Paradata to Address Measurement Error
With regard to measurement error, paradata can be used to improve the questionnaire, the data collection process and the analysis. There is potential for paradata to be used in place of more resource-intensive techniques, such as behaviour coding and digital recording, for identifying and understanding sources of measurement error. The relationship between measurement error and paradata, such as response latencies, back-tracking, and correction of responses has received some attention in the literature (e.g., Heerwegh 2003) but without any clear practical lessons as yet. One possible area for development might be the use of paradata to trigger in-interview edit checks. For example, it is common practice to query answers that are internally inconsistent, but it would equally be possible to query answers when considerable uncertainty is detected in the response process, e.g., if the respondent altered their response several times. In a question-testing context, paradata can also help to identify outlier questions in terms of respondent response behaviour.
Use of Paradata to Address Non-response
The main use to date of paradata in tackling non-response in the field has been in the analysis of call records and more recently in responsive designs (Groves and Heeringa 2006). Extensions to the basic approaches, to incorporate information from the previous rounds of a repeating survey or previous waves of a panel survey, hold promise.
Responsive designs have, thus far, been applied only at the survey level. However, there may be scope for analogous approaches at the institution level, to help survey organisations best identify and distribute interventions across surveys being carried out simultaneously.
Some types of paradata have long been used in non-response adjustment (Lynn et al. 1996), but acceptance is not widespread. Other, newer types of paradata seem not to have been considered in this context. Studies of the potential gains from incorporating paradata (e.g., Blom 2008) in non-response adjustments are as yet limited.
Use of Paradata in Editing and Coding
Paradata have potential to aid in the identification of cases requiring editing as part of a selective editing approach.
An unexplored area is the use of paradata in the coding process itself, for example, indicators of when coders consult code books or other material, indicators of when coders change their mind, and coding latencies. Such data might help to indicate coding quality. There could be scope to use paradata in dependent coding verification to improve the process rather than relying solely on independent verification, which is more costly.
Analysis of Paradata
Paradata often have a complex structure and consequently require complex models if they are to be fully exploited. Techniques such as data mining, regression trees and random forests may have something to offer; also, parallel modeling of a substantive process of interest and the data collection process. There may however sometimes be a choice between seeking better ways to use existing data and seeking ways of collecting better data.
Decision models, as used in operational research, may have potential for estimating the likely outcomes of alternative intervention scenarios.
There could be significant value in making paradata available to secondary analysts (for both methodological and substantive research purposes). Issues of commercial advantage and disclosure risk would however need careful consideration.
Data Quality and General Issues
Little is known about how quality affects the utility of the data and hence whether attempts to improve quality would be worthwhile. Researchers need to demonstrate the value of paradata in order to justify collecting them or improving their quality. There is a distinction between paradata that are a genuine by-product of the survey process, requiring no extra effort to design or collect them, and paradata that require special effort to collect. The latter includes information that is recorded by interviewers rather than automatically. Some research into reliability of interviewer observation data is underway.
Ethical issues in collecting and using paradata require more attention than they have received to date.
The definition of paradata was debated at both seminars, but the feeling was that having a clear definition was rather less important than knowing how best to collect and use the data.
To help promote good practice, it would be useful to build up a body of case studies that demonstrate gains that have been achieved through the use of paradata.
Conclusion
More methodological research into the uses of survey paradata is needed. Researchers are confident of the potential of such data, but as yet less clear of the best data to collect and the best ways to use them. The issues identified at the events reported here will be taken forward at an open meeting planned to take place at the Royal Statistical Society in London in December 2010, a short course, and at a session at the International Statistical Institute conference in Dublin, August 2011. In December 2010 a call for papers will be issued for a special issue on survey paradata of the Journal of the Royal Statistical Society Series A. Further details of these activities, and all of the presentations from the seminars reported in this article, can be found at: http://www.natcen.ac.uk/ncrm-paradata-network.
Acknowledgements
The network on survey paradata is funded by the UK Economic and Social Research Council (ESRC) via the ESRC National Centre for Research Methods (www.ncrm.ac.uk) and is directed by Gerry Nicolaas, Deputy Director of the Survey Methods Unit at the National Centre for Social Research. We are grateful to all the participants in the seminars and workshops, whose collective ideas we have summarised here. We claim neither that all views expressed here are the consensus views of all participants nor that we share all these views ourselves. We report these views simply to stimulate further debate.