Program at a glance 2021

Novel methods: Using machine-learning to aid survey administration

Session Organiser Mr Peter Lugtig (Utrecht University)
TimeFriday 9 July, 15:00 - 16:30

test test test test test test
test test test test test test
test test test test test test
test test test test test test
test test test test test test

New Approaches to Case Prioritization in a Panel Survey: Using Machine Learning Techniques to Identify Hard to Survey Households in the PASS Panel

Professor Mark Trappmann (Institute for Employment Research) - Presenting Author
Dr Jonas Beste (Institute for Employment Research)
Dr Corinna Frodermann (Institute for Employment Research)
Dr Stefanie Unger (-)

Panel surveys provide particularly rich data for implementing adaptive or responsive survey designs (Plewis & Shlomo 2017, Lynn 2017). Not only are data from the current wave fieldwork available, but paradata and survey data as well as interviewer observations from all previous waves can be used to predict fieldwork outcomes in an ongoing wave.

In the German panel survey “Labour Market and Social Security” (PASS, Trappmann et al. 2019), a sequential mixed-mode survey of the general population that oversamples welfare benefit recipients, an adaptive survey design has until now primarily been implemented for refreshment samples (Trappmann and Müller 2016).

As panel attrition is increasing (Williams & Brick 2018), also panel cases that are at greater risk of attrition were targeted in the 14th wave in 2020 and prioritized in the fieldwork. Prioritization includes increased respondent incentives and an increased interviewer premium.

In order to select panel households to be prioritized, we first used data (survey data, paradata, interviewer observations) from wave 4 to 12 of the panel to train different machine learning algorithms. In a next step we used the parameters from this training to predict wave 13 response. The quality of this prediction was assessed by comparison to wave 13 fieldwork outcomes and the best performing algorithm was used to finally predict wave 14 response based on data from waves 4 to 13. The adaptive design was implemented experimentally on roughly half of the panel cases with estimated response propensities in the lower half of the distribution.

In the presentation, we show which algorithm worked best in our setting to predict response propensities and how well these propensities predicted actual wave 14 outcomes. Furthermore, we demonstrate that panel attrition for high risk groups can be reduced by case prioritization and that thereby attrition bias is reduced as well.

Lynn, P. (2017, April). From standardised to targeted survey procedures for tackling non-response and attrition. In Survey Research Methods (Vol. 11, No. 1, pp. 93-103).

Plewis, I., & Shlomo, N. (2017). Using Response Propensity Models to Improve the Quality of Response Data in Longitudinal Studies. Journal of Official Statistics, 33(3), 753-779.

Trappmann, M., Bähr, S., Beste, J., Eberl, A., Frodermann, C., Gundert, S., Schwarz, S., Teichler, N., Unger, S. & Wenzig, C. (2019). Data resource profile: Panel Study Labour Market and Social Security (PASS). International journal of epidemiology.

Trappmann, M. & Müller, G. (2015). Introducing adaptive design elements in the panel study “Labour Market and Social Security” (PASS). In: Statistics Canada (Ed.), Beyond traditional survey taking: adapting to a changing world. Proceedings of Sta-tistics Canada Symposium 2014, Quebec.

Williams, D., & Brick, J. M. (2017). Trends in US face-to-face household survey non-response and level of effort. Journal of Survey Statistics and Methodology, 6(2), 186-211.

The collection of bio-markers: nurses, interviewers, or participants?

Dr Jonathan Burton (ISER, University of Essex) - Presenting Author
Professor Michaela Benzeval (ISER, University of Essex)
Professor Meena Kumari (ISER, University of Essex)

In the 12th wave of the Understanding Society Innovation Panel (IP12) we experimented with the collection of bio-markers from sample members. The sample were randomly allocated to three groups: (1) nurses carried out the interview and collected bio-markers; (2) social interviewers carried out the interview and collected a sub-set of bio-markers, and asked the participants to collect and return hair and dried blood spot samples; and (3) where after the online interview participants were sent the kit through the post which enabled them to take and return their hair and dried blood spot samples. Embedded within the study were two other experiments: (1) different ways to encourage people to take their own blood-pressure before the interview; (2) whether promising feedback of blood results increased take-up.
This presentation describes the design of IP12 and gives the results of the experiments, looking at response rates and and take-up of the biological measures, and the potential for response bias. We also assess the quality of the samples collected across the different groups, and the effect of feedback on response.

Natural Language Processing for Survey Researchers: Can Sentence Embedding Techniques Improve Prediction Modelling of Survey Responses and Survey Question Design?

Mr Qixiang Fang (Utrecht University) - Presenting Author
Dr Dong Nguyen (Utrecht University)
Dr Daniel Oberski (Utrecht University)

It is well-established in survey research that textual features of survey questions can influence responses. For instance, question length, question comprehensibility and the type of rating scales often play a role in how respondents choose their answers. Prior research, typically via controlled experiments with human participants, has resulted in many useful findings and guidelines for survey question design. Nevertheless, there is room for methodological innovation. In particular, it remains a challenge to build prediction models of survey responses that can properly incorporate survey questions as predictors. This is an important task because such models would allow survey researchers to learn in advance how responses may vary due to nuanced and specific textual changes in survey questions. In this way, the models can guide researchers towards better survey question design. Furthermore, because of the use of survey questions as additional predictors, such models will likely improve their prediction of survey responses. This can benefit aspects of survey planning like sample size estimation.

To meet this challenge, we propose to leverage sentence embedding techniques from the field of natural language processing. Sentence embedding techniques map sequences of words to vectors of real numbers, namely, sentence embeddings, which previous research has shown to contain both syntactic and semantic information about the original texts and even certain common-sense knowledge. This suggests that with such techniques, survey questions can be transformed into meaningful numerical representations, which offers two promising solutions. First, given that survey questions as sentence embeddings can readily serve as input for any statistical and machine learning model, we can incorporate sentence embeddings as additional predictors and hopefully achieve more powerful prediction models. Second, we can now manipulate any textual features of survey questions, obtain the corresponding new sentence embeddings as input for prediction models, observe how the responses estimated by the prediction models change are thus able to make informed adjustments to the questions.

Our study investigates the feasibility of these two solutions. We borrow the survey questions and the individual responses from the European Social Survey (wave 9). First, by employing BERT (Bidirectional Encoder Representations from Transformers), a technique successfully applied in many other research contexts, we transform the survey questions into sentence embeddings and train models to predict responses to (unseen) survey questions. Our preliminary results show that the use of sentence embeddings substantially improves prediction of survey responses (compared to baselines), suggesting that sentence embeddings do encode some relevant information about survey questions. Second, we manipulate various aspects of survey questions, such as the topic words, choice of vocabulary and the type of rating scales and thus artificially generate many variants of the original questions. Then, we feed these the sentence embeddings of these generated questions into a high-performance prediction model and examine whether the sizes and directions of the changes in the predicted responses are consistent with hypotheses, established experimental findings and labelled data. This also allows us to determine what kind of information about survey questions sentence embeddings actually encode.

Photos instead of text answers: An experiment within a housing survey

Mr Goran Ilic (Utrecht University)
Mr Peter Lugtig (Utrecht University) - Presenting Author
Professor Barry schouten (Statistics Netherlands)
Mr Seyyit Hocuk (CentERdata)
Mr Joris Mulder (CentERdata)
Mr Maarten Streefkerk (CentERdata)

In general population housing surveys, respondents may be requested to give descriptions of their indoor and outdoor housing conditions. Such conditions may concern the general state of the dwelling, insulation measures the household has implemented to reduce energy use, the setup of their garden, the use of solar panels and the floor area. Part of the desired information may be burdensome to provide or may be non-central to the average respondent. Consequently, data quality may be low or sampled households/persons may decide not to participate at all. In some housing surveys, households are asked to give permission to a housing expert to make a brief inspection and evaluation. Response rates to such face-to-face inspections, typically, are low.
An alternative to answering questions may be to ask respondents to take pictures of parts of their dwelling and/or outdoor area. This option may reduce some burden and may improve data richness, but, obviously, may also be considered intrusive.
In this paper, we present the results of an experiment in which a sample of households from the Dutch LISS panel was allocated to one of three conditions: only text answers, only photos or a choice between text answers and photos.
Respondents were asked to provide information on three parts of their house: their heating system, their garden, and their favorite spot in the house. In this presentation we focus on two key aspects of survey error that vary across our experimental conditions:
1) selection error. We study which respondents are likely to participate in a survey, and which respondents answer to the picture questions, and study that happens to coverage and nonresponse error when we give respondents a choice to take a picture or answer questions.
2) measurement error. The picture data provide much more contextual information about someones housing conditions than a text answer. However, meaningful information from pictures still has to be extracted using image recognition methods.

Finally, we evaluate the combined effect of selection and measurement errors, and the tradeoff between both. In which condition do we learn most about people's housing conditions?