Program at a glance 2021



Advances in dealing with non-response

Session Organiser Dirk Schubotz
TimeFriday 23 July, 13:15 - 14:45

The Geography of Response Comparing low-response areas for an IRS Survey with those identified by the U.S. Census

Dr Jocelyn Newsome (Westat) - Presenting Author
Dr Hanyu Sun (Westat)
Mr Michael Giangrande (Westat)
Dr Kerry Levin (Westat)
Mr Rizwan Javaid (IRS)
Mr Patrick Langetieg (IRS)
Dr Scott Leary (IRS)
Ms Brenda Schafer (IRS)

Identifying who is unlikely to respond to a survey is a critical issue for researchers. It influences decisions related to sampling, questionnaire design, contact methods, and mode choice, as researchers search for ways to encourage response from hard-to-reach populations. The U.S. Census, with its mandate to enumerate the entire U.S. population, held a competition in 2012 (offering a monetary prize) challenging researchers to develop a model that could predict response to its mailed questionnaire. The resulting model was the “Low Response Score” (LRS). It is based on 25 variables from Census data and is publicly available (Erdman & Bates, 2017) .
Researchers often use the LRS, along with other variables, to develop a response propensity model to inform an adaptive design (Jackson, McPhee, & Lavrakas, 2019). Zhu, Baskin, and Morganstein (2019) recently assessed the predictive power of the LRS for two face-to-face household surveys. They found that the LRS was not a good predictor of response, hypothesizing that mode differences (face-to-face versus mail) may explain the discrepancy.
This paper will examine whether the LRS proves predictive for a nationwide, multi-mode household survey where the primary mode of data collection is a paper questionnaire. The Individual Taxpayer Burden (ITB) Survey measures the time and money taxpayers spend complying with their tax reporting responsibilities. It has been conducted annually since 2010 with 20,000 taxpayers each year. We will compare ITB response rates at the tract level over eight survey administrations to the Census LRS for each of those tracts.
Identifying when the LRS is predictive for non-Census studies (and when it isn’t) can help researchers better understand the factors that influence who is hard-to-reach---and why.

References
Erdman, C., & Bates, N. (2017). The Low Response Score (LRS) A Metric to Locate, Predict, and Manage Hard-to-Survey Populations. Public Opinion Quarterly, 81(1), 144-156.
Jackson, M. T., Mcphee, C. B., & Lavrakas, P. J. (2019). Using Response Propensity Modeling to Allocate Noncontingent Incentives in An Address-Based Sample: Evidence from a National Experiment. Journal of Survey Statistics and Methodology.
Zhu, X., Baskin, R.M., and Morganstein, D. (2018, August). Evaluating the Census Planning Database, MSG, and paradata as predictors of household propensity to respond. Joint Statistical Meeting, Vancouver, Canada.


The geography of nonresponse: Can spatial econometric techniques improve survey weights for nonresponse?

Mr Christoph Zangger (University of Zurich) - Presenting Author

Unit nonresponse is an all too well-known fact in cross-sectional and longitudinal survey research (Särndal and Lundström, 2005). Different strategies address this challenge with calibration and inverse propensity weighting as some of the most common approaches. Moreover, it has been recognized that nonresponse varies geographically (Hansen et al., 2007). The geographic clustering of survey nonresponse has helped to identify segments of the population that are less likely to participate (Bates and Mulry, 2011; Erdman and Bates, 2017). Consequently, researchers have included geographically aggregated measures to account for nonresponse and to construct survey weights (Kreuter et al., 2010). This paper extends this literature by building on the argument that people with similar characteristics tend to live in comparable places due to similar residential preferences, endowments with resources that enable or restrict residential mobility, and discrimination practices of landlords (Auspurg, Hinz, and Schmid, 2017; Dieleman, 2001). The resulting clustering and segregation induces a spatial correlation (Anselin, 1995) among characteristics that are also relevant to predict survey nonresponse, for example, education, and that are likely correlated with outcome measures in the survey. As a consequence, the residuals of regressing survey response status on a set of available characteristics are themselves spatially correlated, biasing estimates and predictions (Pace and LeSage, 2010).

While aggregated characteristics can pick up some of the spatial correlation, there is another, more direct approach that accounts for spatial correlation: spatial econometric models (LeSage and Pace, 2009). Moreover, these models can directly incorporate other units' response status (and other characteristics) in the estimation and prediction of an individual unit's response propensity, accounting therewith for the socio-spatial interdependence induced by unobserved selective residential patterns. Using Monte Carlo simulations, this paper demonstrates how spatial econometric models improve predicted response propensities, yield more accurate survey weights, and are thus superior to common approaches of weighting survey non-response, even if the data generating process is incorrectly modeled. The results are robust across a wide variety of model specification, including the underlying response pattern and its spatial correlates. Consequently, using spatial econometrics techniques can be a fruitful approach for obtaining more accurate survey weights to account for unit non-response in social surveys.


Representativeness of individual-level data in phone surveys: Findings from Sub-Saharan Africa

Mr Philip Wollburg (World Bank) - Presenting Author
Dr Talip Kilic (World Bank)
Mr Joshua Brubaker (World Bank)

The COVID-19 pandemic has created urgent demand for timely data on its impacts, leading to a surge in mobile phone surveys, including in low- and lower-middle income countries where they had been sparsely implemented prior to the crisis. However, phone survey data can suffer from selection biases, especially in the aforementioned context, because phone ownership is not universal and is skewed towards wealthier, male-headed, urban, better-educated households and individuals. Further, non-response has been traditionally higher in phone surveys compared to face-to-face surveys.

This paper uses public-use datasets from monthly national phone surveys implemented in Ethiopia, Malawi, Nigeria and Uganda since April 2020 to monitor the socioeconomic impacts of COVID-19. Our research objective is to assess, and attempt to correct, selection biases inherent in individual-level analyses based on the national phone survey data. The phone surveys use as a sampling frame a recent round of the longitudinal, nationally-representative face-to-face household survey that had been implemented under the World Bank Living Standards Measurement Study-Integrated Surveys on Agriculture (LSMS-ISA) initiative in each country. All face-to-face survey households with an available phone number were called, with the first contact usually being the head of household. The phone survey sampling weights in the public-use datasets are adjusted to deal with coverage and non-response biases at the household-level, leveraging the rich, pre-COVID-19 face-to-face survey data on (i) households that participate in the phone survey; households that do not own a mobile phone and are excluded from the sampling frame; and (iii) households that were contacted but could not be reached.

The availability of pre-COVID-19 face-to-face survey data allows us to compare phone survey respondents with the general adult population. The analysis reveals that selected phone survey respondents are most often household heads or their spouses, are older, are more likely to be better educated and on-farm enterprise owners relative to the general adult population. To account for these differences and improve the representativeness of individual-level phone survey data, we recalibrate the household-level phone survey sampling weights based on propensity score adjustments that are derived from a cross-country comparable model of an adult individual’s likelihood of being interviewed as a function of a rich set of individual- and household-level attributes.

The reweighting generally improves the representativeness of the individual-level estimates, that is, reweighting moves variable means for phone survey respondents closer to those of the general adult population. This holds for both women and men and in the full range of demographic, education and labor market outcomes we test. Despite these improvements, reweighting fails fully overcome selection biases, with differences in means remaining statistically significant in many of the variables analyzed. Moreover, individual-level reweighting tends to increase the variance of the estimates. Obtaining reliable individual-level data from these phone surveys therefore requires fundamental changes to the individual respondent selection protocols.


Predicting Web Survey Breakoffs Using Machine Learning Models

Mr Zeming Chen (University of Manchester) - Presenting Author
Dr Alexandru Cernat (University of Manchester)
Professor Natalie Shlomo (University of Manchester)
Dr Stephanie Eckman (RTI International)

Web surveys have become one of the most important tools of social scientists. However, this survey mode tends to have more breakoffs, which occur when respondents quit the survey partway through. One way to minimise the occurrence of breakoffs is to deploy a model to continuously monitor the breakoff risk during the response process and trigger some interventions (e.g., displaying motivation messages) when respondents are predicted to break off at the next question. The success of this method requires models to be built at the question level and have a high prediction accuracy. Nonetheless, the existing breakoffs models are either developed at the questionnaire level or have a lower prediction power compared to machine learning models. Also, there is no consensus about the treatment of the time-varying question-level predictors. While some researchers utilise the lagged value, others aggregate the predictor value from the beginning of the survey. No study has compared these two treatments, especially in terms of their effect on the model prediction accuracy. This study will develop both traditional and machine learning survival models along with different treatments of the question-level variables for predicting web breakoffs. The most accurate breakoff prediction model identified in this study would help the trigger of interventions and generate weights for breakoff adjustments. Also, this study will contribute to the improvement of the breakoff prediction model and the development of theoretical debates around causes of breakoff.


How AI can boost your survey statistic - The use of machine-learning algorithms for the imputation of pension data

Dr Robert Hartl (Kantar) - Presenting Author
Dr Thorsten Heien (Kantar)
Mr Marvin Kraemer (Kantar)
Dr Dina Frommert (Deutsche Rentenversicherung Bund)

When collecting income or other sensitive data, survey researchers are often confronted with missing values because respondents are not able or don’t want to give the asked information. To maximise the number of available cases for analyses and to reduce the risk of biased results, data scientists have developed numerous methods and algorithms to impute missing values including simple (conditional) mean imputation, regression-based approaches, hot-deck imputation or iterative expectation-maximization (EM) algorithms. In the survey on “Life courses and old-age provision” (Lebensverläufe und Altersvorsorge; LeA), innovative machine-learning ensemble techniques combining various statistical models and implemented in XGBoost were used to estimate missing values for pension entitlements of people aged 40 to 59 years in Germany. Since missing values can’t be assumed to occur (completely) at random, a maximum of relevant variables was included. The paper analyses the use of ensemble techniques based on a simulation with non-missing cases and compares the results to those of other imputation approaches.