The Challenges of Survey and Administrative Data Linkage
|Convenor||Dr Tarek Mostafa (UCL Institute of Education )|
In this paper we report findings from a feasibility study carried out in the Umbria region in Italy. The aim of the study is to create a unique dataset that includes detailed information on people’s health and socio-economic conditions based on the linkage of three data sources, (i) the Cancer registry, (ii) survey data from the Italian Statistical Institute (ISTAT) and (iii) administrative data held by the local Municipalities of the Umbria region. Note that, in this latter case, all local Municipalities were contacted and asked for the relevant information, as these data are not held by a
This paper considers matching of housing data from three separate data sources: the American Housing Survey, data obtained from a commercial vendor, and administrative records from the U.S. Department of Housing and Urban Development. We document characteristics of both addresses and housing units that are associated with lower match rates. The results show clear geographic variation in match rates, and that units in multi-unit structures tend to match at lower rates. We discuss potential improvements to the existing address match process that lead to higher match rates and demonstrate the potential benefits and drawbacks of these new changes.
Asking research participants for their consent to add information from administrative records to their survey responses is now a common feature of many surveys, due to the enormous potential value it can offer.
However, collecting consents from research participants in the context of a sequential, mixed mode survey is new ground and there is relatively little literature which offers robust evidence about the optimal approach to take.
This paper charts our experience of implementing a wide-range of data linkage consents for the Next Steps Age 25 Survey which uses a sequential, mixed mode design, involving web, telephone and face-
Although the linkage of survey and administrative data provides distinct advantages, the implementation of the necessary informed consent may undermine response stability in longitudinal surveys. We conducted an experiment that allows us to make causal claims about the consequences of consent requests on longitudinal response behavior. Respondents from a SOEP refreshment sample were randomly assigned to three treatment groups: The first group was asked for their consent in wave 1, the second group in wave 2, the third group was asked in wave 1 and in wave 2 if necessary, and a control group was never queried for their permission.
This study constitutes the first longitudinal exploration of consent to link survey and administrative data. It relies on a theoretical framework distinguishing between passive, active, consistent and inconsistent consent behaviour. The findings show that in general consent behaviours are passive and consistent. On the one hand, the majority of respondents in the Millennium Cohort Study have a consent behaviour which is consistent over time. On the other hand, the likelihood of giving consent and the likelihood of switching behaviour over time depend on extrinsic factors and is characterised by respondent passiveness.