ESRA logo

ESRA 2023 Preliminary Glance Program

All time references are in CEST

Experiments in asking for informed consent to data linkage in general population studies 1

Session Organisers Dr Jonathan Burton (Institute for Social and Economic Research, University of Essex)
Professor Annette Jackle (Institute for Social and Economic Research, University of Essex)
TimeWednesday 19 July, 14:00 - 15:00
Room U6-03

Linking survey and administrative data offers the possibility of combining the strengths, and mitigating the weaknesses, of both. Such linkage is therefore an extremely promising basis for future empirical research in social science. For ethical and legal reasons, linking administrative data to survey responses will usually require obtaining explicit consent. It is well known that not all respondents give consent. Past research on consent has generated many null and inconsistent findings. A weakness of the existing literature is that little effort has been made to understand the cognitive processes of how respondents make the decision whether or not to consent. The overall aim of this session is to improve our understanding about how to pursue the twin goals of maximizing consent and ensuring that consent is genuinely informed.
We welcome papers which employ an experimental design to:
1. Understand how respondents process requests for data linkage: which factors influence their understanding of data linkage, which factors influence their decision to consent, and to open the black box of consent decisions to begin to understand how respondents make the decision.
2. Develop and test methods of maximising consent in web surveys, by understanding why web respondents are less likely to give consent than face-to-face respondents.
3. Develop and test methods of maximising consent with requests for linkage to multiple data sets, by understanding how respondents process multiple requests.
4. Test the effects of different approaches to wording consent questions on informed consent.

Keywords: consent, administrative data, experiments

How to ask for consent to data linkage: Things we’ve learnt

Dr Jonathan Burton (Institute for Social and Economic Research, University of Essex) - Presenting Author
Professor Mick Couper (University of Michigan)
Professor Thomas Crossley (European University Institute)
Professor Annette Jackle (Institute for Social and Economic Research, University of Essex)
Dr Sandra Walzenbach (University of Konstanz)

Linking survey and administrative or other process-generated data is increasingly popular, whether to reduce respondent burden or augment the scope or quality of the data. Data linkage usually requires informed consent of respondents, whether for legal or ethical reasons. A common problem is that when consent questions are asked in self-completion surveys, respondents are much less likely to consent than when they are asked for consent in interviewer-administered surveys. In the existing literature, predictors of consent are inconsistent, between studies, but also between different consents asked within one study. In addition, experiments with the wording of consent questions have often had no or inconsistent effects. Why is this? And what can be done to increase informed consent to data linkage? In this presentation we provide an overview of what we have learnt from qualitative in-depth interviews and a series of experiments implemented in two UK probability household panels (the Understanding Society Innovation Panel and COVID-19 study) and in the UK PopulusLive online access panel. We address the following questions. (1) How do respondents decide whether to consent to data linkage? (2) Why are respondents less likely to consent in web than CAPI surveys? (3) How best to ask for multiple consents within a survey? (4) Which wording and formats affect informed consent and why? We end the overview with a summary of the practical implications for how best to ask for consent to data linkage.

Consent to link survey and Twitter data in panel surveys - experimental evidence

Mr Curtis Jessop (The National Centre for Social Research) - Presenting Author
Dr Tarek Al Baghal (University of Essex)
Dr Luke Sloan (Cardiff University)

Previous research has shown that consent rates to link survey and Twitter data are relatively low (between 27% and 36% of eligible respondents), with rates varying depending on factors such as mode of interview. This is a problem for studies as lower consent rates increase the risk of bias being introduced into the sample and of having insufficient data for robust analysis. This paper presents evidence from three studies using representative samples of Twitter users looking at methods for improving consent rates.

Previous qualitative evidence has suggested that respondents may disengage with long pieces of information at consent questions, ignoring it and relying on cognitive shortcuts to make decisions. The first study uses experimental data from Innovation Panel 15 to test how moving ‘additional information’ to a separate page of the questionnaire to allow respondents to focus on key messages affects consent rates. The second study looks at experimental data from a non-probability web panel, where relationships with respondents may be more ‘transactional. It explores whether, and to what extent, offering a £2 incentive affects participants’ likelihood to consent to data linkage. Finally, we compare consent rates from a parallel run of Twitter data linkage consent questions with a probability sample from the NatCen Panel and a sample from a non-probability panel to provide insight into how different types of samples may respond differently to consent questions, and how applicable approaches in one may be to the other.

Findings from these experiments will help to inform the design of future studies looking to collect consent to link survey and Twitter data and will likely be applicable to other types of request for data linkage including other digital trace data and administrative records.

Is consent to link survey and Twitter data associated with demographic characteristics or reported Twitter behaviour?

Mr Curtis Jessop (The National Centre for Social Research) - Presenting Author
Dr Tarek Al Baghal (University of Essex)
Dr Luke Sloan (Cardiff University)

Previous research has shown that consent rates to link survey and Twitter data are relatively low (between 27% and 36% of eligible respondents). These lower consent rates increase the risk of bias being introduced into the sample – for example, previous research has indicated that women and older respondents are less likely to consent to data linkage, although this was based on relatively small sample sizes, limiting the statistical power to detect differences.

This paper will update that research using a larger sample size: a representative sample of c. 4,000 Twitter users from a non-probability web panel and c. 600 Twitter users from the probability-based NatCen Panel. It will look firstly at the socio-demographic characteristics associated (or not) with consent to data linkage, including sex, age, education, internet use, economic circumstances, and political opinions. It will then look in more detail at if and how consent is associated with self-reported Twitter use: frequency and purpose of use and types of activities.

Doing so will provide two insights: the extent to which any resulting sample of Twitter data may be biased on these measures, and whether certain types of user may be more reticent to share the data. For example, ‘lurkers’ or low-level posters could be reluctant to share their data as they feel it has low value. Conversely, frequent sharers of content may be less likely to consent because they may not want their opinions to be given additional attention.

These results will therefore provide important context for analysts of bias the that may be present in linked survey and Twitter data sets (and potentially other types of linked data). They will also provide insights into reasons for non-consent which could be addressed in consent questions to improve future consent rates.

Linking survey data with webtracking data - challenges due to risk of re-identification via browsing behavior.

Ms Barbara Binder (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author

I discuss the opportunities and challenges of linking data from a probabilistic survey, namely the ALLBUS survey, with webtracking data. Such digital trace data allows for enriched survey data with measures of online behavior of respondents free from measurement error due to recall or social desirability bias in self-reports in surveys.
ALLBUS collects data every two years in a two-stage sampling design: It first draws a probabilistic sample of communes and then a probabilistic sample of people aged 18 and older. In 2023, ALLBUS will be mixed-mode. Respondents who answer the survey in a self-administered mode, either through mail or through the web (N=3000), will be asked for consent to be re-contacted for participation in an Online Access Panel. Participants can subsequently choose to install a webtracking browser plugin which will transmission URLS and HTMLs of visited websites over a period of a few weeks. With the consent of participants, the survey data of the ALLBUS survey, the online access panel, and the webtracking data will be linked.
On this basis, we will explore the willingness to share data on internet behavior among survey respondents in the ALLBUS.
My talk will highlight the challenges of linking webtracking data and survey data that contain highly sensitive information, such as respondents' political or religious views. Even though small-scale geodata from ALLBUS are only available through restricted access and not in the SUF, respondents’ place of residence can be easily identified through tracked online behavior, such as repeated searches for the opening hours of nearby post offices or the menus of nearby restaurants. This risk of re-identification of respondents requires proper management and supervised access to linked data, which must be planned at an early stage of the data linkage project.