All time references are in CEST
Approximating Probability Samples in the Absence of Sampling Frames 1
|Session Organisers|| Dr Carina Cornesse (German Institute for Economic Research)
Dr Mariel McKone Leonard (DeZIM Institute)
|Time||Wednesday 19 July, 14:00 - 15:00|
Research shows that survey samples should be constructed using probability sampling approaches to allow valid inference to the intended target population. However, for many populations of interest high-quality probability sampling frames do not exist. This is particularly true for marginalized and hidden populations, including ethnic, religious, and sexual minorities. In the absence of sampling frames, researchers are faced with the choice to discard their research questions or to try to draw inferences from nonprobability and other less conventional samples.
For the latter, both model-based and design-based solutions have been proposed in recent years. This session focuses on data collection techniques designed to result in samples that approximate probability samples. We also invite proposals on techniques for approximating probability samples using already collected nonprobability sample data as well as by combining probability and nonprobability sample data for drawing inferences. The session scope covers but is not limited to research on hard-to-reach and hard-to-survey populations. We are particularly interested in methodological research on techniques such as
- Respondent-driven sampling (RDS) & other network sampling techniques
- Quasi-experimental research designs
- Weighting approaches for nonprobability data (especially those that make use of probability sample reference survey data)
- Techniques for combining probability and nonprobability samples (e.g. blended calibration)
Keywords: nonprobability sample, respondent-driven sampling, blended calibration, weighting, data integreation
Ms Carli Lessof (Institute for Jewish Policy Research) - Presenting Author
The Institute for Jewish Policy Research (JPR) is an independent research institute, specialising in the state of contemporary Jewish communities in Britain and the EU (www.jpr.org.uk). JPR aims to provide a better understanding of the identities, attitudes and behaviours of Jewish people and attitudes towards Jews in the general population. Our research provides evidence for community organisations to plan for the future, and for UK and EU government agencies.
JPR has been conducting surveys of Jews in Britain for over 25 years. Since there is no existing sampling frame that allows for the selection of a random probability sample of any religious group, JPR has tested approaches including sampling using distinctive names, postal surveys using random samples of geographical areas stratified by population density of Jewish people, and parallel surveys of Jews who are members of general population panels. Representativity is assessed using census data, which in the UK has included a voluntary question on religion since 2001, and five-yearly surveys of synagogue membership conducted to understand the shifting denominational composition of the population, which is strongly associated with attitudes and behaviours.
Soon after the Covid-19 pandemic began, JPR established an online panel to provide an immediate and sustainable approach to future surveys. The fourth wave is scheduled this Spring. Initial recruitment of panel members relied heavily on promotion of the survey by community organisations, which tends to under-represent those least connected to the core Jewish community, so additional efforts have included asking participants to invite others while tracking chains of referral, a digital campaign concentrated in areas with high population density, and the use of influencers. Subject to funding, future approaches could include comparison with a small push-to-web sub-sample. The paper will share learning and recent lessons from the panel.
Ms Jessica Donzowa (Max Planck Institute for Demographic Research) - Presenting Author
Dr Daniela Perrotta (Max Planck Institute for Demographic Research)
Professor Emilio Zagheni (Max Planck Institute for Demographic Research)
Data availability varies globally, with a lack of timely administrative or survey data in developing countries. In this study, we employ a network reporting approach to estimate demographic indicators in Senegal. Network reporting offers the opportunity to expand the coverage beyond the online population and indirectly gain insights about members of the offline population. For this, we use survey data collected via the “Senegal Demographic Survey” (SDS) conducted via Facebook. Using Senegal as a case study, we will employ the network reporting approach to estimate demographic outcomes related to fertility. The goal is to investigate the potential benefits of using this methodology in a context with low internet and Facebook penetration rates. Doing this will expand the so far limited knowledge about the reliability of data collection via surveys using social media as a recruitment tool in a context with extremely low Facebook coverage. Secondly, it will introduce a new sampling methodology to overcome under-coverage bias in a context with low internet and Facebook penetration rate.
Ms Martha McRoy (Abt Associates) - Presenting Author
The Feed the Future Egypt Rural Agribusiness Strengthening Project (ERAS) aims to strengthen the horticulture sector in Upper Egypt and the Delta regions by helping farmers increase their agriculture-related incomes. By prioritizing opportunities for smallholder farmers, youth, and women, ERAS also contributes to increased inclusivity in the horticulture sector. To complete its purpose, ERAS conducts crop-specific trainings with interested farmers with the expectation that farmers will implement some of their learnings including topics of good agricultural practices, harvesting, cultivation, pest management, and water management.
To evaluate ERAS’s impact, baseline studies of farmers are first conducted across the two regions to estimate incomes. Then, annual surveys for each crop of interest are performed to measure the changes of yearly incomes of farmers who attended a training. There is one problem with this strategy – there is no sample frame of farmers in Egypt nor any official statistics on this target population for inference.
In this paper, we focus on how we are able to implement the 2022 seasonal surveys. We start with an overview of the sample designs to prioritize these smallholder farmers, youth, and women. Next, we walk through the processes to create a frame for a population of unknown size. Then, we highlight our approach for estimating control totals and weighting implemented for each surveyed crop survey. Finally, we acknowledge some of the unexpected challenges encountered, especially when trying to use a probability-based approach without a sample frame, and how they were overcome.
We highlight the necessity of collaborating with staff based in-country to create and continually update a sampling frame through cooperation with local organizations. Using this list, we sample and survey farmers cultivating targeted crops, create control totals using details from local organizations, and weight the surveys to help reduce bias.
Dr Sebastian Rinken (Institute for Advanced Social Studies, Spanish Research Council (IESA-CSIC)) - Presenting Author
Mr Juan Antonio Domínguez (Institute for Advanced Social Studies, Spanish Research Council (IESA-CSIC))
Dr Regina Lafuente (Institute for Advanced Social Studies, Spanish Research Council (IESA-CSIC))
Mr Manuel Trujillo (Institute for Advanced Social Studies, Spanish Research Council (IESA-CSIC))
Dr Rafael Serrano-del-Rosal (Institute for Advanced Social Studies, Spanish Research Council (IESA-CSIC))
The availability of ubiquitous connectivity and easy-to-use management tools has increasingly tempted people who lack a solid methodological background to run questionnaire operations. Such data collections are routinely branded as surveys, despite being usually administered to non-probabilistic samples and oftentimes entailing uncontrolled respondent-driven recruitment via social networks or messaging apps. The underlying assumption seems to be that sample size, as such, is associated with data quality.
We analyze a snowball sample that was collected in Spain by a survey on the social dimension of COVID-19. For substantive consideration, ESPACOV employed a combination of aleatory SMS invitations and segmented web advertisements for sampling, quite successfully as it turned out (Survey Research Methods 14(2), https://doi.org/10.18148/srm/2020.v14i2.7733). To inquire into the methodological implications of snowballing, respondents were invited to forward the questionnaire to other people “living in Spain and at least 18 years old”. Within a week, we obtained almost 17.000 complementary questionnaires, about half of which can be traced back to invitation-remitters.
Our analysis of selection bias relies on the paradata that relate seeds with their (direct or indirect) invitees. We observe a pattern of cumulative degradation: a large proportion of complementary questionnaires originates in very few seeds, and higher numbers of additional respondents per seed induce bigger deviations from population parameters. In short, our data suggest that in the context of web-based respondent-driven sampling, the relation between data quality and sample size is just the opposite of what lay conceptions appear to be taking for granted. We conclude that when snowballing cannot be avoided, it is vital to control the seeds’ profiles and modulate re-directing options accordingly.