ESRA logo
Tuesday 14th July      Wednesday 15th July      Thursday 16th July      Friday 17th July     




Wednesday 15th July, 11:00 - 12:30 Room: HT-103


Enhancing survey data with geocoded auxiliary data 2

Convenor Dr Sarah Butt (City University London )
Coordinator 1Mr Rory Fitzgerald (City University London)
Coordinator 2Ms Kaisa Lahtinen (City University London)

Session Details

Combining survey data with auxiliary data from other sources provides researchers with a wealth of potential opportunities to improve survey data collection and the quality of the inferences that can be drawn from survey data. One type of auxiliary data that is increasingly widely available is geocoded data i.e. data that can be linked to survey data based on the geographic location of sampled addresses. This includes census data, administrative data from government agencies and other public sector bodies, commercial databases and geospatial maps. Such data can be used to answer substantive research questions about the effect of location on attitudes and behaviour. By providing information about all sample units, geocoded data are also a potentially valuable tool to aid data collection and for overcoming non-response bias.

However, using auxiliary data from pre-existing sources presents a number of challenges.
Identifying suitable auxiliary variables that are correlated with the survey variables of interest (and, in the case of non-response analysis, response propensity) can be difficult. There are concerns over the coverage, accuracy and timeliness of external databases, the extent to which data which is often highly aggregated can characterise sampled households, and the increased likelihood of deductive disclosure as a result of combining different data sources.

This session invites studies that have combined survey data with geocoded auxiliary data to share their learning regarding the opportunities and challenges associated with this approach. We are interested in papers that provide insights into any of the following:
• The pros and cons of using different sources of geocoded auxiliary data
• Strategies for linking geocoded auxiliary data to survey data
• Modelling item or unit non-response using auxiliary data
• Combining auxiliary data and survey data cross-nationally

Paper Details

1. Grandparents, Nurseries and Employment Options: The Geography of the return to work for Mothers in the Czech Republic
Dr Thomas Emery (NIDI)
Mrs Alzbeta Bartova (University of Edinburgh)

In this paper we examine the extent to which the return to work for mothers is effected by their proximity to various resources, facilities and opportunities. Data from wave 1 and 2 of the GGS in the Czech Republic is matched with geocoded data on childcare provision from childcare websites, regional level data on unemployment levels, local level data on employment opportunities and the proximity of individuals parents and other relatives who might act as childcare alternatives. The findings suggest a strong role of geographic proximity to resources in determining women’s return to work.


2. Combining Sample, Survey and Geocoded Auxiliary Data for Predicting Sales Volumes at Gasoline Stations
Dr Kurt Pflughoeft (MaritzCX)
Ms Sharon Alberg (MaritzCX)

Clients often ask market researchers to link survey and sample data to business outcomes. However, the breadth of this data is limited, which can lead to specification errors in prediction models. This research augments a mystery shop survey with geocoded auxiliary data. The goal was to predict sales volumes for a chain of gas stations.

Two geocoded auxiliary data sources were used: the U.S. Census Data and the Point Of Interest (POI) database, Factual. The model results show that mystery shop ratings were important in the context of location specific characteristics.


3. Examining neighbourhood effects on educational opportunities: Facing challenges in combining survey data with geocoded auxiliary data cross-nationally
Ms Dafina Kurti (GESIS - Leibniz Institute for the Social Sciences)

In order to examine the neighbourhood effects on school attainment of residents with migration background it is necessary to refer to geographic information that is available through public sector agencies or commercial databases. This paper reviews the challenges associated with the use of geocoded auxiliary data and combining them with survey data in Germany, UK, Sweden, and France. It details my experience in identifying correlated survey data variable with geocoded variables, which are suitable for testing the hypotheses. The paper also presents challenges facing in a cross-national research due to the regionally different data quality and accessibility standards.


4. Enhancing longitudinal surveys with geocoded time-series data: Examples from research on school-to-work transitions in Germany
Miss Katarina Weßling (University of Tuebingen)
Mr Andreas Hartung (University of Tuebingen)
Professor Steffen Hillmert (University of Tuebingen)

We aim at understanding how socio-economic spatial contexts contribute towards explaining disparities in school-to-work transitions. To capture the spatial extension of context effects in a longitudinal perspective geocoded small-scale data in time-series format would allow to precisely illustrate spatial structures, contiguities and distances as well as to calculate flexible levels of aggregation. However, there are various difficulties: We will provide examples on (1) typical problems regarding data availability, (2) strategies to overcome those challenges, (3) possibilities to link large-scale surveys (NEPS, GSOEP) with geocoded aggregate data and (4) spatial analytical strategies.