ESRA logo

ESRA 2023 Glance Program

All time references are in CEST

Linking Survey Data with Geospatial Data: Potentials, Methods, and Challenges 2

Session Organisers Professor Simon Kühne (Bielefeld University)
Mr Dorian Tsolak (Bielefeld University)
TimeFriday 21 July, 09:00 - 10:30
Room U6-21

Adding geospatial information to survey data offers new perspectives for social science research. It allows researchers to address regional context effects that play an important role in many research areas. Moreover, combining individual survey data with (aggregated) regional data can aid in closing data gaps and reducing study costs. Over the past years, survey projects have been increasingly enriched with geo-information, for instance, by adding geopositions or municipality codes of survey participants’ residence. In addition, many external data sources of regional indicators across various levels of aggregation are readily available. This includes, but is not limited to, administrative data, social media data, smart data, or satellite imagery. However, survey practitioners are still facing many challenges in managing, linking, and analyzing survey data in conjunction with geospatial data, thus, lots of potential for innovative research which combines survey and geodata remains untapped.

The proposed conference session offers the opportunity to exchange expertise on new developments in linking survey data with geospatial information and regional indicators. Possible questions to be discussed are:

- What are potential sources of regional indicators and geospatial information that can be combined with survey data?
- What methods and procedures exist to retrieve and manage big spatial data from online sources?
- How to harmonize area changes over time when analyzing longitudinal survey data?
- How to assure the high data security standards needed for sensitive georeferenced survey data?
- Which techniques from geospatial analytics can be incorporated in common methods for cross-sectional and longitudinal survey data analysis?
- What are applications that highlight the potential of analyzing georeferenced survey data?

Keywords: survey data, geospatial data, record linkage, regional analytics


Getting personal – Moving from regional to ego-centered provider density to capture healthcare availability in survey data

Ms Barbara Stacherl (Socio-Economic Panel (SOEP), German Institute for Economic Research (DIW Berlin)) - Presenting Author

Regional provider densities, e.g., physician-to-population ratios in administrative regions, are frequently used to depict healthcare availability. Research using survey data also relies on regional provider densities to account for healthcare infrastructure as part of the regional context. However, regional provider densities are based on administrative areas, ignoring border-crossing for healthcare use. Therefore, individualized spatial approaches for measuring healthcare availability in survey research are needed. To capture the healthcare infrastructure (physicians and hospitals) available to individuals, I generate ego-centered provider densities for households in the German Socio-Economic Panel (SOEP), a representative longitudinal survey. Variation across individuals and over time as well as intraregional variation are analyzed. To outline the information gain achieved through ego-centered measures, they are compared to standard regional provider densities. Geocoded address data for all hospitals (2003-2019) and outpatient physicians (2009-2019) in Germany were linked to SOEP data. Using georeferenced population data (100x100m grid), ego-centered provider-to-population ratios, namely the number of physicians and hospital beds per 100,000 inhabitants within a 10km radius, were computed for each household for each year. For comparison, physician and hospital densities at district level (2015-2019) were linked to the SOEP. Although relatively stable over time, some external (=among non-movers) within-individual changes in ego-centered physician and hospital density were observed. For all years, higher variability was observed for ego-centered measures compared to regional measures. There was substantial variation in ego-centered provider density within the administrative regions at which standard densities are aggregated – in 2019, 61% and 57% of the variation in the ego-centered physician density the hospital density, respectively, remained unexplained when accounting for district-level grouping. Using individualized spatial measures – such as ego-centered provider densities – entails large potential to inform survey research.

How Local Labor Markets Affect Refugees’ Chances of Returning to Their Trained Occupations

Mr Dorian Tsolak (Bielefeld University)
Mr Marvin Bürmann (Bielefeld University) - Presenting Author

The labor market integration of refugees is seen as a central cornerstone for integration into the host society. However, the regional distribution of refugees in Germany is not focused on the most efficient labor market integration, but instead refugees are distributed according to the so-called "Königssteiner Schlüssel", which mainly relies on the population sizes of German regions. While many refugees bring substantial labor market experience from their countries of origin, this distribution policy does not consider the type of labor supply they can provide. Ultimately, they may find themselves in a region where their skills are not needed or where the labor market situation is extremely tight, which can make their chances of being employed in their trained occupation very difficult, thus creating occupational mismatches. To examine the extent to which local labor markets characteristics such as unemployment rates, job postings and share of foreigners within occupations affect refugees' chances of finding employment in the occupation they held prior to migration, we combine regional survey data from the IAB-BAMF-SOEP survey with official statistics from the Federal Employment Agency (BA). We examine the conditions under which these local labor market characteristics in Germany exacerbate or mitigate the chances of occupational mismatch. Specifically, we test whether refugees living in a region with a higher local unemployment rate and fewer job postings for their target occupation, which we regard as a tighter labor market situation, may be at higher risk of occupational mismatch. Furthermore, we consider the share of foreign workers and hypothesize that a high share of foreigners helps them access their trained occupation more easily because employers are used to hiring foreign employees. The results have direct policy implications to facilitate adequate occupational matching and allow for better integration into the labor market.

Monitoring Internal Migration Flows Using Origin–Destination Data Based on Twitter User Locations

Mr Long Nguyen (Bielefeld University) - Presenting Author

Origin–destination (OD) data on internal migration allow for a rich set of spatial analyses and can – in combination with survey data – help to reveal relationships between mobility patterns and the social, economic, environmental, and political conditions of the regions involved. In Germany and many other countries, however, official statistics available to the public only record the volumes of in- and out-migration (and net migration) per region, but not the origin or destination of the migratory flows. In this paper, I present a method for estimating internal migration flows based on the geocoding of the locations that Twitter users report in their profiles. OD data are extracted and aggregated from a continuous data collection and preprocessing pipeline that has collected and geocoded approximately two billions German tweets since October 2018. Compared to the usual approach of using only geotagged tweets, user profile locations provide much higher coverage of all Twitter users and more accurately reflect where users live. Evaluating regional aggregates of the extracted internal migration flows against official statistics shows that Twitter data can be used to reliably estimate migration patterns across regions in Germany, but should be treated with caution due to demographic discrepancies between Twitter users and the real population. Estimates of internal migration flows based on Twitter data, which are freely available and offer broad spatial and temporal coverage as well as highly customisable levels of spatial and temporal aggregation, can help fill the gaps left open by the lack of OD data in official statistics.

Linking Survey Data with OpenStreetMap and Geospatial Census Data to Contextualize Intergroup Relations at the Neighborhood Level: the Case of Belgian National Election Study 2019

Ms Daria Dementeva (KU Leuven) - Presenting Author
Professor Cecil Meeusen (KU Leuven)
Professor Bart Meuleman (KU Leuven)

Augmenting survey data with fine-grained geospatial data brings theoretical and methodological innovations in the domain of ethnic intergroup relations, such as intergroup contact and threat hypotheses.

We present the case study of Belgian National Election Study 2019 (BNES 2019), a probability-based survey that concentrates on general political attitudes and behavior, with a special focus on intergroup attitudes and relations, to illustrate how to increase geospatial contextuality at the neighborhood level with data linkage.

Specifically, increasing geospatial contextuality at the neighborhood level is particularly advantageous, as spurred anti-immigration rhetoric and the overall political polarization coupled with a transformation of the Belgian neighborhoods into ethnically diverse and culturally mixed areas accelerated exposure to negative attitude formation towards ethnic minorities.

In particular, we describe the geospatial data linkage process of three data sources: BNES 2019 and geospatial data from the Belgian Census 2011, and OpenStreetMap, a global database of spatial attributes with a granular spatiotemporal resolution. We present a specific hands-on approach to the workflow of ex-post geospatial data linkage, focusing on initial data pre-processing issues, geocoding, georeferencing, geocoarsening, column-wise geomatching, overall database management, and the possible alleviation of privacy concerns. We continue by reviewing the linkage and matching processes for BNES-2019 and discussing how to pre-select and post-select geomatched variables from auxiliary data sources that are most relevant for substantive analyses of survey responses in the domain of ethnic intergroup relations.

Determinants of Regional Mobility After Job Loss in Germany – Examining the Importance of Locational Factors in the Decision For or Against Residential Mobility

Ms Katrin Rickmeier (Bielefeld University) - Presenting Author

This study investigates the effects of locational factors on the propensity to relocate after an involuntary job loss in Germany. The increasing instability of career paths, recently reinforced by global crises, has provoked a great amount of literature on the consequences of job loss.
But although sociological theorists have long pointed out the importance of the broader social context for individual behavior, empirical research on residential mobility after a job loss has paid less attention to locational effects.
Research shows that job loss leads to a higher propensity for regional mobility, however, it is not yet clear which factors drive the relocation decision. While previous studies have generally limited their attention to individual factors it is important to examine various regional structure factors and the broader context in which regional mobility after job loss occurs.
I contribute to the literature on the determinants of regional mobility after job loss by explicitly modeling the causal effect of home region characteristics and investigating the role of economic conditions and societal characteristics. I argue that regions serve as opportunity structures for their inhabitants and thus encourage or discourage workers’ mobility and influence their willingness to relocate.
I apply logistic regression models based on survey data from the German Socio-Economic Panel (SOEP) complemented by spatial structure indicators from the INKAR database. This combination of georeferenced survey and aggregated regional data allows for new insights into the importance of the place of living in the regional mobility of economically vulnerable groups.
First results show that home region characteristics are not as important in residential mobility after job loss as the individual’s family context. The absence of effects of the former suggests that there are no regional inequalities in Germany concerning the opportunity structures of the recently unemployed.