Wednesday 19th July, 09:00 - 10:30 Room: N 101

Overview of open access European survey data 5

Chair Dr Annette Scherpenzeel (SHARE – Survey of Health, Ageing and Retirement in Europe )
Coordinator 1Ms Sabine Friedel (Munich Center for the Economics of Aging, Max Planck Institute for Social Law and Social Policy)

Session Details

In recent years, many large sets of survey data have been made available to the scientific community. Large national and European surveys, such as ESS, SHARE, SOEP, Understanding Society, etc., disseminate their data to registered users. For researchers it can be difficult to get a good overview of what is offered and to find the specific variables and samples of their interest.

This session aims to give researchers more insight into the variety of variables available in large survey datasets. For that purpose, we invite survey practitioners to present their data sets, longitudinal as well as cross-sectional, to potential users. Presentations should address the following survey characteristics: Research field, target population and sample, survey design, data access regulations, available survey variables and paradata, linked administrative data (if applicable), and some examples of data use. Moreover, we especially welcome overviews including information which can be used for methodological analysis, such as key stroke data, auxiliary information, interviewer characteristics and observations, response behavior, experimental designs, etc.

Paper Details

1. Design and Research Potential of the German Family Panel "Pairfam"
Ms Kristin Hajek (LMU Munich)
Professor Josef Brüderl (LMU Munich)

The German Family Panel “pairfam” (‘Panel Analysis of Intimate Relationships and Family Dynamics’) is a panel survey providing rich data on the formation and development of intimate relationships and families in Germany. Main topics are partnership dynamics and partnership dissolution, fertility attitudes and generative behavior, parenting and child development, and intergenerational relationships. The survey started in 2008 with a nationwide random sample from the population registers for three age cohorts (aged 15-17, 25-27, or 35-37 years). One hour CAPI-interviews have been conducted annually. Respondents are referred to as “anchors”, because they are asked every year for permission to interview their partner, parents, and children above age 8 as well (multi-actor approach). This is done to get a full picture of a family’s life. Interviews with anchors’ partners and parents are conducted by PAPI (20-30 pages). A 15 minutes CAPI is conducted with children aged 8 to 15. In addition, for each child aged 6 and above the anchor and his/her partner fill out a parenting-PAPI (3 pages). Finally, for each child below age 8 anchors’ CAPI contains an age-specific child module. Since wave 4, we invite participants in the children survey who have turned 16 to join our panel as regular respondents (step-up respondents): meanwhile about 150 juveniles grow out of the children survey and over 90% of them become regular respondents. The survey fieldwork is conducted by TNS Infratest Sozialforschung in Munich.

With dependent interviewing (DI) we feed forward information collected in the previous wave to the present interview. To reduce respondents' burden we use a graphic event history calendar (EHC) to collect information, mainly for partnership, employment and residential history. The combination of DI and EHC – used for the first time in a large population survey – should ease the cognitive task of the respondent and produce more consistent data with less measurement error. The parifam data are released as scientific use file one year after closing the fieldwork of the respective wave. Together with the collected data, a number of generated variables and additional datasets with accumulated biographical information in form of spell data are issued, in order to make relevant information easily accessible.

In the presentation, we will give an overview of pairfam’s design, the available data and the research potential. Especially we will highlight the panel structure, the multi-actor design, and the available para data. The scientific use files contain information about the interview process (e.g. number of interviewer-respondent contacts or the date, situation and duration of the interview) and interviewer characteristics (sex, age, school degree). Upon request, the pairfam team also offers users the opportunity to conduct regional-level analyses on fine grained regional information or even geo-coordinates. Moreover, the survey data can be linked with Microm Data giving the type of residential structure, socioeconomic characteristics and mobility of the neighborhood. Using this rich set of available data enables family research as well as methodological analysis of various aspects.

2. TwinLife – An open access twin family panel on genetic and social causes of inequalities
Professor Martin Diewald (Bielefeld University)
Mr Volker Lang (Bielefeld University)

TwinLife is a genetically sensitive open access panel study on the development of inequalities in different life domains. The data collection started in 2014, is planned for 10 years,and surveys over 4,000 monozygotic and same-sex dizygotic twin pairs living in Germany and their families. The comparison of monozygotic and dizygotic twins not only facilitates analyses of social mechanisms but also the identification of genetic differences as well as research on the interaction and covariation of social and genetic causes of inequalities. As part of the presentation we will show exemplary genetically sensitive analyses of twin family data.

To these ends the TwinLife data collection covers six important inequality domains: I. Education, academic performance, and skill development; II. occupational careers and labor market attainment; III. social, cultural, and political integration and participation; IV. subjective quality of life and perceived capabilities; V. physical and psychological health; VI. behavioral problems and deviant behavior.

Survey Design

The TwinLife panel combines a sequential cohort-design with an extended twin family-design (ETFD). The related surveys are conducted yearly, whereat the mode alternates between face-to-face at home, including some tests, and CATI interviews. Parts of the face-to-face surveys are conducted in parallel modes, i.e., as computer assisted or paper-and-pencil self-interviews, enabling related methodological analyses.

The sequential cohort-design comprises four cohorts: The youngest twins in cohort 1 (birth years 2009 and 2010) are about 5 years of age at the time of the first survey in 2014 and 2015. The oldest twins in cohort 4 (birth years 1990 to 1993) are about 31 to 32 years of age at the time of the last survey in 2022 and 2023. The twins in cohorts 2 and 3 are born in the years 2003 to 2004 and 1997 to 1998, respectively. This design enables the TwinLife panel to cover an age range between 5 and 32 years with a data collection phase of 10 years. This age range covers important life-course transitions from school entry to the labor market entry phase as well as critical life stages for mating and family formation.

As part of the ETFD, in addition to the twins themselves the biological and if applicable social parents as well as the sibling that is closest in age to the twins are surveyed. Moreover, the partners of adult twins are included as well. This family perspective enables comparisons regarding different degrees of genetic similarity, and it is important to analyze the manifold influences of the family environment on the development of the twins in greater detail.

Data access

Since November 2016 data on the first face-to-face at home interviews of 2,009 twin families is available as a scientific use file at the GESIS data catalogue. The data and documentation are obtainable in English and German. Beside the survey data collected, the release contains mode related paradata on a variable by variable basis as well as socio-demographic information on the interviewers.

3. The German PIAAC-Longitudinal Survey: A Wealth of Data
Ms Anouk Zabal (GESIS Leibniz Institute for the Social Sciences)
Ms Silke Martin (GESIS Leibniz Institute for the Social Sciences)

The German PIAAC-Longitudinal project (PIAAC-L) was created as a national longitudinal extension to the the Programme for the International Assessment of Adult Competencies (PIAAC). Researchers and policy-makers alike have shown great interest in the PIAAC data, but this data also has limitations. One of the central objectives of the PIAAC-L project is to enrich and enhance the German PIAAC 2012 data. Thus, in order to be able to address more and more elaborate research questions, PIAAC-L implemented a longitudinal design with three follow-up waves of data collection. The design also extended the focus to include the household: The follow-up waves targeted not only German PIAAC respondents (anchor persons), but also adult members of the household. Furthermore, a very varied set of instruments was implemented: core questionnaires adopted from the German Socio-Economic Panel (SOEP), question modules from a number of other surveys, and also new questions. In addition, the second wave administered a direct cognitive assessment not only with PIAAC literacy and numeracy instruments, but also with reading and mathematics instruments from the NEPS (National Education Panel Survey).

As a result, a number of very rich and detailed datasets have already been made available at the GESIS Data Archive / Research Data Center PIAAC. We will give an overview of the profusion of accessible data and discuss some of the challenges encountered to release the data for scientific use and some of the ensuing restrictions. Furthermore, we will give an example of very interesting paradata which is a by-product of the computer-based cognitive assessment using PIAAC instruments.