ESRA 2019 Draft Programme at a Glance
Survey Data Harmonisation: Potentials and Challenges 2
|Session Organisers|| Dr Ilona Wysmulek (IFiS PAN)
Dr Irina Tomescu-Dubrow (IFiS PAN and CONSIRT)
|Time||Friday 19th July, 13:00 - 14:00|
Survey data harmonization - its theory and methodology - is growing into a new scientific field that pushes forward the methods of survey data analysis while emphasizing the continuous relevance of surveys for understanding society. Depending on whether researchers intend to design a study to collect comparable data, or use existing data not designed a priori as comparative, the literature distinguishes between input and ex-ante output harmonization, and ex-post output, or, just ex-post, harmonization. Applied ex-ante, harmonization facilitates comparability of survey data collected in multinational, multiregional and multicultural contexts (3MC, www.csdiworkshop.org). Applied ex-post, harmonization enhances the effective use of extant surveys and represents a way to overcome limited time and space coverage inherent in any single comparative project. In both its forms, ex-ante and ex-post, harmonization is a complex, labor-intensive and multistage process, which poses numerous challenges at different stages of the survey lifecycle.
This session welcomes papers on both opportunities and difficulties inherent in ex-ante and ex-post survey data harmonization. We invite theoretical and empirical contributions that deal, among others, with (a) transparency of harmonization procedures, (b) variability in source data quality, (c) minimizing information loss in harmonization, (d) measurement equivalence, and (e) substantive analyses on survey data harmonized ex-post.
Keywords: Data harmonization, transparency of harmonization process, comparability of survey data, data quality
A Pragmatic Approach to Survey Data Harmonisation
Professor Zbigniew Sawiński (Institute of Philosophy and Sociology Polish Academy of Sciences) - Presenting Author
Harmonization of survey data presupposes creating a variable that is common across a number of surveys carried out within-country or cross-nationally. To this end, information from survey-specific items (i.e. source variables) needs to be mapped into the target variable common across all surveys, which often involves information loss. When loss differs across surveys, harmonization can distort conclusions, for example when effects of the same variable, like the impact of education on achievement, are compared between countries.
To minimize information loss in cross-country harmonization, I propose a 'pragmatic' approach to creating common measures, namely one that takes into account knowledge and experience of national research teams. For certain concepts, decades of national research have resulted in classifications that probably measure these concepts best within countries. Put differently, each country has its own harmonized target variable, for example, for education. This information, instead of survey items, could be used to create a common classification applied in cross-country comparisons. To enable peculiarities of each country to fit, the common classification would be rather general with broad and flexible categories.
Having a common classification is the first step, but not the last, because in some countries the variable could be concentrated within only few categories of the classification, which would again lead to information loss. To avoid such loss, any category of the common classification can be further subdivided in ways that are country-specific, rather than common for all countries. The decisions about which, if any, categories should be subdivided, would take into account national expertise. In short, pragmatic harmonization consists of creating a general common classification, whose categories can be further subdivided by countries. I will illustrate both advantages and disadvantages of pragmatic harmonization using ESS data on education.
How can Research Data Management Help to Produce Data for Comparative Research
Mrs Irena Vipavc Brvar (ADP - Slovenian Social Science Data Archives) - Presenting Author
Mrs Ellen Leenarts (DANS - Data Archiving and Networked Services)
Dr Peter Doorn (DANS - Data Archiving and Networked Services)
Good data management is essential in every research process. Even more when we are talking about international or comparative research. Large research teams and the legal regulations in collaborating countries make international research really challenging. In this presentation, we will present CESSDA's Data Management Expert Guide (www.cessda.eu/dmeg) with a special focus on the tasks needed to make international research and (re-)use of collected data possible. The guide is an online tutorial based on extensive experience of Social Science Data Archives joined in CESSDA consortium. It combines the data archives’ knowledge from engaging with researchers and answering their questions about Research Data Management (RDM). The guide is based on the research data life cycle and can be used either by individual researchers as a self-study material or as part of an online or face-to-face workshop on research data management. Many sources of information on RDM already exist and finding exactly the information you need can be a challenge. The CESSDA Data Management Expert Guide was designed to provide a comprehensive overview of all relevant aspects of data management within one online environment. The guide provides guidance for social scientists throughout all stages of their research while taking into account the diversity of the specific requirements that scientists may have to deal with. Based on responses from current users, this guide can be helpful outside the social sciences as well.
Additionally to the guide we will present the work done by Science Europe (S.E.), which has recently published core RDM requirements. These are being implemented by a growing number of national and international research funders in Europe. On top of this, S.E. invites communities to formulate domain protocols for data management that would ease data management planning. A draft for such a protocol for social science research will be presented.
Left-Right Orientation: A Harmonisation Case Study
Dr Ranjit K. Singh (GESIS - Leibniz Institute for the Social Sciences) - Presenting Author
Dr Natalja Menold (GESIS - Leibniz Institute for the Social Sciences)
We present a harmonization case study focusing on the left-right orientation: A frequently used construct in social and political science research that is included in most large German social science survey programs. Despite the concept’s popularity there is no single standard of how to measure it. Instead, the survey programs use operationalizations which differ regarding crucial design aspects such as whether or not to present a scale midpoint or whether or not to present an explicit “don’t know”-option. Consequently, the left-right orientation is well suited to examine the need for greater harmonization.
Our study will address two central issues regarding harmonization. (1) What are the consequences of this lack of harmonization? To answer this we will examine if the different operationalizations lead to different distributions of the left-right orientation and if they impact reliability and criterion and construct validity. (2) Can the different left-right scales be ex-post harmonized? To answer this we will use and compare different ex-post harmonization approaches. We will also examine the limitations and drawbacks of an ex-post harmonized left-right orientation.
We will present evidence from our comparison of the left-right scale in (the German parts of) eight survey programs: The German General Social Survey (ALLBUS/GGSS), the International Social Survey Programme (ISSP), the European Values Study (EVS), the European Social Survey (ESS), the German Longitudinal Election Study (GLES), the GESIS Panel, the Socio Economic Panel (SOEP), and the Politbarometer.
With our study, we hope to (1) inform users of the survey programs’ data about the limitations due to non-harmonized operationalizations of the left-right orientation, to (2) present evidence of the use of harmonization to the organizers of the survey programs, and to (3) contribute to the developing literature on harmonization methodology.
Harmonisation of Institutional Trust Measures for Eastern European and Eurasian Surveys: Issues of Variable Commensurability, Weighting and Missing Value Imputation
Mr David Wutchiett (Université de Montréal) - Presenting Author
Eastern European and Eurasian countries have seen a proliferation of public opinion surveys collected since the early 1990s. Most of these studies include questions addressing public opinion regarding trust in institutions. Despite the potential for aggregate comparative evaluation of national yearly trends, data harmonization remains a considerable challenge given the frequent presence of differences in survey methodological approaches including variation in question formatting and gaps in longitudinal coverage. In addition, the impact of respondent sample weighting must be addressed. Important differences in social and political composition and circumstances between countries likewise present challenges for efforts at drawing inferences relevant to the region as a whole.
The present study examines survey data harmonization processes in the context of 484 surveys collected from 600,000 participants residing in 29 Eastern European and Eurasian countries between the years 1991 and 2016. Several methodological aspects of data harmonization will be discussed in depth including evaluation and reconciliation of differences in variable scale and distribution across survey projects and waves. Further, strategies and options for the imputation of missing values within and between surveys following the merging of multinational survey data sets will be considered. Finally, the utilization of external country-level descriptive social, economic and political variables during imputation and weighting procedures will be examined through the implementation and evaluation of methodological approaches including multiple imputation and raking.
Data Visualisation for Comparative Social Science Survey Data and Metadata at GESIS
Miss Julia Hermann (GESIS) - Presenting Author
Mr Wolfgang Zenk-Moeltgen (GESIS)
At GESIS, numerous data from national and international comparative studies are prepared, docu-mented, and archived. This data are currently offered on different GESIS-portals for reuse. Users receive metadata about these study collections (such as overviews of trends, scales, thematic cate-gories, survey years and participating countries) via the value-added products, the data catalog and the homepages of the study collections - mostly in the form of tables or long lists. This ensures the completeness of the data, but it is at the expense of clarity.
The already enormous amount of data that is constantly growing makes it increasingly difficult for users to get an overview, to quickly select the needed information, and to decide on dataset selec-tion. For this reason we establish a tool with which metadata and survey data can be displayed by using various graphics. The advantage of data visualization is to make certain concise relationships from the data understandable at a glance, to summarize information, and to reduce the complexity of data understanding. Graphics should have a main message in order to draw the users' attention to certain datasets, developments, and contexts from a study collection. In addition to standard graphics, country maps will also be created. Moreover, the use of interactive and animated graphics is planned.
As part of the project, different solutions and approaches are currently being developed to make the data visible and as understandable to users as possible. Main benefits from the projects are that different software solutions are compared, including customized individually programmed solutions, and that the internal workflow of creating and providing visualizations is being considered. This enables us to come up with a practical support for data archive staff in creating more overview for secondary users of the data.