ESRA 2019 Draft Programme at a Glance


Contemporary issues in the assessment of measurement invariance 1

Session Organisers Dr Daniel Seddig (University of Cologne & University of Zurich)
Professor Eldad Davidov (University of Cologne & University of Zurich)
Professor Peter Schmidt (University of Giessen)
TimeTuesday 16th July, 14:00 - 15:30
Room D23

The assessment of the comparability of cross-national and longitudinal survey data is a prerequisite for meaningful and valid comparisons of substantive constructs across contexts and time. A powerful tool to test the equivalence of measurements is multiple-group confirmatory factor analysis (MGCFA). Although the procedures of measurement invariance (MI) testing seem to become increasingly used by applied researchers, several issues remain under discussion and are not yet solved. For example:

(1) Can we trust models with small deviations (approximate MI)? Is partial MI sufficient? How should one deal with the lack of scalar MI, as is the case in many large-scale cross-national surveys?
(2) How to decide whether a model with a high level of MI should be preferred over a model with a lower level of MI? Which fit indices should be used?
(3) Is MI needed anyway and would it be best to start firstly with a robustness calculation?

Recent approaches have tackled the issues subsumed under (1) and aimed at relaxing certain requirements when testing for measurement invariance (Bayesian approximate MI, Muthén and Asparouhov 2012; van de Schoot et al 203) or using the alignment method (Asparouhov and Muthén 2014). Furthermore, researchers addressed the issues subsumed under (2) and recommended the use of particular fit statistics (e.g., CFI, RMSEA, SRMR) to decide among competing models (Chen 2007). The question raised under (3) is a more general one and raises concerns about the contemporary uses of the concept of MI. Researchers (Welzel and Inglehart 2016) have argued that variations in measurements across context can be ignored, for example in the presence of theoretically reasonable associations of a construct with external criteria.

This session aims at presenting studies that assess measurement invariance and/or address one of the issues listed above or related ones. We welcome (1) presentations that are applied and make use of empirical survey data, and/or that (2) take a methodological approach to address and examine measurement invariance testing and use for example Monte-Carlo simulations to study the above mentioned issues.

Keywords: measurement invariance, comparability, cross-cultural research, structural equation modeling

Why Are Gender Role Attitudes Not Equivalent across Countries? A Cultural Explanation Using Multilevel Structural Equation Modeling

Dr Vera Lomazzi (GESIS-Leibniz Institute for the Social Sciences) - Presenting Author
Dr Daniel Seddig (University of Cologne & University of Zurich)

Differences in societal views on the roles of men and women have been addressed in many large-scale comparative studies during the last years. A prerequisite for valid comparisons of attitudes towards gender roles, however, is that the measures are comparable across countries. Thus, the measurement equivalence must be assessed before drawing substantive conclusions. The current study has three main goals. First, we show that the comparability of gender role attitudes is limited when traditional methods (multiple-group confirmatory factor analysis) are used to test measurement invariance with data from the International Social Survey Programme 2012. However, the recently established alignment optimization procedure suggests that comparability is given. Second, we correlate the national mean levels of gender role attitudes found with the alignment method with cultural values to show that the societal views on the role of men and women vary with respect to the shared goals and views about what is desirable. Societies, which emphasize the importance of the collective and status quo (embeddedness) as well as those with a strong preference in the maintenance of societal roles (hierarchy) tend to show more traditional gender role attitudes. Societies, with more egalitarian values (egalitarianism) also display more egalitarian attitudes towards gender roles. While new methods as the alignment procedure can alleviate the risks of drawing invalid conclusions, the reasons for noninvariance often remain unexplained. The third aim of this study is to investigate the possible sources of noninvariance with multilevel structural equation modeling. We use two country-level variables to explain the absence of measurement invariance: the cultural value embeddedness explains noninvariance to a considerable degree while the Gender Inequality Index (from the UNPD) does not. Thus, the issues of comparability of gender role attitudes are related to cultural rather than structural differences between countries.


Is measurement invariance necessary to draw meaningful conclusions in multilevel modelling?

Professor Artur Pokropek (IFiS PAN) - Presenting Author
Professor Eldad Davidov (University of Cologne)
Professor Peter Schmidt (Justus Liebig University Giessen)

The relationship between individuals and society has been at the heart of social science research since its inception. A major tool to study the relations between contextual (macro) phenomena and individual (micro) processes has been multilevel modeling (MM). Heising, Schaeffer, and Giesecke (2017) showed that more than 20% of articles in three leading sociological journals (American Journal of Sociology, American Sociological Review, and European Sociological Review) during the years 2011-2014 utilized MM as a tool to test their hypotheses. However, virtually none of these articles examined a crucial assumption when performing MM: the cross-country (or cross-group) comparability of the indicators used in the analysis, that is, the so-called measurement invariance or measurement equivalence assumption. This lacuna is unfortunate, because the methodological literature in the last decades has demonstrated that full cross-country comparability, is very rarely supported by the data in international surveys (Davidov et al., 2014).
Lack of measurement invariance is likely to lead to wrong conclusions when parameters of interest (means or association measures such as regression coefficients or covariances) are compared across cultures (van de Vijver, 2011). However, nothing is known as to whether and to what extent lack of measurement invariance may also lead to biased parameters and wrong conclusions in MM.
The current study attempts, to the best of our knowledge, for the first time to fill this gap. It presents the results of Monte Carlo simulations where we examine how different types of measurement noninvariance across countries affect various parameters of interest in MM. The results suggest that in many situations the potential bias of lack of measurement invariance in MM is substantial and poses serious threats to the validity of conclusions in MM.


How Much We Can Trust Conventional SEM Goodness-of-Fit Measures in Large Cross-National Invariance Tests?

Mr Boris Sokolov (Higher School of Economics) - Presenting Author

Measurement invariance is an important prerequisite for cross-cultural studies, since it ensures that latent constructs of interest are comparable across countries. Strikingly, most applied invariance studies follow guidelines based on few ten-years-old simulation studies of the two-group setting. The assumption that those guidelines are applicable to much larger, heterogeneous, and complex samples, typical for modern international surveys, is unrealistic. One negative consequence of using the old-fashioned and inappropriate guidelines for determining whether a particular MGCFA model satisfies invariance requirements or not is that often researchers find that popular sociological constructs lack cross-cultural comparability. Using Monte Carlo simulation experiments, this project examines how well popular SEM goodness-of-fit measures, such as CFI, TLI, RMSEA,and SRMR, perform in the context of measurement invariance testing in large samples. Its contribution to the existing methodological literature on cross-national survey research is three-fold. First, it explores how sensitive are the aforementioned fit measures to various amounts of measurement non-invariance in large samples (10-30-50 groups) under various conditions imitating typical features of such type of survey data. Second, it tests how other model misspecifications affect model fit in the multigroup setting, thus disentangling the impact of different fit-worsening factors. The results suggest that CFI and SRMR are superior to RMSEA and TLI as measures of model misfit due to non-invariance, but the existing cut-off values for all these measures are too strict and must be somewhat softened. Finally, it examines how critical are different levels of non-invariance in terms of bias in the latent means hierarchy. The results show that the danger of measurement invariance might be somewhat exaggerated since even in conditions with the highest levels of metric and scalar non-invariance the estimated latent means do not deviate strongly from the true population values.


Finding Subsets of Groups That Hold Measurement Invariance – A Simple Method and Shiny app

Dr Maksim Rudnev (ISCTE-IUL) - Presenting Author

Measurement invariance of constructs across many (>10) groups is rarely supported for all groups and indicators. One of the meaningful strategies in this situation is to look for subsets/clusters of groups, for which a required level of invariance holds. Alignment procedure can help in finding outlier groups but provides biased results in the presence of clusters. Multilevel mixture models are applicable however they are very complex and have limited availability. I suggest a simple method to find clusters of groups that may hold measurement invariance. The method involves k-means clusterization based on differences between group-specific factor parameters (loadings and intercepts), or, alternatively, based on MGCFA model fit indices computed for each pair of groups. An interactive R Shiny app with graphical interface facilitates development of hypotheses regarding clusters. A simulation study demonstrates that compared to the alignment procedure, the simple method is more efficient in presence of group clusters and similar in performance in detecting outlier groups.