ESRA 2019 Programme at a Glance

Dealing with International Comparative Survey Instruments in the Field of Education and Occupational Status: Challenges and Achievements

Session Organiser Dr Alexandra Mergener (Federal Institute for Vocational Education and Training)
TimeFriday 19th July, 09:00 - 10:30
Room D21

Over the past decades, the number of international comparative approaches and studies in the fields of educational and occupational research increased substantially. Even if these approaches lead to tremendous insights into these research fields and are helpful to reflect and interpret the results of one’s own country from another perspective, there are challenges and sometimes limitations in achieving (effectively) comparable results. Different cultural and economic settings lead to a diverse understanding of concepts. Especially in case of education and occupational status, there are system specific characteristics researchers have to deal with in every country. Those hinder measurement, harmonization and finally comparison of concepts.
With international classifications, scales or indices of education and occupational status (e.g. ISCED, CASMIN, ISCO, ISEI, SIOP, EGP), there are survey instruments which should enhance the international comparability. Nevertheless, it has been shown, for instance, that the ‘International Standard Classification of Education (ISCED)’ or ’years of education’ measure the educational attainments in cross-national surveys inadequately, as they do not sufficiently capture the distinct value of vocational training in some countries. Additionally, surveys could ask for occupations or the occupational status in a different way (e.g. asking for job titles or job tasks), depending on the national concept und understanding of occupations.
Thus, in the proposed session we are going to look for contributions focusing on experiences with the international comparability of classifications of education and occupations or occupational status. We are interested in the ways researchers handle the coding of different educational and occupational variables and how they transfer them in international classifications. What kind of analytical approaches could enhance the international comparability? Survey researchers who are active in these fields of research are welcome to present their (recent) studies and their strategies to deal with it or to discuss existing problems and challenges with the international comparability of these survey instruments.

Keywords: International comparability, cross-cultural research, occupations, education, classifications

Implementing an in Field Coding Tool for Occupations in an International Survey - Experiences from SHARE

Ms Stephanie Stuck (SHARE) - Presenting Author

Measuring and coding of occupations in an international context is not only challenging but can be very expensive and time consuming if coding needs to be done ‘by hand’ afterwards. The Survey of Health Ageing and Retirement in Europe (SHARE) developed and implemented an in field coding tool – the so called job coder - to deal with these challenges. The tool was first used in the fifth wave of SHARE to measure and code occupation for respondents as well as their parents. A large multi-lingual database is used by the tool to measure occupations and provide ISCO codes. Within the context of the SERISS project this tool was further developed, since this joint proje - t of several European surveys aimed at developing tools to harmonise measurement and coding in an international context. Based on previous experience from SHARE, the tool including the multi-lingual data base with thousands of entries of job titles was improved and went into the field again in 28 countries. But even though SHARE is ex ante harmonised and uses a common interview tool that is centrally programmed and implemented in all countries in the same way, many challenges remain. Starting with translations of questions and interviewer instructions, interviewer training and last but not least dealing with the data outcome, e.g. remaining uncoded jobs. This talk gives an insight in the development and implementation of the job coder tool, the problems we faced and how we dealt with them.

Qualifications and Duration as Measures of Level of Education

Mr Harry Ganzeboom (Vu University Amsterdam) - Presenting Author
Mrs Ineke Nagek (Vu University Amsterdam)
Mrs Heike Schröder (IB Research)

In comparative research, the level of education is routinely measured using one of two methods. The qualification method measures the level by highest (or most recently achieved) diploma. Best
practice here is to measure the qualifications in country-specific term and then to post-harmonize these using a common denominator. The recent development of the three-digit International Standard Classification of Education 2011 (ISCED-2011) has become a major game-changer in this methodology, because for the first time a detailed and rigorous harmonization framework has become available, which allows the research to scale to qualifications to a internationally valid linear metric (Schröder & Ganzeboom 2014). Alternatively, comparative research measures level of education using its duration, best collected as a question to respondents about the (net) length of their educational careers. Both methods have their pro’s and con’s, and their fervent proponents and opponents (Braun & Müller 1997; Schneider 2009). We examine these arguments and conclude that the discussions have overlooked the fact that qualification measures and duration measures are strongly correlated and can usefully be regarded as parallel indicators of the same underlying construct.

We then examine the quality of the qualification and duration measures empirically using a Saris & Andrews (1991) Multi-Trait Multi-Method model. This reformulation of the classical MTMM models allows one to derive separate validity and reliability coefficients. The model is tested on estimated on EU-SILC household data, in which both types of measurement have been obtained for all members of the household (both partners and children). The provisional estimates indicate almost equal validity of the qualification measures – also in countries for which the validity of duration measurement has been contested, but that duration suffers from about 10% more unreliability than qualification measurements. Finally, it is shown that that double indicator measurement – by both qualifications and duration – repairs both validity and reliability.

Developing an Instrument to Measure Mathematics Self-Efficacy through a Survey with Macau Students: Could We Also Establish Cross-National Comparability?

Miss Ka Hei Lei (University of Manchester) - Presenting Author
Miss Maria Pampaka (University of Manchester)

We focus on the development of a survey instrument aiming to construct a psychometrically valid measure of mathematics self-efficacy (MSE). Self-efficacy is defined as an individual’s self-belief of their capability to organise their actions in order to produce given attainments, thus MSE should be contextualised within relevant mathematical tasks. There is very little research focusing on the comparability of this construct across national/cultural contexts, a gap we aim to address by implementing the same instrumentation/validation methodology as in a large UK study for Macau students (i.e. of Chinese cultural context). The instrument development is underpinned by two main objectives: 1) to ensure the instrument produces a healthy unidimensional MSE measure for Macau students, and 2) to enable the comparison of MSE between the UK and the Macau student samples. To achieve these objectives, we consider the following for item selection: Firstly, we select maths tasks which are appropriate for the involved age groups of Macau students, according to their respective mathematics curriculum. We also include some maths tasks used and validated for the MSE measurement within a large-scale UK study ( to serve as anchoring items to allow us for cultural comparability checks. The analytical approach we use for validation is based on Rasch analysis (rating scale model) including differential item functioning, to test for measurement invariance between gender groups within Macau and between the Macau and the UK national groups. Once measurement validity and invariance are established, we proceed with further statistical modelling with the constructed measures and other variables from the survey (also compared with the UK sample, N=8000+) to provide deeper insight into the development of MSE in Macau students and also cultural differences. Data will be collected at the beginning of 2019 (expected N=500). The presentation will include methodological challenges, and preliminary findings.

Comparing and Validating New Methods to Control for Response Biases in Self-Report Educational Data.

Mr Marek Muszyński (Jagiellonian University) - Presenting Author

Response bias is a general term for a wide array of processes that result in inaccurate or false responses in self-report data (Furnham, 1986). Response biases are perceived as a threat to validity as they lead to lower data quality (Maniaci & Rogge, 2014). Many methods to control for response biases were developed, but none acquired a „golden standard” status. Most of the methods is of unproven utility and have to be implemented before the data is collected. This calls for a development of methods with a possibility to use also after data collection. More validation studies are needed, preferably with use of nonself-report criteria (e.g. cognitive tests).
The present research was performed using PISA 2012 dataset, concentrating on the “math familiarity” scale. Numerous methods of identifying biased responses were used (overclaiming questionnaire, long-string, psychometric synonyms, intraindividual response variability (IRV), Cattell’s sabotage index, modified individualised chance score (MFIC), Mahalanobis distance, dr* outlier measure, polytomous person-fit statistics).
As the response biases are known to suppress the inter-variable relations, the magnitude of the regression coefficient between the self-assessment scale and cognitive math test and the magnitude of the R2 statistic from multilevel regression were used as a validation of the methods used.
The IRV and long-string proved to be effective methods of eliminating straightliners, but outlier detecting method yielded hard-to-interpret data as eliminating flagged outliers did not have a noticeable effect. The latent profile analysis yielded three clusters, with two aberrant responses clusters: one containing low-consistency participants, second with straightliners.
The conducted analyses brought new evidence on which response bias methods should be use and to what end, but also yielded new questions, e.g. what cut-offs should be used to identify outliers or how the generated indices should be then use in a quantitative analysis of survey results.