ESRA logo
Tuesday 18th July      Wednesday 19th July      Thursday 20th July      Friday 21th July     




Thursday 20th July, 16:00 - 17:30 Room: Q4 ANF3


Occupation coding 3

Chair Professor Matthias Schonlau (University of Waterloo )
Coordinator 1Mr Malte Schierholz (IAB )

Session Details

Occupation coding refers to coding a respondent’s text answer (or the interviewer’s transcription of the text answer) about the respondent’s job into one of many hundreds of occupation codes. We welcome any papers on this topic, including, but not limited to:
- measurement of occupations (e.g., mode, question design, …)
- handling of different occupational classifications (e.g., ISCO and national classifications)
- problems of coding (e.g., costs, data quality, …)
- techniques for coding (e.g., automatic coding, computer-assisted coding, manual coding, interview coding)
- computer algorithms for coding (e.g., machine learning, rule-based, …)
- cross-national and longitudinal issues
- Measurement of derived variables (e.g., ISEI, ESeC, SIOPS, job-exposure matrices, …)
- other methodological aspects related to occupation coding

Paper Details

1. German classification of occupations and occupational fields: A suitable way to smooth breaks between KldB88/92 and KldB2010?
Dr Michael Tiemann (Federal Institute for Vocational Education and Training)
Mr Tobias Maier (Federal Institute for Vocational Education and Training)

The year 2010 saw the introduction of a new classification of occupations in Germany with the KldB2010. It was developed mainly by the Federal Employment Agency with support from the Federal Statistical Office and the consultation of an advisory board with different stakeholders and institutions. Since the old national classifications (from 1988 by the Employment Agency and from 1992 by the Statistical Office) were not only somewhat incompatible in parts but also effectively mostly unchanged for decades as they were updates of the 1975 classification, this was seen as a major step. The implementation though was not quite easy with delays in official statics of one year or more on the one hand. While the classification of 2010 offers an almost seamless transfer to the International Standard Classification of Occupations (ISCO 08), this is not the case for the old national classifications, so on the other hand there are now considerable fractions between old and new national classifications.
Data sources spanning longer periods of time have to either live with these fractions if the use old and new classifications with a cut-off date or recode old data in the new classification (or vice versa). But this is problematic in itself, because changes in occupational content might have rendered occupational positions to be something distinctively different over time.
Before the current classification of 2010 was devised, the Federal Institute for Vocational Education and Training developed the “occupational fields” (BIBB-Berufsfelder). This systematic explicitly addressed major drawbacks of the old classifications and grouped occupations by one simple characteristic: Occupational fields are homogenous in their main tasks. While developed for being used in the BIBB/IAB qualification and occupational field projections (QuBe), occupational fields soon got widely used, amog others by the Employment Agency in their service “Berufe im Spiegel der Statistik”.
With the new national classification the occupational fields were updated. This update is challenging due to the different natures of the classifications used. While the old national classifications used several characteristics, such as materials, branches, workplaces, qualifications etc., the current classification is very strictly only relying on specialization and requirements. Especially the reliance on qualifications in the old classifications poses problems. The presentation will show how these challenges were met in the update of occupational fields.
This means that now there is an intuitive way to mend problems in longitudinal data caused by fractions between old and current national classifications. Using occupational fields can successfully smooth these fractions. This presentation thus gives an overview on the updated occupational fields and also their use for long-term occupational time-series with applications in Microcensus data using national classifications from 1975 to 2010. The two key aspects are thus methodical issues in updating occupational fields, methodical and practical issues in old and current national classifications of occupations and possible uses for occupational fields, e.g. in long term data series on occupations or tasks within occupations.


2. Job Activity Descriptions and an Auxiliary Classification for Simultaneous Coding into two Official Classifications
Mr Malte Schierholz (Institute for Employment Research)

Occupational classifications like the International Standard Classification of Occupation 2008 (ISCO-08) or the German Classification of Occupations 2010 (KldB 2010) cluster the variety of different occupations into categories of “similar” occupations. The definitions for each category are usually quite sophisticated, but too long to ask in a survey. Instead, one often relies on short job titles for coding purposes, but job titles are often not precise enough to avoid all ambiguities. To simplify the communication about occupational categories and to increase the data quality from occupational measurement, we propose to use job activity descriptions as a middle ground between long category definitions and short job titles. The job activity descriptions are part of an auxiliary classification that we developed in order to facilitate simultaneous coding into ISCO-08 and KldB 2010.

In this talk, we explain the reasons that led us to develop the new auxiliary classification, discuss guiding principles for development, and share insights we have gained.


3. Occupational classifications in Germany. Same same but different?
Dr Florian G. Hartmann (Universität der Bundeswehr München (University of the federal armed forces Munich))

The Classification of Occupations 2010 (KldB 2010) by the German Federal Employment Agency depicts the occupational landscape in Germany. It is a hierarchical classification. At the first level, 10 different occupational areas are distinguished (1-digit code). The function of these areas is to present a rough thematic overview. In the next three levels occupations, similar in terms of required knowledge, skills and abilities, are grouped into classes. There are 37 classes at level two (Occupational Main-Groups; 2-digit code), 144 classes at level three (Occupational Groups; 3-digit code) and 700 classes at level four (Occupational Sub-Groups; 4-digit code). As the differentiation from level two to level four increases, occupations with the same 4-digit code are supposed to show a high similarity in their activities to be carried out.
Another possiblity to categorize occupations, which is however spread around the world, is based on Holland’s theory of vocational personalities and work environments. According to Holland (1997) occupations can be classified by six basic types of working environments (R, I, A, S, E, C). Usually only the three dominant types are used to describe a occupation. As a result, a so called Holland code is generated (e.g. RIA). According to this method, there are a total of 120 occupational classes (6*5*4). Like occupations with the same 4-digit code of the KldB 2010, occupations with the same Holland code are supposed to be very similar in their main activities.
There are two widespread applications of Holland’s classification in Germany: the occupational database BERUFENET by the German Federal Employment Agency as well as the occupational list of the self-assessment tool EXPLORIX (Joerin Fux, Stoll, Bergmann & Eder, 2012). While the structure of the KldB 2010 is mainly based on the results of a cluster analysis, the Holland codes of the BERUFENET and the EXPLORIX were mainly generated by expert ratings. Both the KldB 2010 and Holland’s occupational classification are, in principle, based on the similarity of occupational activities. Therefore, occupations with the same 4-digit code of the KldB 2010 should have the same Holland code.
This hypothese is examined by comparing the codes of the KldB 2010, the BERUFENET and the EXPLORIX. First results concerning the EXPLORIX show that only 41% of the KldB 2010 Occupational Sub-Groups (4-digit codes) subsume occupations with the same Holland code. The results are discussed in the light of international studies.


4. Structured Derivation of Variables from Occupational Classifications with Stata
Mr Daniel Bela (LIfBi)
Mr Knut Wenzig (DIW/SOEP)

Perhaps the most used source for deriving prestige and status scores (e.g. ISEI, SIOPS, EGP) from occupational classifications are Harry Ganzeboom’s SPSS codes (Ganzeboom 2016), which he publishes on his website. The well-known Stata modules by John Hendrickx (2002, 2004) adapt these scripts in Stata.
Albeit these scripts being the most sophisticated mechanism publicly available to calculate prestige and status scores, the approach via syntax codes has two main shortcomings: (1) the necessarily complex architecture of those scripts makes it hard for users to fully comprehend the derivation process in its details; (2) currently, code for deriving prestige and status score variables from the latest ISCO-08 is available for SPSS only. Packages for other statistical platforms, such as R or SAS, also are only partially available.
We present a Stata module which tries to overcome these issues. By creating a framework that establishes all variables’ derivation via lookup tables, the whole process becomes more flexible. This approach can produce the same results as the established way of using Harry Ganzeboom’s SPSS or John Hendrickx Stata codes, when used with the appropriate lookup tables delivered within the package. However, several benefits emerge from the concept of lookup tables: The end user can easily understand why a certain code led to a certain status score by having a detailed look at the tables. Additionally, he can customize the whole process to his needs by using self-administered lookup tables, either by only slightly adapting the original tables, or by interchanging the lookup information with completely different tables.
Using the latter path, the full flexibility of the lookup table approach leads to considerable advantages. Not only does it enable the user to derive other than the “standard” prestige or status scores, like the Magnitude Prestige Scale (MPS) or Blossfeld’s occupational classification (BLK). Our approach also gives the possibility to run the derivation process from totally different input, from national classifications (like the German KldB) up to bare answers from surveys. It also makes it possible to create cross-walk tables between classifications. Thereby, error checking and reporting, as well as the possibility of “inline documentation” of the derivation process (by citation of the lookup tables) and (multilingual) labeling of values, is handled by the Stata procedure in a structured, standard way. User defined lookup tables can, when properly documented, be submitted to the authors and become part of the package and its documentation.
Finally, our approach of using lookup tables for variable derivation can be ported to other statistical software platforms, like SPSS, R or SAS. All tables are convertible between platforms, so that only the program logic of the derivation process has to be translated into the specific platform’s language. This can eventually lead to a default way of deriving occupational scores from classifications across different software platforms and harmonization of these variables between surveys producing research data, such as NEPS, SOEP or PASS.