ESRA logo

ESRA 2021 Program at a glance



Investigating the effects of machine translation and post-editing in the TRAPD: an experimental approach

Session Organisers Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC)
Dr Dorothée Behr (GESIS)
TimeFriday 9 July, 13:15 - 14:45

This session addresses the effects of machine translation and post-editing (MT+PE) when used in the Harkness (2003) TRAPD model. As part of the EU-funded SSHOC project, a controlled experiment using MT+PE at the translation stage was conducted, on the basis of 40 survey questions from ESS and EVS, using German and Russian as target languages and English as a source language. The first presentation outlines the experimental design and set-up. The second presentation investigates the quality of the final Review output, and to what extent the effects of MT+PE are conditional on language group. The third presentation investigates which translations/post-edited versions led to the final Review output. The fourth presentation explores how different the raw MT outputs are from human translations. The fifth presentation centers on usability (effectiveness, efficiency, satisfaction) as a key factor for the adoption of MT+PE.

Keywords: survey translation, TRAPD, machine translation

Investigating the effects of Machine Translation and Post-editing in the TRAPD: experimental design and methodological considerations

Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC) - Presenting Author
Ms Veronika Keck (GESIS)
Dr Brita Dorer (GESIS)
Dr Dorothée Behr (GESIS)
Ms Danielly Sorato (Universitat Pompeu Fabra)

Download presentation

The effects of implementing machine translation and post-editing (MT+PE) in the Harkness (2003) TRAPD model (Translation, Review, Adjudication, Pretesting, Documentation) were assessed using a controlled experimental environment. In this presentation, we outline the experimental design, selection of survey items, implementation and fieldwork challenges. We describe the resulting data and the analysis strategy of the research team.

A sample of 40 survey questions from the ESS and EVS questionnaires was translated from the English source language into German and Russian, following a generic TRAPD model. In the baseline group, the initial translations were produced by human translators, and these versions were discussed in a Review session. The team at the Review step produced the final translation output.

In one of the treatment groups, one human translation was substituted by machine translation (MT) + full Post-editing (PE), i.e. revisions of MT with the goal to produce a fieldable translation. Both the human translation and the MT+full PE version were discussed at the Review step, from which the final translation output was produced. Finally, a second treatment group followed the same methodology combining human translation and MT+PE, but instead of full PE, light PE was implemented, i.e. revisions of MT to render the text understandable and accurate, without necessarily improving on style or grammar.

Translators and reviewers’ profiles were a combination of social scientists and professional translators, they carried out their translation tasks using the MateCat software. All participants answered questionnaires before and after the experiment. Post-editors additionally were surveyed on their PE experience.

This design allowed for the collection of very rich and unique experimental data on the effects of MT in the TRAPD model. The analysis strategy consists of a triangulation of qualitative and quantitative methods of data analysis.


On the Impact of Machine Translation on the Quality of the final Review outputs

Dr Brita Dorer (GESIS) - Presenting Author
Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC)
Ms Danielly Sorato (Universitat Pompeu Fabra)
Ms Veronika Keck (GESIS)
Dr Dorothée Behr (GESIS)
Dr Olga Kushch (Universitat Pompeu Fabra)

Download presentation

The effects of implementing machine translation and post-editing (MT+PE) in the Harkness (2003) TRAPD model (Translation, Review, Adjudication, Pretesting, Documentation) were assessed using a controlled experimental environment. For each language combination – English-German and English-Russian – three teams were set up, one team constituting the baseline group, where the initial two translations were produced from scratch, and two teams forming the treatment groups where machine translation and post-editing (that is, revision of machine-translated output) was implemented to different degrees at the translation stage. (altogether six teams).

This presentation discusses the quality of the final output of the overall translation process, that is, the translations agreed during the Review sessions. The quality of the six review outputs will be assessed combining different methods: Experienced and qualified coders will code the final translations based on an error scheme, developed for this experiment (based on the MQM approach). In this coding scheme, error categories are combined with a severity level for determining the expected impact of each error on the resulting survey data. We will additionally produce crude similarity and/or distance measures between the translations/post-edited versions and the final review outputs, allowing us to indicate the influence that initial outputs had on the final review output.
Secondly, similarity/distance measures between the baseline and treatment groups’ texts in the experiments will be analysed in a regression model, in which the similarity metric is the dependent variable and the covariates correspond to features of the experimental setting e.g. the background of participants translating the survey questions.
The analyses will allow us to investigate the impact of MT+PE on the Review output, both within a language combination as well as across language combinations.


Survey translation according to the team approach: On the impact of post-edited translations on final review output

Dr Dorothée Behr (GESIS) - Presenting Author
Dr Brita Dorer (GESIS)
Ms Veronika Keck (GESIS)
Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC)
Ms Danielly Sorato (Universitat Pompeu Fabra)
Dr Olga Kushch (Universitat Pompeu Fabra)

Download presentation

Machine translation (MT), in particular the neural paradigm, has tremendously changed the translation industry. It is increasingly used in connection with post-editing, that is the revision of machine-translated text. Against this backdrop, we are testing the potential of MT and post-editing for survey translation in this project. Based on the TRAPD team translation approach by Harkness (2003), we set up an experiment along the following lines: While in the baseline condition, the two initial translations were produced from scratch, in two treatment groups, one of the two initial translations was machine-translated and then light and full post-edited, respectively (light post-editing: to render the translation understandable and accurate, without necessarily improving on style or grammar; full post-editing: to modify MT to result in a fieldable translation). In all three settings, the translators/post-editors subsequently came together with a reviewer to discuss the versions and reconcile these into a final review output. The experiment was set up for both the language combination English-German and English-Russian (altogether six teams). This presentation will look into the way how the final review output came about. We will focus on challenging item elements (e.g. terminology, fills, response categories) that had driven the source item selection process in the first place. Starting with the review solution for these elements, we will trace back where the solution came from, e.g. from the human-translated version, the post-edited version or the review discussion itself. This way we can complement the project analyses by closely investigating the role that post-edited versions play in the final output. Thus, we gain a more complete picture of the strengths and weaknesses of post-edited versions in the team translation approach.


Assessment of machine translations of survey questions and response scales

Ms Danielly Sorato (Universitat Pompeu Fabra) - Presenting Author
Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC)
Ms Veronika Keck (GESIS)
Dr Dorothée Behr (GESIS)
Dr Brita Dorer (GESIS)

Download presentation

The effects of machine translation and post-editing (MT+PE), in the Harkness (2003) TRAPD model (Translation, Review, Adjudication, Pretesting, Documentation) were assessed using a controlled experimental environment. Forty survey items were sampled from the ESS and EVS questionnaires. In this presentation, we zoom into the raw machine translation outputs, i.e. texts produced by a machine translation tool before human editing, to explore in detail how different they are from the final versions of the ESS and EVS survey projects. We also compare how different the raw outputs are from full and light post-edited versions, produced in a second step in the experiment. Finally, we compare the raw MT outputs with the final outputs produced after the Review step in the experiment.

Texts from ESS and EVS source English questionnaires were machine translated into German and Russian using MateCat computer assisted translation software. Similarity between the raw machine translations, the post edited version, the final ESS and EVS translations and, the final translations after the Review step in the experiment is assessed using distance measures for text analysis. In the presentation, we discuss differences between the texts and grouped these differences by language (German and Russian), distinguishing between question stems and response scales in our analysis.


Usability of neural machine translation application for translation of measurement instruments

Ms Veronika Keck (GESIS) - Presenting Author
Dr Dorothée Behr (GESIS)
Dr Brita Dorer (GESIS)
Dr Diana Zavala-Rojas (Universitat Pompeu Fabra, ESS ERIC)
Ms Danielly Sorato (Universitat Pompeu Fabra)

In recent years, machine translation has evolved from rule-based and statistical to neuronal translation engines based on deep learning. It seems that any translation can now be provided quickly, effortlessly, and in apparently good quality without any human intervention. As part of the EU-funded SSHOC project, machine translation was integrated into the workflow of the Harkness TRAPD model (double translation & team review) to explore its potential and use for survey research: In four team set-ups (2 x English-Russian/2 x English-German), one of the initial translations was replaced by machine translation and post-editing, i.e., the revision of machine translated text. The author will zoom into the translation step (‘T’) of the TRAPD model to measure the usability of machine translation in the context of questionnaire translation, since usability is one of the key factors for increasing the adoption of machine translation. Three dimensions of usability, i.e. effectiveness, efficiency, and satisfaction, will be analysed in this regard, taking up a categorisation of the ISO 9241 usability standards. Effectiveness will be measured by analysis of the errors produced by a machine engine. Efficiency will be analysed by comparison of effort needed to produce a text either from a translator or a post-editor perspective. Satisfaction will be captured by a post-task questionnaire. The analysis of the three usability dimensions will be performed from a user-centered perspective