All time references are in CEST
Measurement Error: Factors Influencing the Quality of Survey Data and Possible Corrections 2
|Session Organisers|| Dr Lydia Repke (GESIS - Leibniz Institute for the Social Sciences)
Ms Fabienne Krämer (GESIS - Leibniz Institute for the Social Sciences)
Dr Cornelia Neuert (GESIS - Leibniz Institute for the Social Sciences)
|Time||Wednesday 19 July, 16:00 - 17:30|
High-quality survey data are the basis for meaningful data analysis and interpretation. The choice of specific survey characteristics (e.g., mode of data collection) and instrument characteristics (e.g., number of points in a response scale) affects data quality, meaning that there is always some measurement error. There are several methods and tools for estimating these errors (e.g., the Survey Quality Predictor) and approaches for correcting them in data analysis. This session will discuss factors that influence data quality, methods or tools for estimating their effects, and approaches for correcting measurement errors in survey data.
We invite papers that
(1) identify and discuss specific survey characteristics and their influence on data quality;
(2) identify and discuss specific instrument characteristics and their impact on data quality;
(3) discuss methods of estimating measurement errors and predicting data quality;
(4) present or compare tools for the estimation or correction of measurement errors;
(5) show how one can account for and correct measurement errors in data analysis.
Keywords: measurement error, data quality, correction, survey characteristics, item characteristics
Dr Barbara Felderer (GESIS) - Presenting Author
Dr Ludwig Bothmann (University of Munich)
Dr Lydia Repke (GESIS)
Mr Jonas Schweisthal (University of Munich)
Dr Wiebke Weber (University of Munich)
The Survey Quality Predictor (SQP) is an open-access system to predict the quality of survey questions measuring continuous latent variables based on the characteristics of the questions (e.g., properties of the response scale). The prediction algorithms for reliability and validity are based on a meta-regression of many multitrait-multimethod (MTMM) experiments in which characteristics of the survey questions were systematically varied. Thus, SQP can be used to predict and compare the quality of newly designed questions and to correct for measurement errors in the analysis.
In 2022, 21 years after the first and seven years after the last release of SQP, version 3.0 was published. The database of SQP now includes approximately 600 MTMM experiments in 28 languages and 33 countries, making it necessary to redo the meta-analysis. To find the best method for analyzing the complex data structure of SQP (e.g., the existence of various uncorrelated predictors), we compared four suitable machine learning methods in terms of their ability to predict survey quality indicators: two penalized regression methods (i.e., LASSO and elastic net) and two regression tree-based methods (i.e., boosting and random forest).
The regression tree-based algorithms outperform both penalized regression models in terms of prediction error. While the prediction error is slightly smaller for the boosting algorithm, random forest has the advantage of readily available prediction intervals.
The presentation will highlight the results of the model comparison and give details about the importance of the different predictor variables in the model chosen for SQP 3.0.
Dr Fernanda Alvarado-Leiton (Universidad de Costa Rica) - Presenting Author
Dr Rachel Davis (University of South Carolina)
Dr Sunghee Lee (University of Michigan)
Acquiescent response style (ARS) is the tendency to disproportionately agree with survey items and can impair survey measurement. ARS has been linked to Agree/Disagree (A/D) rating scales, which are broadly used in the social sciences. This has made ARS a vexing problem in the measurement of social constructs. Despite extensive research on the topic, there is little consensus on how to mitigate the effects of ARS. This study proposed that the order of A/D responses may be used to deter ARS. The extant literature shows how primacy effects in non-aural surveys increase the rate at which respondents select the first listed response category. Following this logic, listing agreement categories first may exacerbate ARS, particularly for Web or paper-and-pencil surveys. The aim of this study was to evaluate whether the response order of A/D rating scales impacts use of ARS. Participants from three groups were recruited into a Web survey: non-Hispanic whites (n=1,200), Hispanics in the U.S. (n=1,200) and Hispanics in Mexico (n=1,200). An experiment was conducted using three measurement scales assessing emotional expressivity, affective orientation, and purpose in life. Respondents were randomly assigned to one of two response orders: disagreement to agreement, placing agreement categories last or agreement to disagreement scale order, placing agreement categories first. Respondents chose agreement responses significantly less often when the disagreement response options were presented first, than when the agreement responses were offered first. This response order similarly reduced the number of simultaneous agreeable responses to opposite scale items. When the disagreement options were placed first, reliability and convergent validity of the scales also improved. This study shows that putting the agreement categories last in the response scale may be a useful design option for addressing ARS in Web surveys.
Dr Frances Barlas (Ipsos Public Affairs) - Presenting Author
Mr Randall Thomas (Ipsos)
Ms Megan Hendrich (Ipsos)
How response formats are presented can significantly affect survey results. Researchers are oft concerned that how a respondent will answer a question can be affected by the order in which the responses have been presented, what are called response order effects. For graded scales, response order effects have been found in a number of interview modes. While recency effects (where the last or near last response is more often selected) have been commonly found in telephone interviews, some have not been successful replicating these results. Self-administered surveys (those visually presented like paper-pencil and online surveys) have been more often associated with primacy effects where the first or second response is more often selected. However, while primacy effects appear to be more frequent in web-based surveys, such effects have also not consistently been supported in attempts to replicate. As such, researchers still do not have a clear understanding of when response order effects will occur. To help better understand the conditions under which order effects are most likely observed in online surveys, we collected together over 20 online studies representing data collected from over 600,000 respondents where response order was randomly assigned. These studies included questions that had scales varying from 2 to 11 categories, and scales presented vertically and horizontally. Measured dimensions included intensity measures (e.g., likelihood, usefulness, importance) and evaluation measures (e.g., good-bad, satisfied-dissatisfied). We generally found that primacy effects were most likely to be demonstrated when scales were presented vertically rather than horizontally, and more likely for evaluative scales than for intensity scales. In cases where we could evaluate the validity of the scale response, we found that the least to most (or negative to positive) arrangement has slightly higher validity coefficients.
Dr Eva Zeglovits (IFES) - Presenting Author
Dr Julian Aichholzer (IFES)
Dr Reinhard Raml (IFES)
It has become quite common to conduct mixed mode CATI/CAWI surveys, differences in these two modes (e.g. item non response) are well researched. But more and more respondents use a mobile device to participate in an online survey and the questions do look different on a mobile device than on a bigger screen. This means, that we actually combine three modes in the standard CATI/CAWI survey: telephone, online on a big screen and online on a mobile device. When it comes to item batteries, items have to be presented individually on a mobile device, but are usually presented as a table on a bigger screen.
The paper analyses data from several mixed mode surveys in Austria, to assess response patterns and measurement errors in item batteries, comparing the three modes: telephone, online and online mobile. We will particularly focus on item on response, straightlining and satisficing, distinguishing between mode effects and selection effects.
The data used is obtained from several applied research projects, thus we will be able to use different kinds of item batteries. The item batteries analysed vary in length (number of items), types of scales (unipolar/bipolar, number of response options including odd an even numbers, fully verbalized scales and scales with only the endpoints verbalized), types of items (single words versus complete sentences), and content. Thus we will be able to provide a full picture on measurement effects.