ESRA 2023 Glance Program

All time references are in CEST

Assessing the Quality of Survey Data 3
Session Organiser	Professor Jörg Blasius (University of Bonn)
Time	Thursday 20 July, 09:00 - 10:30
Room	U6-23

This session will provide a series of original investigations on data quality in both national and international contexts. The starting premise is that all survey data contain a mixture of substantive and methodologically-induced variation. Most current work focuses primarily on random measurement error, which is usually treated as normally distributed. However, there are a large number of different kinds of systematic measurement errors, or more precisely, there are many different sources of methodologically-induced variation and all of them may have a strong influence on the “substantive” solutions. To the sources of methodologically-induced variation belong response sets and response styles, misunderstandings of questions, translation and coding errors, uneven standards between the research institutes involved in the data collection (especially in cross-national research), item- and unit non-response, as well as faked interviews. We will consider data as of high quality in case the methodologically-induced variation is low, i.e. the differences in responses can be interpreted based on theoretical assumptions in the given area of research. The aim of the session is to discuss different sources of methodologically-induced variation in survey research, how to detect them and the effects they have on the substantive findings.

Keywords: Quality of data, task simplification, response styles, satisficing

Papers

Designing a Global Quality Profile Using a Total Survey Error and Fitness-for-Use Framework

Dr Jennifer Kelley (Westat) - Presenting Author
Mr Brad Edwards (Westat)
Mr Dave DesRoches (Westat)
Mr Dan Tomlin (Westat)

A global quality profile consists of multiple quality indicators observed and summarized into one global score. Global quality profile scores (GQPS) are widely used in manufacturing production and to market products to consumers. In survey research, few organizations have developed a GQPS. This is not surprising given the nature of surveys’ products, survey estimates. Survey estimates are less tangible than a physical manufacturing product; thus, data quality is more difficult to measure.
However, survey research has a framework for building a GQPS. The gold standard for assessing error contributions of various sources is the total survey error (TSE) framework. Capturing data to measure all potential sources of error is daunting, and producing a global error measure has not been a TSE goal. For most surveys, this would not be feasible given budget and resource constraints.
Another quality framework, fitness-for-use, has an explicit focus on the user. In this context, fitness-for-use means that researchers and data users may have very different views about survey data (Biemer, 2010). Data quality researchers may be interested in creating a GQPS that accounts for all possible sources of error. However, data users may prioritize the timeliness or relevance of the data. Creating a GQPS should be grounded in the fitness-for-use framework and aim towards a TSE paradigm. Further, identifying which quality indicators should feed into a GQPS is challenging. Decisions are often based on the resources and effort needed to operationalize data quality measures instead of how well each quality indicator predicts data quality and the magnitude of the effect of each indicator on data quality.
This presentation will discuss the challenges of designing a GQPS, identify key quality measures for each error source, formulating a GQPS, and implementing a GQPS system.

Is Dirty Data Worth Less?: Data Cleaning Effects on Bias and Analytic Model Quality

Ms Megan Hendrich (Ipsos US Public Affairs) - Presenting Author
Professor Randall Thomas (Ipsos US Public Affairs)
Dr Frances Barlas (Ipsos US Public Affairs)

Many researchers believe that data cleaning leads to better data quality and often clean out participants exhibiting sub-optimal behaviors (e.g., speeding, response non-differentiation, nonresponse, compliance trap failure). In service of this belief, sometimes researchers will use aggressive cleaning criteria that removes up to 20% of their sample, raising questions about the validity of the survey results and potentially resulting in cost implications if replacement sample is needed. Contrary to expectations, most prior research has found that data cleaning does not improve accuracy for point estimates (e.g., proportions, means). In this study, we were interested in assessing the effects of data cleaning on bias and covariance (specifically in multiple regression models). In an online study with over 9,000 completes from three different sample sources (a probability-based sample and two opt-in samples), we first assessed bias by examining how estimates obtained from the different samples compared to external benchmarks from high-quality sample surveys. Next, we computed regression coefficients for three regression models (political attitudes predicting vote choice and political party identification and demographics and life experiences predicting life satisfaction). We deleted cases in gradations from 2.5% up to 50% of the sample based on speed of completion, weighted each dataset, and then ran the analyses. We found that no matter how much or how little data cleaning we performed, it did not reduce or increase bias. Regarding covariance, we found that small to moderate amounts of data cleaning did not substantially affect the direction or degree of coefficients; however, some coefficients became more unstable at 30% deletion and higher. We urge caution in any data cleaning protocols that might eliminate a higher proportion of participants since this may actually increase bias in covariance.

Military survey standards and guidelines: A total-survey-quality approach

Dr Zhigang Wang (Department of National Defence) - Presenting Author

While high-quality survey research is time-consuming and usually costly, high-quality survey data leads to informed decisions. Many military organizations face the challenge of supporting survey data requests with limited survey research capability. To address this challenge, a NATO Research Task Group (RTG) was established to develop concise standards and guidelines for survey design, operation, analysis, and dissemination in a military context. These standards and guidelines reflected a total-survey-quality approach, ensuring the scientific rigor of survey results while dealing with the military’s special requirements and the unique methodological and operational challenges such research presents. The total-survey-quality perspective covers additional survey components that the total-survey-error framework does not include (e.g., assessing clients’ needs, determining survey objectives, data editing, and results dissemination) and their associated errors in the military survey lifecycle (e.g., assessment error, design error, analysis error, interpretation error, and documentation error). The total-survey-quality perspective also promotes an implementation-oriented approach to disseminating military survey results—researchers need to go beyond their role as researcher or information-provider and work with military clients and stakeholders to identify problems, plan and conduct survey projects, communicate survey findings to the target audiences, and work with stakeholders to identify and overcome barriers to implementing organizational change based on survey findings.

Minimising survey data errors in complex humanitarian settings: Lessons learned from the field

Ms Nayana Das (IMPACT Initiatives) - Presenting Author

The 2022 Global Humanitarian Overview states $41 billion is needed to assist 182.8M people in need across 63 countries. One of the highest figures in decades, the importance of robust data to inform effective aid delivery is more pertinent than ever. For years, IMPACT has conducted research and analysis to support aid actors planning and responding to crises. In 2021 alone, IMPACT teams collected data across 20 humanitarian crises globally, through approximately 200,000 household surveys, 270,000 key informant interviews and 1,500 focus group discussions.

Building on IMPACT’s experiences and lessons learned, this paper will analyse how the production of high quality survey data in humanitarian settings can be challenged by certain measurement errors, and how these errors can be systematically detected to ensure continued production and use of quality humanitarian data.

This paper will address two key research questions:
• What type of random and systematic measurement errors can impact production of high-quality survey data in humanitarian settings, and what are the error sources?
• What measures can be taken to ensure errors are detected and addressed in a timely and systematic manner?

The paper will primarily rely on IMPACT’s own lessons learned across different data collection contexts, including preliminary results from two methods explored to better detect data falsification patterns. This will be complemented by an in-depth literature review on quality assurance of survey data, and consultations (key informant interviews) with field teams involved in data collection.

Engaging with these questions is especially relevant now with the humanitarian community undergoing a so-called “data revolution” – humanitarian organisations are collecting and sharing more data than ever before. Understanding and proactively addressing the limits of this data production process is key to ensuring that data-driven aid action is based on high quality survey data.

Correcting for Total Survey Error in National and Sub-National Administrative-Data Based Estimates of Crimes Reported to the Police

Dr Marcus Berzofsky (RTI International) - Presenting Author
Dr Dan Liao (RTI International)
Mr G. Lance Couzens (RTI International)
Ms Erica Smith (U.S. Bureau of Justice Statistics)
Dr Cindy Barnett-Ryan (Federal Bureau of Investigations)

Using administrative data sources to develop official statistics offers governments several benefits such as lower costs than traditional sample-based surveys and the ability to collect data at a more granular level. However, because administrative data are often collected from a large number of entities, there are several errors from the total survey error paradigm that may need to be addressed before representative estimates can be produced. For example, if not all entities contribute administrative data, then the set of participants may not be considered a random sample, leading to nonresponse bias. Furthermore, because entities are providing data, quality can be inconsistent leading to item nonresponse error, processing error, and measurement error. Where these errors cannot be directly corrected, the increased level of uncertainty in the estimates needs to be reflected.

In this presentation, we discuss the process for producing estimates from the National Incident-Based Reporting System (NIBRS). The U.S. Federal Bureau of Investigation (FBI) uses NIBRS to collect detailed incident-level information on all crimes reported to the police. Prior to 2021, the FBI used a combination of data submitted through an older summary system – which only collected aggregate counts of crime – and NIBRS. Beginning in 2021, the FBI only allowed submissions through NIBRS and the 2021 estimates, published in October 2022, were based solely on NIBRS. However, unlike the prior summary system, which covered 95% of the U.S. population, in 2021 NIBRS covered 65% of the population. Furthermore, the additional complexity of the incident-based form creates the potential for data quality issues that did not exist under the summary system. We discuss how we developed an estimation system to address the various total survey error issues and created a robust measure of uncertainty to account for any bias in the NIBRS-based estimates produced.