ESRA logo

ESRA 2023 Preliminary Glance Program

All time references are in CEST

Human-like communication forms in web surveys

Session Organisers Dr Jan Karem Höhne (University of Duisburg-Essen)
Professor Frederick Conrad (University of Michigan)
TimeFriday 21 July, 09:00 - 10:30
Room U6-02

Since the inception of web surveys, researchers have incorporated cutting-edge communication technology to improve data quality and respondent experience. One approach has been to use these technologies to give web surveys a human touch, facilitated by the ubiquity of multimedia-enabled mobile devices and high-speed internet. For example, it is relatively easy to program the web survey so that respondents can self-administer spoken questions by playing audio-recordings on their smartphones or tablets and answer by speaking via the built-in microphones in those devices. This potentially recreates key aspects of daily conversation, which respondents might prefer to clicking and typing in answers to textual questions. It is also possible to answer survey questions by uploading photos and videos, which may result in more accurate information, as it may not rely on respondent’s memory to the same extent as traditional self-reports. Video communication platforms, such as Skype and Zoom, support in-person interviews that are conducted remotely, reducing geographic barriers and interviewer field costs. In addition, advances in AI technology facilitate the use of life-like virtual interviewers that may appeal to survey respondents, potentially decreasing nonresponse and social desirability bias. These technology-driven methodologies expand the existing methodological toolbox for all substantive research fields that rely on web survey data. However, there is only a small body of research on the general feasibility of these approaches and their implications for data quality and participants’ satisfaction. In this session, we therefore invite contributions that report experimental and non-experimental research on human-like technology-mediated communication in web surveys carried out in different settings (e.g., lab or field) and with different study designs (e.g., cross-sectional or longitudinal).

Keywords: communication forms, digital data collection, mobile devices, new data sources, web surveys

Interviewer Effects in Live Video and Prerecorded Video Interviewing

Dr Brady West (University of Michigan-Ann Arbor) - Presenting Author

Live video communication tools (e.g., Zoom) have the potential to provide survey researchers with many of the benefits of in-person interviewing, while at the same time greatly reducing data collection costs, given that interviewers do not need to travel and make in-person visits to sampled households. The COVID-19 pandemic has exposed the vulnerability of in-person data collection to public health crises, forcing survey researchers to explore remote data collection modes, such as live video interviewing, that seem likely to yield high-quality data without in-person interaction. Given the potential benefits of these technologies, the operational and methodological aspects of video interviewing have started to receive research attention from survey methodologists. Although it is remote, video interviewing still involves respondent-interviewer interaction that introduces the possibility of interviewer effects, and no research to date has evaluated this potential threat to the quality of the data collected in video interviews. This research note presents an evaluation of interviewer effects in a recent experimental study of alternative approaches to video interviewing, including both “live video” interviewing and the use of prerecorded videos of the same interviewers asking questions embedded in a web survey (“prerecorded video” interviewing). We find little evidence of significant interviewer effects when using these two approaches, which is a promising result. We also find that when interviewer effects were present, they tended to be slightly larger in the live video approach, as would be expected in light of its being an interactive mode. We conclude with a discussion of the implications of these findings for future research using video interviewing.

Combining Dictation and/or Voice Recordings with Text to Answer Narrative Open-ended Survey Questions

Dr Melanie Revilla (IBEI) - Presenting Author
Professor Mick P. Couper (Survey Research Center, University of Michigan)

While the advantages of voice input for answering open questions in web surveys seem clear, its implementation has met with difficulties. Several studies have found respondents unwilling or unable to use voice input, and requiring them to use voice input to answer open questions has resulted in higher levels of breakoff and non-compliance. Thus, the challenge remains of maximizing the use of voice input while still giving respondents alternatives.
This study experimentally explores three options for encouraging voice input in web surveys:
a) PushDictation: respondents are first asked to answer using dictation (sometimes called Automatic Speech Recognition). If they try to continue without providing an answer, they are given the option to type in a textbox.
b) PushRecording: respondents are first asked to answer using voice recording. If they try to continue without providing an answer, they are given the option to type in a textbox.
c) Choice: respondents are offered three options to answer (dictation, voice recording or type in a textbox).
These three options are compared to a control group in which participants can only answer by typing in a text box.
Using data from two open questions in a survey about nursing homes that will be implemented in early 2023 (expected N=1,000) in an opt-in online panel in Spain (Netquest), we aim to answer three research questions: (RQ1) What are the overall rates of response to open questions? (RQ2) What are the rates of use of voice input to answer open questions? and (RQ3) What is the overall quality of the data across the different conditions?
Our results will contribute to the growing but still very limited literature about the use of voice input in the frame of web surveys, by adding new empirical evidence for several designs encouraging voice input that have not yet been tested.

Innovating web probing: Comparing written and oral answers to open-ended probing questions in a smartphone survey

Dr Jan Karem Höhne (University of Duisburg-Essen) - Presenting Author
Dr Timo Lenzner (GESIS - Leibniz Institute for the Social Sciences)
Dr Konstantin Gavras (Nesto Software GmbH)

Cognitive interviewing in the form of probing is key for developing methodologically sound question and questionnaire designs. For a long time, probing has been tied to the lab, inducing small sample sizes and a high burden on both researchers and participants. Therefore, researchers have recently started to implement probing techniques in web surveys where participants are asked to provide written answers. As observed in studies on open-ended questions, participants frequently provide very short or no answers at all because entering answers is tedious. This particularly applies when completing the web survey via a smartphone with a virtual on-screen keypad that shrinks the viewing space. In this study, we therefore compare written and oral answers to open-ended probing questions in a smartphone survey. Oral answers were collected via the open-source SurveyVoice (SVoice) tool. We conducted a survey experiment in the German Forsa Omninet Panel (N = 1,001) in November 2021 and probed two questions from the module “National Identity and Citizenship” of the German questionnaires of the International Social Survey Programme (ISSP) in 2013/2014. Specifically, we probed for respondents’ understanding of key terms in both questions (comprehension probing). Preliminary analyses indicate that oral answers result in higher item non-response than their written counterparts. However, oral answers are longer, suggesting more in-depth information. In order to provide a clear-cut evaluation of written and oral answers to open-ended probing questions in web surveys we will conduct further, refined analyses. For example, we will code the answers with respect to the number and variety of themes mentioned and examine whether one answer format elicits more detailed and elaborated answers than the other. In addition, we will investigate respondent characteristics associated with high-quality written and oral answers to open-ended probing questions.

Mode-Specific Discomfort Answering a Sensitive Question in a Live Video Interview

Ms Shlomit Okon (The New School) - Presenting Author
Professor Michael Schober (The New School)
Ms Rebecca Dolgin (The New School)

Recent evidence suggests that survey respondents’ mode-specific discomfort answering a sensitive question in live video directly predicts their being less willing to participate in a hypothetical live video survey, along with how hard they perceive live video is to use and how little they enjoy live video in other contexts (Schober et al., in press). In that study, 598 online respondents rated the extent to which they would be willing to participate in a hypothetical survey conducted in each of five modes (in person, live video, phone, web, and self-administered “prerecorded video” surveys in which respondents play a video of an interviewer asking a question and respond by clicking or typing) and then rated how uncomfortable they would feel answering a particular sensitive question–”How many sex partners have you had in the last 12 months?”--in each of the five modes. Here we present analyses of the open-ended explanations respondents in that study gave every time they judged answering this question in a live video interview to be either more or less uncomfortable than in the other modes. The reasons given for live video discomfort reflect the specific social presence that live video engenders relative to the other modes, as well as the technology’s affordances. Some respondents reported they would be more uncomfortable answering this sensitive question in live video than in person because they worried about being recorded, and more uncomfortable in live video than web because they worried about being judged and seen by the interviewer. Major reasons given for being more comfortable in video than in other modes included feeling that video provides greater privacy than in person, that it allows better access to the interviewer’s reaction, and that it is more interesting than a web survey.

Sentiment Analysis in the Wild

Mr Denis Bonnay (Université Paris Nanterre) - Presenting Author
Mr Orkan Dolay (Bilendi)

The development of digital tools has brought new promises to qualitative survey research, in terms of volume, access and spontaneity, also raising new challenges on the analysis side. How should we handle numerous verbatim records?

Bilendi has recently developed Bilendi Discuss, a proprietary solution for “qual at scale”. It allows researchers to directly interact with respondents via their favourite messaging apps, say WhatsApp or Messenger, or, alternatively, via an ad hoc platform. We are also developing NLP tools to facilitate the analysis of those verbatim reports. The aim of this presentation is to present the specific opportunities and challenges for sentiment analysis in this context.

A major issue with neural networks based NLP concerns the availability of labelled data. The success of Large Language Models has been built on the use of masking tasks in pre-training which make it possible to bypass the need for man-made classification. However, sentiment analysis arguably cannot evade the need for labelled data sets. We explore how to enforce mix interrogation modes, with quantitative appraisal supplemented by open-ended justifications (‘how much did you like X?” and “tell us why!”). We will show how this data is amenable to improving the performance of models in a virtuous circle, where the data generated by the tool itself can be used to improve its analysis capabilities.

On the challenge side, verbatim records in this context typically consist in texts which do express a sentiment and texts which do not. Sentiment classification then needs to be combined with subjectivity detection in order to prevent spurious sentiment attribution (“I have one kid” is neither negative nor positive nor neutral, it’s just facts). We will show the limits of the previous training strategy for subjectivity detection and discuss the specific issues with human labelling in this context.