ESRA 2019 Draft Programme at a Glance

The Power and Pitfalls of Combining Survey and Sensor Data 2

Session Organisers Miss Anne Elevelt (Utrecht University)
Dr Peter Lugtig (Utrecht University)
Dr Vera Toepoel (Utrecht University)
TimeWednesday 17th July, 16:30 - 17:30
Room D19

Sensor data offer great potential for social scientists interested in studying attitudes and behaviors. These kind of data are particularly interesting when they can be linked to and compared with other data sources. With more and more possibilities to collect additional data through smartphones (for example through smartphone apps or activity trackers) large-scale population surveys could rather easily be enriched. Participants carry their smartphone everywhere, enabling scientists to ask respondents to make pictures, or to collect GPS and accelerometer data and track for example how much participants move around and where they go. Opportunities abound.

However, there are still many unsolved and unique methodological questions and issues to collecting and using sensor data. This session invites presentations that investigate the potentials and challenges when combining survey and sensor data. We especially welcome papers that used and collected these kind of data, and address;

The power of sensor data
o Higher data quality?
o Lower respondent burden.

The pitfalls of sensor data
o Implementation issues; nonresponse, willingness, device use.
o Technical problems.
o Issues in collecting and accessing these data across the general population.
o Data storage.

Keywords: Big data; sensor data

Consideration of device-related error sources in integrated collection of smartphone sensor data and survey data

Dr Nejc Berzelak (University of Ljubljana, Faculty of Social Sciences) - Presenting Author
Mr Uroš Podkrižnik (University of Ljubljana, Faculty of Social Sciences)
Ms Jasna Urbančič (Artificial Intelligence Laboratory, Jožef Stefan Institute)
Mr Matej Senožetnik (Artificial Intelligence Laboratory, Jožef Stefan Institute)
Professor Vasja Vehovar (University of Ljubljana, Faculty of Social Sciences)

Modern smartphones incorporate sensors for collecting data about position, orientation, motion and environment. Previous studies have demonstrated how passively collected location and motion data can be effectively used in social science research. However, there is a lack of a detailed elaboration of such data collection from the perspective of social science methodology and its integrative placement among survey data collection methods. This extends to appropriate consideration of errors arising from sensor data in the context of their integration with survey data. While error sources have been comprehensively elaborated for survey research, particularly by the Total Survey Error framework, systematic efforts to accomplish a consistent conceptualisation for complementary sensor data remain limited.

This paper contributes a critical elaboration of specific error sources in data collection using smartphone sensors to complement survey data. It focuses predominantly on technical aspects that may have important methodological implications for social science research and addresses three main research questions:
1) How can technical characteristics of smartphones and behaviour of research participants in interacting with their devices affect the quality of data relevant for social science research?
2) How can these error sources be placed into the conceptual framework of the Total Survey Error?
3) How can device paradata contribute to better understanding of potential influences of these factors on the data quality?

The elaboration applies the findings from studies in various fields onto the context of survey research and further highlights potential error sources by evaluating a prototype mobile application for integrated collection of survey and sensor data. On this basis, the paper identifies technological and behavioural factors that can affect the data collection performance, discusses the placement of potential biasing effects into the Total Survey Error framework and underlines the importance of implementing appropriate measures for better understanding and monitoring of the technical environment during the data collection.

Stop detection decisions in a travel survey app

Mrs Danielle McCool (Utrecht University)
Dr Peter Lugtig (Utrecht University)
Professor Barry Schouten (Statistics Netherlands/Utrecht University) - Presenting Author

Apps are promising tools for travel surveys and have been explored by various commercial and non-commercial institutions. Statistics Netherlands recently developed a cross-platform travel app that actively measures time-location sensor data. Respondents get daily overviews of their travels and stops and are asked to supplement these with travel motives and transportation modes. Respondents can also indicate for each stop whether it was correctly detected or not. The current version of the app does not allow for adding or removing stops.
The definition of stops and travels depend on the motives of visiting the specific locations. A change of transportation mode, e.g. from train to bus, or a slow traffic light are not real stops, whereas dropping off someone at the train station is a stop. As a consequence, without stop motives, for two stops with similar features one may be real and the other may not be real. In the app, a candidate stop is detected when a respondent remains within a certain specified radius for at least a specified amount of time. The two parameters do not depend on the location or time.
In the paper, we show the results of a large-scale field test in which the two stop detection parameters have been varied and randomly assigned to different sample units. We compare the parameter values to the number of candidate stops and to the number of real stops.

Predicting Completion Conditions in Mobile Web Surveys with Acceleration Data

Dr Christoph Kern (University of Mannheim) - Presenting Author
Mr Stephan Schlosser (University of Göttingen)
Dr Jan Karem Höhne (University of Mannheim)
Dr Melanie Audrey Revilla (Universitat Pompeu Fabra)

Participation in web surveys via smartphones increased continuously in recent years due to a skyrocketing proportion of smartphone owners and an increase in mobile Internet access. However, previous research has shown that smartphone respondents are frequently distracted and/or multitasking, which might affect response behavior in a negative way. In this study, we therefore predict respondents’ completion conditions (e.g., standing or walking) and study their effects on data quality in mobile web surveys. For this purpose, we train machine learning models based on acceleration data of smartphone respondents – measured by means of the JavaScript-based tool “SurveyMotion” – that were collected in a lab experiment (N = 89) and in a field experiment (N = 521) that systematically varied the completion conditions. The lab experiment data were collected at the Center of Methods in Social Sciences at the University of Göttingen (Germany) in 2017 and the field experiment data were collected by the online fieldwork company Netquest (Spain) in 2018. We extract features from the acceleration data by aggregating over the repeated acceleration measurements that were collected for each respondent-page. Regularized regression and tree-based models were trained and tested using grouped cross-validation, reflecting the hierarchical structure of the data with pages nested in respondents. When building the prediction models, both a multiclass (sitting, standing, walking, climbing stairs) and a binary (moving, not moving) version of the outcome variable were considered. It becomes evident that the acceleration features can be used to build sparse prediction models that almost perfectly discriminate between completion conditions on hold-out sets, with cross-validated ROC-AUCs between 0.981 and 0.998. The evaluation results indicate that the trained prediction models can be used to precisely predict completion conditions in (new) mobile web surveys that collect acceleration data.