ESRA 2017 Programme

Tuesday 18th July      Wednesday 19th July      Thursday 20th July      Friday 21th July     

     ESRA Conference App


Thursday 20th July, 16:00 - 17:30 Room: F2 104

Panel and Survival Techniques for Complex Survey Data

Chair Dr Arne Bethmann (German Youth Institute )
Coordinator 1Dr Ulrich Pötter (German Youth Institute)

Session Details

Many modern statistical methods ranging from panel and survival
analysis techniques to bootstrap methods, are routinely used for the
study of data generated by longitudinal survey designs. However,
survey design issues and in particular the impact of longitudinal
accrual of information combined with panel attrition are very rarely
discussed when advanced statistical methods are applied to data from
complex surveys. This is certainly no accident: statistical methods
are generally developed, analyzed and justified based on the
assumption of independent and identically distributed observations.
These results can then often be transferred to the simplest sampling
situations like (stratified) simple random sampling without
non-response, using only minor adjustments.

But the increasing availability of data from complex survey designs,
including longitudinal studies and multi-frame surveys as well as the
decrease in response rates through the years also increases the
disparity between simple justifications of statistical procedures and
their practical applications. In fact, it is quite unclear how crucial
features of the survey design as well as non-response can be combined
with standard statistical procedures in order to provide valid
inferences. In some special cases, it is possible to justify weighting
schemes derived from design information for procedures based on
estimating equations. But this approach is mainly restricted to
generalized regression type models as well as Cox-models without
time-dependent covariates. For general statistical procedures, there
is little experience and scarce theoretical guidance how one may
combine information on the sampling design and non-response process
with the standard estimation strategies.

We invite contributions that enrich analytical practice by reporting
on current approaches to combine survey design aspects with modern
statistical techniques. We welcome theoretical considerations and/or
simulations to compare different approaches, as well as applied
techniques for dealing with complex longitudinal designs and
non-response in substantive research.

Topics may include (but are not restricted to):
- General uses of (non-response) weights in dynamic regression- or
survival models including the use of time-varying covariates and the
estimation of time-varying effects.
- Using augmented inverse probability weighting.
- Non-weighting methods to deal with design and non-response including
response modeling.
- Bootstrapping methods preserving sampling designs.
- Using sampling process information as well as population level
information to increase credibility of statistical estimates.

Paper Details

1. Weighted moments cum likelihood estimation in a survey population setup for longitudinal categorical data
Professor Brajendra Sutradhar (Carleton University)

The effects of ignoring the sample selection process when fitting models to survey data can have severe effects on the inference. There exists some studies in panel data or longitudinal setup where repeated observations are collected from the individual selected based on a sampling design, but they are confined to the linear correlated model setup for continuous observations. In this talk, we consider dynamic models for repeated count and multinomial data in a finite population setup and develop sampling design weights based likelihood estimating equations for the estimation of the survey population parameters. Properties of the estimators are discussed.


2. Projecting long-term trends in mobility limitations: impact of excess weight, smoking and physical inactivity
Dr Tommi Härkänen (National Institute for Health and Welfare, Helsinki, Finland)
Ms Päivi Sainio (National Institute for Health and Welfare, Helsinki, Finland)
Dr Sari Stenholm (University of Turku, Finland)
Dr Annamari Lundqvist (National Institute for Health and Welfare, Helsinki, Finland)
Dr Heli Valkeinen (National Institute for Health and Welfare, Helsinki, Finland)
Professor Arpo Aromaa (National Institute for Health and Welfare, Helsinki, Finland)
Professor Seppo Koskinen (National Institute for Health and Welfare, Helsinki, Finland)

Background
Mobility is a prerequisite to participation in civic life and an important component of quality of life. The future development of mobility limitations will largely depend on modifiable risk factors, including excess weight, smoking and physical inactivity, but also on structural changes in the population, such as ageing and rising levels of education. This study aimed to project the prevalence and number of people with severe mobility limitations up to 2044, based on scenarios for the development of risk factors.
Methods
We applied a multistate model on repeated measures in the Health 2000 and 2011 Surveys (BRIF8901), representing the Finnish population, to account for individual risk factors and their changes over time. Unit nonresponse and sampling variability in the Health 2000 Survey was handled using the weighted bootstrap using the poststratification weights. The item nonresponse in 2000 and in the Health 2011 Survey using multiple imputation (MI) based on the chained equations and regression trees. The projections of the both the outcome and the risk factor values in the future were generated using the same MI technique assuming same transition probabilities as between years 2000 and 2011.
Results
The number of people with severe mobility limitation was projected to double by the year 2044 in Finland, due to the rapid ageing of the population. Excess weight was the most important modifiable risk factor predicting severe mobility limitations. Eliminating half of the excess weight would reduce the number of persons with severe mobility by one fifth. Reductions in the prevalence of smoking and physical inactivity would only have a small impact on the prevalence of severe mobility limitations. Even if excess weight, smoking and physical inactivity were completely eliminated, the number of persons with severe mobility limitation is projected to increase.
Conclusions
Designing and implementing strategies to promote healthy weight and weight reduction are top priorities for public policy to slow down the rapid increase in mobility limitations due to population ageing. MI using chained equations seemed to be a plausible approach to handle the changes not only in the outcome over time but also in the risk factors, which are often categorical, and can pose interactions and nonlinearities, which generally are time consuming to model in parametric MI techniques.


3. A comparison between variance estimation with bootstrap repliacte weights and TSL
Dr Tobias Schmidt (Deutsche Bundesbank)
Mr Matthias Kaeding (RWI Essen)

In this paper we analyse the difference between variance estimation using bootstrap replicate weights versus linearization methods (TSL). We are particularly interested in the impact of calibration on the difference between the two variance estimation techniques. In order to gauge the size of the effects we start with a set of weights from an actual study conducted in 2010 by the Bundesbank (“Panel on Household Finance”). This study provides both sample design information as well as a set of 1,000 bootstrap replicate weights. To be able to asses the impact of calibration, we simulate a large set of study variables more or less related to the calibration variables and estimate their variance in Stata using both bootstrap and TSL. We find that the linearization methods yield systematically higher variances than the bootstrap methods and the gap between the two methods widens as the correlation between the variable under study and the indicators used in the calibration of the weights increases. Ar very high Levels of correlation, the bootstrap replicate methode underestimates the true variance, while the linarization method is always a conservative estimate.