ESRA logo
Tuesday 14th July      Wednesday 15th July      Thursday 16th July      Friday 17th July     

Friday 17th July, 11:00 - 12:30 Room: HT-101

Robust Methods in Survey Design and Analysis with Applications

Convenor Dr Marco Geraci (University of South Carolina )
Coordinator 1Dr James Hardin (University of South Carolina)
Coordinator 2Dr Andrew Ortaglia (University of South Carolina)

Session Details

The violation of the assumptions that underlie parametric statistical methods is potentially a serious issue when drawing inferences about a population. Resulting bias in the estimates may lead to incorrect conclusions. Typical problems include, but are not limited to, the presence of outliers, untenable normality assumptions, and model misspecification.

This session aims at showcasing recent developments in robust methods for survey design and survey data analysis with emphasis on applications. Submissions on topics such as semi- and non-parametric modelling, estimation of distribution functions and quantiles, variance estimation and methods for missing data are particularly welcome. The presentations will illustrate the application of robust methods to studies in the life, social and natural sciences. Examples on the usage of related statistical software are also encouraged.

Paper Details

1. Design and Estimation Considerations for Stratum Jumping in the National Survey of College Graduates
Professor Jay Breidt (Colorado State University)
Professor Jean Opsomer (Colorado State University)
Mr Michael White (US Censu Bureau)

The National Survey of College Graduates (NSCG) is conducted by the US Census Bureau on behalf of the National Science Foundation. The NSCG's primary focus is on the science and engineering workforce. Frame information for sampling is obtained from the American Community Survey (ACS). A graduate may have an ACS stratum with relatively low importance, while belonging to an important NSCG domain. The high sampling weight of such a "stratum jumper" leads to unstable estimates and variance estimates. Options for addressing such stratum jumpers at both the design and estimation stage are considered, theoretically and via a detailed simulation.

2. On the influence of transforming skewed distributions on survey analysis using imputed data
Mr Tobias Enderle (GESIS - Leibniz Institute for the Social Sciences)
Professor Ralf Münnich (University of Trier)

A way to compensate for item nonresponse is using a multiple imputation routine that relies on the assumption of joint normality. Since research data follow the normal distribution only in the rarest of cases, one can approximate normality when transforming data before imputation. Therefore, we study the handling of skewed distributions by different transformation approaches that rely either on the method of moments or the maximum likelihood method. The aim of the approach is to obtain a correct inference for current survey data analyses. The paper also addresses the criticism in recent years.

3. Robust quantile regression of surveys with data missing at random
Miss Xinling Xu (University of South Carolina)
Dr Marco Geraci (University of South Carolina)
Dr Andrew Ortaglia (University of South Carolina)

When addressing complex survey data, the estimation of population parameters requires statistical modeling that accounts for design features. A substantial complication arises when data are affected by unit and item nonresponse. We address estimation issues for models which target the conditional quantile of a continuous outcome. Survey design variables are properly included in the analysis, and we propose a bootstrap variance estimator. The proposed imputation method preserves conditional skewness and kurtosis, and successfully handles bounded outcomes. Stata and R code implementations will be demonstrated and made available.

4. A new statistical approach to quantile regression of complex survey dietary data
Mr David Pell (University of Cambridge)
Dr Ivonne Solis-trapala (Medical Research Council)

The National Diet and Nutrition Survey (NDNS) uses a representative sample to assess the diet of the UK population. Selecting individuals becomes challenging when sampling over a large geographic area therefore the NDNS uses a complex survey design. Participants provide clustered dietary data which can be examined using mixed-effects models, developed to estimate the conditional mean, although this will fail to provide a complete picture of relationships between dietary intake and explanatory variables. Quantile regression provides a comprehensive description of the intake distribution. A novel method of quantile regression of clustered data collected using a complex survey design is