ESRA logo
Tuesday 14th July      Wednesday 15th July      Thursday 16th July      Friday 17th July     

Tuesday 14th July, 16:00 - 17:30 Room: HT-101

Surveys, ipsative and compositional data analysis (CODA)

Convenor Dr Berta Ferrer-rosell (University of Girona - Department of Economics )
Coordinator 1Dr Josep Daunis-i-estadella (University of Girona - Department of Computer Science, Applied Mathematics and Statistics)
Coordinator 2Professor Vera Pawlowsky-glahn (University of Girona - Department of Computer Science, Applied Mathematics and Statistics)

Session Details

Statistical compositions are common in the chemical and biological analysis in the fields of geology and biology, among others. Typically the size is irrelevant and only the proportion or the relative importance of each component is of interest. In survey measurement, the so-called ipsative data also consist of positive data arrays with a fixed sum and which only convey information on the relative importance of each component. Examples include surveys measuring compositions of household budgets (% spent in each product category), time-use surveys (24-hour total), educational instruments allocating a total number of points into different abilities or orientations (e.g. Kolb’s learning styles or Boyatzis’ philosophical orientations), and social network compositions (% of family members, friends, neighbours, etc.)

Statistical analysis of compositional data is challenging because they lie in a restricted space and components cannot vary independently from one another ("all other things constant"): the relative importance of one component can only increase if the relative importance of at least one other component decreases. A popular solution is to transform compositional data by means of logarithms of ratios of components before applying standard analysis methods, while interpreting the results with great care.

Standard statistical methods such as ANOVA, linear regression and cluster analysis have a well documented tradition in compositional data analysis although there is room for improving the methods and make them friendlier to a wider audience. Less has been done regarding typical survey research analysis methods, for instance, multivariate analysis methods and latent-variable methods. The naive analysis of raw proportions is of common practice even if it is plagued with statistical problems (inconsistent inferences, heteroskedasticity, non-normality, censoring, perfect collinearity, and unclear interpretation, among others). The session aims to bridge methodological knowledge between the natural and social sciences in order to narrow this gap.

Paper Details

1. Multiplicative Ipsative Data and Compositional Data. Why, Why Not, How and How Not?
Dr Glòria Mateu-.figueras (University of Girona)
Dr Josep Daunis-i-estadella (University of Girona)
Dr Berta Ferrer-rosell (University of Girona)

Multidimensional forced choice instruments ask respondents to rank traits along a set of items and convey information on the relative importance of traits. Data are ipsative and have a fixed sum. This is why they violate the assumptions of standard statistical techniques while they are fit for COmpositional Data Analysis (CODA). The paper discusses how CODA gets the most out of the relative information in the data and solves all concerns about ipsativity and assumptions. The simplest CODA approaches are presented: how alternative log-ratio transformations of the data are computed and how zeros are dealt with.

2. Relating Sets of Ipsative Variables from Forced Choice Questionnaires. A Compositional Canonical Correlation Analysis Approach
Dr Josep Daunis-i-estadella (University of Girona)
Dr Glòria Mateu-figueras (University of Girona)
Dr Josep Antoni Martín-fernández (University of Girona)

We propose a method combining canonical correlation analysis and compositional data analysis in order to relate two sets of ipsative variables obtained from multidimensional forced choice questionnaires. In these questionnaires respondents are asked to rank a set of dimensions over a number of items. We derive some key desirable statistical properties of the proposed method, which solves all concerns about ipsativity while revealing the information on the relative importance of the dimensions carried by ipsative data. Once the data have been appropriately transformed, the method is no more complex than standard canonical correlation analysis.

3. Application of CODA to the Experiential Learning Theory. The Third Learning Style Dimension.
Professor Joan Manuel Batista Foguet (ESADE BS. Universitat Ramon Llull. Spain)
Mr Ricard Serlavós (ESADE BS. Universitat Ramon Llull. Spain)
Professor Germà Coenders (Universitat de Girona. Spain)
Professor Richard Boyatzis (CASE Western Reserve University. Cleveland)

Kolb’s experiential learning theory suggests four learning modes, which are dialectically related by pairs. The dimension of grasping knowledge opposes the Concrete Experience mode to the Abstract Conceptualization mode; the dimension of transforming knowledge opposes Reflective Observation to Active Experimentation. Grasping and transforming as a whole have never been compared.
Being a 4-term composition, learning modes require three log ratios. In this paper we show how CODA allows the third dimension comparing grasping and transformation to emerge as a log-ratio. The paper gathers evidence of the third dimension by relating it to student nationality and social and

4. It All Adds Up: A Comparison of Constant Sum Tasks on Self-reported Behavior
Mr Randall Thomas (GfK Custom Research)
Dr Frances Barlas (GfK Custom Research)

We experimentally compared a number of response formats in reporting either frequency or monetary spend for two different topics (DVDs by type of move; quick service restaurants by brand). Our formats included constant sum, open numeric, and grid response formats. Using two measures of validity, we found that constant sum tasks had generally lower validity than independent measures.

5. Clustering compositional data. An example with a repeated cross-sectional travel budget survey
Dr Germa Coenders (University of Girona)

Tourists are heterogeneous in the way they adapt their trip budget to economic crises by reducing expenditure in some budget parts. We derive tourist market segments from the trip budget allocation (share of transportation, accommodation and food, and activities) and study the evolution of the found segments during the current crisis by using year (2006-2012) as an illustrative variable.
Budget share is a particular case of compositional data, which makes clustering difficult. The centred log-ratio transformation makes the Euclidean distance equivalent to the Aitchison distance and standard clustering techniques, such as k-means, can be used.