ESRA logo

Tuesday 16th July       Wednesday 17th July       Thursday 18th July       Friday 19th July      

Download the conference book

Download the program





Wednesday 17th July 2013, 11:00 - 12:30, Room: No. 12

Construction of Response Scales in Questionnaires 1

Convenor Dr Natalja Menold (GESIS)
Coordinator 1Mrs Kathrin Bogner (GESIS)

Session Details

Researchers are invited to submit papers dealing with the design of response scales for questions/items to measure opinions or behaviour in surveys. The papers could include questions about several design aspects of response scales such as number of categories, middle category, unipolar or bipolar scales, numerical and/or verbal labels, ascending or descending order of categories or the scale's visual design. However, of interest are the effects of several design aspects on respondents' responses as well as on data reliability and validity. In addition, effects of cognitional or motivational factors could be focus of the studies. Also, specifics in design of response scales in different survey modes, their comparability in mixed mode surveys as well as their intercultural comparability are further topics of interest.


Paper Details

1. Does the polarity of rating scales matter? How unipolar, bipolar and mixed rating scales affect response sets and factorial validity.

Dr Natalja Menold (GESIS)

Which rating scales are optimal has been widely discussed in the literature. While there are numerous studies addressing different issues, e.g. number of categories, research on scale polarity is quite scarce. Bipolar scales reflect two opposing alternatives (e.g. agree-disagree) with a conceptual zero as midpoint (neither/nor). Unipolar scales (e.g. level of importance; do not agree -agree) reflect varying levels of the same dimension with the middle category presenting the conceptual dimension's midpoint (e.g. moderately). In surveys so called "mixed" rating scales are often used where the midpoint reflecting conceptual zero point is placed in unipolar rating scales, for instance. The current study addresses the question how response sets and factorial validity of multi-item measures are affected by unipolar, bipolar and mixed rating scales. In addition, several personality factors and attitude strength was controlled. A 2x2 randomized experimental design was implemented varying polarity (unipolar vs. bipolar) and middle category (matching vs. non-matching to scale polarity). Different constructs are measured using these rating scales. The participants were 522 members of GESIS online access panel representing a probability sample of German residents. The results show that respondents endorsed more often the middle category "neither/nor" than the category "moderately". The factorial validity was rather poor with bipolar rating scales, whereas unipolar rating scales showed the best or acceptable factorial validity. The results are discussed in terms of their applicability for surveyors and researchers.



2. Impact of Different Response Format on Measurement Quality

Professor Dagmar Krebs (University of Giessen)

Characteristics of response scales are important factors in guiding cognitive processes underlying the choice of a response category. The paper compares two requests addressing the same topic, one request being presented in an agree-disagree response format the other being presented in a construct specific response format. For example, on the one hand side assessment of job motivation can be accomplished by a number of statements describing the importance of job characteristics, where respondents are requested to express how much they agree or disagree with each statement on an agree-disagree response scale. On the other hand side, in the construct-specific response format assessment of job motivation is accomplished by requesting respondents to endorse the degree of importance of specific job characteristics.
It is expected that measurement quality is better in the construct specific response format - where better refers to lower missing data, higher reliability as well as validity. For all questions on both response formats a 4-point response scale was used.
The study is based on a population probability sample using a split ballot design with repeated measurement. Respondents were randomly assigned to the experimental conditions of completely versus endpoint verbalized response formats. Within these conditions the first measurement used construct specific response options while the second (repeated) measurement used agree-disagree response options. Since each respondent answered on construct specific as well as on agree-disagree response scales, response behavior for these response formats can be compared on completely as well as endpoint verbalized response scales.


3. Verbal labels for rating scales: A scaling study

Dr Fanney Thorsdottir (University of Iceland)

One of the issues that have been raised in connection with rating scale construction is the selection of verbal labels attached to the response options. It has been suggested that the verbal labels should have relatively precise meanings for respondents (Jones & Thurstone, 1955; Krosnick & Fabriger, 1997), reflect equal intervals along the response dimension (Krosnick & Fabrigar, 1997; Schaeffer & Presser, 2003) and represent a position from the the greatest degree of disagreement to the greatest degree of agreement (Jones & Thurstone, 1955). The purpose of the present study was to determine the meaning of a set of Icelandic words and phrases than can be used as verbal labels. In total 598 respondents were randomly selected, from a prerecruited probability-based panel in Iceland. The data was collected via the Internet and the overall response rate was 65,9% (n=398). Each respondant was presented with a list of 96 phrases and asked to judge the meaning of each phrase. Scale values were derived for the 96 phrases to determine the meaning of each phrase and standard deviations were calculated to assess the preciseness of their meanings. On the bases of the findings, survey reseachers in Iceland can select verbal labels that have reasonably precise meanings, span the entire response dimension and are spread at equal intervals across the dimension.


4. The effects of the visual presentation of rating scales on middle, extreme and don't know responses

Mrs Kathrin Bogner (GESIS)

In self-completion questionnaires the information is transmitted mainly via the visual channel to respondents. Therefore, it should be considered that next to verbal information also nonverbal features (i.e. numbers, font, space ...) used to design the layout of a response scale might take systematical influence on the respondents' question answering process.
This research investigates how two different visual layouts of a Don't know category (DKC) in rating scales affect respondents in their response behavior. The assumption is that a DKC is perceived accentuated if it is presented detached by a divider line from the substantial rating scale. This accentuation is assumed to attract respondents' attention with a result of a higher rate of item nonresponse than when the DKC is just added to the scale categories. However, for the latter layout the question is if respondents recognize that the DKC is not part of the substantial scale.
Two randomized experiments were conducted varying the presentation of the DKC layout. The first paper-and-pencil experiment included 307 German students. The second experiment was conducted with 450 members of the GESIS online access panel. Results show that respondents of the attached DKC layout are more likely to select extreme responses but less likely to select middle and don't know responses than in the detached DKC layout. Additionally, the results suggest that the security in giving an answer as well as the frequency of thinking about the topic of interest influence respondents significantly in their choice of response category.