Table of Contents
Weights of survey data
When conducting a survey, having a representative sample of the population is of paramount importance. But in practice, you are prone to over-sample some kinds of people and under-sample others. Weighting is a statistical technique to compensate for this type of 'sampling bias'. A weight is assigned to:
- Reflect the data item's relative importance based on the objective of the data collection;
- Take into account the characteristics of sampling design;
- Reduce bias arising from nonresponse when the characteristics of the respondents differ from those not responding;
- Correct identifiable deviations from population characteristics.
Each individual case in the file is assigned a certain coefficient – individual weight – which is used to multiply the case in order to attain the desired characteristics of the sample.
Different types of weights and their different purposes
Several types of weights have different purposes and a different impact on data analysis.
An answer to the question whether or not to use weights is not straightforward. For particular methods of analysis (e.g., estimating associations, regressions, etc.) using weights may be dysfunctional. There are also general theoretical and methodological issues which discourage some researchers from using weights. However, different types of weights are useful for different purposes. In some situations, it is necessary to take an appropriate weight into account in your analysis (see several types of weighting below).
In all cases, if there are any weights in your data file, the rationale and calculation of the weights must be detailed in the data documentation.
Consider the following ...
An example: Using weights in European Social Survey data
The following table provides an illustration of using weights in the data from the European Social Survey (n.d.) (ESS). There are three different weights available in the ESS Source Main Questionnaire data file (see European Social Survey, 2014):
- The design weight takes into consideration the different probabilities of being sampled given the sampling methods implemented in individual countries;
- The post-stratification weight corrects for the differences of the sample from selected population characteristics caused by other sampling and non-sampling errors;
- The population size weight corrects the fact that the individual countries’ sample sizes are very similar while there are large variations in the size of their actual populations.
Different types of data analysis then require the use of different weights or their combinations. When analysing data from one country alone or comparing data of two or more countries, only the design weight or the post-stratification weight needs to be applied. When combining different countries, design or post-stratification weights in combination with population size weights should be applied.
Example – voter turnout (% of respondents voting in the last election) |
Weights to be used |
||
Design weight / Post-stratification weight |
Population weight |
||
To examine data from a single country – whether a single variable or a cross-tabulation |
Voter turnout in Germany |
X |
|
Voter turnout in Germany by age and gender |
X |
|
|
To compare results for two or more countries separately – without using totals or averages |
Compare voter turnout in France, Germany, and the UK |
X |
|
To combine countries – whether on a single variable or via a cross-tabulation |
Voter turnout in Scandinavia |
X |
X |
Voter turnout in the EU |
X |
X |
|
Voter turnout across all countries participating in the ESS |
X |
X |
|
Compare voter turnout between EU member states and accession countries |
X |
X |
|
Voter turnout by age group across all ESS participating countries |
X |
X |
Source: European Social Survey, 2014.