95% confidence interval
Surveys: What are survey data?
This is a measure of the range within which we can be 95% sure that the population figure lies, based on a sample statistic. For a sample percentage, it is calculated as the sample percentage minus 1.96 times the standard error, to the sample percentage plus 1.96 times the standard error.
Bias
Surveys: What are survey data?
A problem that results in incorrect estimates being produced from a sample. Results will be too high or too low. This may arise where the sample is not representative of the population.
Case
Surveys: What are survey data?
A case is a unit for which values are captured. When data are collected from a survey the case is usually a respondent. Most statistics packages show each case in a single row of data.
Categorical variable
Surveys: Exploring data
A variable that holds values which represent discrete classes of reponses. For example a marital status contains the classes (or categories) of single (never married), married, civil partnership, divorced, widowed etc.
Clustering
Surveys: What are survey data?
The process of dividing a population into groups and selecting the sample from only some of them. Geographical clustering is common in surveys conducted face to face. The population is split into geographical units and the sample is selected from only some of them. This is undertaken to reduce the amount of travel the interviewer needs to do.
Codebook
Surveys: Exploring data
A codebook describes the contents, structure, and layout of a data collection. Codebooks begin with basic front matter, including the study title, name of the principal investigator(s), table of contents, and an introduction describing the purpose and format of the codebook. Some codebooks also include methodological details, such as how weights were computed, and data collection instruments, while others, especially with larger or more complex data collections, leave those details for a separate user guide and/or data collection instrument.
Surveys: What are survey data?
A control variable is a variable that is included in an analysis in order to account for it's effect and therefore distinguish any affect that is has from the effect of another variable which may be of more interest. For example if we were looking at the the impact of having a degree on health amongst adults over 40 we might need to consider the impact of age at the same time. Older people tend to be less likely to have a degree and to have poorer health. If we control for age we can see whether graduate health is generally better than non-graduate health once age has been accounted for.
Surveys: What are survey data?
Cross-sectional data is collected at a single point in time. It has been likened to taking a snapshot.
Surveys: What are survey data?
A variable that is created after data collection following some sort of calculation or other processing.
Surveys: What are survey data?
A statistic which simply describes a characteristic of variable for a group of cases this generally does not have any explanatory power.
Estimate
Surveys: What are survey data?
A statistic, produced using a sample of cases, which is designed to produce information about the characteristics of the population.
Surveys: What are survey data?
Data that contains information about the same units (usually respondents) over time is called longitudinal data. It can be contrasted with cross-sectional data which collects data at a single time.
Microdata
Surveys: What are survey data?
Data stored as cases, and variables,, usually with one case per respondent,.
Missing value
Surveys: What are survey data?
A value that is differentiated from a 'valid' response, such as 'did not answer' or 'not applicable'. Missing values are excluded from most procedures.
Surveys: What are survey data?
These are methods that use data from more than two variables at a time, often several variables are included as controls. Statistical modelling is a particularly common form of multivariate method.
Nonimal variable
Surveys: Exploring data
This a categorical variable that contains values which represent categories which do not have a natural order. For example there is no natural order to a set of categories describing religion followed. The values assigned to each class is wholly arbitrary.
Ordinal variable
Surveys: Exploring data
This a categorical variable that contains values which represent categories which have a natural order. For example a highest level of qualification might respect an ordering such as; higher degree, first degree, further education below degree, GCSE or equivalent, no qualification. The values assigned to the classes should respect the natural order.
Population
Surveys: What are survey data?
The population is a defined set of units, for example all 'residents in England and Wales in 2011'. The population is the group that we seek to describe.
Precision
Surveys: What are survey data?
Precision is the accuracy with which a sample statistic is able to estimate a population statistic. A precise sample is characterised by a low standard error and a narrow 95% confidence interval.
Probability sample
Surveys: What are survey data?
A sample based on random selection of elements. It should be possible for the sample designer to calculate the probability with which an element in the population is selected for inclusion in the sample.
PSPP
Surveys: Exploring data
PSPP is an open source statistics package which has a similar design and basic functionality of SPSS. Visit the website for more information.
Raw variable
Surveys: What are survey data?
A variable that stores responses given to a question in the survey in their original form.
Representative sample
Surveys: What are survey data?
A representative sample is one that replicates the characteristics of the population.
Respondent
Surveys: What are survey data?
A person, or other entity, who responds to a survey.
Sample
Surveys: What are survey data?
A sample is a subset of a population.
SPSS
Surveys: What are survey data?
SPSS is a commercial statistics package. Visit the website for more information.
Standard error
Surveys: What are survey data?
The standard error for a statistic is a measure of the amount that a sample statistic (such as a percentage) varies from the true population statistic. The standard error for a percentage can be calculated by se(pct) = sqrt((pct * (100-pct)/n) In other words; multiply the sample percentage by 100 minus the sample percentage. Divide the product by the number of cases used to calculate the percentage. Finally square root the answer to get the standard error.
Surveys: What are survey data?
A statistical model is a theoretical construction of the relationship of a set of explanatory variables to another variable of interest which depends in order to better understand the relationships. The model is expressed as function. For example, a researcher may use a function such as;
y =b1x1 + b2x2 + ... + bmxm
to express the relationship between variable y and the m variables it depends on x1 to xm where the impact of a change in 1 unit of each explanatory variable is given by the factor b1 to bm respectively if (s)he believes this function is an appropriate mathematical description of the relationships in the data.
Standard models have well known methods to determine the b values and the strength of evidence that each variable has an effect on y.
Statistically modelling is a major topic and outsidethe scope of the module. Readers who want to know more will find extensive accounts of statistical models including linear regression and logistic regression in statistical texts and online.
Structured interview
A survey undertaken by an interviewer usually face to face or over the phone that is constructed from standard questions.
Survey non-response
Surveys: Exploring data
Sample members may not be contactable, or may refuse to participate in the survey. Sample members who do not take part in the survey are known as 'non-respondents'.
Unit of analysis
Surveys: What are survey data?
The unit which is being analysed. This is synonymous to the case.
Value
Surveys: What are survey data?
A representation of a characteristic for one case, for one value. In many packages the value is represented by a number.
Value label
Surveys: What are survey data?
A label associated with a value to enable humans to understand what it means.
Variable
A variable is anything which can vary. In surveys, this is usually a characteristic that varies between cases.
Weighting
Surveys: Exploring data
A means by which the relative importance of cases can be changed. By default all cases count as one unit. Weighting changes this so that a case with high weight counts more and a case with low weight counts less. The process is usually undertaken in order to fix problems of unrepresentativeness in a sample.