My World: Statistical methodology

The first phase of a statistical investigation is the identification of the population together with the variables which will be measured.

A population is a totality of entities about which we hope to draw conclusions.

A property of a member of a population which varies from one individual to another is called a random variable.

Data can be classified by its level of measurement:

Nominal: items will be classified into categories.
Ordinal: categories which are ranked in some sort of logical order.
Metric: the quantity has a physical significance.

Data can also be classified by the continuity of the scale:

Continuous: can take any value within a certain range.
Discrete: there are gaps in the scale between allowable values.

A factor is an effect we can control by setting its level before any variables are measured. Notice that a factor is not a random variable as its value is completely within our control as part of the experiment design.

Each factor can be operative at two or more levels and we define a treatment to be a combination of specific levels of each factor present.

Each treatment defines a separate sample, a set of entities from a population. It is emphasized that the items in a sample must be homogeneous with respect to the characteristics we are studying. The fundamental assumption in a statistical analysis is that members of a sample are identical with each other except for that variability we are prepared to write off as being due to random unexplained variation.

The Law of Large Numbers states that the larger the size of sample, the better its average estimates the corresponding average of the population.

Most statistical techniques assume that the sample is randomly selected, and every single member of the population has an equal chance of being included in the sample. Individual data measurements must be statistically independent of each other and ideally should not interact at all.

Methods of analysis range from the calculation of summary statistics like averages and the drawing of diagrams like bar charts to more sophisticated techniques like analysis of variance and regression. Because the information content of the whole operation is being compressed into a diagram or a simple statement, the result can be like a woman's bikini - what it reveals is interesting but what it covers up is vital.

My World

Saturday, December 20, 2008

Statistical methodology

No comments: