 # Standard Deviation

This page continues on from "Understanding Averages" page.  Standard Deviation is a measure of the spread of the data about the arithmetic mean value. It represents the average amount that each value differs from the mean.

Alarm and despondency are spreading through the ranks of the RSPB (Royal Society for the Protection of Beards). The trustees of the organisation are deeply concerned that stubbly “designer beards” have been observed infecting the chins of some members. These “shorty” appendages are strictly illegal, (minimum permissible length of 0.47m and willingness to house a homeless badger being the basic requirements for members). You have been employed to investigate suspect groups and report back to the Minimum Standards (Pogonophilia) Committee.

You collect the following data:

Table showing maximum beard lengths for three categories of members of the RSPB

 Membership Category Beard lengths in metres Ecologists 0.1 0.9 0.4 0.5 0.5 0.5 0.5 0.6 0.5 0.5 Druids 0.4 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.6 Eco-Warriors 0.1 0.2 0.5 0.8 0.5 0.7 0.4 0.5 0.6 0.7 Notice that the mean beard length of all groups is the same (0.5m). The data for each group are however obviously different in the way that individual data points are scattered about the mean. Nearly all Ecologists have beards of 0.5m (the mean value) but there are a couple of illegally short beards and two bigger growths (one you will note is of such massive proportions that it could conceal a veritable menagerie of nesting creatures). Druids are very consistent in their beard lengths. Eight out of ten measured growths are of 0.5m. There is just one shorter beard (probably a trainee) and one slightly bigger version belonging to a non-conformist druid. The Eco-Warrior’s beard lengths are much more spread out around the mean, reflecting an inconsistant approach to bristle growth and (in 4 cases) blatent disregard for the constitution of the RSPB.

How can we describe the differences in the 3 data sets? What is needed is a measure that indicates the average amount that each piece of data is different from the mean. The standard deviation is just such a number.

Here is how it’s calculated:

Standard deviation of Ecologist’s data The first column (from the left) is each piece of raw data. The second column from the left is the distance of each piece of raw data from the mean. Note that if we just added this column up the values would cancel each other out and we end up with a value of zero. Obviously this is no use as a measure of scatter of the data about the mean. The way around this problem is to square each of the values from the second column. The negative values now become positive values and we can add them up. This has been done at the bottom of the third column (= 0.34).

So we now have a value (0.34) that represents the total deviation of all our pieces of data from the mean. If we divided that value by the number of items of data we would have a measure of the mean amount that each bit of data varies from the mean. There is a slight complication though. If we could measure every single ecological beard in the world we could calculate the actual mean length of the whole population. Our sample however, has only ten measurements in it. So to give us a more realistic estimate of the real mean we divide by one less than the number of samples. In our case that = 10 – 1 = 9. Statisticians call this value (n – 1) the degrees of freedom. You will always use n – 1 to calculate standard deviation.

0.34/9 = 0.037

We now have a number (0.037) representing the scatter or spread of our Ecologist’s beard lengths about the mean value. Statisticians call this number the variance of the data and it is used in numerous ways (see the pages on t-tests for example).

Remember we squared all our differences from the mean to get rid of negative values. To complete our calculation of standard deviation we must take the square root to convert the number back to its original units.

√ 0.037 = 0.192 = standard deviation of Ecologist’s beard lengths in metres

The calculation we have just done can be represented by this formula: Now, think back to our original beard length data. We needed a measure of the scatter of data points about the mean value. We now have it: The standard deviation. If we do the calculations for the Druids and the Eco-Warriors as well we get the following values:

σ n-1 (Ecologists) = 0.192

σ n-1 (Druids) = 0.047

σ n-1 (Eco-Warriors) = 0.221

Notice that the Druids (whose beards were all clustered around the mean) have a very small value. The Ecologists (whose beards were mostly clustered around the mean but not to the same extent as the Druids) had a bigger value and the Eco-Warriors whose beard lengths were all over the place (i.e. widely scattered about the mean) have the biggest value of all.

Most scientific calculators will save you the tedium of the above calculations if you know how to operate their statistical functions. The instructions will tell you how to do this. Alternatively, you can use the formula below which is a bit quicker (mathematically it’s the same thing). Where: Σ X 2 = each individual piece of data squared and then added up

(ΣX) 2 = each individual piece of data added up and then squared

n = the number of items of data

Is Our Estimate of the Mean Any Good? Read on ............ Standard Error

Looking for a next step?
The FSC offers a range of publications, courses for schools and colleges and courses for adults, families and professionals that relate to the seashore environment. Why not find out more about the FSC?

FEEDBACK
Do you have any questions?