Learning to handle Data

[We do it all the time !!!]

An Introduction...

We live in a world where facts and figures abound. In our daily lives we are continually assessing numerical situations and making rapid judgments. How many bread rolls to buy? How much petrol do I need? How much money is in my bank account? etc. etc. We cannot escape the need to do many hundreds of minor judgmental maths calculations in our heads each day.

However, are we always right? Is simply guessing the amount of petrol needed for a journey such a good idea? Surely it would be better to have some means of estimating the required quantity in order that we don't run out before reaching the destination. For this to work, we need access to certain data. n this case an accurate measure of the distance involved and an approximation of the number of miles per gallon that the car will do.

The manufacturer of the car will have carried out numerous road tests i.e. experiments (under many driving conditions) to give an average figure for the car's petrol consumption.
You might say that the 'experiment' to establish the average mpg has been replicated many times and so any average mpg figure has been based on a range of results that have been 'condensed' to yield just one figure...the average mpg. Because the figure has been arrived at in a logical and repeatable way, we have cause to have faith in its accuracy albeit that we are putting our trust in others.

Another point to note is that our car is only a sample and so were the few cars actually tested at the factory. The 'population' from which these samples qwere taken would be every single example of that model built to the same specification. This could be tens of thousands of vehicles and it is clearly impossible to check the fuel consumption of each and every one of them.

Now we are beginning to think "how typical is my car"... "How closely does it conform to the specifications of all the examples of this model"? We are going to have to make an inference about the car based on values derived from other cars and not the one we are using. So in order to make an inference about the amount of petrol to buy we need to do a little mental arithmetic...

...140 miles @ 35mpg = 4 gallons needed.

Whilst the calculation is precise the conclusion may be inaccurate. We may be held up in traffic, the car engine may not be running at peak efficiency or we may just have a heavy right foot!!!! Our brains seem to have an amazing ability to adjust mathematical data with no apparent effort. The likelihood therefore is that we would buy 5 gallons of fuel "just to be on the safe side". In essence what happened here was that our brain said "I know that 4 gallons should be sufficient but how confident can I be that the result is correct?

In general conversation we may use phrases such as..

"I am 90% sure the car will make it"!

However, in statistics such use of "90%" would have a far more precise meaning and would involve the use of some inferential test to establish whether or not we are justified in saying that something is 99% certain to happen.

In other words, we need to ask "what is the mathematical likelihood that the results of my estimations are in fact the true state of affairs"? We could set an arbitrary 'cut-off' point at say, 90%. This 'hypothesis' would say " If I did this journey 10 times, I am confident that the car would complete the journey....I think I would fail to get there one time in ten.

Proability theory is often difficult to grasp but essentially we are looking at the chance of an event (outcome) taking place. We can measure this as a proportion on a scale of 0 to 1 or as a percentage from 0% to 100%. 0% says the event is extremely unlikely to happen by chance whilst 100% says it is extremely likely to happen by chance.

So what is the chance of that one in ten disaster journeys happening on this particular trip?

In reality what our brain tells us is that we don't want to run the risk that this journey will be that 'one in ten' and therefore we say... "I am not sufficiently happy with the odds to take the risk and therefore I will buy 5 gallons instead".



Cautionary note: All data and examples given in these pages relate to possible real situations but the data used has been constructed for illustrative purposes and should not be used to infer actual recorded measurements or policies.

Back to Contents Page

On to Level C overview

On to Level C: STEP 1