Eye-balling Meths & Stats


Eye-balling

 

 

Research manifesto

Statistics Index

CERG Home Page  



CERG Resources

Bibliography

The starting point for any data analysis involves seeing what the data looks like - Eye Balling

There are quite sophisticated ways of eye-balling your data using mainly Exploratory Data Analysis. But this requires yet another level of statistical knowledge. It is a level that some people find easy and others find it very difficult.

The informal approach to eye-balling starts with frequency distributions. You should always generate frequency distributions for your data.

Under SPSS you should include all statistics including skew and kurtosis.

The starting point for eye-balling is to simply inspect the type of distribution you have. You ask yourself the question : Are there any odd-bods?.

Technically, you are looking for outliers - cases which are extreme against the rest of the distribution. For example:

    ratingFrequency
    120
    237
    328
    412
    58
    62
    77

This distribution could mean that there is a legitimate "lump" at the bottom end but it could also mean that you have a couple of cases which do not fit too well.

Another example could be:

    AgeFrequency
    18-25100
    25-3057
    31-3522
    36-408
    41-450
    46-500
    51-603

The 3 cases in the 51-60 category are well outside the range for the rest of the sample. You would have to ask yourself Even if they belong to the sample, might they distort the result?

To check out possible outliers, you usually have to go back to your raw data and look at individual cases. Alternatively you can sort your data and look at the potential outliers in isolation from the rest.

Basic descriptives

For those data elements which are parametric you should ask your stats package to print out the descriptives. SPSS can give you the mean, standard deviation, range and valid cases.

What you are looking for here is :

    dirty data

    unexpected results

    the missing cases you have for each item in the data base.

The shape of the distribution

The next thing to look at is the shape of the distribution:

    How flat is it - kurtosis

    How much is it off centre - skew

If your distribution (for parametric data) is to distorted you have to ask whether you can use parametric stats. You may have to resort to the non-parametric equivalents.

AND....

When you have finished eye-balling your data, you have some basic ideas about how coherent your data base might be.

What you will be able to do is to say that you know that there is nothing odd about the data because you have been able to sort out any problems which have arisen. This means that whatever results you get you should be able to interpret them without worrying about data which might have distorted outcomes.