Meths & Stats

 

 

Research manifesto

Statistics Index

CERG Home Page  



CERG Resources

Bibliography


Data Entry


Data entry is one of those areas which gets little attention in the research area.

It is assumed that it "comes naturally". But there are essential points to consider - for instance

    Who is going to enter the data?

    Which application is going to be used for data entry

    What application are you going to use to analyse the data?

    How much time do you have to decode data?

These are interlinked questions so the material below has the following headings:
Ways to go about it

    A common way to do data entry is via a spread sheet. Maybe this is a consequence of spread sheets including basic statistical analyses.

    If you have any quantity of data a spread sheet is very slow.

      Can you imagine how long it would take to enter a questionnaire with 40 items for, say, 250 subjects using a spread sheet?

    The most efficient way is to use a word processor with text mode.

    On my current Celeron I have installed WordStar 4 (1983 vintage) because it is still the best data entry program I have come across - outside those used by professional data entry personnel.

    WS4 in N-file mode gives you all the things you need -

    • column and row specification

    • the ability to go into column mode

    • unlimited line length

    Column and row specification is a part of most word processors.

    I have not found another word processor which uses column-mode where you can isolate a column, or group of columns, and delete them or move them to another position.

    You can switch to landscape mode in a word processor and get about 140 columns of data but if you have, for example, 50 items which are being entered over three columns you have a problem.

    The value of unlimited line length is that you do not have word-wrap getting in the way of being able to do visual inspections of the data.

Preparing your data

    You make life easier for yourself if you spend time before doing data entry deciding on what you want you data to look like.

    • Will it be read in as formatted lines or will you be using a column delimiter? - assuming you are not using a spread sheet.

    • How will you handle missing data? - any data entry method

    • How will you handle your Don't Know, Not Applicable or Other responses? - ditto

    • What will you do about non-numeric categories? - ditto

    • How will you deal with extended text data? - ditto

    Column layout

      Statistical packages give you a wide range of possible methods for reading in a data file, including reading directly from spread sheets and data bases. If your data is to be read from a text file, the stats package should be able to handle:

      • formatted input using Fortran-type formatting

      • fixed column - i.e. all data is entered within a prescribed column width

      • delimited text - using a space, comma, tab etc.


To be completed