Organizing Data
Graphical representation is done of the data available this being a
very important step of statistical analysis. We will be discussing
the organization of data. The word 'Data' is plural for 'datum';
datum means facts. Statistically the term is used for numerical
facts such as measures of height, weight and scores on achievement
and intelligence tests.
Tests, experiments and surveys in education and psychology provide us valuable data, mostly in the shape of numerical scores. For understanding data available and deriving meaning and useful conclusion, the data have to be organized or arranged in some systematic way. This can be done by following ways:
1. statistical tables
2. rank order
3. frequency distribution
Statistical tables
The data are tabulated or arranged into rows and columns of different heading. Such tables can list original raw scores as well as the percentages, means, standard deviations and so on. Example -
Table for group mean and S.D. of anxiety test
of dancers and non dancers
|
Group |
Mean |
Standard deviation |
N |
|
Dancers |
22.66 |
6.018 |
15 |
|
Non dancers |
27.66 |
8.741 |
15 |
Rules for constructing tables:
1. Title of the table should be simple, concise and unambiguous. As a rule, it should appear on the table.
2. The table should be suitably divided into columns and rows according to the nature of data and purpose. These columns and rows should be arranged in a logical order to facilitate comparison.
3. The heading of each columns or row should be as brief as possible. Two or more columns or rows with similar headings may be grouped under a common heading to avoid repetition and we may have subheadings or captions.
4.
Sub total for
each separate classification and a general total for all combined
clases are to be given. These totals
should be given at the bottom or right of the concerned
items.
5. The units in which the data are given must invariably be mentioned.
6. Necessary footnotes should be providing essential explanation of the points to ambiguous representation of the tabulated data must be given at the bottom of the table.
7. The sources from where the data have been received should be given at the end of the table.
8. In tabulating long columns of figures, space should be left after every five or ten rows.
9. If the numbers tabulated have more than three significant figure, the digit should be grouped in threes. For ex.- 4394756 as 4 394 756.
10. For all purposes and by all means, the table should be as simple as possible so that it may be studied by the readers with minimum possible strain and create a clear picture and interpretations of the data.
Rank order
The original raw scores can be arranged in an ascernding or a descending series exhibiting an order with respect to the rank or merit position of the individual. Example:
Sixteen students of BA final psychology class obtained the following scores on an achievement test. Tabulating the given data -
5 8 4 12 15 17 18 12 20 7 8 19 6 9 10 11
S. No. Scores S No. Scores S No. Scores S No. Scores
1 20 5 15 9 10 13 7
2 19 6 12 10 9 14 6
3 18 7 12 11 8 15 5
4 17 8 11 12 8 16 4
Frequency Distribution
The organization of the data according to rank order does not help us to summarize a series of raw scores. It also does not tell us the frequency of the raw scores. In frequency distribution we group the data into an arbitrarily chosen groups or classes. It is also seen that how many times a particular score or group of scores occurs in the given data. This is known as the frequency distribution of numerical data.
Construction of Frequency distribution table
Finding the range:
First of all the range of the series to be grouped is found. it is done by subtracting the lowest score from the highest. In the present problem the range of the distribution is 46-12, ie. 34
Determining class interval:
After finding range we find class interval represented by 'i'. The formula for I is
i = Range/ no. of class interval desired
i = 34/8
i = 4.25
We decide to take class interval to be 5.
Writing the contents of the frequency distribution table:
Writing the classes of the distribution.
In the first column we write the classes of distribution. First of all the lowest class is settled and afterwards other subsequent classes are written down. In this case we take 10-14 as the lowest class, then wee have higher classes as 15-19, 20-24,.. and so on up to 45-49.
Tallying the scores into proper classes.
The scores given are tallied into proper classes in the second column then the tallies are counted against each class to obtain the frequency of the class. Example-
|
Class interval |
Tallies |
Frequency for Non-dancers |
|
45-49 |
l |
1 |
|
40-44 |
0 |
0 |
|
35-39 |
ll |
2 |
|
30-34 |
lll |
3 |
|
25-29 |
llll |
4 |
|
20-24 |
ll |
2 |
|
15-19 |
ll |
2 |
|
10-14 |
l |
1 |
Total frequency(N) = 15