NOMINAL DATA: Summarizing Univariate Distributions
A. Central tendency: mode = the category with the largest frequency
Example: Rape: Relationship of offender to the victim, Kansas, 1988
| RELATIONSHIP | f |
| ACQUAINTANCE | 219 |
| STRANGER | 213 |
| FRIEND | 47 |
| HUSBAND/COMMON-LAW HUSBAND | 32 |
| EX-BOYFRIEND | 31 |
| BOYFRIEND | 10 |
| EX-HUSBAND | 15 |
| FATHER/STEP FATHER | 20 |
| BROTHER | 8 |
| GRANDFATHER | 2 |
| UNCLE | 1 |
| OTHER FAMILY/IN-LAW | 14 |
| UNKNOWN | 141 |
| TOTAL | 782 |
SOURCE: Caputi, J. & D. E. H. Russell "Femicide" Speaking the Unspeakable" MS Sept./Oct., 1990
Q: What is the average or typical relationship of a rapist to the victim?
A: Look at variable and decide what type it is: This is nominal data. Look at the question. It is asking for central tendency. Mode = acquaintance
*If two categories are both equally large, there will be 2 modes. There will be no central tendency if all categories have the same frequency.
ORDINAL DATA: Summarizing Univariate Distributions
A. Median: Central tendency
1. Put variable in order.
2. Number of median is:
a) If N is an odd number,
Example: Nudity Tolerance Score: (ungrouped data)
| La Jolla | San Diego | ||
| 91 | 98 | ||
| 90 | 98 | ||
| 87 | 83 | ||
| 86 | 83 | ||
| N = 12 | 85 | 83 | N = 11 |
| median -> | 83 | 82 | <- median |
| median -> | 81 | 82 | |
| 78 | 71 | ||
| 77 | 42 | ||
| 76 | 31 | ||
| 76 | 30 | ||
| 73 |
Q: What is the average score in La Jolla? In San Diego?
A: In La Jolla, the median is both 83 and 81. In San Diego, the median is 82.
Example: Past Soc201 final grades (grouped data, ordinal w/odd number of cases)
| Grade | f | Cum f | Cases |
| A | 16 | 16 | 1-16 |
| B | 18 | 34 | 17-34 |
| C | 21 | 55 | 35-55 |
| D | 23 | 78 | 56-78 |
| F | 13 | 91 | 79-91 |
| N = | 91 |
Q: What is the average grade for Soc201 classes?
A: The average grade for Soc201 is "C." Since N is odd,
(N + 1) / 2 = (91 + 1) / 2 = 46. The median is the
ordinal category that contains case # 46 not 46!
Example: Persons who have known someone with AIDS (grouped data, ordinal w/even number of cases).
| SES (socio-economic status) | f | cum f | cases |
| low | 190 | 190 | 1-190 |
| middle | 221 | 411 | 191-411 |
| high | 33 | 444 | 412-444 |
| N = | 444 |
SOURCE: GSS73-91 SURVEY SUBSET
Q: What is average SES of persons who have known someone with AIDS?
A: Median SES "middle class"
444 / 2 = 222, 444+2 / 2 =223 the median falls between
cases 222 & 223.
Interval Data or continuous Ordinal Data: Summarizing
Univariate Distributions
A. Median: Central tendency
In some instances, such as income, housing costs,
etc., a small minority of people skew the mean average (i.e. inflate
or deflate). Additionally, some ordinal data is rendered in a
numerical form (i.e. IQ scores, GPA's, etc.). In these cases,
the median average is used. The following formula allows for a
precise numerical answer. This formula is ordinarily used for
grouped data with intervals greater than 1.
Median (Mdn)= LTL + {i[N(.50) - cfbelow ]/ f }
where cfbelow = cumulative frequency below interval containing the median.
f = the number of cases in the interval containing the median.
i = width of interval
N = total number in sample
LTL = lower true limit of the interval containing
the median
Example: Ages (at last birthday)
| Ages | Cum f | ||
| 18-19 | 19.00 | 23 | 23 |
| 20-21 | 21.00 | 44 | 67 |
| 22-23 | 23.00 | 44 | 111 |
| 24-25 | 25.00 | 52 | 163 |
| 26-27 | 27.00 | 50 | 213 |
| 28-29 | 29.00 | 79 | 292 |
| 30-31 | 31.00 | 74 | 366 |
| 32-33 | 33.00 | 66 | 432 |
| 34-35 | 35.00 | 72 | 504 |
| 36-37 | 37.00 | 83 | 587 |
| 38-39 | 39.00 | 85 | 672 |
| 40-41 | 41.00 | 79 | 751 |
| 42-43 | 43.00 | 72 | 823 |
| 44-45 | 45.00 | 62 | 885 |
| 885 |
B. Mean: Central tendency
(abbreviated
called x-bar)
1. ungrouped data:
=
x / N
Where: x = variable
N = total in sample
Example:
| weight | |
| 80 | |
| N = 3 | 90 |
| 60 | |
| 230 = |
Q: What is the average weight? There are 3 meanings of "average": nominal data = mode, ordinal data = median, interval data = mean (unless disproportionately skewed by upper or lower end of the range in which case median is used.)
A: = 230 / 3 = 76.67
2. grouped data:
=
fm / N
Where: f = frequency of a category
m = midpoint of a category
N = total number of cases in sample
| # of children | f | m | fm |
| 1 - 3 | 400 | 2 | 800 |
| 4 - 6 | 300 | 5 | 1500 |
| 7 - 9 | 200 | 8 | 1600 |
| 10 - 11 | 100 | 11 | 1100 |
| 1000 = N | 5000 = |
Example:
Q: What is the average number of children?
A: 5000 / 1000 = 5