AN EXERCISE IN STATISTICS
LEARNING OBJECTIVES
 Determine the variance and standard deviation for a set of quantitative data
 Use the ttest to determine the probability that two sets of data belong to the same
population
BACKGROUND
 Definitions
 The ttest
The ttest is a valid statistical technique for random samples of continuous variables
from normally distributed populations. It can determine the probability that the
null hypothesis concerning the means of two small samples is correct; that is, it
shows the probability that two samples are representative of a single population or of
different populations. We will use the following formula for determining the value
of t:
(3)
where x_{1} = mean of sample 1, x_{2} = mean of sample 2, n_{1} = the number in sample 1,
n_{2} = the number in sample 2; s_{1}^{2} = the variance of sample 1, and s_{2}^{2} = the variance of sample 2.
If the sample sizes are equal, then n_{1}= n_{2} = n, thus allowing formula (4), above,
to be simplified as:
(4)
Note that t
can be either a positive or negative value. This is not important since the curve
for t is symmetrical and we are concerned only with the magnitude of the variation
from the mean. Thus, we are only concerned with the absolute value of t.
Having determined the value of t, we must refer to the table on the Distribution of
t Probability (Table 1, next page). In the table, you will note a series of p values
along the top, and a listing of the degrees of freedom (d.f.) along the left margin.
The degrees of freedom is the number of sets of values that are free to vary in a
given sample. It is equal to one less than
the number of values in each sample. In using the ttest, it will be two less than
the total number in both samples.
Criteria for using the ttest:
 Samples must be chosen randomly.
 Samples must have the characteristics of a "normal" distribution.
 Measurements must be of continuous variables.
Procedure for ttest
 Arrange data in a table with the following headings:
 specimen label (number, letter, name, etc.)
 measurement of specimen (height, weight, etc.)  as many as necessary
 deviations from the mean for each measurement
 squares of deviations from the mean for each measurement
 Determine the mean value (xm) for each measurement heading
 State the null hypothesis (H0)
 Sum the deviations from the mean
 Sum the squares of the deviations from the mean
 Compute the variances (formula #1)
 Determine which formula (#3 or #4) for t is applicable and use it to calculate
the value of t
 Determine the number of degrees of freedom (df)
 Refer to Table 1 and determine the level of probability that your populations
differ from each other by chance
 Accept or reject the null hypothesis on the probability (p); H0 is usually rejected
if p < 0.05
 State a conclusion in terms of the results of the experiment.
 Lung Volume Investigation
 Prepare the spirometer for collecting lung volume data using a disposable mouthpiece
for each subject tested. Metersticks and a bathroom scale should be readied for
collecting data as well.
 Collect the following data for at least 50 students: gender, age, height, weight,
and lung volume. Record the data for each subject in a spreadsheet. DO NOT collect
names with these data; they are irrelevant. Be sure each subject is a willing participant in the investigation by explaining exactly what information will be obtained
and that it will be used anonymously. It will be most convenient if age is recorded
in months, and the metric system is used for height (cm), weight (kg), and lung volume
(L). If height is collected in English units, it can be converted to centimeters by
first converting to inches and then multiplying by 2.54 cm/in. Pounds are converted
to kilograms by dividing by 2.2 lbs/kg.
Table 1. Distribution of t Probability
df / p = 
0.10 
0.05 
0.01 
0.001 
1 2 3 4 56 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30
40
60
120

6.314 2.920 2.353 2.132 2.0151.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.684 1.671 1.658 1.645 
12.706 4.303 3.128 2.776 2.5712.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 2.042 2.025 2.000 1.980 1.960 
63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355 3.250 3.169 3.106 3.055 3.012 2.997 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660 2.617 2.576 
636.619 31.598 12.941 8.610 6.8595.959 5.405 5.041 4.781 4.587 4.437 4.318 4.221 4.140 4.073 4.015 3.965 3.922 3.883 3.850 3.818 3.792 3.767 3.745 3.725 3.707 3.690 3.674 3.659 3.646
3.551 3.460 3.373 3.291 
<=====accept H_{Ø}

reject H_{Ø}===>

 Use the organizing functions in the spreadsheet to compare two populations with
respect to their lung volume. These populations could be established on the basis
of male v. female students, younger v. older students, shorter v. taller students,
or lighter v. heavier students. Where the difference between groups is not obvious (as it
is in gender), the groups could be differentiated on whether their ages, heights,
or weights fell below or above the mean or median for all the data in that category.
In addition, each lab group could compare lung volumes of two populations based on different
criteria.
 Write a formal report using the prescribed report format regarding the Relationship
Between (the selected criterion) and Lung Volume for (the population). Your paper should address the question of whether there is a significant difference
in lung volume between the two populations.
Procedure for entering data in a computer spreadsheet
 Boot ClarisWorks and open a new spreadsheet.
 In Row 1 of the spreadsheet, label the columns with the headings corresponding
to the data that you are taking for each subject tested (i.e., subject number, age,
gender, height, weight, lung volume, etc.). Note in Figure 2 that column widths
can be changed to fit these headings and that units are easiest to interpret if they are given
in the metric system. You can also set the number of decimal places you use for
data and/or calculations in each cell.
Figure 2. Sample setup for data in a spreadsheet with some sample data entered.

A 
B 
C 
D 
E 
F 
1 
Subject 
Gender 
Age 
Height 
Weight 
Lung Volume 
2 

M/F 
(mo) 
(cm) 
(kg) 
(L) 
3 
1 
M 
197 
172.7 
70.5 
5.1 
4 
2 
F 
199 
160.3 
52.3 
4.8 
5 
3 





6 
4 





7 
5 





8 
6 





9 
7 





10 
8 





11 
9 





12 
10 





13 
11 





14 
12 





 Beginning in Row 3, enter the data for each subject. Enter values without units
(the units are given in the heading for each column). It is a good idea to save
your data after entering each subject.
 When all of the data are entered, the information may be sorted by any of the criteria
represented by the data (age, gender, height, weight, etc.). CAUTION
: when sorting, you must select all of the data
in the spreadsheet; if you select only the data in the column you want to serve as
the basis for the sort, only that column will be sorted and it will no longer be
aligned with the appropriate subject.
 At the bottom of the spreadsheet, you can designate various cells to summarize
data for you. For example, you can show the number of data items in each population
(n1 and n2), the sum, or the average of each column or data range selected by entering a formula
from the Paste Function
submenu (Edit Menu). The variance and standard deviation values for selected ranges
can also be calculated using the Paste Function
submenu. The values calculated by the spreadsheet can then be used to calculate
"t" value for comparing the two populations as described earlier.
 You can also use the spreadsheet to calculate the value of "t" by entering the appropriate formula using cell positions to identify the variables
in the formula. Your teacher should be able to help you in this endeavor.
