# SPSS & Statistics

## Tests for Differences

#### AA Team Guide for the Student T-test

The Student T-test (aka: Independent Samples T-test) compares the means of two independent groups to determine if there is reasonable evidence (within the sample) that the population means for these two groups are statistically significant in their difference.

#### Test Assumptions

(1) The dependent variable (test variable) is continuous (interval or ratio).

(2) The independent variable (factor variable) should be two independent groups in which the samples (participants) have no relationship between the other participants in their group or between the participants from the other group.

(3) The samples (participants) for each group are taken at random from the population.

(4) The sample size for both groups have roughly the same number of participants.

(5) The dependent variable (test variable) has a reasonably normal distribution for both groups. Normal distribution means you do not want data heavily skewed to the left or right tails, and you do not want significant outliers (better to have no outliers).

(6) The dependent variable (test variable) has equal variance (homogeneity) between both groups. Homogeneity means you want the standard deviation measurements for the two groups to be reasonable the same to each other.

#### Quick Quiz

(Q1) Does the dependent variable (Weight) have a reasonably normal distribution for both groups (male and female)? (Answer: Yes). The data for both groups is certainly not heavily skewed to the left or right tails, and the data values (blue bins) pretty much gather in and around the centre of the bell curve... happiness.

(Q2) Does the dependent variable (Weight) have a equal variance between both groups (male and female)? (Answer: Yes). The variance (whisker-to-whisker) for both groups, although not exactly equal, is certainly not excessively different to each other. But wait... no, no, no... there are a few outliers, and this could violate one of the assumptions of this test. One interesting point regarding these outliers is that none are measured as extreme. In SPSS extreme outliers are marked with the asterisk (*) symbol in a Boxplot chart.

Here is where SPSS will not help you. You as the researcher must look at the SPSS results and make some relevant interpretation. In this example there are 3 outliers, and the Student T-test would prefer 0 outliers. You the researcher will need to make a decision and support that decision with evidence.

In the write-up for this test you could indicate that you elected to run the Student T-test because the data met the assumptions of normal distribution and homogeneity of variance across the two groups. Equally, there are similar sample sizes in the two groups with 21 female to 22 males (include the histogram charts, a gender frequency table, and a Kolmogorov-Smirnov or Shapiro-Wilk test as evidence). However, not all the assumptions for this test were met perfectly. There were 3 outliers which 1) is only 6.9% of the data, and 2) none of the 3 outliers were measured as extreme (including the Boxplot chart as evidence). In an ideal world this test prefers 0 outliers, but the few outliers that exist are certainly not excessive in number or significant in distance from the median.

#### Student T-test

To start the analysis,click Analyze > Compare Means > Independent-Samples T Test This will bring up the Independent-Samples T Test dialogue box. To carry out the test, move the dependent (scale) variable into the Test Variable(s): placard. Next move the independent (nominal or ordinal) variable into the Grouping Variable: placard. Click on the Define Groups... button, and enter the correct numeric values that represent each group. Click the Continue button, and then click the OK button at the bottom of the main dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Group Statistics table there are the key group metrics -- sample size (N), mean, and standard deviation. From these measurements you should develop an intuitive perspective as to the whether the test will indicate a statistically significant difference or not. Here is this example, there is approximately a 3 kg difference in weight between the females and the males -- the females (83.42 kg) weigh only 3.7% more than the males (80.40 kg). You would not expect this small difference to be statistically significant. Equally there is a reasonably similar standard deviation measurement for the two gender groups, and therefore, you would expect that the two groups do not violate homogeneity of variance. In the Independent Samples Test table there are the key test metrics -- equality of variance and then all the t-test measurements. In this example, first we see (as we estimated earlier from the two standard deviations) the two groups do not violate homogeneity as the p-value (0.147) in the Levene's Test for Equality of Variances is above the critical 0.05 alpha level. Therefore, in the second part of this table, we read (and report) all the metrics from the top row which is labeled, Equal variances assumed.

These measurements in this second part of the table give you the t-score, the degrees of freedom, the p-value, the mean difference, and the 95% C.I. of the difference. Here in this example the t-score (1.407) is relatively small, and we were expecting that as there is only a 3 kg difference in weight. Equally, the p-value (0.167) is above the critical 0.05 alpha level indicating this difference between the females weight and the males weight in not statistically significant, which also we were expecting as the 3 kg difference is only a 3.7% magnitude of change.

Finally, the 95% C.I. of the difference provides a high / low range as to where the difference (3 kg) between these two gender groups might actually exist in the population. Here the male's weight could actually be 7.3 kg lower than the female's weight, or the male's weight could actual overtake and exceed the female's weight by 1.3 kg. This is a range of 8 kg from the high to low which indeed is very narrow -- excellent. But keep in mind this range moves from a negative scale, and crosses the 0 threshold, and then moves into a positive scale. So, at some point the difference could be 0 kg, that is, the females and males weigh the same -- a nil difference.

#### Further Study

Happiness... you should now understand how to perform the Student T-test in SPSS and how to interpret the result. However, if you want to explore further, here are two sites:

#### AA Team Guide for the Mann-Whitney U Test

The Mann-Whitney U test compares the medians or mean ranks of two independent groups and is commonly used when the dependent variable is either categorial (ordinal) or continuous (interval or ratio), and does not meet the assumptions for the Independent Samples T-test (aka:Student T-test).

#### Test Assumptions

(1) The dependent variable (test variable) can be categorial (ordinal) or continuous (interval or ratio) in its measurement type.

(2) The independent variable should be two independent groups in which the samples (participants) have no relationship between the other participants in their group or between the participants from the other group

(3) The samples (participants) for each group are taken at random from the population.

(4) The sample size can be disproportionate or unbalanced in the number of participants in each group.

(5) The dependent variable (test variable) for one or both groups can be non-normal in it distribution. Non-normal distribution means the data can be heavily skewed to the left or right tails, and/or it may have significant outliers.

(6) The dependent variable (test variable) for one or both groups may (or may not) have a similar shape (homogeneity) in its variance. It is extremely unlikely that the variance of the two groups will be identical, and therefore, the Mann-Whitney U test will test between the mean ranks of the dependent variable for both groups.

#### Quick Quiz

(Q1) Would you use the Mann-Whitney U test for the following data on users and non-users of a weight training supplement ? (Answer: Yes). The participant count (frequency) for the two groups is certainly not in a balanced proportion with User at 52 (73%) and Non-user at 19 (27%). Also the dependent variable (Muscle_kg) violates normal distribution for the smaller Non-user group as indicated by both the Kolmogorov-Smirnov (p = .006) and the Shapiro-Wilk (p = .020) tests of normality, which are below the critical 0.05 alpha level.

(Q2) Does the Boxplot give support for using the Mann-Whitney U test to compare between the users and non-users of the weight training supplement ? (Answer: Yes). The total variance (whisker-to-whisker and including outliers) between the two groups, although not exactly equal, is certainly not wildly different to each other. And you could argue the two groups have homogeneity of variance. However, there are a several outliers in the non-user group; and one is an extreme outlier, as marked with the asterisk (*) symbol. This number and condition of outliers in the non-user group would give support for choosing the Mann-Whitney U test to analyse the data.

#### Mann-Whitney U Test

To start the analysis,click Analyze > Nonparametric Tests > Legacy Dialogs > 2 Independent Samples... This will bring up the Two-Independent-Samples Tests dialogue box. To carry out the test, move the dependent variable (scale or ordinal) into the Test Variable List: placard. Next move the independent variable (nominal or ordinal) into the Grouping Variable: placard. Click on the Define Groups... button, and enter the correct numeric values that represent each group. Click the Continue button. Verify that the Mann-Whitney U test is selected in the Test Type section. Finally, click the OK button on the bottom of the main dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Ranks table there are the key group metrics -- sample size (N) and mean rank. From these measurements you should develop an intuitive perspective as to the whether the test will indicate a statistically significant difference or not. Here is this example, there is approximately a 3.8 point difference between the mean rank of the non-users (19.84) and the mean rank of the users (23.71) as regards to their muscle mass. This is a differences of 4 places in rank, and you would not expect this small difference to be statistically significant. In the Test Statistics table there are the key test metrics -- the Mann-Whitney U score, the p-value (Asymp. Sig.), and some researchers will also report the Z score. In this example, we see (as we estimated earlier from the two mean ranks) the difference between the two groups is not statistically significant as the p-value (0.316) is above the critical 0.05 alpha level. In your report write-up you should also include the Mann-Whitney U score as further support that indicates the difference is not statistically significant.

The Mann-Whitney U tests converts the raw data values for the dependent variable into a rank -- 1st, 2nd, 3rd, 4th, and so forth. Then it adds all the converted ranks for all the participant in their respective group to achieve that group's total "sum of ranks". If you divide the sum of ranks by the number of participants, you will get the mean rank (or what is the typical participant's rank). Remember in statistics we tend to determine 1) what is a typical member in my sample and 2) what is the variance around that typical member.

The Mann-Whitney U test is much simpler to understand and appreciate, as it is not concerned with normal distribution of the dependent variable, and it is not concerned with homogeneity of variance between the two groups.

#### Further Study

Happiness... you should now understand how to perform the Mann-Whitney U test in SPSS and how to interpret the result. However, if you want to explore further, here are two sites:

#### AA Team Guide for the One-Way ANOVA Test

In this tutorial, we will look at how to conduct the One-Way ANOVA test in SPSS (aka: One Factor ANOVA or One-Way Analysis of Variance), and how to interpret the results of the test. The One-Way ANOVA test compares the means of three or more independent groups to determine if there is reasonable evidence that the population means for these three or more groups have a statistically significant difference.

#### Test Assumptions

(1) The dependent variable (test variable) is continuous (interval or ratio).

(2) The independent variable (factor variable) is categorical (nominal or ordinal) and should be three or more independent groups in which the samples (participants) have no relationship between the other participants in their group or between the participants from the other groups.

(3) The samples (participants) for each group are taken at random from the population.

(4) The sample size for all the groups have roughly the same number of participants.

(5) The dependent variable (test variable) has a reasonably normal distribution for each group. Normal distribution means you do not want data heavily skewed to the left or right tails, and you do not want significant outliers (better to have no outliers).

(6) The dependent variable (test variable) has equal variance (homogeneity) between all the groups. Homogeneity means you want the standard deviation measurements for the groups to be roughly the same to each other.

#### Quick Quiz

(Q1) Does the dependent variable (Fat %) have a reasonably normal distribution across the three groups? (Answer: Yes). The data for the three groups is certainly not heavily skewed to the left or right tails, and the data values (blue bins) pretty much gather in and around the centre of the bell curve... happiness.

(Q2) Does the dependent variable (Fat %) have a equal variance between the three groups? (Answer: Yes). The variance (whisker-to-whisker) for the three groups, although not exactly equal, is certainly not excessively different to each other. But wait... no, no, no... there are a few outliers, and this could violate one of the assumptions of this test. One interesting point regarding these outliers is that none are measured as extreme. In SPSS extreme outliers are marked with the asterisk (*) symbol in a Boxplot chart.

Here is where SPSS will not help you. You as the researcher must look at the SPSS results and make some relevant interpretation. In this example there are two outliers, and the One-Way ANOVA test would prefer zero outliers. You the researcher will need to make a decision and support that decision with evidence.

In the write-up for this test you could indicate that you elected to run the ANOVA test because the data met the assumptions of normal distribution and homogeneity of variance across the three groups. Equally, there are similar sample sizes from 13 to 15 participants in each group (include the histogram charts, a participants per sessions Frequency table, and a Kolmogorov-Smirnov or Shapiro-Wilk test as evidence). However, not all the assumptions for this test were met perfectly. There were two outliers which 1) is only 4.6% of the data, and 2) neither of the two outliers were measured as extreme (including the Boxplots as evidence). In an ideal world this test prefers 0 outliers, but the few outliers that exist are certainly not excessive in number nor significant in distance from the median.

#### One-Way ANOVA Test

To start the analysis,click Analyze > Compare Means > One-Way ANOVA This will bring up the One-Way ANOVA dialogue box. To carry out the test, move the dependent (scale) variable into the Dependent List: placard. Next move the independent (nominal or ordinal) variable into the Factor: placard. There is an Options... button, where you can select descriptive statistics for the three groups and a homogeneity of variance test, and there are other extra statistics and a mean plot.

Note: If the dependent variable violates the homogeneity of variance test (a p-value above the critical 0.05 alpha level), then researchers will re-run the One-Way ANOVA test and in the Options... section they will select the Welch statistics. This variant of the One-Way ANOVA test is not concerned with homogeneity of variance between the different groups.

After selecting any extra options that you want, click the Continue button, and then click the OK button at the bottom of the main dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Descriptives table there are the key group metrics -- sample size (N), mean, standard deviation, and 95% C.I. From these measurements you should develop an intuitive perspective as to the whether the test will indicate a statistically significant difference or not. Here is this example, there is approximately a 1.5 and 1.9 point difference in fat (%) between the three groups with 3 times a week at 14.8%, and 4 times a week at 16.3%, and 5 times a week at 14.4%. You would expect this small difference to not be statistically significant. Equally there are almost identical standard deviation measurements for the three groups; and therefore, you would expect that the groups do not violate homogeneity of variance. In the Test of Homogeneity of Variances table there are the key test metrics for homogeneity (equality) of variance. If the data were normally distributed and there were no significant outliers, than you would expect all four measurements to agree. And this is true with our example with all the p-values virtually the same between 0.95 and 0.96. The measurement you would refer to (and quote) in your write-up would be the top row titled, Based on Mean. Here in our example the p-value is 0.955 (well above the critical 0.05 alpha level) which indicates that between the three groups the dependent variable does not violate homogeneity of variance.

Finally, in the ANOVA table you have the the degrees of freedom, the F-score, and the p-value. Here in this example the F-score (1.418) is relative small, and we were expecting that as there is only a 1.5 and 1.9 point difference in fat (%). Equally, the p-value (0.254) is above the critical 0.05 alpha level indicating this difference between the three groups in not statistically significant, which we were expecting based on the results in the Descriptives table as mentioned earlier.

#### Post Hoc Testing

Secondary post hoc testing can be completed if the original One-Way ANOVA result indicated that the differences between the groups were statistically significant . You would need to re-run the test, and in the One-Way ANOVA dialogue box there is a Post Hoc button you would click. This will open the One-Way ANOVA: Post Hoc Multiple Comparison dialogue box, and you can select one of the Post Hoc tests recommended by your tutor. The three most common seem to be: LSD, Bonferroni, and Tukey; and in my example I have selected the LSD test.

#### Post Hoc Results In the Multiple Comparison table you are looking for any comparison with a large mean difference which should result in a corresponding p-value below the critical 0.05 alpha level. In my example there are two comparisons like this, which are the 3 times / week to the 5 times / week, and the 4 times / week to the 5 times / week. In your write-up you would list these two comparisons with the evidence of the mean difference and p-value respectively.

#### Further Study

Happiness... you should now understand how to perform the One-Way ANOVA in SPSS and how to interpret the result. However, if you want to explore further, here are two sites:

#### AA Team Guide for the Kruskal-Wallis Test

The Kruskal-Wallis test compares the medians or mean ranks of three or more independent groups and is commonly used when the dependent variable is either categorial (ordinal) or continuous (interval or ratio), and does not meet the assumptions for the One-Way ANOVA test.

#### Test Assumptions

(1) The dependent variable (test variable) can be categorial (ordinal) or continuous (interval or ratio) in its measure type.

(2) The independent variable should be three or more independent groups in which the samples (participants) have no relationship between the other participants in their group or between the participants from the other groups.

(3) The sample size can be disproportionate or unbalanced in the number of participants in each group.

(4) The dependent variable (test variable) for one or all the groups can be non-normal in it distribution. Non-normal distribution means the data can be heavily skewed to the left or right tails, and/or it may have significant outliers.

(5) The dependent variable (test variable) for one or all the groups may (or may not) have a similar shape (homogeneity) in its variance. It is extremely unlikely that the variance for the groups will be identical, and therefore, the Kruskal-Wallis test will test between the mean ranks of the dependent variable for the all the groups.

#### Quick Quiz

(Q1) Do the three bread types (White, Brown, Seeded) have balanced or equal proportions? (Answer: No) The Brown and Seeded bread types are fairly balanced (equal) in their sample size at 24 (16.1%) and 28 (18.8%) respectively. However the White bread type has a sample size that is more than 3 times larger at 97 (65.1%). The Kruskal-Wallis test is more suited to manage groups with disproportionate (unequal) sample sizes.

(Q2) Do the three bread types (White, Brown, Seeded) have a normal distribution? (Answer: No) The Seeded bread appears the most normal with the data values located around the mean (top of the bell curve). However, the Brown bread is starting to show a higher distribution of data values on the left tail and some outliers above 2.0 grams. And the White bread has increased this same skewed distribution (overweight on the left tail and outliers on the right tail) to a much higher degree with several extreme outliers at 4.0 to 8.0 grams. The Kruskal-Wallis test is more suited to manage groups where their test data are not normally distributed and/or have a high number of outliers.

#### Kruskal-Wallis Test

To start the analysis,click Analyze > Nonparametric Tests > Legacy Dialogs > K Independent Samples This will bring up the Tests for Several Independent Samples dialogue box. To carry out the test, move the dependent (scale or ordinal) variable into the Test Variable List: placard. Next move the independent (nominal or ordinal) variable into the Grouping Variable: placard. Click on the Define Range... button, and enter the correct numeric values that represent all the groups. Click the Continue button. Verify that the Kruskal-Wallis test is selected in Test Type section. Finally, click the OK button at the bottom of the main dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Ranks table there are the key group metrics -- sample size (N) and mean rank. From these measurements you should develop an intuitive perspective as to the whether the Kruskal-Wallis test will indicate a statistically significant difference or not. Here is this example, there is approximately a 9.1 point difference between the mean rank of the White bread (65.12) and the mean rank of the Brown bread (74.29) as regards to their saturated fat. You would not expect this moderate (13.9%) difference to be statistically significant.

However, there is approximately a 44.7 point difference between the mean rank of the White bread (65.12) and the mean rank of the Seeded bread (109.84). You would expect this large (68.6%) difference to be statistically significant. Equally, there is approximately a 35.5 point difference between the mean rank of the Brown bread (74.29) and the mean rank of the Seeded bread (109.84). You would expect this large (47.8%) difference to be statistically significant In the Test Statistics table there are the key test metrics -- the Kruskal-Wallis H score, the degrees of freedom (df), and the p-value (Asymp. Sig.). In this example, we see (as we estimated earlier from the mean ranks between the bread types) the difference between the three groups is statistically significant as the p-value (0.000) is below the critical 0.05 alpha level. In your report write-up you should also include the Kruskal-Wallis H score as further support that indicates the difference is statistically significant.

The Kruskal-Wallis test converts the raw data values for the dependent variable into a rank -- 1st, 2nd, 3rd, 4th, and so forth. Then it adds all the converted ranks for all the participant in their respective group to achieve that group's "sum of ranks". If you divide the sum of ranks by the number of participants, you will get the mean rank (or what is a typical participant's rank). Remember, in statistics we tend to determine 1) what is a typical member in my sample and 2) what is the variance around that typical member.

The Kruskal-Wallis test is much simpler to understand and appreciate, as it is not concerned with normal distribution of the dependent variable, and it is not concerned with homogeneity of variance between the three groups.

Sadly, what it does not report is exactly between which groups the statistical difference exists. In our example, is the statistical difference between the White and Brown breads, or between the White and Seeded breads, or between the Brown and Seeded breads? To find exactly where the statistical difference exists between our three bread groups, you would need to run three separate Mann-Whitney U tests on each of three pair-wise comparisons listed above.

#### Further Study

Happiness... you should now understand how to perform the Kruskal-Wallis test in SPSS and how to interpret the result. However, if you want to explore further, here are two sites:

#### AA Team Guide for the Paired Samples T-test

The Paired Samples T-test (aka: Repeated Measures T-test) compares the means of two measurements taken from the same participant or sample object. It is commonly used for a measurement at two different times (e.g., pre-test and post-test score with an intervention administered between the two scores), or a measurement taken under two different conditions (e.g., a test under a control condition and an experiment condition).

The Paired Samples T-test determines if there is evidence that the mean difference between the paired measurements is significantly different from a zero difference.

#### Test Assumptions

(1) The dependent variable (test variable) is continuous (interval or ratio).

(2) The independent variable consist of two related groups. Related groups means the participants (or sample objects) for both measurements of the dependent variable are the same participants.

(3) The participants (or sample objects) are taken at random from the population.

(4) The dependent variables (test variables) have a reasonable normal distribution. Normal distribution means you do not want data heavily skewed to the left or right tails, and you do not want significant outliers (better to have no outliers).

(Note) When testing the assumptions related to normal distribution and outliers, you must create and use a new variable that represents the difference between the two paired measurements. Do not test the original two paired measurements themselves.

#### Quick Quiz

(Q1) You want to examine the alertness of both male and female students at 09:00 am lectures and at 1:00 pm (after lunch) lectures. Do you use the Independent Samples T-test or the Paired Samples T-test? (Answer: Independent Samples T-test). Although the experiment design sounds like a before and after intervention, it would be highly unlikely that at the two different times (09:00 am and 1:00 pm) the students in the lectures would be the same identical students.

(Q2) You have surveyed students on what they eat (over one week) for breakfast and lunch. A diet-plan app has calculated the energy level of the food eaten for each participant. You used SPSS to create a new variable which is the difference between the breakfast meal and the lunch meal, and you created a histogram to check for normal distribution and outliers. From the histogram below, would you use the Independent Samples T-test or the Paired Samples T-test? (Answer: Paired Samples T-test). Here in this experiment design there are the same students surveyed for their breakfast meal and for their lunch meal. Equally, the histogram shows a very reasonable normal distribution (no extreme skewness on the left or right tails) and with no significant outliers... happiness!

#### Paired Samples T-test

To start the analysis,click Analyze > Compare Means > Paired Samples T Test This will bring up the Paired-Samples T Test dialogue box. To carry out the test, move the two dependent (scale) variables into the Paired Variables: placard. And then click the OK button at the bottom of dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Paired Samples Statistics table there are the key group metrics -- sample size (N), mean, and standard deviation. From these measurements you should develop an intuitive perspective as to the whether the test will indicate a statistically significant difference or not. Here is this example, there is approximately a 505 calorie difference (on average) in the lunch meal (880 calories) and the dinner meal (1385 calories) which is a 57.4% increase in calories from lunch to dinner. You would expect this sizeable difference to be statistically significant. In the Paired Samples Test table there are the key test metrics -- 95% confidence intervals, the t-score, the degrees of freedom (df), and the p-value. In this example, we can see (as we estimated earlier from the two means) the t-score (34.242) is extremely large, and we were expecting this as there was a 505 calorie difference between the two meals. Equally, the p-value (0.000) is well below the critical 0.05 alpha level indicating the difference in calories between the lunch meal and dinner meal is statistically significant, which also we were expecting as the 505 calorie difference is a 57.4% magnitude of change.

Finally, the 95% C.I. of the difference provides a high / low range of accuracy as to where this difference (505 calories) between the two meals might actually exist in the population. Here the calorie difference could actually be as high as 534 calories or as low as 474 calories. This is only a 60 calorie range from high to low providing strong confidence that the mean difference in our sample accurately represents what is likely to be the mean difference in the population.

#### Further Study

Happiness... you should now understand how to perform the Paired Samples T-test in SPSS and how to interpret the result. However, if you want to explore further, here are two sites:

#### AA Team Guide for the Wilcoxon Sign Test

The Wilcoxon Sign test (aka: Wilcoxon Signed-Rank) compares the means of two measurements taken from the same participant or sample object. It is commonly used for a measurement at two different times (e.g., pre-test and post-test score with an intervention administered between the two scores), or a measurement taken under two different conditions (e.g., a test under a control condition and an experiment condition).

The Wilcoxon Sign test determines if there is evidence that the mean difference between the paired measurements is significantly different from a zero difference.

#### Test Assumptions

(1) The dependent variable (test variable) is continuous (interval or ratio) or it can be categorical (ordinal).

(2) The independent variable consist of two related groups. Related groups means the participants (or sample objects) for both measurements of the dependent variable are the same participants.

(3) The participants (or sample objects) are taken at random from the population.

(4) The dependent variables (test variables) do not need to have a normal distribution. This test does not require normality or homoscedasticity (the data having the same scatter or spread) within the dependent variables. Non-normal distribution means the data can be skewed to the left or right tails, and the data can have a significant number outliers.

#### Quick Quiz

(Q1) You want to examine caffeine markers in a group of students. One week the students will receive a normal cup of coffee (control group), and the next week the same students will receive a cup of coffee with an additive (experiment group). The research is set up as a double blind, so that neither the students nor the researches know which cup of coffee is normal or with the additive. Could you use the Wilcoxon Sign test to analyse the data? (Answer: Yes) The experiment design is set-up as two related (dependent) groups tested twice, once as the control group and once as the experiment group.

(Q2) You have collected the data for the two coffee groups (control and experiment). You used SPSS to create a boxplot to visualise the data side-by-side. From the boxplot, as an intuitive perspective, would the Wilcoxon Sign test indicate a statistically significant difference? (Answer: Yes) Here in this boxplot there is very little overlap between the two interquartile ranges (IQR). Remember, the IQRs represent 50% the the data values. Therefore for almost 50% of experiment group (or greater if we include the whiskers), the caffeine markers are different from when the same person was in the control group.

#### Wilcoxon Sign Test

To start the analysis,click Analyze > Nonparametric Tests > Legacy Dialogs > 2 Related Samples This will bring up the Two-Related-Samples Tests dialogue box. To carry out the test, move the two dependent (scale or ordinal) variables into the Test Pairs: placard. And then click the OK button at the bottom of dialogue box. #### The Result

The results will appear in the SPSS Output Viewer. In the Ranks table there are the key group metrics -- sample size (N), mean rank, and sum of ranks. From these measurements you should develop an intuitive perspective as to the whether the test will indicate a statistically significant difference or not. Here is this example, there are 0 negative ranks to 50 positive ranks -- take note there are only 50 students in the sample. So, out of 50 students all of them had a positive rank, not a single student had a negative rank. If you had 50 darts and threw that at a dart board (a random action), would they all land on the top half and not a single dart would land on bottom half? Never! Something is happening here that is violating the laws of random probability and equality. If there is no bias, trickery, or tom-foolery, if everything is equal with the students and the coffee, then you would expect 25 negative ranks and 25 positive ranks. As the result is extremely skewed with everyone in a positive rank, we would expect the test to indicate the difference is statistically significant. As the footnotes under the Ranks table indicate a negative rank is where the experiment group's caffeine marker (BPM) is lower than the same person's caffeine marker when they were in the control group. In other words, their second test measurement (experiment) was lower than their first test measurement (control). And, of course, a positive rank is just the opposite. As mentioned earlier, in the mathematics of random probability we are expecting a 25 to 25 ratio, that is, half the students to have a lower second score and half the students to have a higher second score. The further we move away from this equal and random ratio, the more likely the result will be statistically significant.

In the Test Statistics table there are the key test metrics -- the test score (Z) and the p-value (Asymp. Sig). In this example, we can see (as we estimated earlier from the negative and positive ranks) the z-score (6.169) is extremely large, and we were expecting this as there was a 0 to 50 negative to positive ratio in the ranks. Equally, the p-value (0.000) is well below the critical 0.05 alpha level indicating the difference in the caffeine markers for the control to the experiment is statistically significant, which also we were expecting as the 0 to 50 ratio in ranks is a 100% magnitude of change.

#### Further Study

Happiness... you should now understand how to perform the Wilcoxon Sign Test in SPSS and how to interpret the result. However, if you want to explore further, here are two sites: