Skip to Main Content

SPSS and Statistics: Basic Statistics

Basic Statistics

AA Team Guide for Descriptive Statistics

There are a number of different ways to calculate descriptive statistics in SPSS. We will use the Frequencies menu option. To start the analysis, click on Analyze > Descriptive Statistics > Frequencies.

Frequencies menu path in SPSS

 

This will bring up the Frequencies dialogue box. You can move the scale variable you wish to calculate the descriptive statistics into the Variable(s) box. You can drag and drop the scale variable; or first select it, and then click the arrow button in the centre of the dialogue box.

Frequencies dialogue box in SPSS

 

Once you have moved the scale variable into the right-hand Variable(s) box, first untick the Display frequency tables option. Next, click the Statistics button. This will bring up the Frequency Statistics dialogue box, where it is possible to choose a number of descriptive measures.

Frequencies statistics dialogue box in SPSS

 

Once you have ticked the descriptive measures you want, tick the Continue button, and then click the OK button in the Frequencies dialogue box to carry out the analysis.

 


The Result

The table of statistics is displayed in the SPSS Output Viewer. It is fairly self-explanatory displaying all the descriptive measures that you selected.

Table of descriptive statistics in SPSS

 


Further Study

Happiness... you should now be able to complete frequency tables in SPSS. However, if you want to explore further, here are two sites:


AA Team Guide for Frequency Tables

A frequency table will display the count and percentage for each level (group) in a categorical variable. We will use the Frequencies menu option. To start the analysis, click on Analyze > Descriptive Statistics > Frequencies.

Frequencies menu path in SPSS

 

This will bring up the Frequencies dialogue box. You can move the categorical variable (nominal or ordinal) you wish to create the frequency table into the Variable(s) box. You can drag and drop the categorial variable; or first select it, and then click the arrow button in the centre of the dialogue box.

Frequencies dialogue box in SPSS

 

Once you have moved the categorial variable into the right-hand Variable(s) box, be sure the Display frequency tables option is ticked. Next, click the OK button to create the table.

 


The Result

The table of frequency is displayed in the SPSS Output Viewer. It is fairly self-explanatory displaying the count (frequency) and the percentage for each level (group) within the categorial variable that you selected (below are two examples).

Frequency table results in SPSS

 


Further Study

Happiness... you should now be able to complete frequency tables in SPSS. However, if you want to explore further, here are two sites:


AA Team Guide for Charts

There are a number of excellent charts in SPSS to give visual interpretation to your data. We will look at four key charts as a starting reference, but you should be able to develop more charts as a follow-on from this guide.

  1. Histogram
  2. Bar
  3. Boxplot
  4. Scatter/Dot

For all the charts in this guide, we will use the Chart Builder. To start, click on Graphs > Chart Builder.

Menu path for the Chart Builder in SPSS

This will open the Chart Builder dialogue box, and I have labelled 6 areas to help navigate through the Chart Builder:

  1. list of variables
  2. chart construction tabs
  3. list of chart types
  4. variants for each chart type
  5. preview / sandbox area to construct the chart
  6. expand button for properties side panel

The Chart Builder dialogue box in SPSS

 


1) Histogram

After opening the Chart Builder, select Histogram from the (3) list of chart types, and choose the first variant from the (4) list of variants. Next, drag the scale variable you want to chart onto the X-axis placard in the (5) preview / sandbox area (in my example I used Weight_kg). Finally, open the (6) properties side panel and tick the Display normal curve option. Click the OK button when finished.

The Chart Builder dialogue box for a Histogram in SPSS

 


2) Bar

After opening the Chart Builder, select Bar from the (3) list of chart types, and choose the first variant from the (4) list of variants. Next, drag the categorial variable you want to chart onto the X-axis placard in the (5) preview / sandbox area (in my example I used Gender). And then, drag the second scale variable you want to chart onto the Y-axis placard in the (5) preview / sandbox area (in my example I used Weight_kg). Finally, open the (6) properties side panel and tick the Display error bars option, and select the type of error bars -- Confidence Intervals, or Standard Error (with 1 as the multiplier), or Standard Deviation (with 1 as the multiplier). Click the OK button when finished.

The Chart Builder dialogue box for a Bar chart in SPSS

 


3) Boxplot

After opening the Chart Builder, select Boxplot from the (3) list of chart types, and choose the first variant from the (4) list of variants. Next, drag the categorial variable you want to chart onto the X-axis placard in the (5) preview / sandbox area (in my example I used Gender). And then, drag the second scale variable you want to chart onto the Y-axis placard in the (5) preview / sandbox area (in my example I used Weight_kg). Click the OK button when finished.

The Chart Builder dialogue box for a Boxplot in SPSS

 


4) Scatter/Dot

After opening the Chart Builder, select Scatter/Dot from the (3) list of chart types, and choose the first variant from the (4) list of variants. Next, drag the scale variable you want to chart onto the X-axis placard in the (5) preview / sandbox area (in my example I used Weight_kg). And then drag the second scale variable you want to chart onto the Y-axis placard in the (5) preview / sandbox area (in my example I used Muscle_kg). Finally, open the (6) properties side panel and tick the Linear Fit Lines option, and select the Total as the type of line. Click the OK button when finished.

 


5) Chart Editor

After creating any chart in SPSS it will appear in the SPSS Output Viewer. If you double-click on the chart the Chart Editor will open; and there are menus and quick tools to change the text formatting, the scaling of X-axis and Y-axis, to add data labels, to add trendlines, and much more. When finished, close the Chart Editor and the changes will update on the original chart.

The Chart Editor in SPSS

 


Further Study

Happiness... you should now be able to create charts in SPSS. However, if you want to explore further, here are two sites:


AA Team Guide for Parametric Assumptions

There are a number of parametric assumptions that are requirements for certain statistical tests in SPSS. We will look at four key assumptions as the starting requirement for the majority of these tests.

  1. Scale Variable
  2. Normal Distribution
  3. Outliers
  4. Homogeneity of Variance

 


1) Scale Variable

The variable must be a scale measurement type. You are not concerned with parametric assumptions for variables that are nominal or ordinal measurement types.

List of variables in SPSS

 

A scale variable (interval or ratio) measures quantity and where every unit of measure is at equal divisions. Equal divisions means that 4 feet is 2x longer than 2 feet and that 10 minutes is 5x longer than 2 minutes.

Ruler as example of equal intervals

 


2) Normal Distribution

The normal distribution, also known as the Gaussian distribution, is a probability function that describes how the values of a variable are spread out. It is a symmetric distribution showing that data near the mean are more frequent in occurrence and the probabilities for values further away from the mean taper off equally in both directions. In a graph, normal distribution will appear as a bell-shaped curve.

Bell curve of normal distribution

 

You can examine a scale variable for normal distribution either with a histogram (as above) or with a Q-Q plot (not shown). You can test for normal distribution with the Kolmogorov-Smirnov test or the Shapiro-Wilk test. To start the analysis, click on Analyze > Descriptive Statistics > Explore.

Explore menu path in SPSS

 

This will open the Explore dialogue box. Move the scale variable to be tested into the right-hand Dependent List: box. [As a side note: you can put a categorial variable in the Factor List: box if you want to split the dependent list variable in order to test each group separately.] Next in the Display section (at the bottom), tick the Plots radio button. Finally, open the Plots options button (on the far right side).

Explore dialogue box in SPSS

 

This will open the Explore: Plots dialogue box. Tick the Normality plots with tests option. There are other options you may (or may not) want to tick. When finished click the Continue button, and then the OK button in the original Explore dialogue box.

Explore: Plots dialogue box in SPSS

 


2A) The Result for Normal Distribution

The result will appear in the SPSS Output Viewer. The Kolmogorov-Smirnov test and the Shapiro-Wilk test result appear in the Test of Normality statistics table.

Test of normality result table in SPSS

 

Most often they will agree. However (as is the case in our example), the Kolmogorov-Smirnov test (p = .042) shows the data failed normal distribution, but the Shapiro-Wilk test (p= .090) shows the data passed normal distribution. When they do not agree, most researchers will select the Shapiro-Wilk result. It is a more robust test, it does not have the Lilliefors correction applied, and it manages small sample sizes better.

There are also a variety of charts in the result -- Q-Q plot, Stem & Leaf, Histogram (if you ticked this option), and Boxplot. All of which provide good visual evidence of normal (or non-normal) distribution, as confirmation and a visual inspection into the normality test result.

 


3) Outliers

Another important property within parametric assumptions involves outliers -- you do not want to have many of these. There are several ways to detect these little monsters in the data with charts, such as, Stem & Leaf, Histogram, Q-Q Plots, and Boxplots. Below is a Q-Q plot and a Boxplot of the Muscle_kg data from the earlier Explore result (the outliers are underlined in green).

Q-Q plot in SPSS

 

Boxplot in SPSS

 

There are three outliers in the data, and one is an extreme outlier (marked as an asterisk symbol in the boxplot). With three of these little monsters in the data, you can understand better why the two normality tests are disagreeing. And you can also understand why I call them 'monsters'. In this case, as already stated, you would accept the Shapiro-Wilk result and consider the data as having normal distribution.

If you scroll back up to the Q-Q plot, and imagine the three outliers not there, you can see that the rest of the data (40 out of 43 values which is 93%) has a fairly good distribution around the line of fit. Again this may help to understand why these two tests of normality are contradicting each other. It seems the Kolmogorov-Smirnov test is more influenced by the outliers (thus failing normal distribution), while the Shapiro-Wilk test gives more weight to the 93% majority (thus passing normal distribution).

 


4) Homogeneity of Variance

The final property we want to add into the mix of parametric assumptions is homogeneity of variance. This means a scale variable should have fairly equal variance when split into the respective levels within a categorial variable. For example, the male's data for Muscle_kg should have a similar variance to the female's data for Muscle_kg. I have used a Bar chart (see below) with standard deviation error bars as a good visual check for homogeneity of variance.

Bar chart with standard deviation error bars in SPSS

 

You can see the two error bars (black whiskers) are not exactly equal. And therefore initially you might think the two groups (male and female) do not have homogeneity of variance. However, you can allow for a certain amount of discrepancy from exact equality and still not violate homogeneity of variance.

The error bar in the female data is about 40% longer than the error bar in the male data. This small percentage of difference is allowable; and in fact it is not until the difference exceeds 200% (double) or even 300% (triple) that the property of homogeneity of variance is violated... amazing!

Homogeneity of variance must also be checked when testing two scale variables against each other. In this case a Scatter/Dot chart can be used as a good visual check.

Scatter/Dot chart for homogeneity of variance in SPSS

 

We can see that throughout the Weight variable (70kg - 75kg - 80kg - 85kg - 90kg) the Muscle_kg variable is spread between a fairly parallel pathway. Well except at the 100kg, however there are only two values out that far which is a small percentage (4.5%) of all the values. Therefore is is fairly reasonable to say these two variables have homogeneity of variance.

 


Review

A quick review of our top four parametric assumptions:

  • a scale (interval or ratio) measurement
  • normal distribution
  • few (or no significant) outliers
  • homogeneity of variance

 


Further Study

Happiness... you should now be able to test for parametric assumptions in SPSS. However, if you want to explore further, here are two sites: