CS130 Lecture 5

SPSS

SPSS is a statistical analysis program that allows:

Goals for this section of the course include:

Note: This is not a statistics course such as Math 207. We will only concentrate on basic statistical concepts.

Examining the Help utility within SPSS

SPSS has a very nice help utility as part of the application. Let's briefly examine this utility before diving into SPSS.

Creating a Simple Dataset

Let's go to the Tutorial section entitled "Using the Data Editor" and discuss Data View versus Variable View.

Data View: __________________________________________________________

____________________________________________________________________

____________________________________________________________________

Variable View: _______________________________________________________

____________________________________________________________________

____________________________________________________________________

5.1 Problem

  1. Create the variables needed for the following dataset. They will all be numeric for starters.
Brand Name ServingPerPkg OzPerPkg Calories TotalFatInGrams SatFatInGrams
M&M/Mars Snickers Peanut Butter
1.0
2.00
310
20.0
7.0
Hershey Cookies 'n Mint
1.0
1.55
230
12.0
6.0
Hershey Cadbury Dairy Milk
3.5
5.00
220
12.0
8.0
M&M/Mars Snickers
3.0
3.70
170
8.0
3.0
Charms Sugar Daddy
1.0
1.70
200
2.5
2.5

Note: Variable names must begin with a letter and cannot contain spaces or any illegal characters. Let's use the following convention for variable namnes: 1) the name begins with a letter and 2) the variable can contain letters, numbers, an underscore, or a period.

  1. Switch to Data View and look at your variables.
  2. Going back to Variable View, change the type of Name to a String and the decimals column is to be 0, 0, 1, 2, 0, 1, 1.
  3. In the Values column, create the Value Labels for Brand where 1 = "M&M/Mars", 2 = "Hershey", 3 = "Charms"
  4. Enter the data into the correct SPSS cells in the Data View. As you are entering the data into the cells, you will notice that some of the data cannot be entered correctly. You will need to switch to the Variable View to fix these problems.

Summary Statistics

In the tutorial, let's go to "Examining Summary Statistics for Individual Variables"

SPSS contains the following data types:

Question: For the data in Problem 1, what is the type of data for each of the variables:

Different summary measures are used for the different data types:

5.2 Problem

For the previous problem, do the following:

Go into the "Analyze Menu -> Descriptive Statistics -> Frequencies" and display the appropriate statistics for each of the variables.

Types of Data Analysis

When doing data analysis, we are interested in two types of summaries:

  1. Statistical Summaries (e.g. descriptive, hypothesis testing)
  2. Visual Summaries (e.g. tables, graphs)

Statistics is sometimes broken up into two different areas:

  1. Descriptive Statistics - a situation is described by the statistics by the collection, summarization, organization and presentation of data.
  2. Inferential Statistics - where inferences are made from samples of the population (e.g. smokers smoking a pack of cigarettes per day have a higher cholesterol). In this area we get into Hypothesis testing.

5.3 Problem

A paint manufacturer tested two experimental brands of paint over a period of months to determine how long they would last without fading. Here are the results:

Brand A Brand B
10 25
20 35
60 40
40 45
50 35
30 30

5.4 Problem

  1. To make sure we stay fresh with Excel for the final, here's a little problem. Generate 100 random integer numbers (i.e. the numbers do not contain any decimal places) between 1 and 20. Beside each number output "EVEN" or "ODD". Save this file as random.xls.
  2. Import this data into SPSS and create a Histogram and Pie Chart of the dataset.