Comparing SAS Data Sets using PROC COMPARE, Assignments of Statistics

Instructions on how to enter sas data sets for comparison using the proc compare procedure. It also explains how to identify and correct errors in data sets by comparing them element-wise. Sas code examples and output.

Typology: Assignments

Pre 2010

Uploaded on 09/24/2009

koofers-user-75u
koofers-user-75u 🇺🇸

6 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Proc Compare – Stat 480
James D. Abbey
January 19, 2006
1. Enter your SAS data sets for comparison. Ideally, the data sets will match
entirely. In this exercise, data1 will be correct while data2 will have flaws.
a. SAS Code for Data Entry:
i. Data data1; specifies that you are entering data into SAS and
names the data “data1”
ii. infile ‘C:\Temp\data1.txt’; points to an external file on your hard-
drive. You could also have used a “datalines;” command to enter
the data directly into SAS.
iii. Input var1 1 var2 2…; specifies what variables to input. The
number following the variable name is a column flag, which tells
SAS to look for data in column 1, 2, etc.
1. Other input options are a space, tab, comma, etc.
a. Space and tab will be treated the same in SAS and
require no special syntax. Use of a comma will be
covered later.
iv. Run; tells SAS to execute the commands above.
b. A view of our sample data:
i. Data1: Data2:
ii. As you can see, the errors occur in the following observations
(lines):
1. Observation 2 swaps the second and third columns
2. Observation 3 swaps the first and second columns
3. Observation 5 swaps the third and fourth columns
4. Observation 7 is missing the first piece of data
2. Using PROC COMPARE
a. SAS syntax:
i. We use the proc compare to compare data sets in many ways. In
our example, we will go element-wise.
ii. base = data 1 compare = data2 informs SAS that we are
comparing data1 to data2
pf3

Partial preview of the text

Download Comparing SAS Data Sets using PROC COMPARE and more Assignments Statistics in PDF only on Docsity!

Proc Compare – Stat 480 James D. Abbey January 19, 2006

  1. Enter your SAS data sets for comparison. Ideally, the data sets will match entirely. In this exercise, data1 will be correct while data2 will have flaws. a. SAS Code for Data Entry: i. Data data1; specifies that you are entering data into SAS and names the data “data1” ii. infile ‘C:\Temp\data1.txt’; points to an external file on your hard- drive. You could also have used a “datalines;” command to enter the data directly into SAS. iii. Input var1 1 var2 2…; specifies what variables to input. The number following the variable name is a column flag, which tells SAS to look for data in column 1, 2, etc.
  2. Other input options are a space, tab, comma, etc. a. Space and tab will be treated the same in SAS and require no special syntax. Use of a comma will be covered later. iv. Run; tells SAS to execute the commands above. b. A view of our sample data: i. Data1: Data2: ii. As you can see, the errors occur in the following observations (lines):
  3. Observation 2 swaps the second and third columns
  4. Observation 3 swaps the first and second columns
  5. Observation 5 swaps the third and fourth columns
  6. Observation 7 is missing the first piece of data
  7. Using PROC COMPARE a. SAS syntax: i. We use the proc compare to compare data sets in many ways. In our example, we will go element-wise. ii. base = data 1 compare = data2 informs SAS that we are comparing data1 to data

b. SAS output: i. First, we receive a variable error output summary: ii. Second, we receive a more detailed break down by variable and observation:

  1. Example of analysis: “Observation 2 shows an error in the entry of variable 2. Upon analysis of the data sets, I found that variables two and three were swapped in the data file.”
  2. Note: Often, you will not be able to determine which data set is flawed unless you revisit the raw data.