Observations to Multiple SAS Data Sets - Lecture Notes | STAT 321, Study notes of Statistics

Material Type: Notes; Professor: Davenport; Class: INTRO TO STATISTICAL COMPUTING; Subject: Statistics; University: Virginia Commonwealth University; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/12/2009

koofers-user-w0e-1
koofers-user-w0e-1 🇺🇸

10 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Writing Observations to Multiple SAS Data Sets
The SAS system allows you to create multiple SAS data sets in a
single data step. The basic tool is the “output” statement.
The basic syntax is OUTPUT <sas-data-set-name>;
If you use an “output” statement without specifying a data
set name, SAS will output that observation to data sets
named in the current Data statement.
If you want to write to a specific data set, then it must be
named in the “output” statement.
Any data set named in an “output” statement MUST be
listed in the Data statement.
Suppose you want to output two data sets; one with guide =
‘Lucas’ and one with guide = ‘other’.
(See program SAS_Output1_arts.sas)
Note that when you create more than one data set in a Data
Statement, the last one listed will be the most recently used data
set and hence will be the current work.dataset (in this case,
othrtour) . To use another data set in a procedure, you must use
the “DATA=sas.data.set.name” option.
Using an output statement suppresses the automatic output of
observations at the end of a DATA step. Therefore, if you plan
to use an output statement in a DATA step, then you must
program ALL output for that step with output statements.
(See program SAS_Output2_arts.sas)
pf3
pf4
pf5

Partial preview of the text

Download Observations to Multiple SAS Data Sets - Lecture Notes | STAT 321 and more Study notes Statistics in PDF only on Docsity!

Writing Observations to Multiple SAS Data Sets The SAS system allows you to create multiple SAS data sets in a single data step. The basic tool is the “output” statement. The basic syntax is OUTPUT <sas-data-set-name>;  If you use an “output” statement without specifying a data set name, SAS will output that observation to data sets named in the current Data statement.  If you want to write to a specific data set, then it must be named in the “output” statement.  Any data set named in an “output” statement MUST be listed in the Data statement. Suppose you want to output two data sets; one with guide = ‘Lucas’ and one with guide = ‘other’. (See program SAS_Output1_arts.sas) Note that when you create more than one data set in a Data Statement, the last one listed will be the most recently used data set and hence will be the current work.dataset (in this case, othrtour). To use another data set in a procedure, you must use the “DATA=sas.data.set.name” option. Using an output statement suppresses the automatic output of observations at the end of a DATA step. Therefore, if you plan to use an output statement in a DATA step, then you must program ALL output for that step with output statements. (See program SAS_Output2_arts.sas)

Understanding the OUTPUT Statement An output statement tells the SAS system to output the observation when the output statement is processed, NOT at the end of the DATA Step. This can cause problems, if you are not careful. (See program SAS_Output3_arts.sas) The problem with the example is that the assignment statement that computes the variable “days” is misplaced in the programming stream. (See program SAS_Output4_arts.sas) After the SAS system processes an OUTPUT statement, the observation remains in the program data vector; so you can still continue to program with that observation. You can even output it again to the same SAS data set or to a different one. (See program SAS_Output5_arts.sas)

If the values of the variable consist of only letters, then the sorting is done alphabetically (in ascending order by default). If you omit the “out=newone” options, the sorted version of the data set is named old.one and becomes the current version (i.e., it is replaced). The SORT Procedure provides a message in the SAS log that tells you that the sort procedure was executed. (See program SAS_Sort1_tourtypes.sas) Grouping BY More Than One Variable First variable is sorted, then within the first the second is sorted, then with those the third is sorted, …. Etc. (See program SAS_Sort2_tourtypes.sas) Arranging in Descending Order proc sort data=old.one out=newone; by descending tourtype vendor landcost; run;

Finding the First or Last Observation in a Group Suppose you want to create a data set containing the least expensive tour that features architecture and the least expensive tour featuring scenery. How do you do this without first displaying the data set and seeing which observation to select?  First sort by TOURTYPE and LANDCOST.  When you use a BY statement, the SAS system automatically creates two additional variables (that are “hidden” variables) for each variable in the BY statement.

  1. FIRST.variable (“variable” is your variable name)
  2. LAST.variable Their values are either zero or one (one if true and zero if false). They exist in the program data vector and are available for use in program statements. However, FIRST. and LAST. variables are NOT written to the SAS output data sets. (See program SAS_Sort3_tourtypes.sas) (See program SAS_Sort4_tourtypes.sas) Sorting Data with the SORT Procedure Sometimes it is more important to work with the actual sorted observations instead of just the grouped data. Also you may want to delete any duplicate observations, if they exist.