Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

SAS Data Sets: Subsetting, Variables, and Multiple Sets - Prof. James Davenport, Study notes of Statistics

Virginia Commonwealth University (VCU)Statistics

Prof. James Davenport

How to access and modify existing sas data sets by creating subsets of observations and variables, keeping or dropping specific variables, and creating multiple data sets within a single data step. It also discusses the differences between using drop= and keep= options in the set and data statements.

Typology: Study notes

Pre 2010

Uploaded on 02/12/2009

koofers-user-n8a 🇺🇸

10 documents

1 / 6

This page cannot be seen from the preview

Don't miss anything!

Let’s assume a permanent SAS data set exists in the SAS data

library called "asdl" (libref name), and the data set is called

"one"

We already know how to create and examine the contents of

such a data set, but how do you access and modify/alter an

existing data set?

data two;

set asdl.one; /*This produces sequential

processing of the observations in one.*/

There are no input or cards statements!!!

While creating data two we can:

. create new variables via transformations

. choose selected observations

. choose selected variables

And we can create additional permanent SAS data sets within a

single Data Step.

(See Program SAS_ModifyVariables_census_data5.sas)

Discover Study notes of Statistics Virginia Commonwealth University (VCU)

Partial preview of the text

Download SAS Data Sets: Subsetting, Variables, and Multiple Sets - Prof. James Davenport and more Study notes Statistics in PDF only on Docsity!

Let’s assume a permanent SAS data set exists in the SAS data library called "asdl" (libref name), and the data set is called "one" We already know how to create and examine the contents of such a data set, but how do you access and modify/alter an existing data set? data two; set asdl.one; /This produces sequential processing of the observations in one./ There are no input or cards statements!!! While creating data two we can:

. create new variables via transformations . choose selected observations . choose selected variables And we can create additional permanent SAS data sets within a single Data Step. (See Program SAS_ModifyVariables_census_data5.sas)

The following examples will use the following data set called “origins”. (See Program SAS_Create_origin_.sas) (See Program SAS_ProcContents_origin.sas) Let’s now focus on accessing a permanent SAS data set and form new SAS data sets that are subsets from the original data set. (Go over the “sub-setting” diagram) ***** Selecting subsets of observations: ***** firstobs = n & obs=m e.g. data asdl.origin_subsets1; set asdl.origin (firstobs=7); (first obs kept) or data asdl.origin_subsets3; set asdl.origin (obs=10); (last obs kept) or data asdl.origin_subsets2; set asdl.origin (firstobs=4 obs=10); this is the last observation read, not the number of observations (See program: SAS_Subset_obs_origin.sas) We could also use " IF statements" (see example). (See program: SAS_Subset_if_origin.sas) We will discuss this in more detail later in the semester.

■ Creating two Data Sets in a single Data Step no semicolon data asdl.origin_first(keep=name a b c d .... ) asdl.origin_second (keep=name a d e f .... ); set asdl.origin; In this case, you must use the KEEP= data set option. If you use the KEEP Statement, then all data sets created in this Data Step would contain the same variables. (See Program SAS_Two_Subsets_origin.sas) Differences in DROP = & KEEP= as options in the Data Statement vs the Set Statement ■ Using these options in the Set statement determines which variables are read from the permanent SAS data set being used as input; hence they determine how the program data vector is built. (Excluded variables are never read into the program data vector at all) ■ Using these options in the Data statement determines which variables are written from the program data vector to the resulting SAS data set.

■ You can use a variable from an input data set (the one read with the Set statement) to perform a calculation, and it must be in the program data vector in order for you to use it. But if you do not want the variable to appear in the resulting data set, then you can use the DROP= option in the Data statement to exclude it when the program data vector is written to the new data set. data asdl.first (keep=name a b c d) asdl.second (keep=name a d e f); set asdl.origin (drop=id); example of these options used in both statements the variable “id” is NEVER read into the program data vector. ■ In the Set Statement, this controls which variables are read into the program data vector ■ In the Data Statement, this controls which variables are written from the program data vector to the data set. NOTE: Using a DROP or KEEP Statement within a Data Step, is comparable to using DROP= or KEEP= options in the Data Statement. All variables are included in the program data vector; they are excluded when the observation is written

SAS Data Sets: Subsetting, Variables, and Multiple Sets - Prof. James Davenport, Study notes of Statistics

Related documents

Partial preview of the text

Download SAS Data Sets: Subsetting, Variables, and Multiple Sets - Prof. James Davenport and more Study notes Statistics in PDF only on Docsity!