SAS Programming Fundamentals, Exams of Nursing

A wide range of SAS programming concepts and techniques, including statistics, proc means, proc freq, ods, data step, by statement, data set options, string manipulation functions, and more. Useful for university-level courses related to data analysis, statistics, and programming.

Typology: Exams

2024/2025

Available from 09/21/2024

rosze-macharia
rosze-macharia 🇬🇧

4.4

(7)

11K documents

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
SAS Exam 2
Proc means: Which variables are analyzed by default? - Answer -analyzes every numeric variable in the
SAS data set. Excludes missing values in calculating statistics.
Proc means: - Which default statistics are produced? - Answer -Count, mean, standard deviation, min,
and max
Proc means: - How to choose specific statistics and number of decimal places - Answer -To specify the
summary statistics to be computed add them to the PROC MEANS statement as options. To specify the
number of decimal places in a report produced by PROC means to k places use: MAXDEC = k in the PROC
MEANS statement
Proc means: How to choose variables to be analyzed - Answer -The VAR statement chooses the variables
to be processed by PROC MEANS. Place it below the proc means statement as previously.
Proc means: The class statement and by statement and proc means - Answer -The CLASS statement in
the MEANS procedure groups the observations of the SAS data set for analysis. Unlike using the BY
statement, data do not need to be sorted prior to using the CLASS statement. Place it below the proc
means statement.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download SAS Programming Fundamentals and more Exams Nursing in PDF only on Docsity!

SAS Exam 2

Proc means: Which variables are analyzed by default? - Answer -analyzes every numeric variable in the SAS data set. Excludes missing values in calculating statistics. Proc means: - Which default statistics are produced? - Answer -Count, mean, standard deviation, min, and max Proc means: - How to choose specific statistics and number of decimal places - Answer -To specify the summary statistics to be computed add them to the PROC MEANS statement as options. To specify the number of decimal places in a report produced by PROC means to k places use: MAXDEC = k in the PROC MEANS statement Proc means: How to choose variables to be analyzed - Answer -The VAR statement chooses the variables to be processed by PROC MEANS. Place it below the proc means statement as previously. Proc means: The class statement and by statement and proc means - Answer -The CLASS statement in the MEANS procedure groups the observations of the SAS data set for analysis. Unlike using the BY statement, data do not need to be sorted prior to using the CLASS statement. Place it below the proc means statement.

Proc means: - Creating an output data set - Answer -In order to do so include an output statement in your proc means step with the following syntax: OUTPUT OUT = sas-data-set mean = Av_var1 Av_var median = Med_var1 Med_var2; Proc freq: Which variables are analyzed by default? - Answer -analyzes and creates one way frequency tables for every variable in a SAS data set (numeric and character). Displays each distinct data value Proc freq: Choosing variables to be analyzed with the TABLES statement - Answer -Use the TABLES statement to limit the variabels for with proc freq produces frequency reports. Typically you want such reports on variables that have a limited number of distinct values. Place TABLES below the proc freq statement Proc freq: Effect of format in proc freq - Answer -Use proc format to redefine categories of values in TABLES statement Proc freq: - Two way tables - Answer -A two-way, or crosstabular, frequency report analyzes all possible combinations of the distinct values of two variables. Proc freq: - Additional syntax for tables statement (slide 36) - Answer -tables A(B C); tables AB AC; tables (A B) (C D); tables AC BC AD BD; tables (A B C)* D; tables AD BD CD; tables A -- C; tables A B C; tables (A -- C)D; tables AD BD C*D Proc freq: NOFREQ, NOPERCENT, NOROW, NOCOL options in tables statement - Answer -NOFREQ: options for suppressing cell frequency NOPERCENT: options for suppressing cell percent NOROW:options for suppressing ROW percent NOCOL: options for suppressing COLUMN percent

Sum statement and accumulator variables: By default an accumulator variable is initialized as 0 - Answer -SUM statement: Variable + Expression. The sum statement adds the result of the expression to the numeric variable. This numeric variable is known as a accumulator variable. At the beginning of the data step, the value of this numeric variable is not set to missing rather it starts with a value of 0 and it retains its new value in the program data. Sum statement and accumulator variables: Initializing an accumulator variable with a retain statement - Answer -The retain statement: assigns an initial value to a retained value; prevents variables from being initialized each time the data step executes Syntax: RETAIN variable initial-value; Retain statement can be used to assign an initial value other than the default value of 0 to a variable whose value is assigned by a sum statement IF-THEN ELSE: Numeric constants and IF statements - Answer -Numeric constant: is true if it is non-zero or non-missing and will be evaluated as falsie if it is 0 or missing. IF-THEN ELSE: DO Groups - Answer -IF condition THEN DO; statement; statement; ............ END; ELSE IF condition THEN DO; statement; statement; END; How variable lengths are assigned - Answer -By default, a numeric variable has length of 8 bytes and character variables are assigned the length of first occurrence in the DATA step.

Length statement - Answer -LENGTH variable-name <$> length-specification...; As the length is assigned to a character variable is determined by the first character value encounter, the LENGHT statement should be included prior to the statement that includes that first encounter Difference between where and IF for subsetting data - Answer -Where condition can only be used with variables that exist in an INPUT SAS data set. When merging data sets, WHERE condition is applied BEFORE merging data sets; while IF is applied AFTER merging data sets. ONLY THE WHERE condition can be applied inside a SAS PROC step (no IF condition) WHERE is usually for selecting observations based on variables that already exist. DROP= and KEEP= data step options. Know difference when used in set vs. data statements - Answer - The DROP= data set option in the set statement is applied to the input data set and excludes variables from processing or from output SAS data sets. The DROP= data set option in the data statement is applied to the output data set. SAS does not write these variables to the output data set. However, all variables are available for processing. The DROP= Data Step Option The KEEP= data set option in the set statement is applied to the input data set and includes variables for processing and excludes all others not specified by Keep =. The KEEP= data set option in the data statement is applied to the output data set. SAS can write only the variables specified by the Keep= option to the output data set but all variables are available for processing. Control Output using Keep = , Drop = when you create multiple output data sets, you can use the DROP= and KEEP= data set options to write different variables to different data sets. Controlling Variable Input using (Keep=, Drop=) DROP= and KEEP= data set options can apply to both input and output SAS data sets. Drop and keep statements - Answer -To drop variables that are read or created during the DATA step, use a DROP statement. Variables dropped with a DROP statement are read into the PDV but are not output to the new SAS data set. They are available for processing during the DATA step. To KEEP variables that are read or created during the DATA step, use a KEEP statement. Variables which are not kept with a KEEP statement are read into the PDV but are not output to the new SAS data set. They are available for processing during the DATA step. By default, the SAS System writes all variables from every input data set to every output data set.

WHEN3(when3-exp) statement3; OTHERWISE statement4; END; If when1-exp is true, statement1 is executed and selection stops. If when1-exp is false then when2-exp is evaluated. If when2-exp is true then statement2 is executed and selection stops. And so on. If all when-exp are false then the otherwise statement is executed. By statement in data step - Answer -Data pilotn (drop = state); Set mylib.pilots (drop=id city homephone); BY Jobcode; Run; NOTE: The BY Jobcode statement requires that mylib.pilots be sorted by jobcode. You can use proc sort for this purpose. When you use BY Jobcode with the set statement above, SAS creates two temporary variables in the PDV: FIRST.jobcode: which has a value of 1 when the first observation in the BY group is read, and 0 otherwise. LAST.jobcode: which has a of value 1 when the LAST observation in the BY group is read, and 0 otherwise. Using by statement and accumulator variable in data step to get subtotals - Answer -Now you can sum the annual payroll by job code for each manager. Here, the payroll for only two managers (Coxe and Delgago) is listed. proc print data=company.budget2 noobs; by manager; var jobtype; sum payroll;

where manager in ('Coxe','Delgado'); format payroll dollar12.2; Title 'Payroll sum by Job Type and Manager'; run; Output statement - Answer -By default, an observation will be output during the execution phase when the RUN statement is reached. This is called 'IMPLICIT' output. In many situations, we need to output observations explicitly using an output statement. OUTPUT <SAS-dataset(s)>; This statement forces SAS to output to the SAS-dataset(s) listed or if none are listed to all SAS-data sets in the data statement. Note that the SAS-dataset(s) listed must be in the data statement. NOTE: Once OUTPUT is used, SAS will no longer use implicit output. Point= option to read specific observations with output and stop statements - Answer -By default observations are read from the input data set sequentially. Using the POINT = option in the SET statement you can read specific observations directly. General Syntax: Data output-SAS-dataset; varname=n; SET sas-data-set POINT=varname; /* varname is temporary variable that must be defined before the set statement and n is a positive integer*/ output; stop; run; Point= must be set to temporary variable and can not be set to a positive integer. The STOP statement is used to exit the data step after reading the observation specified by Point= option in the SET statement. This is needed because direct access to observations means that the end- of-file marker will not be reached and without a stop statement the data step will read the data set in a unlimited loop.

If data sets contains variables with the same name: these variables MUST have the same data type (numeric or character), otherwise SAS will stop processing the data step and generate an error massage. SAS takes the length attribute of the variable in the FIRST data set listed in the set statement. The same is true for label, format, informat attributes. Interleaving SAS Data sets - Answer -If you use a BY statement when you concatenate data sets, the result is interleaving. Interleaving intersperses observations from two or more data sets, based on one or more common variables. The new data set includes all the variables from all the input data sets, and it contains the total number of observations from all input data sets. NOTE: The data values must be sorted on the by-variable(s). You can use proc sort for this purpose. Rename= option - Answer -In the previous example, the variables Jcode and Jobcode are indicating the same characteristic but have different names in the two different data sets. Thus RENAME = is used to rename Jcode as Jobcode. You can use a RENAME= option in the set statement to change the name of a variable so that its values get read into a particular slot in the PDV. General form of the RENAME= data set option: SAS-data-set(RENAME=(old-name-1=new-name- old-name-2=new-name- .. old-name-n=new-name-n)) Rename = can be used to rename variables in a SAS data set in a SET statement, MERGE statement or in DATA statement. Proc append - Answer -Basic General syntax: PROC APPEND BASE = SAS-data-set DATA = SAS-data-set2; RUN; Above requires both data sets must have "matching" variables (see slide 35). General syntax for appending when variables don't "match":

PROC APPEND BASE = SAS-data-set DATA = SAS-data-set2 FORCE; RUN; Only two data sets can be used at a time, a base= data set and a data= data set. The observations of the data= data set are appended to the base= data set. No new SAS data set is created and the observations of the base= data set are not read. The variable information in the descriptor portion of the base= data set does not change. Append is a proc step, not a data step, but it does not create a report rather it appends observations to the base data set. Note: If you have data that you would like to append you might create a copy to use as the base= data set so that your original data set does not get altered. Without the Force option the variables in the base= and the data= data sets must "match-up" (see next slide). If there is not such a match and there is no Force option the append procedure will generate errors and stop processing and the base= data set will not be altered. All of and only the variables that are in the base= data set are in the result of Proc Append Force option - Answer -When the variables don't "match-up" we must use the FORCE option. When using the FORCE option: If a data value in a variable in the data= data set is longer than the length of the corresponding variable in the base= data set then that data value will be truncated. If there are variables in the data= data set that are not in the base= data set then they will be dropped. If there are variables in the data=data set that are not of the same type as a variable with the same name in the base=data set then the values of that variable will be set to missing. Note that if there are non-matching labels, formats or informats the FORCE option is not required but the label, format or informat from the base= data set will be used. Match merging - Answer -Match merging combines observations from two or more data sets into single observations in a data set according to values of a common variable. General form of a DATA step match-merge: DATA SAS-data-set; MERGE SAS-data-sets;

Mean and sum functions - Answer -Function-name(OF variable list); Example: MEAN(OF Var1 - Var4) =MEAN(Var1,Var2,Var3,Var4); computes the mean of Var1 to Var MEAN(Var1 - Var4); does not compute the mean of Var1 to Var4; instead, if computes the average of Var1 MINUS Var4. The target variable is the variable to which the result of a SAS function is assigned. For example: Avg_score = Mean (of Quiz1 - Quiz 5); Avg_score is the target variable. Some useful syntax for computing sample statistics using the SUM function: SUM (x1, x2, x3, x4); SUM(of x1 - x4); SUM(of x -- y); SUM (y, z, of x1 - x4); SUM (4, 24, 10, 6); NOTE: SUM(x1-x4); computes the sum of x1 MINUS x4. NOTE: Missing values are ignored in the computation. Converting character to numeric with input function - Answer -INPUT(Source,Informat); Source is the character variable, constant, or expression to be converted. Informat tell SAS how to convert the character into numeric. A "special" informat is needed if the character variable, constant or expression is not in the form of a standard numeric data value (see the next slide for a reminder of what are standard numeric data values). For example assume a variable payment is a character variable and a typical value for payment is $4,624.75.

In this case you might use the informat dollar9.2 with the input function to read the values of the variable payment and covert them to standard numeric values as such: Num_Pay = INPUT(payment, dollar9.2) Automatic conversion of character variables - Answer -When a numeric operation has a character variable as an operand SAS will attempt to convert the character variable into a temporary numeric variable for computational purposes. For example, assume you had the following numeric expression in an assignment statement in a data step: Salary = payrate*hours; If payrate is a character variable and hours a numeric variable: SAS will identify this mismatch and attempt to create a temporary numeric variable to hold each converted character value from payrate and then use these converted values in the computation. This conversion is only possible if the values of payrate are in the standard numeric form (if they have values such as 23.45 not such as $23.45). If this conversion is not possible the value of payrate is treated as missing in the computation and salary will be assigned a missing value. The character values of payrate are NOT replaced by numeric values. It is good practice to always do the conversion manually with input or another function. Converting numeric to character using put function - Answer -The PUT SAS function conducts numeric- to-character conversation:PUT(source,Format); Source is the numeric variable to be converted to character. Format is the format of the source to write the source into a character string. The format MUST be a numeric format. The numeric format right-aligns the converted character string. Concatenation operator - Answer -We want to concatenate such values to get (for this observation and others) values such as: RB109, BSU 47304 We use the assignment statement Com_Address = address || ZIP;

Interval can be: DAY, WEEKDAY, WEEK, TENDAY, SEMIMONTH, MONTH, QTR, SEMIYEAR, YEAR, DTMONTH, DTWEEK, HOUR, MINUTE, SECOND Start-from: specifies the starting SAS date, time, datetime. Increment: specifies a negative (back to the past) or positive integer (to the future). Alignment: forces the alignment of the returned date to be the beginning ('b'), middle ('m'), or end ('e') of the time interval. The default is the beginning. The type of interval must match the type of value in 'start-from' and in increment. Scan, substr, trim, left, upcase, lowcase, propcase, catx, index and find functions - Answer -SCAN: Look for a specific word from a character string SUBSTR: Extract a substring or replace character values TRIM: Trim trailing blanks from character values LEFT: Left align a string that is right-aligned UPCASE: Convert character values to UPPER case LOWCASE: Convert character values to lower case PROPCASE: Convert character values to Proper case CATX: Concatenate strings, remove leading, trailing blanks and insert separators INDEX: Search character value for a specific string FIND: Search for a specific substring with a character string user specifies Specifying do loops using 'start to stop by increment' - Answer -DO index-variable = start TO stop BY increment; SAS statements; END; Index-variable stores a value during each iteration of the DO loop. Start, stop, increment values are set upon entry of the DO loop. Can not be changed during the processing the DO loop Can be numbers, variables, or SAS expressions. The BY clause is optional. If none is present the increment is 1.

Increment may be negative, which will process the DO loop backwards. For this situation, Start should larger than Stop. After completing the DO loop, the value of the index-variable is the STOP value + Increment, not the STOP value. Specifying do loops with lists - Answer -DO index-variable=item-1 <,...item-n>; item-1 through item-n can be either all numeric or all character constants, or they can be variables. The DO loop is executed once for each value in the list. Specifying do loops with 'do while' and 'do until' - Answer -The DO WHILE statement executes statements in a DO loop while a condition is true. expression is evaluated at the top of the loop. The statements in the loop never execute if the expression is initially false. The DO UNTIL statement executes statements in a DO loop until the condition is true. expression is evaluated at the bottom of the loop. The statements in the loop are executed at least once. Nested do loops - Answer -Nested DO loops are DO loops within DO loops. When you nest DO loops, use different index variables for each loop be certain that each DO statement has a corresponding END statement.