Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Statistical Programming - Examination | STA 402, Exams of Statistics

Miami University - Oxford Statistics

Prof. A. John Bailer

Material Type: Exam; Professor: Bailer; Class: Statistical Programming; Subject: Statistics; University: Miami University-Oxford; Term: Fall (First Sem) 2004;

Typology: Exams

Pre 2010

Uploaded on 08/19/2009

koofers-user-z9w 🇺🇸

10 documents

1 / 35

This page cannot be seen from the preview

Don't miss anything!

Roberts Excel spreadsheet imported: CONTENTS

The CONTENTS Procedure

Week 07/08 [13+ Oct.] Class Activities

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2004\sta402\handouts\week-07-08-13oct04.doc

based on:

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week7-08oct03.doc

&

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week8-15oct03.doc

SAS PROGRAMMING

* Arrays

* DO groups

* Statements: RETAIN, RENAME, LABEL, FORMAT, SUM

* Using formats in DATA steps

* Conditional execution

* More on missing values

Additional Ref: Cody, R. and Pass, R. (1995) SAS® Programming by Example. SAS

Institute Inc., Cary, NC. – Chapters 7 (“arrays”), 8 (“retain”), 5 (“SAS functions”)

ARRAYS

* look to use if writing the same set of code multiple times

* “arrays” can contain lists of variables

* “arrays” also good for restructuring data sets

Common example 1: Recoding a set of variables

/*

Suppose you have a data set “old_data” containing

Variables: a_var, b_var, var3, var4, var5

1

Discover Exams of Statistics Miami University - Oxford

Partial preview of the text

Download Statistical Programming - Examination | STA 402 and more Exams Statistics in PDF only on Docsity!

The CONTENTS Procedure

Week 07/08 [13+ Oct.] Class Activities

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2004\sta402\handouts\week-07-08-13oct04.doc

based on:

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week7-08oct03.doc

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week8-15oct03.doc

SAS PROGRAMMING

* Arrays

* DO groups

* Statements: RETAIN, RENAME, LABEL, FORMAT, SUM

* Using formats in DATA steps

* Conditional execution

* More on missing values

Additional Ref: Cody, R. and Pass, R. (1995) SAS®^ Programming by Example. SAS

Institute Inc., Cary, NC. – Chapters 7 (“arrays”), 8 (“retain”), 5 (“SAS functions”)

ARRAYS

* look to use if writing the same set of code multiple times

* “arrays” can contain lists of variables

* “arrays” also good for restructuring data sets

Common example 1: Recoding a set of variables

Suppose you have a data set “old_data” containing

Variables: a_var, b_var, var3, var4, var

The CONTENTS Procedure

(all numeric with missing values coded as -999)

Recode -999 as missing=.

data old_data;

input a_var b_var var3 var4 var5 @@;

datalines;

run;

data recode_ex; set old_data;

array all[5] a_var b_var var3 var4 var5;

do ii=1 to 5;

if all[ii] = -999 then all[ii]=.;

end;

drop ii;

/* can use either [], {}, () to reference array elements */

options nocenter nodate;

proc print;

run;

Obs a_var b_var var3 var4 var 1 1 2 3 4 5 2 6 7. 8 9 3 10 11 12. 14

/* alternative to get SAS to count array size &

dimension of array

data recode_ex2; set old_data;

array all{*} a_var b_var var3 var4 var5;

do ii=1 to dim(all);

The CONTENTS Procedure

run;

Recode 3: Using NUMERIC to select elements Obs char_var a_var b_var var3 var4 var 1 a 1 2 3 4 5 2 b 6 7. 8 9 3 c 10 11 12. 14

Common example 2: Creating multiple observations from a single observation

data one;

input x1 x2 x3 x4;

datalines;

data two; set one;

array xx[4] x1-x4;

do time=1 to 4;

x=xx[time];

output;

end;

drop x1-x4;

run;

proc print;

title ‘Expand one record to multiple records’;

run;

Expand one record to multiple records Obs time x 1 1 60 2 2 62 3 3 64 4 4 68 5 1 80 6 2 84 7 3 90 8 4 98

Common example 3: Creating one observations from multiple observations

The CONTENTS Procedure

data multi;

input id time heart_rate;

datalines;

proc sort data=multi; by id time;

data sorted by the variable “id”

FIRST.id = 1 if first occurrence of new by group variable

LAST.id = 1 if last occurrence of a by group variable

data one; set multi;

by id;

array xx[4] x1-x4;

retain x1-x4; * values kept from previous observation;

if FIRST.id=1 then do ii=1 to 4;

xx[ii]=.; * elements initialized to missing;

end;

xx[time]=heart_rate;

if LAST.id=1 then output;

keep id x1-x4;

The CONTENTS Procedure

title NITROFEN: t-test of ( 0 , 160 ) concentrations;

class conc;

var total;

run ;

NITROFEN: t-test of (0, 160) concentrations The TTEST Procedure Statistics Lower CL Upper CL Lower CL Upper CL Variable conc N Mean Mean Mean Std Dev Std Dev Std Dev Std Err total 0 10 28.827 31.4 33.973 2.4737 3.5963 6.5654 1. total 160 10 26.612 28.3 29.988 1.6229 2.3594 4.3073 0. total Diff (1-2) 0.2424 3.1 5.9576 2.2981 3.0414 4.4977 1. T-Tests Variable Method Variances DF t Value Pr > |t| total Pooled Equal 18 2.28 0. total Satterthwaite Unequal 15.5 2.28 0. Equality of Variances Variable Method Num DF Den DF F Value Pr > F total Folded F 9 9 2.32 0. */

proc print ;

title NITROFEN: print of ( 0 , 160 ) concentrations;

var conc total;

run ;

NITROFEN: print of (0, 160) concentrations Obs conc total 1 0 27 2 0 32 3 0 34 4 0 33 5 0 36 6 0 34 7 0 33 8 0 30 9 0 24 10 0 31 11 160 29 12 160 29 13 160 23 14 160 27 15 160 30 16 160 31 17 160 30 18 160 26 19 160 29 20 160 29

proc transpose data=test prefix=xx out=tran_out;

var total;

run;

data obs_test; set tran_out;

type = ‘O’;

run;

The CONTENTS Procedure

proc print data=obs_test;

title ‘Randomization test: observed data’;

run;

Randomization test: observed data _ N A x x x x x x x x x x x t O M x x x x x x x x x x x x x x x x x x x x y b E x x x x x x x x x 1 1 1 1 1 1 1 1 1 1 2 p s _ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 e 1 total 27 32 34 33 36 34 33 30 24 31 29 29 23 27 30 31 30 26 29 29 O */

proc plan ;

factors test= 4000 ordered in= 20 ;

output out=d_permut;

run ;

proc transpose data=d_permut prefix=in out=out_permut(keep=in1-in20); by test;

run ;

proc print data=out_permut;

run ;

data null; set obs_test;

file 'D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week7-perm.data';

put type xx1-xx20;

run ;

data null; set out_permut;

type = 'P'; * permutation data;

file 'D:\baileraj\Classes\Fall 2003\sta402\SAS-programs\week7-perm.data'

mod; /* mod option adds lines to existing file */

put type in1-in20;

run ;

/* week7-perm.data ...

O 27 32 34 33 36 34 33 30 24 31 29 29 23 27 30 31 30 26 29 29

P 8 14 4 11 3 2 12 1 6 13 17 9 15 16 5 19 20 7 10 18

P 12 2 8 10 13 7 9 16 4 19 15 3 5 14 17 1 20 11 6 18

P 18 17 13 14 5 8 19 16 3 12 11 9 10 7 2 20 4 6 1 15

P 6 12 4 20 19 16 11 5 15 18 1 8 3 13 17 14 10 9 7 2

P 8 17 4 19 2 11 1 7 6 3 9 13 20 14 12 18 15 10 5 16

P 11 7 17 6 18 13 3 12 8 10 19 16 2 20 4 5 15 1 9 14

P 17 11 4 7 20 6 9 16 1 2 14 12 5 18 10 8 15 13 3 19

data perm_data;

array both{ 20 } x1-x10 y1-y10; /* array for observed values */

array ins{ 20 } in1-in20; /* index array */

array perms{ 20 } xp1-xp10 yp1-yp10; /* array for permuted values */

The CONTENTS Procedure

clustered patterns of response.

Problem:

4 trees were observed in a hypothetical square plot
are these trees clustered in this plot? regularly spaced?
how can you check?

Strategy:

1. Determine nearest-neighbor distances

2. Calculate the average NN distance

3. Generate a sample of observations that are randomly

distributed in the region of interest

4. Calculate the average NN distance for this set

5. Repeat steps 3 and 4 a large number of times

6. P-values are the proportional of generated samples

that were more extreme than observed

History:

First examined this problem (12apr95) in

[-.classes.ies612]monte_spatial.sas (old VAX file)

options ls=74;

data plot1;

title plot1 assessment of pattern;

array xobs xobs1-xobs4;

array yobs yobs1-yobs4;

array nnobs nnobs1-nnobs4;

input xobs1-xobs4 yobs1-yobs4 @@;

/* Determine the observed NN distance and average */

sumnnobs = 0;

do i=1 to 4; * find NN distance for each point ;

nnobs(i) = 100; * initialize distances to be large;

do j=1 to 4; * compare the ith point to all others;

d=sqrt( (xobs(i)-xobs(j))2 + (yobs(i)-yobs(j))2 );

if (d<nnobs(i)) and (d>0) then nnobs(i)=d;

* output; * output if debugging desired;

end;

The CONTENTS Procedure

sumnnobs=sumnnobs+nnobs(i);

end;

avgnnobs = sumnnobs/4; * observed average NN distance;

datalines;

proc print;

data mccsr1; set plot1;

array xobs xobs1-xobs4;

array yobs yobs1-yobs4;

array xsim xsim1-xsim4;

array ysim ysim1-ysim4;

array nnobs nnobs1-nnobs4;

array nncsr nncsr1-nncsr4;

/* Generate a large number of CSR plots with 4 trees */

/* CSR = completely spatially random */

* initialize counters of nn avg dist le or ge than observed;

numle = 0; numge = 0;

do isim = 1 to 1000;

do ii = 1 to 4;

xsim(ii) = ranuni(0);

ysim(ii) = ranuni(0);

end;

/* Find NN distance for the simulated trees */

sumnncsr = 0;

do i=1 to 4;

nncsr(i) = 100; * initialize;

do j=1 to 4;

d=sqrt( (xsim(i)-xsim(j))2 + (ysim(i)-ysim(j))2 );

if (d<nncsr(i)) and (d>0) then nncsr(i)=d;

* output; * debugging;

end;

The CONTENTS Procedure

data retain_demo1;

input dobs time x;

retain subject 0 ;

if time= 1 then subject=subject+ 1 ;

datalines;

proc print ;

id dobs;

run ;

dobs time x subject 1 1 60 1 2 2 62 1 3 3 64 1 4 4 68 1 5 1 80 2 6 2 84 2 7 3 90 2 8 4 98 2

data retain_demo2;

input dobs time x;

if time=1 then subject+ 1 ; * implicitly retains values for calculations;

datalines;

options nocenter;

proc print ;

title2 ‘implicitly retain with subject+1 statement’;

id dobs;

run ;

implicitly retain with subject+1 statement dobs time x subject 1 1 60 1 2 2 62 1 3 3 64 1 4 4 68 1 5 1 80 2 6 2 84 2 7 3 90 2

The CONTENTS Procedure

example: find the average weight by subject using

DATA step programming

/* STEP 1: read in the data file */

data diet;

input id @3 date mmddyy8. weight;

format date mmddyy8.;

datalines;

proc print;

title ‘diet data’;

run;

diet data Obs id date weight 1 1 10/01/92 155 2 1 10/08/92 158 3 1 10/15/92 158 4 1 10/22/92 158 5 2 09/02/92 200 6 2 09/09/92 198 7 2 09/16/92 196 8 2 09/23/92 202

data diet2; set diet;

The CONTENTS Procedure

STEPS 2 and 3 ALTERNATIVE:

Accumulate cumulative weight and average of measurements

And then extract the last measurement for each ID

data diet5; set diet;

retain total 0 count 0 ;

if id = lag(id) then do;

total=total+weight;

count+ 1 ;

wt_avg = total/count;

end;

else if id NE lag(id) then do;

total = weight;

count= 1 ;

wt_avg = total/count;

end;

proc print ;

run ;

Obs id date weight total count wt_avg 1 1 10/01/92 155 155 1 155. 2 1 10/08/92 158 313 2 156. 3 1 10/15/92 158 471 3 157. 4 1 10/22/92 158 629 4 157. 5 2 09/02/92 200 200 1 200. 6 2 09/09/92 198 398 2 199. 7 2 09/16/92 196 594 3 198. 8 2 09/23/92 202 796 4 199.

data diet6; set diet5; by id;

if LAST.id;

keep id wt_avg;

proc print;

run;

Obs id wt_avg 1 1 157. 2 2 199.

example: find the total time enrolled for each participant

[motivated by an example where people may enroll/

disenroll in a program during different quarters]

The CONTENTS Procedure

options formdlim="-";

data test;

input id xstart xstop;

datalines;

proc print;

run;

data test2; set test; by id;

array start{9} start1-start9;

array stop{9} stop1-stop9;

array times{9} times1-times9;

retain count 0;

retain start1-start9 stop1-stop9 times1-times9;

if FIRST.id=1 then do; * initialize count and arrays with new ID;

count = 0;

do ii=1 to 9;

start{ii} = .;

stop{ii} = .;

times{ii} = .;

end;

count = count + 1;

start{count} = xstart;

stop{count} = xstop;

times{count} = xstop - xstart;

if LAST.id=1 then output; * output results if last obs for ID;

drop xstart xstop ii;

run;

data test3; set test2;

total_time = sum(of times1-times9);

run;

proc print;

run;

material from

Statistical Programming - Examination | STA 402, Exams of Statistics

Related documents

Partial preview of the text

Download Statistical Programming - Examination | STA 402 and more Exams Statistics in PDF only on Docsity!

The CONTENTS Procedure

Week 07/08 [13+ Oct.] Class Activities

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2004\sta402\handouts\week-07-08-13oct04.doc

based on:

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week7-08oct03.doc

C:\Documents and Settings\John Bailer\My Documents\baileraj\

Classes\Fall 2003\sta402\handouts\week8-15oct03.doc

SAS PROGRAMMING

* Arrays

* DO groups

* Statements: RETAIN, RENAME, LABEL, FORMAT, SUM

* Using formats in DATA steps

* Conditional execution

* More on missing values

Additional Ref: Cody, R. and Pass, R. (1995) SAS®^ Programming by Example. SAS

Institute Inc., Cary, NC. – Chapters 7 (“arrays”), 8 (“retain”), 5 (“SAS functions”)

ARRAYS

* look to use if writing the same set of code multiple times

* “arrays” can contain lists of variables

* “arrays” also good for restructuring data sets

Common example 1: Recoding a set of variables

Suppose you have a data set “old_data” containing

Variables: a_var, b_var, var3, var4, var

The CONTENTS Procedure

(all numeric with missing values coded as -999)

Recode -999 as missing=.

data old_data;

input a_var b_var var3 var4 var5 @@;

datalines;

run;

data recode_ex; set old_data;

array all[5] a_var b_var var3 var4 var5;

do ii=1 to 5;

if all[ii] = -999 then all[ii]=.;

end;

drop ii;

/* can use either [], {}, () to reference array elements */

options nocenter nodate;

proc print;

run;

/* alternative to get SAS to count array size &

dimension of array

data recode_ex2; set old_data;

array all{*} a_var b_var var3 var4 var5;

do ii=1 to dim(all);

The CONTENTS Procedure

run;

Common example 2: Creating multiple observations from a single observation

data one;

input x1 x2 x3 x4;

datalines;

data two; set one;

array xx[4] x1-x4;

do time=1 to 4;

x=xx[time];

output;

end;

drop x1-x4;

run;

proc print;

title ‘Expand one record to multiple records’;

run;

Common example 3: Creating one observations from multiple observations

The CONTENTS Procedure

data multi;

input id time heart_rate;

datalines;

proc sort data=multi; by id time;

data sorted by the variable “id”

FIRST.id = 1 if first occurrence of new by group variable

LAST.id = 1 if last occurrence of a by group variable

data one; set multi;

by id;

array xx[4] x1-x4;