



















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Information on transforming sas data sets using data steps, creating variables, and working with functions. Topics include the use of statements like data, set, output, return, where, if, drop, keep, length, and sas functions and operators. The document also covers working with date values and missing values.
Typology: Assignments
1 / 27
This page cannot be seen from the preview
Don't miss anything!




















Hello there Hello there Hello
data numeric_format_show;
/* character formatting illustrated first */ test_num = 1277695.384 ; put 'BEST6. / BEST9. / BEST12.'; put test_num BEST6.; put test_num BEST9.; put test_num BEST12.; put '-------------------------------'; put 'COMMA7. / COMMA10.1 / COMMA11.3'; put test_num COMMA9.; put test_num COMMA12.1; put test_num COMMA13.3; put '-------------------------------'; put 'E7.'; put test_num E7.; put '-------------------------------';
put today weekdate29.; put '-------------------------------'; put 'WORDDATE12. / WORDDATE18.'; put today worddate12.; put today worddate18.;
run ;
DATE7. / DATE9. 29SEP 29SEP
DAY2. / DAY7. 29 29
EURDFDD8. 29.09.
MMDDYY8. / MMDDYY6. 09/29/ 092903
WEEKDATE15. / WEEKDATE29. Mon, Sep 29, 03 Monday, September 29, 2003
WORDDATE12. / WORDDATE18. Sep 29, 2003 September 29, 2003**
data time_format_show; start= 0 ; time_test = 1380442000 ; put start DATETIME13.; put time_test DATETIME17.; run ;
01JAN60:00: 29SEP03:08:06:
data test; input @ 1 date MMDDYY10. @ 21 time TIME8. @31 money DOLLAR10.2;
datalines;
*ODS RTF file='D:\baileraj\Classes\Fall 2007\sta402\SAS-programs\week6- prt1.rtf'; ODS RTF file= “\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week06- prt1.rtf”;
proc print ; title print of date and time w/o formatting – internal SAS representation; var date time money; run ; proc print ; title print of date and time w/ formatting; var date time; format date MMDDYY10. time TIME8. money DOLLAR10.2; run ;
Obs date time money
1 0 3600 100.
2 15977 35399 12693.
Obs date time money
1 01/01/1960 1:00:00 $100.
2 09/29/2003 9:49:59 $12,693.
label loggnp = ‘Per capita Gross National Product (log10-transformed)’; label ienglish = ‘Indicator variable that primary language is English’;
proc format ; value Mlifefmt LOW-54 =' First quartile' 54<-63 =’Second quartile’ 63<-68 =’ Third quartile’ 68<-HIGH='Fourth quartile'; value Wlifefmt LOW-56 =' First quartile' 56<-67 =’Second quartile’ 67<-73 =’ Third quartile’ 73<-HIGH='Fourth quartile'; value Literfmt LOW-53 =' First quartile' 53<-76 =’Second quartile’ 76<-90 =’ Third quartile’ 90<-HIGH='Fourth quartile'; value catlit 1 ='First quartile' 2 ='Second quartile' 3 ='Third quartile' 4 ='Fourth quartile';
categ_lit6 = 1 *(0<liter<= 53 ) + 2 *( 53 <liter<= 76 ) + 3 *( 76 <liter<= 90 )
*ODS RTF file='D:\baileraj\Classes\Fall 2007\sta402\SAS-programs\week6- freq1.rtf'; ODS RTF file= “\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week06- freq1.rtf”;
categ_lit1 Frequency Percent
Cumulative Frequency
Cumulative Percent
1 20 25.97 20 25.
2 19 24.68 39 50.
3 19 24.68 58 75.
4 19 24.68 77 100.
Frequency Missing = 2
categ_lit6 Frequency Percent
Cumulative Frequency
Cumulative Percent
1 20 25.97 20 25.
2 19 24.68 39 50.
3 19 24.68 58 75.
4 19 24.68 77 100. Frequency Missing = 2
categ_lit7 Frequency Percent
Cumulative Frequency
Cumulative Percent
. 2 2.53 2 2.
First quartile 20 25.32 22 27.
Third quartile 19 24.05 41 51.
Fourth quartile 19 24.05 60 75.
Second quartile 19 24.05 79 100.
(see example below)
lots of examples...
sqrt_total = sqrt(total);
conc2 = conc**2;
Iplastic = (condition=”Plastic”);
categ_lit6 = 1 *(0<liter<= 53 ) + 2 *( 53 <liter<= 76 )
Order of Operations/Precedence of operations …
data preced_test; x1a = 322; x1b = (32)2; x2a = 3-2/2; x2b = (3-2)/2; x3a = -22; x3b = (-2)2; put ‘-------------------------‘; put ‘| Order of operations |’; put ‘| illustrated |’; put ‘-------------------------‘; put ‘ 322 = ‘ x1a; put ‘(32)2 = ‘ x1b; put ‘ 3-2/2 = ‘ x2a; put ‘ (3-2)/2 = ‘ x2b; put ‘ -22 = ‘ x3a; put ‘ (-2)**2 = ‘ x3b; run;
| Order of operations |
322 = 12 (32)2 = 36 3-2/2 = 2 (3-2)/2 = 0. -22 = - (-2)**2 = 4
ODS RTF file="\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\SAS- programs\week-06-tab1.rtf”;
proc print; run;
proc tabulate ; class conc brood; var count; table concbrood,count(min q1 median q3 max); run ; ODS RTF CLOSE;
NOTE: There were 50 observations read from the data set CLASS.NITROFEN. NOTE: The data set WORK.NITROFEN2 has 40 observations and 7 variables. NOTE: DATA statement used: real time 0.01 seconds cpu time 0.01 seconds
1143 data nitrofen3; set class.nitrofen; 1144 where conc<310; 1145 run;
NOTE: There were 40 observations read from the data set CLASS.NITROFEN. WHERE conc<310; NOTE: The data set WORK.NITROFEN3 has 40 observations and 7 variables. NOTE: DATA statement used: real time 0.66 seconds cpu time 0.03 seconds
“\Muserver2\USERS\B\BAILERAJ\public.www\classes\sta402\examples\week06-MC- fig.rtf”;
data twogroup;
array x{ 10 } x1-x10; array y{ 10 } y1-y10;
do isim = 1 to 10000 ;
/* generate samples X~N(0,1) Y~N(0,4) - normal case */ do isample = 1 to 10 ; x{isample} = rannor( 0 ); y{isample} = 2 *rannor( 0 ); end;
/* calculate the t-statistic */ xbar = mean(of x1-x10); ybar = mean(of y1-y10);
xvar = var(of x1-x10); yvar = var(of y1-y10);
s2p = (9xvar + 9yvar)/18;
tstat = (xbar-ybar)/sqrt(s2p(2/10)); Pvalue = 2(1-probt(abs(tstat),18)); Reject05 = (Pvalue <= 0.05);
keep xbar ybar xvar yvar s2p tstat Pvalue Reject05; output; end; * end of the simulation loop;
/ proc print* ; run ; */
proc freq; table Reject05; run;
Cumulative Cumulative Reject05 Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 0 9443 94.43 9443 94.
data mle_exact; set m_out; if STAT='MEAN'; lambda_MLE = 1 /TIME;
proc print data=mle_exact; run ; /* lambda_ Obs TYPE FREQ STAT time MLE 1 0 25 MEAN 0.95268 1.
Iterative Phase Sum of Iter lambda Squares 0 0.2500 40. 1 1.1829 24. 2 0.9727 23. 3 1.0355 23. 4 1.0341 23. 5 1.0341 23. 6 1.0343 23. 7 1.0344 23. 8 1.0344 23. 9 1.0344 23. 10 1.0344 23. 11 1.0344 23. 12 1.0344 23. 13 1.0344 23.
NOTE: Convergence criterion met but a note in the log indicates a possible problem with the model.
Estimation Summary
Method Gauss-Newton Iterations 13 Subiterations 5 Average Subiterations 0. R 8.253E- PPC 8.915E- RPC(lambda) 3.603E- Object 5.51E- Objective 23. Observations Read 25 Observations Used 25 Observations Missing 0
NOTE: An intercept was not specified for this model.
Sum of Mean Approx Source DF Squares Square F Value Pr > F Model 1 -23.7907 -23.7907 -24.. Error 24 23.7907 0. Uncorrected Total 25 0
Approx Parameter Estimate Std Error Approximate 95% Confidence Limits
/* alternative code using NLMIXED where likelihood is directly entered / / added: 6 Oct 04 */
proc nlmixed data=gen_exp; parms lambda= 0.25 ; ll = log(lambda) - lambda*time; model time ~ general(ll); * could also use gamma(lambda,1) in model; run ;
Specifications Data Set WORK.GEN_EXP Dependent Variable time Distribution for Dependent Variable General Optimization Technique Dual Quasi-Newton Integration Method None
Dimensions Observations Used 25 Observations Not Used 0 Total Observations 25 Parameters 1
Parameters lambda NegLogLike 0.25 40.
Iteration History
Iter Calls NegLogLike Diff MaxGrad Slope 1 2 25.0547568 15.55684 9.516393 -380. 2 4 23.8231484 1.231608 1.308295 -0. 3 5 23.7908217 0.032327 0.359401 -0. 4 6 23.7880391 0.002783 0.01845 -0. 5 7 23.7880316 7.492E-6 0.000274 -0. 6 9 23.7880316 1.655E-9 1.785E-8 -3.31E-
NOTE: GCONV convergence criterion satisfied.
Fit Statistics -2 Log Likelihood 47. AIC (smaller is better) 49. AICC (smaller is better) 49. BIC (smaller is better) 50.
Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient lambda 1.0497 0.2099 25 5.00 <.0001 0.05 0.6173 1.4820 1.785E-